<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	>
<channel>
	<title>Comments on: A quick JS quizz for anybody who think they know regex</title>
	<atom:link href="http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/feed/" rel="self" type="application/rss+xml" />
	<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/</link>
	<description>Web development concerns, usually revolving around implimentation of designs into graphics, CSS, and HTML.</description>
	<pubDate>Tue, 07 Oct 2008 08:34:50 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.2</generator>
		<item>
		<title>By: xentek</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-25717</link>
		<dc:creator>xentek</dc:creator>
		<pubDate>Fri, 25 Jan 2008 05:46:24 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-25717</guid>
		<description>Don't think lookbehinds (backreferences, if I'm not a complete idiot) is even allowed in JavaScript (had to do a little bit of RegEx this week, after a while of not messing with it).

Of course, I just used a RegEx tester, and examples of the string to build mine up, bit by bit, until I couldn't break the RegEx equation when I changed my string. But when I was looking up the style of RegEx JS uses (a limited form of PERL Syntax), I remember reading about this limitation.</description>
		<content:encoded><![CDATA[<p>Don&#8217;t think lookbehinds (backreferences, if I&#8217;m not a complete idiot) is even allowed in JavaScript (had to do a little bit of RegEx this week, after a while of not messing with it).</p>
<p>Of course, I just used a RegEx tester, and examples of the string to build mine up, bit by bit, until I couldn&#8217;t break the RegEx equation when I changed my string. But when I was looking up the style of RegEx JS uses (a limited form of PERL Syntax), I remember reading about this limitation.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steven Levithan</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-20633</link>
		<dc:creator>Steven Levithan</dc:creator>
		<pubDate>Wed, 24 Oct 2007 07:09:16 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-20633</guid>
		<description>IMO, example 02 is a particularly sneaky/good example. It may not be obvious why I answered the way I did unless you consider less tricky examples like /(?:(a)&#124;(b))+/.exec("ba") , which follows the exactly same principal.

If you changed it from /(?:(a)(b)?)+/.exec('aaba') to /(?:(a)(b?))+/.exec('aaba') then I would expect $1 to still be the last "a", but $2 to be "" instead of "b". It's all about group participation. For anyone curious about why browsers handle this the way they do, have a look at this post of mine from a while ago: http://blog.stevenlevithan.com/archives/npcg-javascript .</description>
		<content:encoded><![CDATA[<p>IMO, example 02 is a particularly sneaky/good example. It may not be obvious why I answered the way I did unless you consider less tricky examples like /(?:(a)|(b))+/.exec(&#8221;ba&#8221;) , which follows the exactly same principal.</p>
<p>If you changed it from /(?:(a)(b)?)+/.exec(&#8217;aaba&#8217;) to /(?:(a)(b?))+/.exec(&#8217;aaba&#8217;) then I would expect $1 to still be the last &#8220;a&#8221;, but $2 to be &#8220;&#8221; instead of &#8220;b&#8221;. It&#8217;s all about group participation. For anyone curious about why browsers handle this the way they do, have a look at this post of mine from a while ago: <a href="http://blog.stevenlevithan.com/archives/npcg-javascript" rel="nofollow">http://blog.stevenlevithan.com/archives/npcg-javascript</a> .</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Steven Levithan</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-20632</link>
		<dc:creator>Steven Levithan</dc:creator>
		<pubDate>Wed, 24 Oct 2007 06:49:51 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-20632</guid>
		<description>I think I know regexes pretty well. I especially know JavaScript regexes, and can tell you what ES3 and the big four browsers will return for your examples without looking it up or testing. However, since you're asking what users *think* should be returned, here's what I'd wish for / expect intuitively:

Example 01: /(a\1)+&#124;b/.exec('aaaab');

This should match within the string only one time ("b"), and the result for $1 should be undefined (null would also be acceptable), rather than an empty string. Treating non-participating capturing groups as null or undefined rather than an empty string just seems more logical, is typical behavior with most other regex flavors, and allows for some fancy regex tricks which are not otherwise possible when conditionals are not available.

Example 02: /(?:(a)(b)?)+/.exec('aaba');

This should match the entire string. $1 should be the very last "a", and $2 should be "b". It doesn't make sense to me to replace the value of backreferences for capturing groups which previously participated in the match with an empty string unless the group re-participates.

Example 03: /(?:(\2&#124;a)(b)?)+/.exec('abba');

This should not match anywhere in the string (see my take on Example 01).

@liorean, so what do *you* think they should match?</description>
		<content:encoded><![CDATA[<p>I think I know regexes pretty well. I especially know JavaScript regexes, and can tell you what ES3 and the big four browsers will return for your examples without looking it up or testing. However, since you&#8217;re asking what users *think* should be returned, here&#8217;s what I&#8217;d wish for / expect intuitively:</p>
<p>Example 01: /(a\1)+|b/.exec(&#8217;aaaab&#8217;);</p>
<p>This should match within the string only one time (&#8221;b&#8221;), and the result for $1 should be undefined (null would also be acceptable), rather than an empty string. Treating non-participating capturing groups as null or undefined rather than an empty string just seems more logical, is typical behavior with most other regex flavors, and allows for some fancy regex tricks which are not otherwise possible when conditionals are not available.</p>
<p>Example 02: /(?:(a)(b)?)+/.exec(&#8217;aaba&#8217;);</p>
<p>This should match the entire string. $1 should be the very last &#8220;a&#8221;, and $2 should be &#8220;b&#8221;. It doesn&#8217;t make sense to me to replace the value of backreferences for capturing groups which previously participated in the match with an empty string unless the group re-participates.</p>
<p>Example 03: /(?:(\2|a)(b)?)+/.exec(&#8217;abba&#8217;);</p>
<p>This should not match anywhere in the string (see my take on Example 01).</p>
<p>@liorean, so what do *you* think they should match?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: josephdietrich</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18712</link>
		<dc:creator>josephdietrich</dc:creator>
		<pubDate>Fri, 07 Sep 2007 14:54:22 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18712</guid>
		<description>Well, I tried it out and I admit I'm kind of surprised. Interesting. The results of example 01 really surprised me, because I thought the rule on that was clear.

But then I ran these through a Perl script as well and didn't get what I expected, so obviously my grasp or regular expressions needs to be improved.</description>
		<content:encoded><![CDATA[<p>Well, I tried it out and I admit I&#8217;m kind of surprised. Interesting. The results of example 01 really surprised me, because I thought the rule on that was clear.</p>
<p>But then I ran these through a Perl script as well and didn&#8217;t get what I expected, so obviously my grasp or regular expressions needs to be improved.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: josephdietrich</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18676</link>
		<dc:creator>josephdietrich</dc:creator>
		<pubDate>Thu, 06 Sep 2007 19:45:36 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18676</guid>
		<description>I'll try that tomorrow morning then. Your cryptic response is intriguing. ;-)</description>
		<content:encoded><![CDATA[<p>I&#8217;ll try that tomorrow morning then. Your cryptic response is intriguing. ;-)</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: liorean</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18675</link>
		<dc:creator>liorean</dc:creator>
		<pubDate>Thu, 06 Sep 2007 19:40:59 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18675</guid>
		<description>&lt;p&gt;Well then. I'm not saying you're not allowed to try them in browsers. I'm just asking you to reason about what the result should be before you actually try it. Actually, since you've already donne your thinking, try them out in each of iew, saf and either of op or moz(these two behave identical in this respect), and you'll find noticable differences.&lt;/p&gt;
&lt;p&gt;Oh, and just to avoid ruining it for others, please refrain from posting browser results in the comments.&lt;/p&gt;</description>
		<content:encoded><![CDATA[<p>Well then. I&#8217;m not saying you&#8217;re not allowed to try them in browsers. I&#8217;m just asking you to reason about what the result should be before you actually try it. Actually, since you&#8217;ve already donne your thinking, try them out in each of iew, saf and either of op or moz(these two behave identical in this respect), and you&#8217;ll find noticable differences.</p>
<p>Oh, and just to avoid ruining it for others, please refrain from posting browser results in the comments.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: josephdietrich</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18673</link>
		<dc:creator>josephdietrich</dc:creator>
		<pubDate>Thu, 06 Sep 2007 19:17:44 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18673</guid>
		<description>Without testing it I really can't answer that question, since I am hardly a regex guru. My first instinct would be to say "what j said." I don't think that using the &lt;code&gt;?:&lt;/code&gt; option on the outer parenthesis invalidates any references to the inner parenthesis. So if you were to try and do a replacement with \1\2 you'd just get "ab."

But maybe I'm not understanding your question correctly ...</description>
		<content:encoded><![CDATA[<p>Without testing it I really can&#8217;t answer that question, since I am hardly a regex guru. My first instinct would be to say &#8220;what j said.&#8221; I don&#8217;t think that using the <code>?:</code> option on the outer parenthesis invalidates any references to the inner parenthesis. So if you were to try and do a replacement with \1\2 you&#8217;d just get &#8220;ab.&#8221;</p>
<p>But maybe I&#8217;m not understanding your question correctly &#8230;</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: j</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18669</link>
		<dc:creator>j</dc:creator>
		<pubDate>Thu, 06 Sep 2007 17:07:00 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18669</guid>
		<description>I'd expect the submatches for example 2 to be 'a' and 'b' respectively, as that's what the groups contain.  If they return anything else, I'd suspect a broken regex engine in that browser.</description>
		<content:encoded><![CDATA[<p>I&#8217;d expect the submatches for example 2 to be &#8216;a&#8217; and &#8216;b&#8217; respectively, as that&#8217;s what the groups contain.  If they return anything else, I&#8217;d suspect a broken regex engine in that browser.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: liorean</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18663</link>
		<dc:creator>liorean</dc:creator>
		<pubDate>Thu, 06 Sep 2007 14:38:01 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18663</guid>
		<description>&lt;p&gt;josephdietrich: Well, let's say that Example 02 works jsut splendid (...it does, in all browsers). What do you expect the values are for the first and the second captured submatch, respectively. Remember that the captured submatches are included in the return value.&lt;p&gt;</description>
		<content:encoded><![CDATA[<p>josephdietrich: Well, let&#8217;s say that Example 02 works jsut splendid (&#8230;it does, in all browsers). What do you expect the values are for the first and the second captured submatch, respectively. Remember that the captured submatches are included in the return value.</p></p>
]]></content:encoded>
	</item>
	<item>
		<title>By: josephdietrich</title>
		<link>http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18640</link>
		<dc:creator>josephdietrich</dc:creator>
		<pubDate>Thu, 06 Sep 2007 07:06:15 +0000</pubDate>
		<guid isPermaLink="false">http://web-graphics.com/2007/09/05/a-quick-js-quizz-for-anybody-who-think-they-know-regex/#comment-18640</guid>
		<description>Example 01 should only match "b" because you are not supposed to be able to use a backreference inside itself, so the regex should just ignore &lt;code&gt;(a\1)&lt;/code&gt; as nonsensical and only match "b".

Example 02 is, I think, the effectively the same as &lt;code&gt;((a)(b)?)+,&lt;/code&gt; which will match the whole string. &lt;code&gt;?:&lt;/code&gt; just gets rid of the backreference on the outer parenthesis, and in this case that's pointless. Since &lt;code&gt;(b)?&lt;/code&gt; is optional, what should happen is that the engine should consume "a" then "ab" then "a" and then match the whole string.

Like j, I'd be surprised if Example 3 works but it might. The reference &lt;code&gt;\2&lt;/code&gt; is not self-contained like in Example 1, and you &lt;em&gt;can&lt;/em&gt; use references during the match, but I don't know if it will match when it comes before the thing it is referencing. The &lt;code&gt;?:&lt;/code&gt; should get rid of the reference to the outer group, leaving the references to the two groups within. So the expression would simplify to &lt;code&gt;((b&#124;a)(b)?)+&lt;/code&gt;, which should match the whole string; first "ab" then "b" then "a".</description>
		<content:encoded><![CDATA[<p>Example 01 should only match &#8220;b&#8221; because you are not supposed to be able to use a backreference inside itself, so the regex should just ignore <code>(a\1)</code> as nonsensical and only match &#8220;b&#8221;.</p>
<p>Example 02 is, I think, the effectively the same as <code>((a)(b)?)+,</code> which will match the whole string. <code>?:</code> just gets rid of the backreference on the outer parenthesis, and in this case that&#8217;s pointless. Since <code>(b)?</code> is optional, what should happen is that the engine should consume &#8220;a&#8221; then &#8220;ab&#8221; then &#8220;a&#8221; and then match the whole string.</p>
<p>Like j, I&#8217;d be surprised if Example 3 works but it might. The reference <code>\2</code> is not self-contained like in Example 1, and you <em>can</em> use references during the match, but I don&#8217;t know if it will match when it comes before the thing it is referencing. The <code>?:</code> should get rid of the reference to the outer group, leaving the references to the two groups within. So the expression would simplify to <code>((b|a)(b)?)+</code>, which should match the whole string; first &#8220;ab&#8221; then &#8220;b&#8221; then &#8220;a&#8221;.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
