<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>beta BLOG dot NET - recently in algorithms category</title>
  <link rel="alternate" type="text/html" href="http://beta-blog.net/algorithms/" />
  <link rel="self" type="application/atom+xml" href="" />
  <id>tag:beta-blog.net,2009-08-27://1</id>
  <updated>2010-04-02T01:17:02Z</updated>
  
  <generator uri="http://www.sixapart.com/movabletype/">Movable Type 4.25</generator>

<entry>
  <title>Is LINQ functional?</title>
  <link rel="alternate" type="text/html" href="http://beta-blog.net/2010/03/is-linq-functional" />
  <id>tag:beta-blog.net,2010://1.52390</id>

  <published>2010-03-31T20:11:19Z</published>
  <updated>2010-04-02T01:17:02Z</updated>

  <summary>With it&apos;s 3.5 extensions, the .NET framework started to turn into a really cool looking programming concept, last but not least due to the syntactic sugar of LINQ. A reason for that is surely it&apos;s functional look. Well, as LINQ is integrated into an imperative context, it won&apos;t be ever able to guarantee state-free evaluation as a genuine functional language does. Nevertheless it&apos;s worth to discuss and play around with a few aspects of it in terms of a multiple programming paradigm concept. </summary>
  <author>
    <name>Sebastian</name>
    <uri>http://beta-blog.net</uri>
  </author>
  
  <category term=".NET" scheme="http://www.sixapart.com/ns/types#category" />
  
  <category term="algorithms" scheme="http://www.sixapart.com/ns/types#category" />
  
  <category term="net" label=".NET" scheme="http://www.sixapart.com/ns/types#tag" />
  <category term="c" label="C#" scheme="http://www.sixapart.com/ns/types#tag" />
  <category term="math" label="math" scheme="http://www.sixapart.com/ns/types#tag" />
  
  <content type="html" xml:lang="en" xml:base="http://beta-blog.net/">
  <![CDATA[<p>
With it's 3.5 extensions, the .NET framework started to turn into a really
cool looking programming concept,
last but not least due to the syntactic sugar of
<a href="http://msdn.microsoft.com/en-us/library/bb397676.aspx" target="_blank">LINQ</a>.
A reason for that is surely it's <a href="http://en.wikipedia.org/wiki/Functional_programming" target="_blank">functional</a>
look.
Well, as LINQ is integrated into an imperative context, it won't be ever able to
guarantee state-free evaluation as a genuine functional language does.
Nevertheless it's worth to discuss and play around with a few aspects of it
in terms of a multiple programming paradigm concept.
</p>

<h3>Delegating definitions in C# 3.0</h3>
<p>
Firstly, the concept of
<a href="http://en.wikipedia.org/wiki/First-class_function" target="_blank">first-class functions</a>,
i.e. the invention of the function type, leads to the notion of closures.
So for instance, a constant function such as
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Func</span>&lt;<span class="kwd builtin">int</span>&gt; <span class="type">i</span> = () =&gt; <span class="num">1</span>;
</code></pre>
<p>
defines something like a readonly variable.
You may get it's value now, later or never,
but you can always be sure that it's value won't be ever changed anywhere in your code.
Hence, you have won a quantum of control over your program by this
weird piece of code.
That's a basic idea of functional programming.
</p>

<p>
The concept of function types leads to higher order
functions, i.e. functions mapping functions to other functions.
Thus, the <a href="http://en.wikipedia.org/wiki/Currying" target="_blank">curry functor</a>,
a key concept in the theory of functional programming,
is regarded:
</p>
<p class="quote">
<span class="math">curry: (X <span class="small">x</span> Y &rarr; Z) &rarr; (X  &rarr; Y  &rarr; Z)</span>
</p>
<p>
That is, for any function <span class="math">f(x,y)</span>, there is a curryied function
<span class="math">curry(f)(x)</span>
taking <span class="math">x</span> to a function <span class="math">g(y) = f(x,y)</span>.
This is now implemented easily in C# using generic types:
</p>
<pre class="code">
<code class="csharpnet"><span class="kwd builtin">static</span> <span class="kwd def">Func</span>&lt;<span class="type">X</span>, <span class="kwd def">Func</span>&lt;<span class="type">Y</span>, <span class="type">Z</span>&gt;&gt; <span class="type">Curry</span>&lt;<span class="type">X</span>, <span class="type">Y</span>, <span class="type">Z</span>&gt;(<span class="kwd def">Func</span>&lt;<span class="type">X</span>, <span class="type">Y</span>, <span class="type">Z</span>&gt; <span class="type">f</span>)
{
  <span class="kwd builtin">return</span> <span class="type">x</span> =&gt; <span class="type">y</span> =&gt; <span class="type">f</span>(<span class="type">x</span>, <span class="type">y</span>);
}
</code></pre>
<p>
(inspired by this <a target="_blank" href="http://jacobcarpenter.wordpress.com/2008/01/02/c-abuse-of-the-day-functional-library-implemented-with-lambdas/">C# abuse of the day</a>).
Well, that's more or less of academic interest, since one would hardly ever replace
<span class="code">x++</span> by
</p>
<pre class="code">
<code class="csharpnet"><span class="type">x</span> = <span class="type">Curry</span>&lt;<span class="kwd builtin">int</span>, <span class="kwd builtin">int</span>, <span class="kwd builtin">int</span>&gt;((<span class="type">a</span>, <span class="type">b</span>) =&gt; <span class="type">a</span> + <span class="type">b</span>)(<span class="num">1</span>)(<span class="type">x</span>); <span class="cmnt">// x++ ;)</span>
</code></pre>
<p>
A slightly more interesting example is the following:
</p>
<pre class="code">
<code class="csharpnet"><span class="cmnt">// using System.Text.RegularExpressions;</span>
<span class="kwd builtin">var</span> <span class="type">grep</span> = <span class="type">Curry</span>&lt;<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx" target="_blank" rel="nofollow">Regex</a>, <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">string</span>&gt;, <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">string</span>&gt;&gt;(
  (<span class="type">regex</span>, <span class="type">list</span>) =&gt; <span class="kwd builtin">from</span> <span class="type">s</span> <span class="kwd builtin">in</span> <span class="type">list</span>
                   <span class="kwd builtin">where</span> <span class="type">regex.Match</span>(<span class="type">s</span>).<span class="type">Success</span>
                   <span class="kwd builtin">select</span> <span class="type">s</span>);
<span class="kwd builtin">var</span> <span class="type">grepFoo</span> = <span class="type">grep</span>(<span class="kwd builtin">new</span> <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.regex.aspx" target="_blank" rel="nofollow">Regex</a>(<span class="str">&quot;foo&quot;</span>));
</code></pre>
<p>
Thus, <span class="code">grepFoo</span> will grep all words containing
<code class="csharpnet"><span class="str">&quot;foo&quot;</span></code>
from a wordlist.
Attention should be paid to the fact that with the statement
</p>
<pre class="code">
<code class="csharpnet"><span class="kwd builtin">var</span> <span class="type">fooList</span> = <span class="type">grepFoo</span>(<span class="kwd builtin">new</span> <span class="kwd builtin">string</span>[]{<span class="str">&quot;foo&quot;</span>, <span class="str">&quot;bar&quot;</span>, <span class="str">&quot;foobar&quot;</span>});
</code></pre>
<p>
then there is still no regex applied.
Indeed, <code class="csharpnet"><span class="type">fooList</span></code>
is of type
<code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">string</span>&gt;</code>
and not yet enumerated at this point.
So the evaluation of the expression is deferred until it's result is needed by another computation
- smells like lazy evaluation.
</p>

<h3>LINQ is not lazy!</h3>
<p>
One of the most important paradigms of functional programming is the concept of
<a href="http://en.wikipedia.org/wiki/Lazy_evaluation" target="_blank">lazy evaluation</a>.
For instance, in a functional language, such as the good old
<a href="http://haskell.org/" target="_blank">Haskell</a>,
an expression such as
</p>
<pre class="code"><code>length [1, 2, 3/0]
</code></pre>
<p>
evaluates to <span class="code">3</span>.
That is, the control system is too lazy to fail on division by zero,
neither at compile time nor on run time, since it doesn't need to know any element
inside the array in order to calculate it's length.
In <em>C#</em> (where you aren't even able to compile an expression such as <span class="code">1/0</span>),
you may let
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">var</span> <span class="type">q1</span> = <span class="kwd builtin">from</span> <span class="type">i</span> <span class="kwd builtin">in</span> (<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">int</span>&gt;)<span class="kwd builtin">new</span> <span class="kwd builtin">int</span>[] { <span class="num">1</span>, <span class="num">2</span>, <span class="num">3</span> }
         <span class="kwd builtin">select</span> <span class="num">1</span>/(<span class="type">i</span> - <span class="num">3</span>);
</code></pre>
<p>
without getting a run time error.
But this has nothing to do with lazy evaluation, since the query expression isn't evaluated at all at this point
(in contrast to the array definition inside the query), so the query expression is simply treated as a function definition.
However, as soon as an aggregation expression such as
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">int</span> <span class="type">three</span> = <span class="type">q1.Count</span>();
</code></pre>
<p>
is reached, a
<span class="code"><code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.dividebyzeroexception.aspx" target="_blank" rel="nofollow">DivideByZeroException</a></code></span>
will be thrown.
Thus, LINQ evaluates eager here, not lazy.
On the other hand,
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">int</span> <span class="type">two</span> = <span class="type">q1.Take</span>(<span class="num">2</span>).<span class="type">Count</span>();
</code></pre>
<p>
works fine, since the black hole stays unevaluated due to the <code>Take</code> operator.
But, having
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">var</span> <span class="type">q2</span> = <span class="kwd builtin">from</span> <span class="type">i</span> <span class="kwd builtin">in</span> (<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">int</span>&gt;)<span class="kwd builtin">new</span> <span class="kwd builtin">int</span>[] { <span class="num">1</span>, <span class="num">2</span>, <span class="num">3</span> }
         <span class="kwd builtin">select</span> <span class="num">1</span>/(<span class="type">i</span> - <span class="num">1</span>);
<span class="kwd builtin">int</span> <span class="type">two2</span> = <span class="type">q2.Skip</span>(<span class="num">1</span>).<span class="type">Count</span>();
</code></pre>
<p>
instead, you will - guess what! - catch the exception again.
Thus, in contrast to the <span class="code"><code class="csharpnet"><span class="type">Take</span></code></span> operator,
the <span class="code"><code class="csharpnet"><span class="type">Skip</span></code></span> operator
does iterate through skipped elements and hence evaluates them.
Ok, that's no surprise, since these operators are using the
<code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.aspx" target="_blank" rel="nofollow">IEnumerator</a></code>
provided by the corresponding
<code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a></code>.
So, LINQ pretends to be lazy in the way that
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">var</span> <span class="type">p</span> = <span class="type">q2.Reverse</span>();
</code></pre>
<p>
won't be evaluated at this point and thus doesn't fail, wheras
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">int</span> <span class="type">two3</span> = <span class="type">p.Take</span>(<span class="num">2</span>).<span class="type">Count</span>();
</code></pre>
<p>
then throws again the exception even though the evil one shuoldn't be taken here.
</p>
<p>
A functional approach to force lazyness would be to replace value expressions by
constant functions, but the compiler won't accept something like this:
</p>
<pre class="code"><code class="csharpnet"><span class="cmnt">// The type of the expression in the select clause is incorrect.</span>
<span class="cmnt">// Type inference failed in the call to &#039;Select&#039;.</span>
<span class="kwd builtin">var</span> <span class="type">q1_</span> = <span class="kwd builtin">from</span> <span class="type">i</span> <span class="kwd builtin">in</span> (<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">int</span>&gt;)<span class="kwd builtin">new</span> <span class="kwd builtin">int</span>[] { <span class="num">1</span>, <span class="num">2</span>, <span class="num">3</span> }
          <span class="kwd builtin">select</span> () =&gt; <span class="num">1</span> / (<span class="type">i</span> - <span class="num">3</span>);
</code></pre>
<p>
Hence, LINQ isn't lazy, but has a smart way to make function definitions
looking like statement expressions.
</p>


<h3>Diving into recursion</h3>
<p>
Remember the famous
</p>
<a href="http://en.wikipedia.org/wiki/Fibonacci_number" target="_blank">Fibonacci numbers</a>:
<p class="quote">
<span class="math">fib<sub>0</sub> = 0, fib<sub>1</sub> = 1, fib<sub>n</sub> = fib<sub>n-1</sub> + fib<sub>n-2</sub>.</span>
</p>
<p>
The sequence starts with
</p>
<p class="quote">
<span class="math">fibs = 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ...</span>
</p>
<p>
where <span class="math">fibs<sub>100</sub></span> is a number consisting of 21 digits then, so it grows quite fast.
Although one may calculate Fibonacci numbers in constant time using
<a href="http://mathworld.wolfram.com/BinetsFibonacciNumberFormula.html" target="_blank">Binet's formula</a>,
the definition leads to interesting comparisons of different recursion strategies.
</p>

<p>
Well, lets have a
</p>
<pre class="code"><code class="csharpnet"><span class="kwd builtin">delegate</span> <span class="kwd builtin">long</span> <span class="kwd def">Fibonacci</span>(<span class="kwd builtin">int</span> <span class="type">n</span>);
</code></pre>
<p>
A direct translation of the definition into a lambda recursion looks like this:
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Fibonacci</span> <span class="type">fib1</span> = <span class="kwd builtin">null</span>; <span class="cmnt">// pre-assigned for use within recursion</span>
<span class="type">fib1</span> = <span class="type">n</span> =&gt; <span class="type">n</span> &lt;= <span class="num">1</span> ? <span class="type">n</span> : <span class="type">fib1</span>(<span class="type">n</span> - <span class="num">1</span>) + <span class="type">fib1</span>(<span class="type">n</span> - <span class="num">2</span>);
</code></pre>
<p>
The funny thing with this implementation is, that the Fibonacci function itself determines it's run time:
It's <span class="math">O(fib<sub>n</sub>)</span>, i.e. lower values will be
recalculated many times again and again in order to get a higher one, due to the lack of an aggregating strategy.
</p>

<p>
Now, in Haskell you may get around this very elegantly by defining an infinitive list:
</p>
<pre class="code"><code>fibs = 0 : 1 : zipWith (+) fibs (tail fibs)
</code></pre>
<p>
The list is inititialized with two elements.
Then, notional, the <code>tail</code> function shifts the first element from the <code>fibs</code> list,
while <code>zipWith (+)</code> creates a new list by adding elements of both
<code>fibs</code> and <code>(tail fibs)</code> with each other then.
But in practice, Haskell is smart and lazy enough to avoid any needless recalculation
of numbers already present in the <code>fibs</code> list.
Thus, the algorithm applied here is the same one a human being would apply spontaneously using a
pencil and a chit of paper. So, it's <span class="math">O(n)</span>.
</p>

<p>
To define an infinitive list in C#, one should
implement the
<code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a></code>
interface in the way that
the corresponding
<code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.aspx" target="_blank" rel="nofollow">IEnumerator</a></code>
expands the list on demand within it's
<code class="csharpnet"><span class="type">MoveNext</span>()</code>
method then.
Here, it's enough to have a little inliner,
taking a list and an expanding function to a
<code class="csharpnet"><span class="kwd def">Fibonacci</span></code> type:
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Func</span>&lt;
  <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;,
  <span class="kwd def">Func</span>&lt;<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;, <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;&gt;,
  <span class="kwd def">Fibonacci</span>&gt; <span class="type">infList</span> = <span class="kwd builtin">null</span>;
<span class="type">infList</span> = (<span class="type">list</span>, <span class="type">exp</span>) =&gt; <span class="type">n</span> =&gt; <span class="type">n</span> &lt; <span class="type">list.Count</span>() ?
  <span class="type">list.Skip</span>(<span class="type">n</span>).<span class="type">First</span>() : <span class="type">infList</span>(<span class="type">exp</span>(<span class="type">list</span>), <span class="type">exp</span>)(<span class="type">n</span>);
</code></pre>
<p>
Now, C# also provides a <code>Zip</code> function.
So, a simple syntactic translation of the Haskell list would look like this:
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Func</span>&lt;<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;, <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;&gt; <span class="type">fibZip</span> = <span class="type">fibs</span> =&gt;
  <span class="type">fibs.Take</span>(<span class="num">2</span>).<span class="type">Concat</span>(<span class="type">fibs.Zip</span>(<span class="type">fibs.Skip</span>(<span class="num">1</span>), (<span class="type">x</span>, <span class="type">y</span>) =&gt; <span class="type">x</span> + <span class="type">y</span>));
</code></pre>
<p>
Hm, but this one is even worse than the naive recursion.
Indeed, trying
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Fibonacci</span> <span class="type">fib2</span> = <span class="type">infList</span>(<span class="kwd builtin">new</span> <span class="kwd builtin">long</span>[] { <span class="num">0</span>, <span class="num">1</span> }, <span class="type">fibZip</span>);
</code></pre>
<p>
then, you will see that aggregation doesn't work at all this way, since the concept
of enumeration is not functional.
We may repair the <code>fibZip</code> as follows:
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Func</span>&lt;<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;, <a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;&gt; <span class="type">fibZip2</span> = <span class="type">fibs</span> =&gt;
  <span class="type">fibs.Concat</span>((<a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a>&lt;<span class="kwd builtin">long</span>&gt;)(<span class="kwd builtin">new</span> <span class="kwd builtin">long</span>[] {
    <span class="type">fibs.Skip</span>(<span class="type">fibs.Count</span>() - <span class="num">2</span>).<span class="type">Sum</span>() }));
</code></pre>
<p>
This one looks a bit weird, since it's not that easy to extend an
<code class="csharpnet"><a class="kwd def" href="http://msdn.microsoft.com/en-us/library/system.collections.ienumerable.aspx" target="_blank" rel="nofollow">IEnumerable</a></code>
by one element. Anyway,
</p>
<pre class="code"><code class="csharpnet"><span class="kwd def">Fibonacci</span> <span class="type">fib3</span> = <span class="type">infList</span>(<span class="kwd builtin">new</span> <span class="kwd builtin">long</span>[] { <span class="num">0</span>, <span class="num">1</span> }, <span class="type">fibZip2</span>);
</code></pre>
<p>
indeed does the job in <span class="math">O(n)</span> then,
even though the idea of an infinitive list has lost it's magic this way.
</p>

<h3>Conclusion</h3>
<p>
As expected, neither C# nor LINQ turns out to implement
the paradigms of a functional language.
None  the less, it's really fancy. 8-)
</p>
]]>
  
  </content>
</entry>

<entry>
  <title>a wordlist folding algorithm</title>
  <link rel="alternate" type="text/html" href="http://beta-blog.net/2009/11/a-wordlist-folding-algorithm" />
  <id>tag:beta-blog.net,2009://1.52384</id>

  <published>2009-11-28T23:33:32Z</published>
  <updated>2009-11-29T16:22:17Z</updated>

  <summary>Assumed you wish to match a large wordlist against a huge chunk of text. As a small test case, let
for, far, bar, foo, boofaz, boofar, boof, faz, foobaz, foobars, boofar
be your wordlist. Now, you may apply the according regualar expression:
But which way a regex engine would implement the assignment?
</summary>
  <author>
    <name>Sebastian</name>
    <uri>http://beta-blog.net</uri>
  </author>
  
  <category term="Perl" scheme="http://www.sixapart.com/ns/types#category" />
  
  <category term="algorithms" scheme="http://www.sixapart.com/ns/types#category" />
  
  <category term="codes" label="codes" scheme="http://www.sixapart.com/ns/types#tag" />
  <category term="perl" label="Perl" scheme="http://www.sixapart.com/ns/types#tag" />
  <category term="regex" label="regex" scheme="http://www.sixapart.com/ns/types#tag" />
  
  <content type="html" xml:lang="en" xml:base="http://beta-blog.net/">
  <![CDATA[<p>
Assumed you wish to match a large wordlist against a huge chunk of text.
As a small test case, let
</p>
<pre class="code">
for, far, bar, foo, boofaz, boofar, boof, faz, foobaz, foobars, boofar
</pre>
<p>
be your wordlist. Now, you may apply the according regualar expression:
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_1">[-] hide code</a></legend><div class="collapsible-container"><pre class="code">
(1) /\b(for|far|bar|foo|boofaz|boofar|boof|faz|foobaz|foobars|boofar)\b/
</pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_1')})/*]]&gt;*/</script>
<p>
But which way a regex engine would implement the assignment?
There are different options. The very worst algorithm would be surely to
look up every word separately in the whole text. That would be the same as
doing
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_2">[-] hide code</a></legend><div class="collapsible-container"><pre class="code"><code class="perl"><a class="kwd" href="http://perldoc.perl.org/functions/foreach.html" target="_blank" rel="nofollow">foreach</a> <span class="op ld">(</span><span class="qlo qw"><span class="kwd">qw</span><span class="op">(</span><span class="istr"> for far bar foo boofaz boofar boof faz foobaz foobars boofar </span><span class="op">)</span></span><span class="op rd">)</span>
<span class="op ld">{</span>
  <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="istr">&quot;matching!&quot;</span> <span class="kwd">if</span> <span class="var">$<span class="symb">text</span></span> =~ <span class="symb">m</span>/\<span class="symb">b</span><span class="var">$_</span>\<span class="symb">b</span>/<span class="op stmt">;</span>
<span class="op rd">}</span>
<a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="istr">&quot;not matching.&quot;</span><span class="op stmt">;</span></code></pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_2')})/*]]&gt;*/</script>
<p>
Assumed you would match <span class="math">m</span> words against a text consisting of <span class="math">n</span> letters,
this peace of coding horror would have a runtime estimation of <span class="math">O(m*n)</span>.
</p>

<p>
Now, a better approach would be to run only once through the text,
using a matching stack. Thus, assume <span class="code">&quot; foobar &quot;</span> would appear somewhere in
the text, the stack trace might look as follows then (read from bottom to top):
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_3">[-] hide code</a></legend><div class="collapsible-container"><pre class="code">
[7] ' ' =&gt; nothing matches.
[6] 'r' =&gt; &quot;foobars&quot; might match.
[5] 'a' =&gt; &quot;foobaz&quot; or &quot;foobars&quot; might match.
[4] 'b' =&gt; &quot;foobaz&quot; or &quot;foobars&quot; might match.
[3] 'o' =&gt; &quot;foo&quot;, &quot;foobaz&quot;, or &quot;foobars&quot; might match.
[2] 'o' =&gt; &quot;for&quot;, &quot;foo&quot;, &quot;foobaz&quot;, or &quot;foobars&quot; might match.
[1] 'f' =&gt; &quot;for&quot;, &quot;far&quot;, &quot;foo&quot;, &quot;faz&quot;, &quot;foobaz&quot;, or &quot;foobars&quot; might match.
[0] ' ' =&gt; &quot;\b&quot; matches.
</pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_3')})/*]]&gt;*/</script>
<p>
So, but what if the wordlist is getting large? It seems that we should run nearly
through the whole list each time a character is pushed onto the stack in order to
find out whether the current stack contents still may be matched or not.
</p>

<p>
It's clear that a considerable optimization would be to sort the word list
in advance. Moreover, instead of looking up one item after another,
a really smart approach would be to walk downwards a search tree instead.
As a tree, the wordlist above would appear like this:
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_4">[-] hide code</a></legend><div class="collapsible-container"><pre class="code">
          _____________|_____________
          |                         |
          b                         f
    ______|______        ___________|___________
    |           |        |                     |
   oof          ar       a                     o
    |                 ___|___            ______|______
    a ?               |     |            |           |
 ___|___              r     z            o           r
 |     |                                 |
 r     z                                 ba ?
                                     ____|____
                                     |       |
                                     rs      z
</pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_4')})/*]]&gt;*/</script>
<p>
Here, the &quot;?&quot; denotes an optional node. Remember the length of the way downwards
such a tree is in logarithmic relation to the number of nodes. Thus, loosely speeking,
we have improved the worst algorithm above up to <span class="math">O(n*log(m))</span> at least.
</p>
<p>
Actually I'm not sure whether regex engines would apply optimizations like that
when compiling. I guess they do, so it might be needless to replace the regex <span class="code">(1)</span> above
by the optimized version, implementing the sorted tree of alternative and optional nodes:
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_5">[-] hide code</a></legend><div class="collapsible-container"><pre class="code">
(2) /\b(b(?:ar|oof(?:a(?:r|z))?)|f(?:a(?:r|z)|o(?:o(?:ba(?:rs|z))?|r)))\b/
</pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_5')})/*]]&gt;*/</script>
<p>
Nevertheless I couldn't help to create a little Perl routine that folds a wordlist into an
optimized regex. Now, here it is:
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_6">[-] hide code</a></legend><div class="collapsible-container"><pre class="code"><code class="perl"><a class="kwd" href="http://perldoc.perl.org/functions/sub.html" target="_blank" rel="nofollow">sub</a> <span class="symb">foldWordsToRegex</span> <span class="op ld">{</span>

  <a class="kwd" href="http://perldoc.perl.org/functions/local.html" target="_blank" rel="nofollow">local</a> *<span class="symb">toString</span> = <a class="kwd" href="http://perldoc.perl.org/functions/sub.html" target="_blank" rel="nofollow">sub</a> <span class="op ld">{</span>
    <span class="cmnt">## node: [ prefix, [ nodes ], opt ]</span>

    <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="op ld">(</span><span class="var">$<span class="symb">prefix</span></span>, <span class="var">$<span class="symb">nodes</span></span>, <span class="var">$<span class="symb">opt</span></span><span class="op rd">)</span> = <span class="var">$<span class="op ld">{</span><span class="var">$_<span class="op ld">[</span>0<span class="op rd">]</span></span><span class="op rd">}</span></span><span class="op stmt">;</span>
    <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">rv</span></span> = <a class="kwd" href="http://perldoc.perl.org/functions/quotemeta.html" target="_blank" rel="nofollow">quotemeta</a> <span class="var">$<span class="symb">prefix</span></span><span class="op stmt">;</span>
    <span class="kwd">if</span> <span class="op ld">(</span> <a class="kwd" href="http://perldoc.perl.org/functions/ref.html" target="_blank" rel="nofollow">ref</a> <span class="var">$<span class="symb">nodes</span></span> <span class="symb">eq</span> <span class="qlo q"><span class="kwd">q</span><span class="op">|</span><span class="str">ARRAY</span><span class="op">|</span></span> &amp;&amp; <span class="var">@$<span class="symb">nodes</span></span> <span class="op rd">)</span>
    <span class="op ld">{</span>
      <span class="var">$<span class="symb">rv</span></span> .= <span class="str">&#039;(?:&#039;</span>.<span class="op ld">(</span><a class="kwd" href="http://perldoc.perl.org/functions/join.html" target="_blank" rel="nofollow">join</a> <span class="str">&#039;|&#039;</span>, <a class="kwd" href="http://perldoc.perl.org/functions/map.html" target="_blank" rel="nofollow">map</a> <span class="op ld">{</span> <span class="symb">toString</span><span class="op ld">(</span><span class="var">$_</span><span class="op rd">)</span> <span class="op rd">}</span> <span class="var">@$<span class="symb">nodes</span></span><span class="op rd">)</span>.<span class="str">&#039;)&#039;</span><span class="op stmt">;</span>
      <span class="var">$<span class="symb">rv</span></span> .= <span class="str">&#039;?&#039;</span> <span class="kwd">if</span> <span class="var">$<span class="symb">opt</span></span><span class="op stmt">;</span>
    <span class="op rd">}</span>
    <span class="var">$<span class="symb">rv</span></span><span class="op stmt">;</span>
  <span class="op rd">}</span><span class="op stmt">;</span>

  <a class="kwd" href="http://perldoc.perl.org/functions/local.html" target="_blank" rel="nofollow">local</a> *<span class="symb">fold</span> = <a class="kwd" href="http://perldoc.perl.org/functions/sub.html" target="_blank" rel="nofollow">sub</a><span class="op ld">(</span><span class="var">@_</span><span class="op rd">)</span> <span class="op ld">{</span>

    <a class="kwd" href="http://perldoc.perl.org/functions/sub.html" target="_blank" rel="nofollow">sub</a> <span class="symb">reduce</span><a class="o" href="o" target="_blank" rel="nofollow">(</a><a class="p" href="p" target="_blank" rel="nofollow">$</a><a class="o" href="o" target="_blank" rel="nofollow">)</a><span class="op stmt">;</span>
    <a class="kwd" href="http://perldoc.perl.org/functions/local.html" target="_blank" rel="nofollow">local</a> *<span class="symb">reduce</span> = <a class="kwd" href="http://perldoc.perl.org/functions/sub.html" target="_blank" rel="nofollow">sub</a> <span class="op ld">{</span>
      <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="op ld">(</span><span class="var">$<span class="symb">prefix</span></span>, <span class="var">$<span class="symb">nodes</span></span>, <span class="var">$<span class="symb">opt</span></span><span class="op rd">)</span> = <span class="var">$<span class="op ld">{</span><span class="var">$_<span class="op ld">[</span>0<span class="op rd">]</span></span><span class="op rd">}</span></span><span class="op stmt">;</span>

      <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="var">$_<span class="op ld">[</span>0<span class="op rd">]</span></span> <span class="kwd">unless</span> <a class="kwd" href="http://perldoc.perl.org/functions/ref.html" target="_blank" rel="nofollow">ref</a> <span class="var">$<span class="symb">nodes</span></span> <span class="symb">eq</span> <span class="qlo q"><span class="kwd">q</span><span class="op">|</span><span class="str">ARRAY</span><span class="op">|</span></span> &amp;&amp; <span class="var">@$<span class="symb">nodes</span></span> &gt; <span class="num">1</span><span class="op stmt">;</span>

      <span class="cmnt">## 1st char of the prefix of 1st node in list</span>
      <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="op ld">(</span><span class="var">$<span class="symb">c</span></span>, <span class="var">$<span class="symb">qc</span></span><span class="op rd">)</span><span class="op stmt">;</span>

      <span class="cmnt">## check whether 2nd prefix starts with same letter as the 1st</span>
      <span class="kwd">if</span> <span class="op ld">(</span> <a class="kwd" href="http://perldoc.perl.org/functions/length.html" target="_blank" rel="nofollow">length</a> <span class="var">$<span class="symb">nodes</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>0<span class="op rd">]</span><span class="op ld">[</span>0<span class="op rd">]</span></span> <span class="op rd">)</span>
      <span class="op ld">{</span>
        <span class="var">$<span class="symb">c</span></span> = <a class="kwd" href="http://perldoc.perl.org/functions/substr.html" target="_blank" rel="nofollow">substr</a> <span class="var">$<span class="symb">nodes</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>0<span class="op rd">]</span><span class="op ld">[</span>0<span class="op rd">]</span></span>, <span class="num">0</span>, <span class="num">1</span><span class="op stmt">;</span>
        <span class="var">$<span class="symb">qc</span></span> = <a class="kwd" href="http://perldoc.perl.org/functions/quotemeta.html" target="_blank" rel="nofollow">quotemeta</a> <span class="var">$<span class="symb">c</span></span><span class="op stmt">;</span>
        <span class="var">$<span class="symb">nodes</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>1<span class="op rd">]</span><span class="op ld">[</span>0<span class="op rd">]</span></span> =~ <span class="symb">m</span>/^<span class="var">$<span class="symb">qc</span></span>/ <span class="kwd">or</span> <a class="kwd" href="http://perldoc.perl.org/functions/undef.html" target="_blank" rel="nofollow">undef</a> <span class="var">$<span class="symb">c</span></span><span class="op stmt">;</span>
      <span class="op rd">}</span>

      <span class="kwd">unless</span> <span class="op ld">(</span> <a class="kwd" href="http://perldoc.perl.org/functions/defined.html" target="_blank" rel="nofollow">defined</a> <span class="var">$<span class="symb">c</span></span> <span class="op rd">)</span>
      <span class="op ld">{</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="var">$_<span class="op ld">[</span>0<span class="op rd">]</span></span> <span class="kwd">unless</span> <span class="var">@$<span class="symb">nodes</span></span> &gt; <span class="num">2</span><span class="op stmt">;</span>

        <span class="cmnt">## try to reduce next list part</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">first</span></span> = <a class="kwd" href="http://perldoc.perl.org/functions/shift.html" target="_blank" rel="nofollow">shift</a> <span class="var">@$<span class="symb">nodes</span></span><span class="op stmt">;</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">next</span></span> = <span class="symb">reduce</span> <span class="op ld">[</span><span class="str">&#039;&#039;</span>, <span class="var">$<span class="symb">nodes</span></span>, <span class="num">0</span><span class="op rd">]</span><span class="op stmt">;</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="op ld">[</span> <span class="var">$<span class="symb">prefix</span></span>, <span class="op ld">[</span> <span class="var">$<span class="symb">first</span></span>, <span class="var">$<span class="symb">next</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">opt</span></span><span class="op rd">]</span> <span class="kwd">if</span> <a class="kwd" href="http://perldoc.perl.org/functions/length.html" target="_blank" rel="nofollow">length</a> <span class="var">$<span class="symb">next</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>0<span class="op rd">]</span></span><span class="op stmt">;</span>

        <span class="cmnt">## couldn&#039;t be reduced</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="op ld">[</span> <span class="var">$<span class="symb">prefix</span></span>, <span class="op ld">[</span> <span class="var">$<span class="symb">first</span></span>, <span class="var">$<span class="op ld">{</span><span class="var">$<span class="symb">next</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>1<span class="op rd">]</span></span><span class="op rd">}</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">opt</span></span> <span class="op rd">]</span><span class="op stmt">;</span>
      <span class="op rd">}</span>

      <span class="cmnt">## reduce any ensuing node whose prefix starts with $c</span>
      <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">@<span class="symb">new</span></span><span class="op stmt">;</span>
      <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">newopt</span></span> = <span class="num">0</span><span class="op stmt">;</span>
      <a class="kwd" href="http://perldoc.perl.org/functions/while.html" target="_blank" rel="nofollow">while</a> <span class="op ld">(</span> <span class="var">@$<span class="symb">nodes</span></span> <span class="op rd">)</span>
      <span class="op ld">{</span>
        <span class="var">$<span class="symb">nodes</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>0<span class="op rd">]</span><span class="op ld">[</span>0<span class="op rd">]</span></span> =~ <span class="symb">s</span>/^<span class="var">$<span class="symb">qc</span></span>// <span class="kwd">or</span> <a class="kwd" href="http://perldoc.perl.org/functions/last.html" target="_blank" rel="nofollow">last</a><span class="op stmt">;</span>

        <span class="cmnt">## reduce node or detect new optional node</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">n</span></span> = <a class="kwd" href="http://perldoc.perl.org/functions/shift.html" target="_blank" rel="nofollow">shift</a> <span class="var">@$<span class="symb">nodes</span></span><span class="op stmt">;</span>
        <span class="kwd">if</span> <span class="op ld">(</span> <a class="kwd" href="http://perldoc.perl.org/functions/length.html" target="_blank" rel="nofollow">length</a> <span class="var">$<span class="symb">n</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>0<span class="op rd">]</span></span> <span class="op rd">)</span>
        <span class="op ld">{</span>
          <a class="kwd" href="http://perldoc.perl.org/functions/push.html" target="_blank" rel="nofollow">push</a> <span class="var">@<span class="symb">new</span></span>, <span class="var">$<span class="symb">n</span></span><span class="op stmt">;</span>
          <a class="kwd" href="http://perldoc.perl.org/functions/next.html" target="_blank" rel="nofollow">next</a><span class="op stmt">;</span>
        <span class="op rd">}</span>
        <span class="var">$<span class="symb">newopt</span></span> = <span class="num">1</span><span class="op stmt">;</span>
      <span class="op rd">}</span>

      <span class="kwd">if</span> <span class="op ld">(</span> <span class="var">@$<span class="symb">nodes</span></span> || <span class="var">$<span class="symb">opt</span></span> <span class="op rd">)</span>
      <span class="op ld">{</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">new</span></span> = <span class="symb">reduce</span> <span class="op ld">[</span> <span class="var">$<span class="symb">c</span></span>, <span class="op ld">[</span> <span class="var">@<span class="symb">new</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">newopt</span></span> <span class="op rd">]</span><span class="op stmt">;</span>
        <span class="kwd">if</span> <span class="op ld">(</span> <span class="var">@$<span class="symb">nodes</span></span> <span class="op rd">)</span>
        <span class="op ld">{</span>
          <span class="cmnt">## reduce remaining nodes</span>
          <a class="kwd" href="http://perldoc.perl.org/functions/my.html" target="_blank" rel="nofollow">my</a> <span class="var">$<span class="symb">next</span></span> = <span class="symb">reduce</span> <span class="op ld">[</span><span class="str">&#039;&#039;</span>, <span class="var">$<span class="symb">nodes</span></span>, <span class="num">0</span><span class="op rd">]</span><span class="op stmt">;</span>
          <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="op ld">[</span> <span class="var">$<span class="symb">prefix</span></span>, <span class="op ld">[</span> <span class="var">$<span class="symb">new</span></span>, <span class="var">$<span class="symb">next</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">opt</span></span><span class="op rd">]</span> <span class="kwd">if</span> <a class="kwd" href="http://perldoc.perl.org/functions/length.html" target="_blank" rel="nofollow">length</a> <span class="var">$<span class="symb">next</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>0<span class="op rd">]</span></span><span class="op stmt">;</span>

          <span class="cmnt">## couldn&#039;t be reduced</span>
          <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="op ld">[</span> <span class="var">$<span class="symb">prefix</span></span>, <span class="op ld">[</span> <span class="var">$<span class="symb">new</span></span>, <span class="var">$<span class="op ld">{</span><span class="var">$<span class="symb">next</span><span class="op ptr">-&gt;</span><span class="op ld">[</span>1<span class="op rd">]</span></span><span class="op rd">}</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">opt</span></span> <span class="op rd">]</span><span class="op stmt">;</span>
        <span class="op rd">}</span>

        <span class="cmnt">## current node is optional</span>
        <a class="kwd" href="http://perldoc.perl.org/functions/return.html" target="_blank" rel="nofollow">return</a> <span class="op ld">[</span> <span class="var">$<span class="symb">prefix</span></span>, <span class="op ld">[</span> <span class="var">$<span class="symb">new</span></span>, <span class="var">@$<span class="symb">nodes</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">opt</span></span> <span class="op rd">]</span><span class="op stmt">;</span>
      <span class="op rd">}</span>

      <span class="cmnt">## nothing left to reduce</span>
      <span class="symb">reduce</span> <span class="op ld">[</span> <span class="var">$<span class="symb">prefix</span></span>.<span class="var">$<span class="symb">c</span></span>, <span class="op ld">[</span> <span class="var">@<span class="symb">new</span></span> <span class="op rd">]</span>, <span class="var">$<span class="symb">newopt</span></span> <span class="op rd">]</span><span class="op stmt">;</span>
    <span class="op rd">}</span><span class="op stmt">;</span>

    <span class="symb">reduce</span> <span class="op ld">[</span> <span class="str">&#039;&#039;</span>, <span class="op ld">[</span><span class="op ld">(</span> <a class="kwd" href="http://perldoc.perl.org/functions/map.html" target="_blank" rel="nofollow">map</a> <span class="op ld">{</span> <span class="op ld">[</span><span class="var">$_</span><span class="op rd">]</span> <span class="op rd">}</span> <a class="kwd" href="http://perldoc.perl.org/functions/sort.html" target="_blank" rel="nofollow">sort</a> <span class="var">@_</span> <span class="op rd">)</span><span class="op rd">]</span>, <span class="num">0</span><span class="op rd">]</span><span class="op stmt">;</span>
  <span class="op rd">}</span><span class="op stmt">;</span>

  <span class="symb">toString</span><span class="op ld">(</span><span class="symb">fold</span><span class="op ld">(</span><span class="var">@_</span><span class="op rd">)</span><span class="op rd">)</span><span class="op stmt">;</span>
<span class="op rd">}</span><span class="op stmt">;</span></code></pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_6')})/*]]&gt;*/</script>
<p>
Well, not so easy, but it works :)
</p>
<p>
Here, the inner recursion <span class="code">fold</span> will create the actually tree, where nodes
having the form of arrays consisting of prefix, subnodes and a flag denoting optional nodes.
The second inner function <span class="code">toString</span> then creates the actual regular
expression string from that tree.
So, for instance, calling
</p>
<fieldset class="collapsible"><legend><a href="javascript:void(0)" id="collapsible_awekcvse_7">[-] hide code</a></legend><div class="collapsible-container"><pre class="code"><code class="perl">&amp;<span class="symb">foldWordsToRegex</span><span class="op ld">(</span><span class="qlo qw"><span class="kwd">qw</span><span class="op">(</span><span class="istr"> for far bar foo boofaz boofar boof faz foobaz foobars boofar </span><span class="op">)</span></span><span class="op rd">)</span></code></pre></div></fieldset><script type="text/javascript">/*<![CDATA[*/xLib.onLoad(function(){Blog.Collapsible.create('collapsible_awekcvse_7')})/*]]&gt;*/</script>
<p>
would return the regex <span class="code">(2)</span>.
</p>
]]>
  
  </content>
</entry>

</feed>

