Regular Expressions And Generating Functions

We'll discuss about the number of ways to generate a string of length n, matching or avoiding certain patterns as its substring.

This can be easily achieved deriving an unambiguous regular expression (RE) by constructing a minimal Deterministic Finite Automaton (DFA), and then the ordinary Generating Function (GF) whose coefficients are the number of n-letter strings having the pattern.

Let's look at a few examples, for the strings constructed over the set of symbols {0,1} such that

1. There are no two consecutive 1's

The RE is: (0+10)(1+ϵ)

Then, to get its GF, replace each symbol with x, and ϵ with 1. Hence, the GF for the RE is

G1(x)=11xx2(1+x)=n0((53)(15)n+(5+3)(1+5)n)52n+1xn=n0Fn+2xn

where Fn is the fibonacci number

2. At least one pair of consecutive 1's as its substring

RE: (0+10)11(0+1)

G2(x)=11xx2x2112x=n0(((53)((15)n)(3+5)(1+5)n)52n+1+2n)xn=n0(2nFn+2)xn

which can also be confirmed from the first example by taking its complement, since the number of possible words are 2n

3. At least one 01 as its substring

RE: 1001(0+1)

G3(x)=x2(12x)(1x)2=n0(2nn1)xn

For more, refer Analytic Combinatorics, which is a treatise on Generating Functions.