1. Introduction
 
The problem of pattern generation was first investigated by [1] and then extended by [2] and [3] . The idea of playing patterns against each other was then explored by [4] , who considered only the case of 
  and all m patterns
 
being of the same length, and generalized by [5] to any admissible value of p and patterns being of arbitrary lengths. This article is summarizing and streamlining these results, pointing out that the existing formulas fail when m becomes large (as symbolic inversion of matrices is no longer feasible) and resolving this problem by deriving (with the help of Sherman-Morrison identity) formulas which require numerical evaluation only.
 
The main mathematical tool of this article is a skillful application of generating functions; a generating function of a sequence, say 
  is defined as 
  . It is easy to see that a product of two such generating functions, say 
  , corresponds to a new sequence (called a convolution of the two sequences) with terms 
  , 
  , 
  , 
  , 
  .
 
When a sequence 
  is defined by 
  , where Z is an integer-valued random variable, the corresponding generating function, say 
  , is called probability generating function (PGF); to find the expected value of Z, denoted 
  or 
  , one needs to evaluate 
  at 
  . Similarly, 
  yields the second factorial moment of Z, namely 
  , which can be easily converted into the variance of Z, thus equal to 
  .
 
2. Generating Patterns
 
We first review the technique for constructing a probability generating function (PGF) of the number of trials to generate one such pattern (of specific consecutive outcomes, such as FFSF) for the first time. This requires introducing the following notation: for any pattern 
  , let 
  be the first k symbols of 
  and 
  be its last k symbols, while 
  is the probability of generating 
  in 
  trials, where 
  is the length of 
  , e.g. 
  .
 
For any two patterns, say 
  and 
  , 
  is a set of all positive integers k such that 
  where 
  indicates a perfect match of the corresponding strings; note that this set may be empty. Let 
  be the largest of these integers (equal to 0 when the set is empty; also: 
  cannot be bigger than the length of either pattern), and 
  be the corresponding 
  (a string of zero length when 
  ). Thus, for example, when 
  and 
  , we get 
  and 
  , and 
  and 
  . Furthermore, 
  and 
  , implying that 
  for any pattern. The 
  operation is easily performed by the following Mathematica code
 
 
  (1)
 
where each pattern needs to be represented by a list of zeros (for every F) and ones (for every S); to find 
  thus requires typing
 
 
  (2)
 
to obtain {0, 0, 1, 0, 0, 0}, which represents FFSFFF.
 
Consider generating an infinite (in principle) sequence of Bernoulli trials with repeated occurrences of a specific pattern, say 
  . Note that it is assumed that consecutive occurrences are not allowed to overlap, i.e. upon a completion of the pattern, none of its symbols can be used to help generate its next occurrence (which needs to be built from scratch—yet another way of putting it).
 
Let 
  be the probability that the first occurrence of 
  is completed at Trial n, while 
  is the probability that 
  is completed (from scratch—no help from a potential previous occurrence, but not necessarily for the first time) at Trial n. Using the total-probability formula (TPF), we can partition 
  based on where the first occurrence of 
  has happened, thus getting
 
 
  (3)
 
Note that this identity is correct for all positive integers n (not when 
  ), with the understanding that 
  and 
  (assumed from now on).
 
Multiplying each side of (3) by 
  and summing over n from 1 to 
  yields
 
 
  (4)
 
where 
  is the probability generating function (PDF) of the 
  sequence while 
  is the generating function of the 
  sequence. This follows from the fact that the RHS of (3) represents the nth term of a convolution of the two sequences; note that the summation missed the 
  term on the LHS but (since 
  ) no term on the RHS; subtracting 1 form 
  on the LHS of (4) is correcting for this discrepancy.
 
The last equation implies that
 
 
  (5)
 
The importance of this formula rests in a relatively simple way (described shortly) of finding 
  ; (5) then easily converts it into (otherwise inaccessible) 
  
 
To find 
  , we apply the TPF to the probability of the consecutive outcomes which define 
  being completed at Trial n (due to the non-overlapping rule, this may not constitute a legitimate occurrence of Pattern 
  ). Partitioning this (trivial) probability according to where a legitimate occurrence of 
  may have been completed yields
 
 
  (6)
 
where 
  is the length of 
  and 
  , with the understanding that 
  . Multiplying each side of the last equation by 
  and summing over n, from 
  to 
  . results in
 
 
  (7)
 
since 
  and 
  .
 
Based on (5), this further implies that
 
 
  (8)
 
The last line represents the corresponding Taylor expansion of 
  at 
  . The expected number of trials to generate 
  for the first time is given by 
  or equivalently (and more easily) by the linear coefficient of this expansion, namely 
  ; similarly, the second factorial moment equals to the quadratic coefficient further multiplied by 2; this yields
 
 
  (9)
 
for the corresponding variance.
 
Example 1: When 
  ,
 
 
  (10)
 
since, when observing 
  at the last 
  trials, the actual pattern may have been completed at any one of the trials indicated by a dot. This then yields 
  and 
  , easily converted into a variance formula. 
  
 
Finding 
  for any specific pattern is made routine by the following Mathematica function
 
 
  (11)
 
Thus, using the same 
  as our example, typing Q[{0, 0, 0, 1, 0, 0, 0}] results in the last line of (10); the expected number of trials to complete 
  and the corresponding variance can be easily found by evaluating this expression and its z derivative at 
  , as already indicated.
 
Subsequently, we will also need the PGF of a number of extra trials to generate the next occurrence of a pattern 
  , given that pattern 
  has just been completed and 
  is allowed to utilize any of its symbols (i.e. 
  may overlap with 
  whenever possible, up to and including 
  ). This PGF, denoted 
  and called conditional, is found by realizing that, to generate the first occurrence of 
  from scratch, 
  must be generated first, while the remaining (independent) trials to complete 
  are those of 
  . This implies that
 
 
  (12)
 
where 
  is just a new notation for 
  of 
  , 
  is a similarly constructed 
  of 
  , and the conditional 
  is a simple solution of (12).
 
3. Competing Patterns
 
Let us now consider m such patterns with individual PGFs denoted 
  , where 
  , and the corresponding 
  coefficients are 
  . Similarly, the conditional PGFs for the number of trials to generate Pattern 
  with the help of Pattern 
  are denoted 
  when 
  , and their coefficients are 
  (where 
  ). The probability that Pattern 
  is generated, for the first time (and thus winning the game) before any of the other pattern occurs, at Trial n is denoted 
  (with 
  ); the corresponding generating function is 
  . We assume that none of these patterns is a sub-string of any of the others (which would either prevent it from winning the game, or allow it to be completed at the same time as the other pattern).
 
We assume that n trials have been completed, and we use TPF to partition 
  (the probability that Pattern 
  has been completed, for the first time, at Trial n) according to which of the other patterns may have won the game at Trial k (for 
  ). This yields, for any non-negative n,
 
 
  (13)
 
Multiplying by 
  and summing over n from 0 to 
  converts these equations into
 
 
  (14)
 
by setting 
  , in agreement with (12). Dividing by 
  and switching sides makes these m linear equations for 
  into
 
 
  (15)
 
which has a simple matrix-algebra solution, namely
 
 
  (16)
 
where 
  is a vector whose components are the 
  generating functions, 
  is a matrix whose 
  element is equal to 
  , 
  is a column vector with all m components equal to 1, and the small circle implies matrix multiplication.
 
The probability of Pattern 
  winning the game is then given by the value of 
  at 
  ; since 
  is a highly singular matrix of rank 1, this value can be found only by taking the 
  limit of the RHS of (16). But the RHS itself is increasingly more difficult to compute as m increases, since inverting large matrices symbolically (note that 
  are all functions of z) is practically impossible.
 
We can bypass this problem by expanding each 
  at 
  , which yields 
  , where 
  , i.e. the expected number of trials to generate 
  . This further implies that, to the 
  accuracy,
 
 
  (17)
 
where 
  is a matrix of the 
  values. Using Sherman-Morrison formula, we get, to the same accuracy
 
 
  (18)
 
Post-multiplying by 
  then enables us to approximate 
  by
 
 
  (19)
 
resulting in
 
 
  (20)
 
Example 2: When playing patterns FFFSF, SFSSS, FFFSS and SSSFS against each other while 
  (tossing a fair coin), the four probabilities of winning are computed by the following Mathematica code
 
 
  (21)
 
resulting in 0.3113, 0.2058, 0.3113 and 0.1716 respectively (these are huge differences by gambling standards). Note that now (after p has been assigned a specific value) the computation of the matrix inverse is purely numerical, posing no problem even when hundreds of patterns are involved. 
  
 
4 Game’s Duration
 
The PGF of the number of trials to complete any such game is given by
 
 
  (22)
 
where the last approximation no longer lets us compute individual probabilities of the distribution, but it is sufficient to find the corresponding expected value; all we need to do is to differentiate the expression with respect to z and substitute 
  , resulting in
 
 
  (23)
 
An alternate way of deriving both (20) and the last formula is based on an elegant probabilistic argument expounded in [5] .
 
To derive a similar formula for the corresponding variance (which, unlike M, necessitates using the novel approach of this article), we need to extend the expansion of 
  to reach an 
  accuracy, thus getting
 
 
  (24)
 
This further implies that, to the same accuracy,
 
 
  (25)
 
where 
  is the matrix of the 
  values. To correspondingly extend the accuracy of the RHS of (22), we only need to replace 
  by
 
 
  (26)
 
therein. The second derivative of the resulting expression, evaluated at 
  , yields the corresponding second factorial moment, namely
 
 
  (27)
 
Adding M and subtracting 
  then yields a formula for the variance of the number of trials to complete the game, namely
 
 
  (28)
 
Continuing Example 2: Extending the previous Mathematica program (where 
  has already been computed) by
 
 
  (29)
 
results in 31.96 for this variance, and 5.65 for the corresponding standard deviation. 
  
 
Monte-Carlo Verification
 
We have randomly generated about hundred thousand rounds of the game of Example 2, using the following Mathematica code
 
 
  (30)
 
This produces a set of relative frequencies (one for each pattern) of winning the game; in our case, we obtained 0.310, 0.205, 0.313 and 0.171, in excellent agreement with the earlier computed theoretical probabilities. Similarly, the following extra line
 
 
  (31)
 
yields the average number of trials per game and the corresponding (sample) standard deviation; in our case these are 10.52 and 5.60 respectively (again, in good agreement with the theoretical values of Example 2).
 
5. More Examples
 
Example 3: To demonstrate that the algorithm (and the corresponding Mathematica code) can easily accommodate patterns of different lengths, we now play SSS, SFFFFFFFFFFFS, and twenty-one F’s followed by a single S against each other, where S represents getting a six and F stands for any other value when rolling a die.
 
This requires changing the definition of 
  in (21) to
 
 
  (32)
 
and replacing p = q = 0.5; by p = 1/6.; q = 5/6., thus getting 0.343, 0.323 and 0.335 respectively for the probabilities of winning, and 92.35 and 79.99 for the expected number of rolls and the corresponding standard deviation.
 
Example 4: Here we investigate how changing the p value affects the winning probabilities when all six patterns of length four with exactly two S’s each are played against each other. This now requires the following changes in (21)
 
 
  (33)
 
After executing the program, the following extra line
 
 
  (34)
 
results in six graphs of Figure 1, indicating that FFSS (red) has always a better chance of winning than FSFS (blue) which in turn has higher probability than SFFS (green); for the remaining three patterns, it is the corresponding complement. Also, the former group of patterns has a substantial advantage over the latter group when p is small, and the opposite happens when p is large.
 
To find the value of p which maximizes the probability of FFSS winning, we have to add another line of code, namely
 
 
  (35)
 
to get 0.1039 (similarly, SSFF would have its best chance of winning at 
  ).
 
Example 5: Finally, assuming that each of our opponents has selected his pattern already, and we are then allowed to choose ours, the question is how do we do that to maximize our chances of winning. The following program indicates (using a specific situation) how to get the answer by simply trying all permissible possibilities.
 
Using 
  and SSFSFF, SSFFFF, FFSFFF, SFFSFS for the opponents’ choices (when all patterns must be of length six), the following few lines of code
 
 
  (36)
 
yield (practically instantaneously) the best winnig chance of 0.2627 if we select SSFFSF. Note that FFFFFF would be the worst possible choice, reducing our probability of winning to 0.0403 only.
 
6. Conclusions
 
We have summarized and extended the existing theory of pattern generation with the aim of making it accessible to wide audience; the only prerequisites to understanding our derivation of key formulas is a rudimentary knowledge of matrices and generating functions. We have deliberately tried to make our exposition as brief as possible, not to obstruct fundamental issues with less important
 
 
details. In this closing section we will only indicate the directions in which the study of pattern competition can be extended, in most cases rather routinely.
 
The first modification assumes that, instead of having two possibilities for an outcome (our S and F), there may be three or more (e.g. rolling a die can result in showing one of six numbers); on closer reflection one can see that the formulas presented in this article remain practically unchanged (only the p and q probabilities need to be replaced by 
  ).
 
Secondly, it is also possible to employ generating functions to investigate the distribution of number of occurrences of a specific pattern, or the number of games (playing several patterns against each other), to be completed in a fixed number of trials.
 
Finally, one can modify the game in such a way that a pattern can win only after completing a specific number of its occurrences (before any of the other patterns can do the same); here, there are several possibilities in terms of which occurrences (e.g. of like patterns only, dissimilar patterns only, all patterns, etc.) may be allowed to overlap and thus help generate the next one. We leave these for our readers to explore.