This page serves as a playground for the exposition Coins Flips, Fair Casinos and Martingales.

Calculator

Alphabet, \(\mathcal{A} = \)
Terminators, \(\mathcal{T} = \{\)\(\}\)

Results

\(L_{\mathcal{T}}\) has mean \(\mathbb{E}{\left[L_{\mathcal{T}}\right]} =\) \(\diamondsuit\) and variance \(\mathrm{Var}{\left[L_{\mathcal{T}}\right]} =\) \(\heartsuit\).

The following table shows the expectation and variance of \(L_{\mathcal{T}}\) given that the word \(W_{\mathcal{T}}\) ends in a specific terminator \(t \in \mathcal{T}\). Numerical values are given to 3 significant figures where applicable.

Table 1. Conditional expectations and variances.
\(t\)	\(\mathbb{P}{\left[W_{\mathcal{T}} \in [t]\right]}\)	\(\mathbb{E}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t]\right]}\)	\(\mathrm{Var}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t]\right]}\)

How It Works

Let \(\mathcal{T} = \left\{t_1, t_2, \dots, t_n\right\}\) be our terminators and \(\mathcal{A}\) be our alphabet.

Correlation Matrix

As defined in the workshop notes (Definition 21), the correlation matrix \(\mathbf M_{\mathcal T} = (m)_{i,j}\) has entries given by \[(m)_{i,j} = (t_j : t_i)_{\mathcal A} = \sum_{k = 1}^{\min \left\{\lvert t_i \rvert, \lvert t_j \rvert \right\}} \lvert \mathcal{A}\rvert^k \, \mathbf{1}{\left\{R_k(t_j) = L_k(t_i)\right\}},\] which we can easily compute.

Probabilities

As derived in the workshop notes (Theorem 26), the probability vector \(\mathbf p_{\mathcal T}\) has formula \[\mathbf{p}_{\mathcal T} = \frac{1}{\mathbf{1}^{\mathsf{T}} \mathbf{M}_{\mathcal T}^{-1} \mathbf{1}} \mathbf{M}_{\mathcal T}^{-1} \mathbf{1},\] where \(\mathbf{1}\) is the \(n \times 1\) vector that is all ones.

Expectations

As derived in the workshop notes, the expected length is given by \[\mathbb{E}{\left[L_{\mathcal{T}}\right]} = \frac{1}{\mathbf{1}^{\mathsf{T}} \mathbf{M}_{\mathcal T}^{-1} \mathbf{1}}.\] Unfortunately, we were unable to find a closed form for the conditional expectation \(\mathbb{E}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t]\right]}\) using a martingale argument. However, not all hope is lost. Theorem 2.1 of Guibas and Odylzko's 1981 paper gives us a viable (albeit clumsy) approach towards computing conditional expectations.

First, define \(f(m)\) to be the number of words of length \(m\) that do not contain any terminators. Define also \(f_i(m)\) to be the number of words of length \(m\) that terminate with \(t_i\). Now, define \(F(z)\) and \(F_i(z)\) to be the generating functions of \(f(m)\) and \(f_i(m)\) respectively: \[F(z) = \sum_{m = 0}^\infty f(m) \, z^{-m}, \quad F_i(z) = \sum_{m = 0}^\infty f_i(m) \, z^{-m}.\] Then Theorem 2.1 of the aforementioned paper states that \(F(z)\) and \(F_i(z)\) obey the following system of equations: \[\begin{aligned}(z-\lvert \mathcal{A} \rvert) \, F(z) + z F_1(z) + zF_2(z) + \dots + zF_n(z) &= z \\ -F(z) + (t_1:t_1)_z F_1(z) + (t_2:t_1)_z F_2(z) + \dots + (t_n:t_1)_z F_n(z) &= 0 \\ -F(z) + (t_1:t_2)_z F_1(z) + (t_2:t_2)_z F_2(z) + \dots + (t_n:t_2)_z F_n(z) &= 0 \\ \vdots \\ -F(z) + (t_1:t_n)_z F_1(z) + (t_2:t_n)_z F_2(z) + \dots + (t_n:t_n)_z F_n(z) &= 0 \end{aligned}\] This can be rewritten as the matrix equation \[ \begin{pmatrix}z - \lvert \mathcal{A} \rvert & z & z & \cdots & z \\ -1 & (t_1:t_1)_z & (t_2:t_1)_z & \cdots & (t_n:t_1)_z \\ -1 & (t_1:t_2)_z & (t_2:t_2)_z & \cdots & (t_n:t_2)_z \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ -1 & (t_1:t_n)_z & (t_2:t_n)_z & \cdots & (t_n:t_n)_z \end{pmatrix} \begin{pmatrix}F(z) \\ F_1(z) \\ F_2(z) \\ \vdots \\ F_n(z) \end{pmatrix} = \begin{pmatrix} z \\ 0 \\ 0 \\ \vdots \\ 0 \end{pmatrix}. \] Let the matrix on the left be \(\mathbf A \), and let \(\mathbf C = (c)_{i,j}\) be the cofactor matrix associated with \(\mathbf A\). Then \[\begin{pmatrix}F(z) \\ F_1(z) \\ \vdots \\ F_n(z) \end{pmatrix} = \mathbf A^{-1} \begin{pmatrix} z \\ 0 \\ \vdots \\ 0 \end{pmatrix} = \frac{1}{\text{det}{(\mathbf{A})}} \mathbf C^{\mathsf T} \begin{pmatrix} z \\ 0 \\ \vdots \\ 0 \end{pmatrix} = \frac{1}{\text{det}{(\mathbf{A})}} \begin{pmatrix}z c_{1,1} \\ z c_{1, 3} \\ \vdots \\ z c_{1, n+1} \end{pmatrix},\] which gives \[F_i(z) = \frac{z c_{1, i+1}}{\text{det}{(\mathbf{A})}}.\] We now relate \(F_i(z)\) to the conditional expectation. Using the definition of expecation and invoking the formula for conditional probability, one has \[\mathbb{E}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t_i]\right]} = \sum_{m = 0}^\infty \frac{m \mathbb{P}{\left[L_{\mathcal T} = m \text{ and } W_{\mathcal T} \in [t_i] \right]}}{\mathbb{P}{\left[W_{\mathcal T} \in [t_i] \right]}}.\] Observe that the numerator counts the proportion of words of length \(m\) that end in \(t_i\). Since there are \(f_i(m)\) such words, along with \(\lvert \mathcal{A} \rvert^m\) total words of length \(m\), the numerator is simply \(f_i(m) \, \lvert \mathcal{A} \rvert^{-m}\). Thus, \[\mathbb{E}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t_i]\right]} = \frac1{\mathbb{P}{\left[W_{\mathcal T} \in [t_i] \right]}} \sum_{m = 0}^\infty m f_i(m) \lvert \mathcal{A} \rvert^{-m}.\] Now notice that the RHS looks like the derivative of \(F_i(z)\) evaluated at \(\lvert \mathcal{A} \rvert\). Indeed, one can show that \[\sum_{m = 0}^\infty m f_i(m) \lvert \mathcal{A} \rvert^{-m} = -\lvert \mathcal{A} \rvert F_i'{\left(\lvert \mathcal{A} \rvert\right)}.\] Putting everything together, once we compute the cofactor \(c_{1, i+1}\), we can easily compute the conditional expectation with the following formula: \[\mathbb{E}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t_i]\right]} = \frac{-\lvert \mathcal{A} \rvert}{\mathbb{P}{\left[W_{\mathcal T} \in [t_i] \right]}} \left[\frac{\mathrm{d}}{\mathrm{d} z} \frac{z c_{1, i+1}}{\text{det}{(\mathbf{A})}} \right]_{z = \lvert \mathcal{A} \rvert}.\] Not as pretty and not as elegant as our martingale approach, but it works.

Variances

We begin by discussing our calculation of the conditional variance \(\mathrm{Var}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t]\right]}\). Using an almost identical argument as above, one can show that \[\mathbb{E}{\left[L_{\mathcal{T}}^2\mid W_{\mathcal{T}} \in [t_i]\right]} = \frac{\lvert \mathcal{A} \rvert^2 F_i''{\left(\lvert \mathcal{A} \rvert\right)} + \lvert \mathcal{A} \rvert F_i'{\left(\lvert \mathcal{A} \rvert\right)}}{\mathbb{P}{\left[W_{\mathcal T} \in [t_i] \right]}},\] so we can find the condition variance with \[\mathrm{Var}{\left[L_{\mathcal{T}}\mid W_{\mathcal{T}} \in [t_i]\right]} = \frac{\lvert \mathcal{A} \rvert^2 F_i''{\left(\lvert \mathcal{A} \rvert\right)} + \lvert \mathcal{A} \rvert F_i'{\left(\lvert \mathcal{A} \rvert\right)}}{\mathbb{P}{\left[W_{\mathcal T} \in [t_i] \right]}} - \frac{\lvert \mathcal{A} \rvert^2 \left[F_i'\left(\lvert \mathcal{A} \rvert\right)\right]^2}{\mathbb{P}{\left[W_{\mathcal T} \in [t_i] \right]}^2}.\]
We now discuss the total variance \(\mathrm{Var}{\left[L_{\mathcal{T}}\right]}\). For convenience, let \(W_i\) be the event \(W_{\mathcal T} \in [t_i]\). By the law of total variance, one has \[\mathrm{Var}{\left[L_{\mathcal T}\right]} = \mathbb{E}{\left[\mathrm{Var}{\left(L_{\mathcal T} \mid W_{\mathcal{T}}\right)}\right]} + \mathrm{Var}{\left[\mathbb{E}{\left(L_{\mathcal T} \mid W_{\mathcal{T}}\right)}\right]}.\] By the law of total expectation, the first term becomes \[\mathbb{E}{\left[\mathrm{Var}{\left(L_{\mathcal T} \mid W_{\mathcal{T}}\right)}\right]} = \sum_{i = 1}^n \mathbb{P}{\left[W_i\right]} \mathrm{Var}{\left[L_{\mathcal T} \mid W_i\right]},\] which we can calculate from previous parts. Meanwhile, applying the formula \(\mathrm{Var}(X) = \mathbb{E}\left[X^2\right] - \mathbb{E}[X]^2\) and invoking the law of total expectation on the second term yields \[\begin{aligned}\mathrm{Var}{\left[\mathbb{E}{\left(L_{\mathcal T} \mid W_{\mathcal{T}}\right)}\right]} &= \mathbb{E}{\left[\mathbb{E}{\left(L_{\mathcal T} \mid W_{\mathcal{T}}\right)}^2\right]} - \mathbb{E}{\left[\mathbb{E}{\left(L_{\mathcal T} \mid W_{\mathcal{T}}\right)}\right]}^2\\ &= \sum_{i = 1}^n \mathbb{P}{\left[W_i\right]} \mathbb{E}{\left(L_{\mathcal T} \mid W_i\right)}^2 - \mathbb{E}{\left[L_{\mathcal T}\right]}^2,\end{aligned}\] which we also know how to calculate from previous parts. By computing the two terms and adding them, we can find \(\mathrm{Var}{\left[L_{\mathcal{T}}\right]}\).