The Pólya Urn

This interactive post was written with the Idyll markup language.

Markup sources for this blog post can be found here.

An interactive study of the Pólya urn model

The Pólya urn model is a statistical experiment in which we study the distribution of two populations of distinctly colored balls in an urn. In its simplest form, the problem can be stated as follows:

Given an urn containing initially aa orange balls and bb blue balls, we pick one ball at random from the urn, place it back and add one new ball of the same color in the urn; We then repeat this sampling procedure nn times.
Pólya urn with a=a = 1 and b=b = 2

Figure 1: Simulation of a Pólya urn with initial conditions a = 1 and b = 2.

This process is a good example of a model where small imbalances are magnified over time: In fact, the more balls of one color are drawn, the more this color is likely to be drawn. As such, it has been used to model for instance propagation of epidemics in a population (Hayhoe et al.) or the correlated dynamics between novelty events (Tria et al, 2014). In the following, we study the dynamics of the model from a probabilistic point of view and its asymptotical properties, in particular with respect to the initial ratio of populations.

Basic dynamics of the model

Composition of the urn at step n

Let us first express the probability of having picked kk blue objects after nkn \geq k steps of the experiment. Denoting by BnB_n the number of blue balls at step nn , and by SnS_n the sampling event at step nn , taking value 1 if a blue ball is sampled and 0 otherwise:

P(Bn=b+k)=s{0,1}nisi=kP(S1=s1,Sn=sn)\mathbb{P} (B_n = b + k) = \sum_{\mathbf{s} \in {\{0, 1\}}^n \atop \sum_i \mathbf{s}_i = k} \mathbb{P}(S_1 = s_1, \dots S_{n} = s_n)
    P(S1=s1,,Sn=sn)\ \ \ \ \mathbb{P}(S_1 = s_1, \dots, S_{n} = s_n) =P(S1=s1)P(S2=s2  S1=s1)= \mathbb{P}(S_1 = s_1) \mathbb{P}(S_2 = s_2\ |\ S_1 = s_1) \dots

We can expand the inner term by using the probability chain rule. Moreover, the probability of drawing a ball of a certain color given the composition of the urn is simply the ratio of correspondingly colored balls in the urn, which evolves with each sampling. Therefore, the conditional probability of drawing a blue object at step nn can be written as:

(1)(1) P(Sn=1  S1=s1,,Sn1=sn1)=b+i=1n1sia+b+n1\mathbb{P}(S_n = 1\ |\ S_{1} = s_1, \dots, S_{n-1} = s_{n-1}) = \frac{b + \sum_{i = 1}^{n - 1} s_i}{a + b + n - 1}

From this expression, we can see that the probability does not depend on the time step nn but rather on how many blue balls have been sampled until now, as captured by the quantity i=1n1si\sum_{i=1}^{n-1} s_i . This also means that the model is Markovian, as the probability at step nn only depends on the urn’s composition at step n1n - 1 .

Consequently, for any sequence of draws s\mathbf{s} , we can arrange the order in which the balls have been drawn without affecting the probability of the next draw. This is sometimes called the exchangeability property. More formally, it means that Equation (1) is equivalent to:

(2)(2) P(Sn=1  S1=1,,Ssˉ=1,Ssˉ+1=0Sn1=0)\mathbb{P}(S_n = 1\ |\ S_{1} = \textcolor{CornflowerBlue}{1}, \dots, S_{\bar{\mathbf{s}}} = \textcolor{CornflowerBlue}{1}, S_{\bar{\mathbf{s}} + 1} = \textcolor{Orange}{0} \dots S_{n-1} = \textcolor{Orange}{0} )

where sˉ\bar{\mathbf{s}} is a shortcut notation for i=1n1si\sum_{i=1}^{n-1} s_i from (1)(1) . The same analysis holds for the conditional probability of drawing an orange ball. Bringing everything together, we can thus finally write P(Bn=b+k)=(nk)P(S1=1,,Sk=1,Sk+1=0Sn=0)\mathbb{P} (B_n = b + k) = \binom{n}{k} \mathbb{P}(S_1 = \textcolor{CornflowerBlue}{1}, \dots, S_k = \textcolor{CornflowerBlue}{1}, S_{k+1} = \textcolor{Orange}{0} \dots S_{n} = \textcolor{Orange}{0})

=(nk)P(S1=1)P(Sn=0  S1=1Sn1=0)= \binom{n}{k} \mathbb{P}(S_1 = \textcolor{CornflowerBlue}{1}) \dots \mathbb{P}(S_{n} = \textcolor{Orange}{0}\ |\ S_{1} = \textcolor{CornflowerBlue}{1} \dots S_{n-1} = \textcolor{Orange}{0}) =(nk)p=0k1b+pa+b+pq=0nk1a+qa+b+q= \binom{n}{k} \prod_{p=0}^{k-1} \frac{b + p}{a+ b + p} \prod_{q=0}^{n-k-1} \frac{a + q}{a+ b + q} =(nk)(b+k1)!(a+b1)!(a+nk1)!(b1)!(a1)!(a+b+n1)!= \binom{n}{k} \frac{(b+k-1)! (a+b-1)! (a+n-k-1)!}{(b-1)! (a-1)! (a+b+n-1)!}
P(Bn=b+k)=(nk)(a+bb)(a+b+n1b+k)ab(a+b)(b+k)\mathbb{P} (B_n = b + k) = \frac{\binom{n}{k} \binom{a+b}{b}}{\binom{a+b+n-1}{b+k}} \frac{ab}{(a+b)(b+k)}

Probability of the next draw

Following the previous result, we can now compute the probability of picking a blue ball at step n+1n + 1 , independently of the state of the urn at time nn :

P(Sn+1=1)=k=0nP(Sn+1=1  Bn=b+k) P(Bn=b+k)\mathbb{P} (S_{n + 1} = 1) = \sum_{k=0}^{n} \mathbb{P}(S_{n+1} = 1\ |\ B_{n} = b + k)\ \mathbb{P}(B_{n} = b + k) =k=0nb+ka+b+n P(Bn=b+k)= \sum_{k=0}^{n} \frac{b+k}{a+b+n}\ \mathbb{P}(B_{n} = b + k) (3)(3) =ab(a+bb)(a+b+n)(a+b)k=0n(nk)(a+b+n1b+k)f(n,a,b)= \frac{ab\binom{a+b}{b}}{(a+b+n)(a+b)} \underbrace{\sum_{k=0}^{n} \frac{\binom{n}{k}}{\binom{a+b+n-1}{b+k}}}_{f(n, a, b)}
Pascal’s triangle: (nk)=(n1k1)+(n1k)\binom{n}{k} = \binom{n - 1}{k - 1} + \binom{n - 1}{k}

To simplify the expression of f(n,a,b)f(n, a, b) , we can use Pascal’s triangle formula on the numerator (while isolating the terms for j=0j = 0 and j=nj = n ) and observe that: f(n,a,b)=j=0n(nj)(a+b+n1b+j)f(n, a, b) = \sum_{j=0}^n \frac{\binom{n}{j}}{\binom{a+b+n-1}{b+j}}

=1(a+b+n1b)+1(a+b+n1b+n)+j=1n1(n1j)(a+b+n1b+j)+j=1n1(n1j1)(a+b+n1b+j)= \frac{1}{\binom{a+b+n-1}{b}} + \frac{1}{\binom{a+b+n-1}{b+n}} + \sum_{j=1}^{n-1} \frac{\binom{n-1}{j}}{\binom{a+b+n-1}{b+j}} + \sum_{j=1}^{n-1} \frac{\binom{n-1}{j-1}}{\binom{a+b+n-1}{b+j}} =j=0n1(n1j)(a+b+n1b+j)+j=0n1(n1j)(a+b+n1b+j+1)= \sum_{j=0}^{n-1} \frac{\binom{n-1}{j}}{\binom{a+b+n-1}{b+j}} + \sum_{j=0}^{n-1} \frac{\binom{n-1}{j}}{\binom{a+b+n-1}{b+ j +1}} f(n,a,b)=f(n1,a+1,b)+f(n1,a,b+1)f(n, a, b) = f(n-1, a+1, b) + f(n-1, a, b+1)

By extending the relation, one can infer and finally prove by induction over nn , that f(n,a,b)=a+b+n(a+b)(a+b1b)f(n, a, b) = \frac{a+b+n}{(a+b) \binom{a+b-1}{b}} . Plugging this expression in Equation (3), we finally get:

P(Sn=1)=ba+b\mathbb{P} (S_n = 1) = \frac{b}{a+b}

In other words, the overall probability of the urn composition after nn only depends on the initial color distribution in the urn. In particular, for the case where a=1a = 1 and b=1b = 1 we have P(Bn=k+1)=[ kn ]n+1\mathbb{P} (B_n = k + 1) = \frac{[\!|\ k \leq n \ |\!]}{n + 1} and P(Sn=1)=0.5\mathbb{P} (S_n = 1) = 0.5 . In other words, all urn configurations are equiprobable when starting from a balanced urn.

Asymptotic behavior

Expected value

Since various urn compositions can be reached from the same initial configuration ( aa , bb ), one can also wonder what the urn composition looks like asymptotically, when the number of repeats nn goes to \infty .

Let us introduce Yn=Bna+b+nY_n = \frac{B_n}{a + b +n} , the ratio of the two colored ball populations at step nn . Using the previous results, we can easily see that the expected value of YnY_n is equal to the initial ratio, independently of the time step:

E(Yn)=k=0nP(Bn=b+k)b+ka+b+n\mathbb{E}(Y_n) = \sum_{k=0}^n \mathbb{P}(B_n = b + k) \frac{b + k}{a + b +n} =k=0nP(Bn=b+k)P(Sn=1  Bn=b+k)=P(Sn=1)= \sum_{k=0}^n \mathbb{P}(B_n = b + k) \mathbb{P}(S_n = 1\ |\ B_n = b + k) = \mathbb{P}(S_n = 1)
E(Yn)=ba+b=Y0\mathbb{E}(Y_n) = \frac{b}{a + b} = Y_0

This can also be observed in simulations. Here, blue lines ( --- ) represent the observed value of YnY_n for each time step nn \leq 100 for 70 different runs with the same initial conditions aa and bb . Red circles ( ) mark the average of these values across runs, for each time step, hence serve as an approximation of E(Yn)\mathbb{E}(Y_n) .

observed values of YnY_n with a=a = 1 and b=b = 2.

Figure 2: Study of the average asymptotic distribution for a Pólya urn experiment.

Going further, we can also study the convergence in distribution of the random variable YnY_n . First we observe that the probability distribution of YnY_n is directly linked to the one of BnB_n , that we expressed earlier.

P(Yn=b+ka+b+n)=P(Bn=b+k)=(nk)(a+bb)(a+b+n1b+k)ab(a+b)(b+k)\mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right) = \mathbb{P} (B_n = b + k) = \frac{\binom{n}{k} \binom{a+b}{b}}{\binom{a+b+n-1}{b+k}} \frac{ab}{(a+b)(b+k)}

We can further simplify the expression by making use of the Gamma and Beta functions. This yields the following result:

P(Yn=b+ka+b+n)=(nk)ab(a+bb)a+b1(a+b+n1b+k)(b+k)\mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right) = \binom{n}{k}\frac{ab \binom{a+b}{b}}{a + b} \frac{1}{\binom{a+b+n-1}{b+k}(b+k)} (4)(4) =(nk)Γ(a+b)Γ(a)Γ(b)Γ(b+k)Γ(a+nk)Γ(a+b+n)= \binom{n}{k} \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} \frac{\Gamma(b + k) \Gamma(a + n - k)}{\Gamma(a + b + n)}
P(Yn=b+ka+b+n)=(nk)β(b+k,a+nk)β(a,b)\mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right) = \binom{n}{k} \frac{\beta(b + k, a + n - k)}{\beta(a, b)}

Where the Gamma and Beta functions are defined as Γ:x(x1)!\Gamma: x \mapsto (x - 1)! and β:(a,b)Γ(a)Γ(b)Γ(a+b)=01xa1(1x)b1dx\beta: (a, b) \mapsto \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)} = \int_0^1 x^{a-1} (1-x)^{b-1}dx respectively.

Note that, up to a simple rearrangement of terms in Equation (4), we observe that YnY_n follows a Dirichlet-multinomal distribution with parameters α=(a,b)\alpha = (a, b) . And in fact, this distribution is also known as the multivariate Pólya distribution and can be generalized to urns with more than two color populations.

Asymptotic distribution

To prove a convergence in distribution, we want to look at the limit of the repartition function of YnY_n , defined as FYn(x)=P(Ynx)F_{Y_n}(x) = \mathbb{P}(Y_n \leq x) , when the time step nn \rightarrow \infty . Using the previous simplified expression, we have:

FYn(t)=k=0t(a+b+n)bP(Yn=b+ka+b+n)F_{Y_n} (t) = \sum_{k = 0}^{t (a + b+ n) - b} \mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right) =1β(a,b)01k=0t(a+b+n)b(nk)xb+k1(1x)a+nk1dx= \frac{1}{\beta(a,b)} \int_0^1 \sum_{k=0}^{t(a+b+n) - b} \binom{n}{k} x^{b+k-1} (1 - x)^{a+n-k-1} dx (5)(5) =1β(a,b)01xb1(1x)a1k=0t(a+b+n)b(nk)xk(1x)nkg(x,t,n,a,b)dx= \frac{1}{\beta(a,b)} \int_0^1 x^{b-1} (1 - x)^{a-1} \underbrace{\sum_{k=0}^{t(a+b+n) - b} \binom{n}{k} x^{k} (1 - x)^{n-k}}_{g(x, t, n,a,b)} dx

Thus g(x,t,n,a,b)g(x,t,n, a, b) is the only term that depends on nn . Its expression is clearly linked, and similar to, a binomial expansion: More specifically, if we define the random variable XnX_n with a binomial distribution of parameters nn and x[0,1]x \in [0, 1] , then we have exactly that g(x,t,n,a,b)=P(Xnt(a+b+n)b)g(x, t, n, a, b) = \mathbb{P}(X_n \leq t(a+b+n) - b) .

We can then directly apply the Demoivre-Laplace theorem, which derives from the Central Limit theorem, to XnX_n which yields that, for large nn ,

g(x,t,n,a,b)Φ(n(tx)nx(1x)+t(a+b)bnx(1x))g(x, t, n, a,b) \simeq \Phi\left(\frac{n (t - x)}{\sqrt{nx(1-x)}} + \frac{t(a+b) - b}{\sqrt{nx(1-x)}}\right)

where Φ\Phi is the repartition function of the standard Gaussian distribution, N(0,1)\mathcal{N}(0,1) , and, as such, is continuous and bounded. As a result, we have that:

limng(x,t,n,a,b)={1,if t>x0,if t<x\text{lim}_{n \rightarrow \infty} g(x, t, n, a,b) = \left\{\begin{array}{lr} 1, & \text{if } t > x\\ 0, & \text{if } t < x \end{array}\right.

By splitting the integral from Equation (5) at point t[0,1]t \in [0, 1] , we conclude that

limnFYn(t)=1β(a,b)0txb1(1x)a1dx\text{lim}_{n \rightarrow \infty} F_{Y_n} (t) = \frac{1}{\beta(a,b)} \int_0^t x^{b-1} (1 - x)^{a-1} dx

The right-side term is exactly the repartition function of a Beta distribution at point tt . Thus, the proportion of blue balls in the urn, YnY_n , converges in distribution to the Beta distribution with parameters b and a when nn grows.

Figure 3: Histograms of observed values of Yn=1000Y_{n = 1000} for 10000 runs with different initial conditions (a , b), compared to the density function of the β(b,a)\beta(b, a) distribution (represented as --- ).

Generalized Pólya urn

We won’t study such models here, but it is possible to extend the Pólya urn experiment to more flexible frameworks: Generalized Pólya urns follow the same workflow as the one we have been working with until now, but with different replacement values. More specifically, the experiment is described as follows:

Given an urn containing initially aa orange balls and bb blue balls, we pick one ball at random from the urn. If the ball is orange, place it back and add α\alpha new orange balls and β\beta new blue balls to the urn; If the ball is blue, do the same but with γ\gamma new orange balls and δ\delta respectively instead. We then repeat this sampling procedure nn times.

Such a model is often represented by its initial conditions (a , b), and its replacement matrix, R=(αβγδ)R = \begin{pmatrix} \alpha & \beta \\ \gamma & \delta \end{pmatrix} .

It is also possible to generalize the urn to an arbitrary number of colors, leading to a d×dd \times d replacement matrix. For instance, in the standard Pólya setting, we have d=2d = 2 and R=(1001)=I2R = \begin{pmatrix} 1 & 0 \\ 0& 1 \end{pmatrix} = I_2


  • “A Pólya Contagion Model for Networks”, M. Hayhoe et al, 2017
  • “A survey of random processes with reinforcement”, R. Pemantle, 2017
  • “On generalized Pólya urn models”, M. Chen and M. Kuba, 2011
  • “Pólya urn models: Lecture Notes”, N. Pouyanne, 2004
  • “Rate of convergence of Pólya’s urn to the Beta distribution”, J. Drinane, 2008
  • The dynamics of correlated novelties”, F. Tria et al, 2014
  • “Urnes aléatoires, populations en équilibre et séries génératrices”, B. Mallein, 2010