CSU Hayward

Statistics Department

Derivation and Applications of the Poisson Distribution


Preliminaries

Because our derivation of the Poisson distribution involves taking limits, it is best to put some standard results about limits on record as we begin. All limits in this section are taken as n approaches infinity. We begin with limiting expressions that involve the fundamental mathematical constant e = 2.71828..., sometimes referred to as the base of natural logarithms.

The following table, generated by Minitab from illustrative numbers put into the first column of a worksheet, shows how the limiting processes proceeds. First, look at the column headed c = 1. As n increases, (1 + 1/n)n increases also, approaching, but never exceeding, e. If c = 3, the limit is e3 = 20.0855; if c = –1, the limit is e–1 = 1/e = 0.367879.

 MTB > set c1

 DATA> 1 2 5 10 100 1000

 DATA> end

 MTB > let c2 = (1+1/c1)**c1

 MTB > let c3 = (1+3/c1)**c1

 MTB > let c4 = (1–1/c1)**c1

 MTB > print c1-c4



Row       n       c=1       c=3      c=-1



   1      1   2.00000    4.0000  0.000000

   2      2   2.25000    6.2500  0.250000

   3      5   2.48832   10.4858  0.327680

   4     10   2.59374   13.7858  0.348678

   5    100   2.70481   19.9955  0.366032

   6   1000   2.71692   20.0765  0.367695

   7  10000   2.71815   20.0846  0.367861

Next we deal with a limit involving the binomial coefficient C(n, k), the "combinations of n things taken k at a time."

C(n, k) = n! / [k!(nk)!]

= (n)(n–1)(n–2) ... (nk+1) / k!,

where there are k factors in the numerator of the right-hand expression. From this is it easy to see that, as n approaches infinity,
lim C(n, k) / nk = lim (1/k!) [n(n–1)(n–2) ... (nk+1)] / nk

= (1/k!) lim [n/n] [(n–1)/n] [(n–2)/n] ... [(nk+1)/n]

= (1/k!) lim [1 – 1/n][1 – 2/n] ... [1 – (k–1)/n]

= 1/k!,

because each of the k – 1 bracketed factors in the next-to-last expression approaches 1. (Notice that k is a fixed number as n becomes larger.)

An Approximately Binomial Model

Suppose that a radioactive source of very long half-life emits particles into a Geiger counter at an average rate of 3 per second. The number X of particles actually seen in any particular one-second interval is a random variable. The average number seen is 3, but the actual number will frequently be 2 or 4, and any of the values 0, 1, 2, 3, 4, 5, .... is possible. We can use the binomial distribution to approximate the distribution of the random variable X.

In order to construct the binomial approximation, let a one-second interval of time be divided into 100 consecutive intervals of length 0.01 sec. each. Consider each of these small intervals as a binomial trial. Because n = 100 and E(X) = np = 3, we conclude that we must have p = 0.03. This construction requires that particles arrive in the small intervals independently, that we regard a small interval as a "success" if it contains a particle, and that the probability P(Success) is the same for each small interval, namely P(Success) = p = 0.03.

Thus, the approximate distribution of X is given by the expression

P(X = k) = C(100, k)(0.03) k(0.97)100–k.

This expression is only approximate because it does not take into account the possibility that there might be two or more particles in one of the small intervals. A "double hit" in a small interval is very unlikely because the probability of a single hit is about 0.03 and so, by independence, the probability of a double hit in any one small interval should be something like (0.03)2 = 0.0009. Three or more hits in a small interval are even less likely.

Thus there are two reasons that the above expression for P(X = k) is somewhat unsatisfactory.

A Minitab printout of this approximating binomial distribution is shown in the column headed n = 100 below. (The next column is explained just below the printout. The last column refers to the Poisson distribution derived in the next section.)


 MTB > set c11

 DATA> 1:16

 DATA> end

 MTB > pdf c11 c12;

 SUBC> bino 100 .03.

 MTB > pdf c11 c13;

 SUBC> bino 1000 .003.

 MTB > pdf c11 c14;

 SUBC> pois 3.

 MTB >  print c11-c14



 Row    k       n=100     n=1000      Pois(3) 



   1    0    0.047553   0.049563     0.049787

   2    1    0.147070   0.149137     0.149361

   3    2    0.225153   0.224154     0.224042

   4    3    0.227474   0.224379     0.224042

   5    4    0.170606   0.168284     0.168031

   6    5    0.101308   0.100869     0.100819

   7    6    0.049610   0.050333     0.050409

   8    7    0.020604   0.021507     0.021604

   9    8    0.007408   0.008033     0.008102

  10    9    0.002342   0.002664     0.002701

  11   10    0.000659   0.000794     0.000810

  12   11    0.000167   0.000215     0.000221

  13   12    0.000038   0.000053     0.000055

  14   13    0.000008   0.000012     0.000013

  15   14    0.000002   0.000003     0.000003

  16   15    0.000000   0.000001     0.000001

  17   16    0.000000   0.000000     0.000000

We could minimize the inaccuracy due to multiple hits by making the small intervals more numerous and the probability of success in any one of them correspondingly smaller. For example, we could deal with 1000 intervals of a millisecond each to obtain  P(X = k) = C(1000, k)(0.003) k(0.997) nk.   The resulting computation is even more tedious, but the possibility of multiple hits is now truly minuscule. The results are given in the column headed n = 1000 of the printout above.

Browser note: The remainder of this document uses Microsoft's "symbol" font to print the lower-case Greek letter lambda (l). If you see the Latin letter "ell" (l) in parentheses in the previous sentence, your installation is not using this font.

An Exact Probability Model

Suppose now that the average number of particles arriving in an interval of length 1 sec. is l so that we seek the distribution of a random variable X with E(X) =l. If we let the small intervals of the previous section get ever smaller and more numerous, we are talking about binomial distributions with n trials and P(Success) = l/n. Taking the limit as n approaches infinity, we have

P(X = k) = lim C(n, k)(l/n)k(1 – l/n)nk

= lk lim [C(n, k)/nk][1 + (–l)/n]n [1 – l/n]k.

We have shown above that the first bracketed factor approaches 1/k!, that the second approaches el, and that the third approaches 1. Thus,

P(X = k) = el lk / k!,   k = 0 , 1, 2, 3, ... .

This is the probability density function of the Poisson distribution. Because we derived the Poisson distribution as a limit of binomials all with mean l, it is not surprising that l is the mean of the Poisson distribution as well. It can be shown that the variance of the Poisson distribution is also numerically equal to l. Thus, a Poisson random variable with l = 4 counts per minute will have a standard deviation of 2 counts per minute.

Siméon Poisson was a French mathematician of the 19th century who developed this distribution. (Poisson is the French word for fish; it is pronounced something like PWAH-ssohn, and nothing at all like the English word poison. You probably don't say FROOD for Freud or BATCH for Bach, so it's only fair not to say POY-sson for Poisson.]

Applications of the Poisson Distribution

An extraordinarily large number of natural and social phenomena have been successfully modeled using the Poisson distribution.

The conditions in each case are the same.

As with any probability model, the application of the Poisson family of distributions to any particular situation may not be perfect (except perhaps for radioactive decay of stable samples), but the situations that can be satisfactorily modeled by Poisson distributions are extraordinarily many and varied.

Problems

1.  In a one-second period of time, a radioactive source emits a random number X of particles into a counter, where X has a Poisson distribution with mean 3.

2.  Consider the radioactive source in Problem 1. Suppose that we are interested in the number of particles counted in a two-second time period.

3.  A particular batch of steel wire has flaws occurring along the length of the wire at random locations according to a Poisson distribution at a rate of 2 flaws per meter.

4.  One hundred specimens of volume 0.1 ml each are taken from a liquid suspension containing bacteria of a particular kind. Each specimen is spread onto nutrient jell in a different culture dish. Two days later we find that cultures grew in 60 of the dishes. Assume that a culture grows if, and only if, the specimen happened to have one or more bacteria in it.

5.  Let Y be a Poisson random variable with E(Y) = 50.

6.  The number of "permutations (ordered samples) of n things taken k at a time" is  P(n, k) = n! / (nk)!.  Show that  lim P(n, k)/nk  approaches 1 as n approaches infinity (for fixed k).

7.  Try Quiz Question 7 on this site.


Copyright © 1999 by Bruce E. Trumbo. All rights reserved. Intended for instructional use at California State University, Hayward. Please request permission in advance for other uses: btrumbo@csuhayward.edu.

BT/CF: Posted 11/13/1998, Last revised (mainly to include problems) 07/08/99