Bayesian Inference – Episode 2

Poisson, Gamma and the Objective

Last time we set the objective: i.e. to find the posterior distribution of the expected value, from a Poisson distributed set of variables using a Gamma distribution of the mean as the prior information.

Caution: Math Ahead!

\text{Poisson distribution function}:  f(x_i|\mu) = \frac{\mu^{x_i}e^{-\mu}}{x_i!}; i = 1, ..., n. \\ \\ \text{the joint likelihood is} f(x_1, ..., x_n|\mu) = \displaystyle \prod_i f(x_i|\mu) = \displaystyle \prod_i \frac{\mu^{x_i}e^{-\mu}}{x_i!} \\ \\   =  \frac{\mu^{\sum_i x_i}e^{-n\mu}}{\displaystyle \prod_i x_i!} \\ \\  \text{Gamma Distribution function}: f(\mu|a,b) = \frac{b^a}{\Gamma(a)}\mu^{a-1}e^{-b\mu} \text{  for } \mu > 0

So we have a function and a prior. We will obtain the posterior using Bayes’ theorem.

f(\mu|x_1, ..., x_n, a, b) = \frac{f(x_1, ..., x_n|\mu)f(\mu|a,b)}{\int_0^{\infty}f(x_1, ..., x_n|\mu)f(\mu|a,b) d\mu} =  \frac{\frac{b^a}{\Gamma(a)} \frac{1}{ \displaystyle \prod_ix_i!} \mu^{a + \sum_i x_i - 1} e^{-(b+n)\mu}}        {\frac{b^a}{\Gamma(a)} \frac{1}{ \displaystyle \prod_ix_i!} \int_0^{\infty}\mu^{a + \sum_i x_i - 1} e^{-(b+n)\mu} d\mu} \\ \\   = \frac{ \mu^{a + \sum_i x_i - 1} e^{-(b+n)\mu}} {\int_0^{\infty} \mu^{a + \sum_i x_i - 1} e^{-(b+n)\mu} d\mu} }

The integral in the denominator will be a constant. Therefore,

f(\mu|x_1, ..., x_n, a, b) \propto \mu^{a + \sum_i x_i - 1} e^{-(b+n)\mu}

Look at the above equation carefully. Don’t you see the resemblance with a Gamma p.d.f, sans the constant?

f(\mu|x_1, ..., x_n, a, b) \sim Gamma (a + \sum_i x_i , b+n)

End Game

So if you know a prior gamma, you can get a posterior gamma based on the above equations. Recall the table from the previous post. The Sum of xi is 42000 and n is 7. Assume Gamma(6000,1) as a prior. This leads to a posterior of Gamma( 48000,8). Mean = 48000/8 and variance = 48000/82. The standard error becomes the square root of variance divided by the square root of n.

\sqrt{\frac{4800/8^2}{7}} = 10.35  \text{confidence interval}, 4800/8 \pm 10.35 = (6010.35, 5989.65)

Expanding the Prior Landscape

Naturally, you may be wondering why I chose a prior that has a mean of 6000, or where I got that distribution from etc. And these are valid arguments. The prior was arbitrarily chosen to perform the calculations. In reality, you can get it from several sources – from similar shops in the town, scenarios created for worst (or best) case situations and so on. Rule number one in the scientific process is to challenge, and two is to experiment. So, we run a few cases and see what happens.

Imagine you come up with a prior of Gamma(8000,2). What does this mean? A distribution with a mean of 4000 and a variance of 2000 (standard deviation 44). [Recall mean = a/b; variance = a/b2 ]. The original distribution (Poisson) remains the same because it is your data.

Take another, Gamma(8000,1). A distribution with a mean of 8000 and a variance of 8000 (standard deviation 89).

Yes, the updated distributions do change positions, but they still hang around the original (from own data) probability density created by the Poisson function.

You may have noticed the power of Bayesian inference. The prior information can change expectations on the future yet retain the core elements.