Stein and the Normal Distribution

Hello and happy 2021!

Inspired from this tweet, I wanted to understand the basics of Stein’s characterization of the Normal distribution.

With Stein’s idea, we can identify the distribution of a random variable by checking that it satisfies some condition in expectation. For example, for the standard normal, we have that XN(0,1) if and only if for all f with E[f]< we have:

E[xf(x)f(x)]=0.

The operator Af:=xff is then called the Stein operator and we can rewrite the result with this operator as: EP[Af]=0 for all fCb1 iff P is N(0,1).

This operator is not unique as we can always add P-measure zero parts; see Bf:=xff+x which satisfies

E[Bf]=E[Af]+E[x]=E[Af].

How can we show A characterises the standard Normal? A key identity is that the probability density function (PDF) of the N(0,1) satisfies P+xP=0. So, if P is indeed the PDF of the standard normal, then applying integration by parts to EP[f] and using the differential equation gives us Stein’s formula.

Now, if P is the PDF of any other distribution and it satisfies Stein’s formula for all fCb1, then integration by parts on EP[f] leads us back to the P+xP=0. That ODE is separable with solution Pexp(x2/2), which, up to the normalisation, is the PDF of the standard normal!

This strategy of deriving an ODE for the density function, getting its weak form by multiplying with a smooth function f and integrating can be repeated to get Stein operators for other distributions, e.g., the exponential, etc.

Footnotes