Law of total expectation – Wikipedia

before-content-x4

Proposition in probability theory

after-content-x4

The proposition in probability theory known as the law of total expectation,[1] the law of iterated expectations[2] (LIE), Adam’s law,[3] the tower rule,[4] and the smoothing theorem,[5] among other names, states that if

X{displaystyle X}

is a random variable whose expected value

E(X){displaystyle operatorname {E} (X)}

is defined, and

Y{displaystyle Y}

is any random variable on the same probability space, then

i.e., the expected value of the conditional expected value of

X{displaystyle X}

given

after-content-x4
Y{displaystyle Y}

is the same as the expected value of

X{displaystyle X}

.

One special case states that if

{Ai}i{displaystyle {left{A_{i}right}}_{i}}

is a finite or countable partition of the sample space, then

Note: The conditional expected value E(X | Z) is a random variable whose value depend on the value of Z. Note that the conditional expected value of X given the event Z = z is a function of z. If we write E(X | Z = z) = g(z) then the random variable E(X | Z) is g(Z). Similar comments apply to the conditional covariance.

Example[edit]

Suppose that only two factories supply light bulbs to the market. Factory

X{displaystyle X}

‘s bulbs work for an average of 5000 hours, whereas factory

Y{displaystyle Y}

‘s bulbs work for an average of 4000 hours. It is known that factory

X{displaystyle X}

supplies 60% of the total bulbs available. What is the expected length of time that a purchased bulb will work for?

Applying the law of total expectation, we have:

where

Thus each purchased light bulb has an expected lifetime of 4600 hours.

Proof in the finite and countable cases[edit]

Let the random variables

X{displaystyle X}

and

Y{displaystyle Y}

, defined on the same probability space, assume a finite or countably infinite set of finite values. Assume that

E[X]{displaystyle operatorname {E} [X]}

is defined, i.e.

min(E[X+],E[X])<{displaystyle min(operatorname {E} [X_{+}],operatorname {E} [X_{-}])

. If

{Ai}{displaystyle {A_{i}}}

is a partition of the probability space

Ω{displaystyle Omega }

, then

Proof.

If the series is finite, then we can switch the summations around, and the previous expression will become

If, on the other hand, the series is infinite, then its convergence cannot be conditional, due to the assumption that

min(E[X+],E[X])<.{displaystyle min(operatorname {E} [X_{+}],operatorname {E} [X_{-}])

The series converges absolutely if both

E[X+]{displaystyle operatorname {E} [X_{+}]}

and

E[X]{displaystyle operatorname {E} [X_{-}]}

are finite, and diverges to an infinity when either

E[X+]{displaystyle operatorname {E} [X_{+}]}

or

E[X]{displaystyle operatorname {E} [X_{-}]}

is infinite. In both scenarios, the above summations may be exchanged without affecting the sum.

Proof in the general case[edit]

Let

(Ω,F,P){displaystyle (Omega ,{mathcal {F}},operatorname {P} )}

be a probability space on which two sub σ-algebras

G1G2F{displaystyle {mathcal {G}}_{1}subseteq {mathcal {G}}_{2}subseteq {mathcal {F}}}

are defined. For a random variable

X{displaystyle X}

on such a space, the smoothing law states that if

E[X]{displaystyle operatorname {E} [X]}

is defined, i.e.

min(E[X+],E[X])<{displaystyle min(operatorname {E} [X_{+}],operatorname {E} [X_{-}])

, then

Proof. Since a conditional expectation is a Radon–Nikodym derivative, verifying the following two properties establishes the smoothing law:

The first of these properties holds by definition of the conditional expectation. To prove the second one,

so the integral

G1XdP{displaystyle textstyle int _{G_{1}}X,doperatorname {P} }

is defined (not equal

{displaystyle infty -infty }

).

The second property thus holds since

G1G1G2{displaystyle G_{1}in {mathcal {G}}_{1}subseteq {mathcal {G}}_{2}}

implies

Corollary. In the special case when

G1={,Ω}{displaystyle {mathcal {G}}_{1}={emptyset ,Omega }}

and

G2=σ(Y){displaystyle {mathcal {G}}_{2}=sigma (Y)}

, the smoothing law reduces to

Alternative proof for

E[E[XY]]=E[X].{displaystyle operatorname {E} [operatorname {E} [Xmid Y]]=operatorname {E} [X].}

This is a simple consequence of the measure-theoretic definition of conditional expectation. By definition,

E[XY]:=E[Xσ(Y)]{displaystyle operatorname {E} [Xmid Y]:=operatorname {E} [Xmid sigma (Y)]}

is a

σ(Y){displaystyle sigma (Y)}

-measurable random variable that satisfies

for every measurable set

Aσ(Y){displaystyle Ain sigma (Y)}

. Taking

A=Ω{displaystyle A=Omega }

proves the claim.

Proof of partition formula[edit]

where

IAi{displaystyle I_{A_{i}}}

is the indicator function of the set

Ai{displaystyle A_{i}}

.

If the partition

{Ai}i=0n{displaystyle {{A_{i}}}_{i=0}^{n}}

is finite, then, by linearity, the previous expression becomes

and we are done.

If, however, the partition

{Ai}i=0{displaystyle {{A_{i}}}_{i=0}^{infty }}

is infinite, then we use the dominated convergence theorem to show that

Indeed, for every

n0{displaystyle ngeq 0}

,

Since every element of the set

Ω{displaystyle Omega }

falls into a specific partition

Ai{displaystyle A_{i}}

, it is straightforward to verify that the sequence

{i=0nXIAi}n=0{displaystyle {left{sum _{i=0}^{n}XI_{A_{i}}right}}_{n=0}^{infty }}

converges pointwise to

X{displaystyle X}

. By initial assumption,

E|X|<{displaystyle operatorname {E} |X|

. Applying the dominated convergence theorem yields the desired result.

See also[edit]

References[edit]

after-content-x4