Simple Functions and Random Variables#
Simple Random Variable
Let (Ω,F,P) be a Probability Space i.e. P(Ω)=1. A simple random variable X:Ω→R is a real valued function that only takes on a finite number of values x1,…,xp and such that the set
{ω∈Ω:X(ω)=xi}∈F
One way to write such a function is to finitely partition Ω into disjoint sets {Ai}i=1p i.e. i=1⋃pAi=Ω and Ai∩Aj=∅ and write
X(ω)=i=1∑pxi1[ω∈Ai]Then we can say that the probability that X is equal to xi is
P(X=xi)=P({ω∈Ω:X(ω)=xi})=P(Ai)Furthermore, this allows us to define the expectation of the simple random variable X to be
EX=i=1∑pxiP(X=xi)Simple Measurable Function
Let (Ω,F,μ) be a Probability Space. A simple function F:Ω→R is s.t.
F(ω)=i=1∑pxi1[ω∈Bi] Bi∈Fwhich is the linear combination of indicator functions. The sets Bi need not be disjoint, but given a simple function, we can define it in terms of disjoint Bi.
Then, we define the integral of a simple function to be
∫Fdμ:=i=1∑pxiμ(Bi)Measurable Functions and Random Variables#
To extend the above idea of a simple random variable, we want to replace the finite xi with any Borel set B⊂R. We need two Measureable Spaces (X,X) and (Y,Y).
Measurable Function
A function f:X→Y is said to be measurable (with respect to X/Y) if f−1(B)∈X for any B∈Y. If Y=R, then we say that f is a X-measurable.
Typically, the σ-Fields of interest are the Borel σ-Fields and it is sometimes writen as (X,B(X)) when we have a topological space. Moreover, the space (X,X) is typically
taken to be (R,B(R)) or (R+,B(R+)). In this case, we say that f is Borel Measurable.
If we replace B(R) with Mλ(R), the set of Lebesgue measurable subsets of R, then we say
f is Lebesgue Measurable.
Cool facts about Measurable Functions#
-
Inverse images of set functions preserve set operations. i.e. for f:X→Y and A,Ai⊂Y,i∈I,
f−1(i∈I⋃Ai)=i∈I⋃f−1(Ai) and f−1(Y∖A)=X∖f−1(A)
For a measurable set function f, this implies that {f−1(B):B∈Y} is a σ-Field and is contained in X . Hence, we want Y to be no larger than X to have measurable functions. Furthermore, this can be used to show that the measurability of f can be established by looking only at a collection of sets A⊂Y that generates Y.
Example
let A be the set of all half-lines At=(−∞,t] for t∈R will generate B(R). Thus, f is measurable as long as the sets {x:f(x)≤t} are measurable.
-
For any A∈X, the indicator functions f(x)=1[x∈A] are measurable. The σ-Field generated by f−1 is simply {∅,A,Ac,X}⊂X.
-
For measurable functions f,g:X→R, the functions f+g and fg are measurable.
-
For measurable functions, {fi}i=1∞ from X to R, the following are also measurable: supifi, infifi, ilimsupfi, iliminffi and ilimfi if it exists.
Proof
In set notation, {x:supifi(x)≤t}=i=1⋂∞{x:fi(x)≤t} where the righthand side is a countable intersection of measurable sets and hence measurable. Similarly, {x:infifi(x)≤t}=i=1⋃∞{x:fi(x)≤t} and ilimsupfi=iinfj≥isupfi and iliminffi=isupj≥iinffi. If ilimfi exists then limsupfi=ilimfi=iliminffi. □
-
Let f:X→R be a continuous function, then it is measurable.
Proof
If U is an open set in R, then f−1(U) is open in X (definition of a continous function between two topological spaces). Thus, the set f−1(U) is measurable. Since the open sets of R generate B(R), the function f is measurable. □
-
Given a collection of functions fi:X→Y,i∈I, we can make them measurable by constructing the measurable space (X,X) where σ({fi}i∈I)⊆X is the σ-field generated by the sets fi−1(B) for all i and B∈Y.
Almost Surely / Almost Everywhere
Let (Ω,F,μ) be a measure space. For two functions f,g:Ω→R, we say that f=g a.e. (almost everywhere) when the set N={ω:f(ω)=g(ω)} has measure μ(N)=0.
In probability theory, “almost everywhere” is replaced with “almost surely” abbreviated a.s. and it is equivalently is written “with probability 1” or w.p.1.
Example
Let ([0,1],B,λ) be the standard measure space of Borel sets on the unit interval with Lebesgue measure. Let f(t)=0 for all t∈(0,1] and g(t)=0 on (0,1]∖Q and g(t)=1 on (0,1]∩Q.
Then we have f=g a.e. that is λ((0,1)∩Q)=0. To prove that λ(Q)=0, we enumerate each rational number {qm}m=1∞ and surround each qm with an interval (qm−2mϵ,qm+2mϵ). For ϵ>0, m=1∑∞λ((qm−2mϵ,qm+2mϵ))=m=1∑∞2m−1ϵ=2ϵ. Since ϵ is arbitrary, we have λ(Q)=0.
Integration#
We will consider measurable functions mapping from (Ω,F) to [−∞,∞], which is called the extended real line. This allows us to handle sets such as f−1(∞).
Notation :
fi↑fWe use fi↑f to represent a sequence of functions {fi} that are increasing and converge to f for every ω. This implies that fi(ω)→f(ω) and fi(ω)≤fi+1(ω) for all ω.
Theorem
Let (Ω,F) be a measurable space and A a π-system that generates F ( σ(A)=F ). Let V be a linear space that contains
- all indicators 1Ω and 1A for each A∈A
- all functions f such that ∃fi∈V s.t. fi↑f
Then V contains all measurable functions.
Proof
First, 1A∈A. Let L={B∈F:1B∈V}, we claim that L is a λ-system. Since A⊂L, by Dynkin π-λ Theorem, σ(A)⊆L. By the definition of L, we have that V⊆F, thus L=F, so every indicator function 1B is in V.
Let f be a non-negative measurable function, we can define fi=2−i⌊2if⌋. Each fi is a finite linear combination of indicator functions and hence fi∈V. Furthermore, fi↑f and thus f∈V.
Lastly, for a general measurable f, we write f=f+−f− where
f+(ω)f−(ω)={f(ω)0if f(ω)≥0otherwise={−f(ω)0if f(ω)<0otherwisewhich are both non-negative measurable functions. □
Integral of a Measurable Function
For a non-negative measurable function f:Ω→[0,∞] on the measure space (Ω,F,μ), we define
∫fdμ=sup[i∈I∑{ω∈Aiinff(ω)}μ(Ai)]where the supremum is taken over all finite partitions {Ai}i∈I of Ω.
Inside the square brackets is the integral of the simple function that assigns a value of infω∈Aif(ω) for the set A_i. Hence, for a non-negative f, we consider all simple functions g such that 0≤g≤f and define the integral to be the supremum of the integral of g . To extend this to all measurable functions f, we write f=f+−f− then
∫fdμ=∫f+dμ−∫f−dμMonotone Convergence Theorem for Simple Functions
Let (Ω,F,μ) be a measure space, and let f≥0 be measureable, and fn≥0 ∀n∈N be a sequence of measureable simple functions such that fn↑f. Then, ∫fndμ↑∫fdμ.
Proof
Since fn↑f and f is integrable, then ∫fndμ↑c∈[0,∞) and c≤∫fdμ. To prove the other side of the inequality, let g be any simple measureable function s.t. 0≤g≤f. Our goal is to show that ∫gdμ≤c as this will imply that c≥∫fdμ.
First, we write g=i∈I∑ai1Ai where Ai are disjoint partition of Ω and similarly fn=j∈J∑bj(n)1Bj(n). Consequently,
fn=i∈I∑fn1Ai=i,j∑bj(n)1Ai∩Bj(n)and ∫fndμ=i∈I∑∫Aifndμ. Hence, we want to show that for each i∈I that
n→∞lim∫Aifndμ≥∫Aigdμ=aiμ(Ai)First, if ai=0, then the eqaulity above must hold. If ai>0, we can divide by ai and without loss of generality, we take ai=1 and consider g=1A for some set A.
For any ϵ>0, let Cn={x∈A:fn(x)>1−ϵ}. Then Cn↑A, which means
C1⊆C2⊆⋯⊆A and n=1⋃∞Cn=ABy countable additivity of μ
μ(Cn+1)=μ(C1)+i=1∑nμ(Ci+1∖Ci)and μ(Cn)↑μ(A). Since ∫fndμ≥(1−ϵ)μ(Cn), we have that c≥(1−ϵ)μ(A)=(1−ϵ)∫gdμ. Take ϵ to zero gives c≥∫gdμ. □
Let (Ω,F,μ) be a measure space, and let f,g:Ω→[−∞,∞] be measureable and f=g a.e. Then, f is integrable if and only if g is integrable. If f and g are integrable, then ∫fdμ=∫gdμ.
Proof
Let f=g on Ω∖N where μ(N)=0. For any measureable function h:Ω→[−∞,∞], we can consider when ∫Ωhdμ=∫Ω∖Nhdμ coincide.
- True if h is an indicator function because μ(A)=μ(A∖N)
- Also true for h being a non-negative simple function.
- From MCT for Simple Functions, we can take non-negative simple functions h to any non-negative measureable function.
- Lastly, write ∫hdμ=∫h+dμ−∫h−dμ shows that the integrals will
coincide for any integrable measureable function.
To complete the proof, we note that
∫fdμ=∫Ω∖Nfdμ=∫Ω∖Ngdμ=∫gdμThis result basically tells us that we can modify functions on a set of measure zero without breaking anything. □
Monotone Convergence Theorem#
Monotone Convergence Theorem
Let (Ω,F,μ) be a measure space and let {fi}i=1∞ be measurable functions from Ω to [−∞,∞] such that fi↑f μ-a.e. and ∫f1dμ>−∞. Then ∫fidμ↑∫fdμ. (This means that we can swap the limit and the integral)
Proof
First, we need to check that f is measureable. For c∈R, we consider the sets (c,∞], which generate the Borel σ-Field. Since fi↑f, f−1((c,∞])=i=1⋃∞fi−1((c,∞]) and fi−1((c,∞])∈F, thus f is measureable. (by the first cool fact about measurable functions)
Next, assume that f1≥0, and for each fi we take simple functions gij such that gij↑fi as j→∞. Thus, by MCT for Simple Functions,
∫gijdμ↑∫fidμ. Furthermore, let gi∗=max{g1i,⋯,gii}. These gi∗ are simple functions and gi∗↑f. Once again, MCT for Simple Functions implies that ∫gi∗dμ↑∫fdμ. But since gi∗≤fi by construction, ∫gi∗dμ≤∫fidμ≤∫fdμ. Thus, ∫fidμ↑∫fdμ. This means we are done for non-negative fucntions (f1≥0).
Now, we assume that f≤0. In this case fi↑f implies that −fi↓−f. Let h=−f and hi=−fi, we have 0≤∫hdμ≤∫hidμ. Next, note that 0≤h1−hi↑h1−h. Applying the above result gives that ∫(h1−hi)dμ↑∫(h1−h)dμ. Since all of the h have finite integrals, we are allowed to subtract to get that ∫hidμ↓∫hdμ and thus ∫fidμ↑∫fdμ.
For a general function f=f+−f−, we have fi+↑f+ and fi−↓f− and ∫f−dμ<∞. So by the above special cases, ∫fi+dμ↑∫f+dμ and ∫fi−dμ↓∫f−dμ. Finally, ∫fidμ↑∫fdμ. □
We only require fi↑f to hold almost everywhere to establish the result. Hence convergence can fail on a set of measure (probability) zero and we still have convergence of the integrals.
Secondly, we can redo the above proof for fi↓f with ∫f1dμ<∞ to get a similar result for decreasing sequence.
Fatou’s Lemma #
Fatous’ Lemma
Let (Ω,F,μ) be a measure space and let {fi}i=1∞ be non-negative measurable functions from Ω to [−∞,∞]. Then, ∫liminffidμ≤liminf∫fidμ
Proof
Recall that i→∞liminf=jsupi>jinffi. Hence, let gj=infi≥jfi. Then, gj↑i→∞liminffi and f1≥0 by assumption. So MCT for Simple Functions says that ∫gjdμ↑∫i→∞liminffidμ. By construction, gj≤fi for any i≥j, thus ∫gjdμ≤∫fidμ for any i≥j and subsequently, ∫gjdμ≤i≥jinf∫fidμ. Taking j→∞ gives j→∞lim∫gjdμ=∫liminffidμ≤liminf∫fidμ. □
Dominated Convergence Theorem#
Dominated Convergence Theorem
Let (Ω,F,μ) be a measure space and let {fi}i=1∞ and g be measurable functions and absolutely integrable. If ∣fi∣≤g for all i and fi(ω)→f(ω) for each ω∈Ω (i.e. pointwise convergence), then f is absolutely integrable and ∫fidμ→∫fdμ.
Proof
Let fi∧=infj≥ifj and fi∨=supj≥ifj. Then fi∧≤fi≤fi∨. We have that fi∧↑f and ∫f1∧dμ≥−∫gdμ>−∞. So MCT for Simple Functions implies that ∫fi∧dμ↑∫fdμ.
Do the same for fi∨, we have that fi∨↓f and hence that ∫fi∨dμ↓∫fdμ. Since ∫fi∧dμ≤∫fidμ≤∫fi∨dμ, we have the desired result that ∫fidμ→∫fdμ. □
Lebesgue-Stieltjes Measure#
Let (X,X) and (Y,Y) be two measurable spaces. Let ψ:X→Y be a measurable function and μ be a measure on X. Then we can define ν=μ∘ψ−1 to be a measure of Y. This allows us to turn Lebesgue measure into Lebesgue-Stieltjes
measures.
Theorem
Let F:R→R be non-constant, right-continuous, and non-decreasing. Then, there exists a unique measure dF on R such that for all a,b∈R with a<b,
dF((a,b])=F(b)−F(a)
Proof
Let F(∞)=x→∞limF(x) and F(−∞)=x→−∞limF(x). We define an open
interval I=(F(−∞),F(∞)) and define g(y)=inf{x∈R:y≤F(x)}. We want to define dF to be λ∘g−1 where λ is Lebesgue Measure on R, so we need to show that this makes sense.
We first show that g is left-continuous and non-decreasing and for y∈I and x∈R,g(y)≤x if and only if y≤F(x). To show this, fix a y∈I and let Jy={x∈R:y≤F(x)}. As F is non-decreasing, if x∈Jy and x′≥x then x′∈Jy. As F is right-continuous, if xn∈Jy and xn↓x then x∈Jy. Therefore, Jy=[g(y),∞) and g(y)≤x if and only if y≤F(x). Secondly, for y≤y′, we have that Jy′⊆Jy and thus g(y)≤g(y′). So if yn↑y, then Jy=n=1⋂∞Jyn and thus g(yn)→g(y), which implies that g is left continuous and non-decreasing.
From the above, g is Borel Measurable and thus defining dF=λ∘g−1 gives us that
dF((a,b])=λ({y:g(y)>a,g(y)≤b})=λ((F(a),F(b)])=F(b)−F(a)Furthermore, this measure, dF, is unique by using the same arguments as before for Lebesgue Measure. □
In the case that F:R→[0,1] such that the interval I=[0,1], we have a cumulative distribution function, which induces a measure on the real line. This allows us to do things like integrate with respect to such measures——i.e. take an expectation.
Radon Measure#
Definition
Let (Ω,B,μ) be a measure space where B is the Borel σ-field. The measure μ is said to be a Radon Measure if μ(K)<∞ for all compact K∈B.
-
dF is a Radon Measure.
-
Every non-zero Radon Measure on B(R) can be written as dF=λ∘g−1 for some F.
If μ is a Radon Measure on R, then we can define F as
F(x)={μ((0,x])−μ((x,0])if x>0if x<0
Thus, F(b)−F(a)=μ((a,b]) for a<b and hence μ=dF by uniqueness.