In the previous lesson, we learned that a measure is a mathematical object that assigns a notion of size to sets.
We saw examples:
- Counting measure
- Length
- Area
- Volume
- Probability
However, we never answered an important question:
How do we rigorously define “length” for arbitrary subsets of the real line?
For ordinary intervals the answer is obvious.
For complicated sets, infinite unions of intervals, fractals, and strange subsets of the real numbers, the answer is not obvious at all.
The solution is one of the greatest achievements in mathematics:
Lebesgue Measure.
This is the measure that eventually becomes the foundation of:
- Modern integration
- Probability theory
- Stochastic processes
- Statistical theory
- Bayesian inference
- Bayesian nonparametrics
The Problem with Ordinary Length
Consider an interval:
$$[0,5]$$
Its length is:
$$5$$
Similarly:
$$[3,8]$$
has length:
$$5$$
No problems so far.
Now consider:
$$[0,1]\cup[3,4]$$
Its length should be:
$$1+1=2$$
Again this seems straightforward.
But what about infinitely many intervals?
For example:
$$\bigcup_{n=1}^{\infty}\left(\frac1{n+1},\frac1n\right)$$
This union equals:
$$\left(0,1\right)$$
The total length should be:
$$1$$
How do we rigorously justify this?
We need a more general framework.
What We Want From Length
A good notion of length should satisfy:
Non-negativity
$$\mu(A)\ge0$$
Empty set
$$\mu(\emptyset)=0$$
Countable Additivity
$$\mu\left(\bigcup_{i=1}^{\infty}A_i\right)=\sum_{i=1}^{\infty}\mu(A_i)$$
for disjoint sets.
These are exactly the measure axioms.
Therefore:
Length should be a measure.
The Key Idea of Lebesgue
Instead of defining length directly, Lebesgue asked:
Can we define length for intervals first and then extend it to more complicated sets?
This is precisely what Lebesgue measure accomplishes.
For intervals:
$$\lambda([a,b])=b-a$$
Examples:
$$\lambda([0,1])=1$$
$$\lambda([2,7])=5$$
$$\lambda([-3,4])=7$$
The symbol:
$$\lambda$$
is commonly used for Lebesgue measure.
The First Requirement
Any reasonable notion of length should agree with ordinary geometry.
Therefore:
$$\lambda([a,b])=b-a$$
must hold.
This is the starting point.
Everything else is built from this simple rule.
Measuring Open Intervals
For open intervals:
$$(a,b)$$
the length is also:
$$\lambda((a,b))=b-a$$
Examples:
$$\lambda((0,1))=1$$
$$\lambda((3,8))=5$$
Notice that endpoints do not matter.
Why Endpoints Have Length Zero
Consider a single point:
$${x}$$
Its length should be:
$$0$$
Since:
$$[0,1]={0}\cup(0,1)$$
and length remains:
$$1$$
the point contributes nothing.
Therefore:
$$\lambda({x})=0$$
for every real number.
Finite Sets Have Length Zero
Suppose:
$$A={1,5,9}$$
Then:
$$\lambda(A)=0+0+0=0$$
Therefore:
$$\lambda(A)=0$$
Every finite subset of the real line has zero Lebesgue measure.
Countable Sets Have Length Zero
Now consider:
$$\mathbb N={1,2,3,\ldots}$$
Every individual point has measure zero.
Since the set is countable:
$$\lambda(\mathbb N)=0$$
Similarly:
$$\mathbb Z$$
has measure zero.
Rational Numbers Have Measure Zero
One of the most surprising results in mathematics is:
$$\lambda(\mathbb Q)=0$$
even though:
$$\mathbb Q$$
contains infinitely many points.
The rational numbers are countably infinite.
Countable unions of measure-zero sets remain measure zero.
Therefore:
$$\lambda(\mathbb Q)=0$$
Irrational Numbers Have Full Measure
Inside:
$$[0,1]$$
we know:
$$\lambda([0,1])=1$$
Since:
$$\lambda(\mathbb Q\cap[0,1])=0$$
the irrationals occupy all the length:
$$\lambda(\mathbb R\setminus\mathbb Q\cap[0,1])=1$$
This is astonishing.
Almost every real number is irrational.
“Almost Everywhere”
This observation gives rise to one of the most important concepts in modern mathematics.
A property holds almost everywhere if it fails only on a set of measure zero.
For example:
The statement:
A randomly chosen real number is irrational
is true almost everywhere.
The exceptional rational numbers occupy zero length.
Translation Invariance
Another key property of Lebesgue measure is:
Moving a set does not change its length.
If:
$$A+ c={x+c:x\in A}$$
then:
$$\lambda(A+c)=\lambda(A)$$
Example:
$$[0,1]$$
and
$$[100,101]$$
both have length:
$$1$$
Location does not matter.
Only size matters.
Scaling
If we stretch a set by a factor:
$$c$$
then its measure scales by:
$$|c|$$
Formally:
$$\lambda(cA)=|c|\lambda(A)$$
Example:
Stretch:
$$[0,1]$$
to:
$$[0,5]$$
Then:
$$\lambda([0,5])=5$$
The Cantor Set Revisited
In Lesson 1 we briefly mentioned the Cantor set.
It contains:
- infinitely many points
- uncountably many points
Yet:
$$\lambda(C)=0$$
This is one of the first examples showing that:
Number of points and length are different concepts.
A set may contain infinitely many points and still occupy no length.
Outer Measure: Lebesgue’s Strategy
How do we measure extremely complicated sets?
Lebesgue’s idea was ingenious.
Cover the set with intervals:
$$I_1,I_2,I_3,\ldots$$
Compute:
$$\sum_{i=1}^{\infty}\text{Length}(I_i)$$
Then find the smallest possible value among all such coverings.
This leads to the concept of outer measure:
$$\lambda^*(A)$$
Outer measure is the first step toward constructing Lebesgue measure rigorously.
Carathéodory’s Criterion
Not every subset of:
$$\mathbb R$$
receives a Lebesgue measure.
A set is declared measurable if it satisfies:
$$\lambda^(E)=\lambda^(E\cap A)+\lambda^*(E\setminus A)$$
for every set:
$$E$$
This criterion determines which sets are Lebesgue measurable.
The resulting collection forms a sigma algebra.
Why We Care
Lebesgue measure gives us a rigorous notion of length for extremely complicated sets.
Once we have length, we can build:
- integration
- expectation
- probability
All modern probability theory rests on this construction.
Connection to Probability
Consider:
$$\Omega=[0,1]$$
Define:
$$P(A)=\lambda(A)$$
for every measurable set.
Then:
$$P([0,0.3])=0.3$$
$$P([0,0.8])=0.8$$
Probability becomes normalized length.
This is one of the deepest insights in measure-theoretic probability.
Why Bayesian Nonparametrics Needs This
Later we will study:
$$G\sim DP(\alpha,G_0)$$
where:
$$G$$
is itself a probability measure.
To understand probability measures, we must first understand ordinary measures.
Lebesgue measure is the canonical example.
Everything in Bayesian nonparametrics ultimately builds upon this idea.
The Hierarchy So Far
We have now constructed:
Sets
$$\longrightarrow$$
Sigma Algebras
$$\longrightarrow$$
Measurable Spaces
$$\longrightarrow$$
Measures
$$\longrightarrow$$
Lebesgue Measure
The next step is to understand functions that interact correctly with these structures.
What You Should Know After This Lesson
You should now understand:
- Lebesgue measure generalizes ordinary length.
- Intervals satisfy:
$$\lambda([a,b])=b-a$$
- Single points have measure zero.
- Finite sets have measure zero.
- Countable sets have measure zero.
- Rational numbers have measure zero.
- A property can hold almost everywhere.
- Lebesgue measure is translation invariant.
- Lebesgue measure forms the foundation of probability theory.
- Modern Bayesian statistics ultimately relies on Lebesgue measure.
Preview of Lesson 6
Next we study:
Measurable Functions
This lesson introduces the object that eventually becomes the modern definition of a random variable.
We will discover that a random variable is not merely a number that varies—it is a special type of function that preserves measurable structure between measurable spaces.

Leave a Reply