Measure Theory Lesson 16: The Radon–Nikodym Theorem

Introduction

The Jordan Decomposition Theorem answered an important question:

Can every signed measure be decomposed into positive and negative parts?

The answer was yes.

Now we ask a much deeper question:

Suppose we have two measures. Can one measure be described in terms of the other?

For example:

Suppose:

$$\lambda$$

is Lebesgue measure on:

$$\mathbb{R}$$

and define a new measure:

$$\nu(A)=\int_A 3x^2,d\lambda(x)$$

for measurable sets:

$$A$$

Notice that:

$$\nu$$

is completely determined by:

$$\lambda$$

and the function:

$$3x^2$$

The natural question is:

Can every measure be represented this way?

The Radon–Nikodym Theorem gives a precise answer.

It is one of the most important theorems in all of modern mathematics.

Probability theory, Bayesian statistics, stochastic processes, information theory, functional analysis, and noncommutative geometry all depend on it.


Motivation

Suppose:

$$f(x)\ge0$$

is integrable.

Define:

$$\nu(A)=\int_A f,d\mu$$

for every measurable set:

$$A$$

Then:

$$\nu$$

is a measure.

We say:

$$f$$

acts as a density that converts:

$$\mu$$

into:

$$\nu$$

The Radon–Nikodym Theorem asks:

Under what conditions does such a density exist?


Absolute Continuity

Before stating the theorem, we need a new concept.

Let:

$$\mu$$

and

$$\nu$$

be measures.

We say:

$$\nu$$

is absolutely continuous with respect to:

$$\mu$$

if:

$$\mu(A)=0 \implies \nu(A)=0$$

for every measurable set:

$$A$$

We write:

$$\nu\ll\mu$$

and read:

ν is absolutely continuous with respect to μ.


Intuition

Absolute continuity means:

ν cannot see anything that μ cannot see.

Whenever:

$$\mu$$

declares a set negligible,

$$\nu$$

must also declare it negligible.


Example

Let:

$$\mu=\lambda$$

be Lebesgue measure.

Define:

$$\nu(A)=\int_A 5,d\lambda$$

Then:

$$\nu(A)=5\lambda(A)$$

If:

$$\lambda(A)=0$$

then:

$$\nu(A)=0$$

Therefore:

$$\nu\ll\lambda$$


A Counterexample

Let:

$$\delta_0$$

be the Dirac measure at:

$$0$$

Then:

$$\delta_0({0})=1$$

but:

$$\lambda({0})=0$$

Therefore:

$$\delta_0\not\ll\lambda$$

The Dirac measure assigns mass to a set that Lebesgue measure considers negligible.

Thus no density with respect to Lebesgue measure can exist.


Statement of the Radon–Nikodym Theorem

Let:

$$\nu$$

and:

$$\mu$$

be σ-finite measures.

Assume:

$$\nu\ll\mu$$

Then there exists a measurable function:

$$f:X\to[0,\infty)$$

such that:

$$\nu(A)=\int_A f,d\mu$$

for every measurable set:

$$A$$

Furthermore, the function:

$$f$$

is unique up to sets of measure zero.


The Radon–Nikodym Derivative

The function:

$$f$$

is called the Radon–Nikodym derivative.

It is written:

$$\frac{d\nu}{d\mu}$$

Thus:

$$\nu(A)=\int_A \frac{d\nu}{d\mu},d\mu$$

This notation resembles ordinary calculus for a reason.

The theorem is a vast generalization of ordinary differentiation.


Why the Notation Makes Sense

Recall from calculus:

If:

$$F(x)=\int_a^x f(t),dt$$

then:

$$F’(x)=f(x)$$

The derivative measures how one quantity changes relative to another.

The Radon–Nikodym derivative plays the same role.

It tells us how:

$$\nu$$

changes relative to:

$$\mu$$

at infinitesimal scales.


Example 1

Let:

$$\mu=\lambda$$

and define:

$$\nu(A)=\int_A 3x^2,d\lambda$$

Then:

$$\frac{d\nu}{d\lambda}=3x^2$$

Indeed:

$$\nu(A)=\int_A \frac{d\nu}{d\lambda},d\lambda$$


Example 2

Suppose:

$$\nu(A)=7\lambda(A)$$

Then:

$$\frac{d\nu}{d\lambda}=7$$

The density is constant.

Every set simply receives seven times as much mass.


Example 3

Probability Density Functions

Let:

$$P$$

be a probability measure on:

$$\mathbb{R}$$

with density:

$$p(x)$$

Then:

$$P(A)=\int_A p(x),dx$$

Using measure-theoretic notation:

$$\frac{dP}{d\lambda}=p(x)$$

Thus every probability density function is a Radon–Nikodym derivative.


Why This Is So Important

Many students learn probability densities first and only later discover that they are Radon–Nikodym derivatives.

In reality:

Probability density functions are special cases of the theorem.

The theorem explains why densities exist.


Uniqueness

Suppose:

$$f$$

and:

$$g$$

both satisfy:

$$\nu(A)=\int_A f,d\mu=\int_A g,d\mu$$

for every measurable set:

$$A$$

Then:

$$f=g$$

almost everywhere.

Thus the derivative is unique up to measure-zero sets.

This is exactly the level of uniqueness measure theory typically provides.


A Geometric Interpretation

Imagine:

$$\mu$$

as measuring volume.

The Radon–Nikodym derivative:

$$\frac{d\nu}{d\mu}$$

describes how density varies from point to point.

Large values indicate regions where:

$$\nu$$

places more mass than:

$$\mu$$

Small values indicate regions where:

$$\nu$$

places less mass.


Connection to Conditional Probability

One of the deepest applications occurs in probability.

Conditional expectations and conditional probabilities can often be constructed using Radon–Nikodym derivatives.

Much of modern probability theory rests upon this theorem.


Connection to Bayesian Statistics

Bayesian inference constantly updates one measure into another.

Posterior distributions are often described through densities.

The mathematical justification ultimately comes from Radon–Nikodym theory.

Many Bayesian formulas are simply applications of:

$$\frac{d\nu}{d\mu}$$

under different names.


Sketch of the Proof

The actual proof is one of the masterpieces of measure theory.

Very roughly:

  1. Consider all measurable functions satisfying suitable integral bounds.
  2. Use completeness properties of the Lebesgue integral.
  3. Construct the largest possible candidate density.
  4. Show it reproduces the measure exactly.
  5. Prove uniqueness using measure-zero arguments.

The proof combines:

  • measure theory
  • integration theory
  • completeness
  • signed measure techniques

developed throughout the previous lessons.


Why Connes Cares

The Radon–Nikodym Theorem introduces a profound idea:

Measures can be compared through derivatives.

In noncommutative geometry, ordinary measures are replaced by states and traces on operator algebras.

One of the great achievements of twentieth-century mathematics was finding noncommutative analogues of Radon–Nikodym theory.

Much of modular theory, which later becomes central in Connes’ work on von Neumann algebras, can be viewed as a vast generalization of the Radon–Nikodym idea.

In many ways:

$$\frac{d\nu}{d\mu}$$

is the classical ancestor of some of the most sophisticated constructions in noncommutative geometry.


Key Concepts Learned

By the end of this lesson you should understand:

  • Absolute continuity means:

$$\nu\ll\mu$$

if:

$$\mu(A)=0\implies\nu(A)=0$$

  • The Radon–Nikodym Theorem states that absolutely continuous measures possess densities.
  • The density is written:

$$\frac{d\nu}{d\mu}$$

  • Every probability density function is a Radon–Nikodym derivative.
  • The derivative is unique almost everywhere.
  • The theorem generalizes ordinary differentiation.
  • Radon–Nikodym theory underlies modern probability, statistics, and Bayesian inference.

Looking Ahead

In the next lesson:

Lesson 17: Absolute Continuity of Measures

we will study absolute continuity in depth, explore its consequences, and see why it is the exact condition needed for Radon–Nikodym derivatives to exist. We will also begin building toward the Lebesgue Decomposition Theorem, one of the deepest structural results in measure theory.

Leave a Reply

Discover more from nerd-ish

Subscribe now to keep reading and get access to the full archive.

Continue reading