Laplace approximation is a way to approximate a difficult posterior distribution using a Gaussian distribution near its peak.
The posterior is written as:
Taking logs:
So:
Near the posterior mode (\hat{\theta}), we use a second-order Taylor approximation:
At the mode:
That is why there is no first-derivative term.
Since (\hat{\theta}) is a maximum:
So the posterior becomes approximately:
Ignoring constants:
Let:
Then:
Therefore:
Example 1: Bernoulli Data With Logistic Parameter
Model
Suppose we observe binary outcomes.
For example, in healthcare:
could mean a patient responded to treatment.
could mean the patient did not respond.
The model is:
Instead of estimating (p) directly, we use the logit parameter:
So:
Assume the prior:
The likelihood is:
The log-likelihood is:
The log-prior is:
So the log-posterior is:
Laplace approximation gives:
where:
Python Code With Plot
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.stats import norm
y = np.array([1,1,1,0,1,0,1,1,0,1])
def log_posterior(beta):
beta = beta[0] if isinstance(beta, np.ndarray) else beta
p = 1 / (1 + np.exp(-beta))
loglik = np.sum(
y * np.log(p) +
(1-y) * np.log(1-p)
)
logprior = norm.logpdf(beta, loc=0, scale=np.sqrt(10))
return loglik + logprior
result = minimize(lambda b: -log_posterior(b), x0=np.array([0.0]))
beta_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(beta_hat + eps)
- 2 * log_posterior(beta_hat)
+ log_posterior(beta_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
print("Posterior mode beta_hat:", beta_hat)
print("Laplace variance:", variance)
print("Laplace standard deviation:", std)
grid = np.linspace(beta_hat - 4*std, beta_hat + 4*std, 500)
log_post_values = np.array([log_posterior(b) for b in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=beta_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(beta_hat, linestyle=":", label="Posterior mode")
plt.xlabel("beta")
plt.ylabel("Density")
plt.title("Laplace Approximation for Bernoulli Logistic Model")
plt.legend()
plt.show()What the Plot Shows
The solid curve is the normalized posterior.
The dashed curve is the Gaussian approximation:
If the two curves are close, the Laplace approximation is good.
Example 2: Poisson Data With Log Rate
Model
Suppose we observe count data.
For example:
could be the number of emergency department arrivals per hour.
The model is:
Since:
we write:
So:
Assume the prior:
The likelihood is:
Since:
the log-likelihood is:
The log-posterior is:
Laplace approximation gives:
where:
Then:
(\lambda \approx \exp(\hat{\theta}))
Python Code With Plot
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.stats import norm, poisson
y = np.array([3,5,4,7,6,5])
def log_posterior(theta):
theta = theta[0] if isinstance(theta, np.ndarray) else theta
lam = np.exp(theta)
loglik = np.sum(poisson.logpmf(y, mu=lam))
logprior = norm.logpdf(theta, loc=0, scale=np.sqrt(10))
return loglik + logprior
result = minimize(lambda t: -log_posterior(t), x0=np.array([1.0]))
theta_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(theta_hat + eps)
- 2 * log_posterior(theta_hat)
+ log_posterior(theta_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
lambda_hat = np.exp(theta_hat)
print("Posterior mode theta_hat:", theta_hat)
print("Approximate lambda:", lambda_hat)
print("Laplace variance:", variance)
print("Laplace standard deviation:", std)
grid = np.linspace(theta_hat - 4*std, theta_hat + 4*std, 500)
log_post_values = np.array([log_posterior(t) for t in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=theta_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(theta_hat, linestyle=":", label="Posterior mode")
plt.xlabel("theta = log(lambda)")
plt.ylabel("Density")
plt.title("Laplace Approximation for Poisson Model")
plt.legend()
plt.show()What the Plot Shows
The posterior is for:
not directly for (\lambda).
The Gaussian approximation is:
Example 3: Binomial Data With Logistic Parameter
Model
Suppose a treatment is given to n=20 patients.
Out of them: y=15 respond.
The model is:
Use:
Prior:
The likelihood is:
The log-likelihood is:
The log-posterior is:
Laplace approximation gives:
where:
Python Code With Plot
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.stats import norm, binom
y = 15
n = 20
def log_posterior(theta):
theta = theta[0] if isinstance(theta, np.ndarray) else theta
p = 1 / (1 + np.exp(-theta))
loglik = binom.logpmf(y, n=n, p=p)
logprior = norm.logpdf(theta, loc=0, scale=np.sqrt(10))
return loglik + logprior
result = minimize(lambda t: -log_posterior(t), x0=np.array([0.0]))
theta_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(theta_hat + eps)
- 2 * log_posterior(theta_hat)
+ log_posterior(theta_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
p_hat = 1 / (1 + np.exp(-theta_hat))
print("Posterior mode theta_hat:", theta_hat)
print("Approximate p:", p_hat)
print("Laplace variance:", variance)
print("Laplace standard deviation:", std)
grid = np.linspace(theta_hat - 4*std, theta_hat + 4*std, 500)
log_post_values = np.array([log_posterior(t) for t in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=theta_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(theta_hat, linestyle=":", label="Posterior mode")
plt.xlabel("theta = logit(p)")
plt.ylabel("Density")
plt.title("Laplace Approximation for Binomial Model")
plt.legend()
plt.show()
What the Plot Shows
The approximation is Gaussian on the logit scale:
So:
Then convert back:
Example 4: Exponential Data With Log Rate
Model
Suppose we observe waiting times.
For example:
could be the time until hospital discharge.
The model is:
The exponential density is:
Since:
we use:
Prior:
The likelihood is:
The log-likelihood is:
The log-posterior is:
Laplace approximation gives:
where:
Python Code With Plot
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.stats import norm, expon
y = np.array([2.1, 1.7, 3.4, 2.8, 1.9, 2.5, 3.0])
def log_posterior(theta):
theta = theta[0] if isinstance(theta, np.ndarray) else theta
lam = np.exp(theta)
loglik = np.sum(expon.logpdf(y, scale=1/lam))
logprior = norm.logpdf(theta, loc=0, scale=np.sqrt(10))
return loglik + logprior
result = minimize(lambda t: -log_posterior(t), x0=np.array([0.0]))
theta_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(theta_hat + eps)
- 2 * log_posterior(theta_hat)
+ log_posterior(theta_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
lambda_hat = np.exp(theta_hat)
print("Posterior mode theta_hat:", theta_hat)
print("Approximate lambda:", lambda_hat)
print("Laplace variance:", variance)
print("Laplace standard deviation:", std)
grid = np.linspace(theta_hat - 4*std, theta_hat + 4*std, 500)
log_post_values = np.array([log_posterior(t) for t in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=theta_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(theta_hat, linestyle=":", label="Posterior mode")
plt.xlabel("theta = log(lambda)")
plt.ylabel("Density")
plt.title("Laplace Approximation for Exponential Model")
plt.legend()
plt.show()
What the Plot Shows
The solid line is the posterior for:
The dashed line is:
Example 5: Normal Data With Unknown Mean
Model
Suppose we observe blood pressure measurements.
Let:
Assume () is known.
Prior:
The likelihood is:
The log-likelihood is:
The log-posterior is:
Laplace approximation gives:
where:
In this case, because the likelihood and prior are both Gaussian, the posterior is already Gaussian.
So Laplace approximation is essentially exact.
Python Code With Plot
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.stats import norm
y = np.array([128, 132, 125, 130, 135, 127, 131])
sigma = 5
def log_posterior(mu):
mu = mu[0] if isinstance(mu, np.ndarray) else mu
loglik = np.sum(norm.logpdf(y, loc=mu, scale=sigma))
logprior = norm.logpdf(mu, loc=0, scale=10)
return loglik + logprior
result = minimize(lambda m: -log_posterior(m), x0=np.array([120.0]))
mu_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(mu_hat + eps)
- 2 * log_posterior(mu_hat)
+ log_posterior(mu_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
print("Posterior mode mu_hat:", mu_hat)
print("Laplace variance:", variance)
print("Laplace standard deviation:", std)
grid = np.linspace(mu_hat - 4*std, mu_hat + 4*std, 500)
log_post_values = np.array([log_posterior(m) for m in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=mu_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(mu_hat, linestyle=":", label="Posterior mode")
plt.xlabel("mu")
plt.ylabel("Density")
plt.title("Laplace Approximation for Normal Mean Model")
plt.legend()
plt.show()
What the Plot Shows
The posterior and Laplace approximation should almost overlap.
That is because the log-posterior is exactly quadratic in (\mu).
A quadratic log-posterior gives an exact Gaussian posterior.
Example 6: Gamma Data With Log Shape Parameter
Model
Suppose we observe positive continuous data.
For example:
could be hospital length of stay.
Assume:
where () is the shape and (\beta) is the rate.
Assume () is known.
Since:
we write:
Prior:
The gamma likelihood is:
The likelihood is:
where:
The log-likelihood is:
The log-posterior is:
Laplace approximation gives:
where:
Then:
Python Code With Plot
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from scipy.stats import norm, gamma
y = np.array([2.4, 3.1, 2.8, 4.0, 3.6, 2.9, 3.3])
beta = 1.0
def log_posterior(theta):
theta = theta[0] if isinstance(theta, np.ndarray) else theta
alpha = np.exp(theta)
loglik = np.sum(gamma.logpdf(y, a=alpha, scale=1/beta))
logprior = norm.logpdf(theta, loc=0, scale=np.sqrt(10))
return loglik + logprior
result = minimize(lambda t: -log_posterior(t), x0=np.array([1.0]))
theta_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(theta_hat + eps)
- 2 * log_posterior(theta_hat)
+ log_posterior(theta_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
alpha_hat = np.exp(theta_hat)
print("Posterior mode theta_hat:", theta_hat)
print("Approximate alpha:", alpha_hat)
print("Laplace variance:", variance)
print("Laplace standard deviation:", std)
grid = np.linspace(theta_hat - 4*std, theta_hat + 4*std, 500)
log_post_values = np.array([log_posterior(t) for t in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=theta_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(theta_hat, linestyle=":", label="Posterior mode")
plt.xlabel("theta = log(alpha)")
plt.ylabel("Density")
plt.title("Laplace Approximation for Gamma Shape Model")
plt.legend()
plt.show()
What the Plot Shows
The approximation is done on:
not directly on (\alpha).
So the Gaussian approximation is:
Then transform back:
Complete Reusable Function for Laplace Approximation
To avoid rewriting the same code again and again, we can create a reusable function.
import numpy as np
from scipy.optimize import minimize
def laplace_approximation(log_posterior, start_value):
result = minimize(
lambda x: -log_posterior(x),
x0=np.array([start_value])
)
theta_hat = result.x[0]
eps = 1e-5
hessian = (
log_posterior(theta_hat + eps)
- 2 * log_posterior(theta_hat)
+ log_posterior(theta_hat - eps)
) / eps**2
variance = -1 / hessian
std = np.sqrt(variance)
return theta_hat, variance, std
And a reusable plotting function:
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm
def plot_laplace(log_posterior, theta_hat, std, x_label, title):
grid = np.linspace(theta_hat - 4*std, theta_hat + 4*std, 500)
log_post_values = np.array([log_posterior(t) for t in grid])
posterior_values = np.exp(log_post_values - np.max(log_post_values))
posterior_values = posterior_values / np.trapz(posterior_values, grid)
laplace_values = norm.pdf(grid, loc=theta_hat, scale=std)
plt.figure(figsize=(8,5))
plt.plot(grid, posterior_values, label="Posterior")
plt.plot(grid, laplace_values, "--", label="Laplace approximation")
plt.axvline(theta_hat, linestyle=":", label="Posterior mode")
plt.xlabel(x_label)
plt.ylabel("Density")
plt.title(title)
plt.legend()
plt.show()
Example use:
theta_hat, variance, std = laplace_approximation(log_posterior, start_value=0)
plot_laplace(
log_posterior=log_posterior,
theta_hat=theta_hat,
std=std,
x_label="theta",
title="Laplace Approximation"
)
Final Summary
Laplace approximation follows this pattern:
Approximate the log-posterior near the mode:
Then:
where:
So:
The important idea is:
The log-posterior becomes approximately quadratic.
The posterior becomes approximately Gaussian.
That is why Laplace approximation works.

Leave a Reply