Chapter 12: GAMLSS — Modeling More Than Just the Mean

By Chapter 12, something surprising happens.

Until now, almost every model we built focused on one thing:

modeling the mean.

GLM

$$
g(\mu) = X\beta
$$

GAM

$$
g(\mu) = \beta_0 + f(X)
$$

Cox Model

Models the hazard function.

But Chapter 12 asks:

Why model only the average?

What if:

  • Variability changes?
  • Skewness changes?
  • Tail behavior changes?

Real life often behaves that way.

This chapter introduces GAMLSS:

Generalized Additive Models for Location, Scale and Shape

One of the most flexible statistical frameworks available.


Why Mean Alone Is Sometimes Not Enough

Suppose we have two categories:

SKUAverage SalesVariability
A202
B2020

Both have the same mean.

Yet they lead to completely different business decisions.

If you only model the mean:

they look identical.

Reality says:

they are not.


The Big Idea of GAMLSS

Instead of modeling only:

$$
\mu
$$

GAMLSS models:

$$
\mu,\sigma,\nu,\tau
$$

These represent:

ParameterMeaning
$$\mu$$Location (mean)
$$\sigma$$Scale (variability)
$$\nu$$Skewness
$$\tau$$Tail shape

This means multiple parts of the distribution become dynamic.


Why This Matters

Suppose we are modeling inventory demand.

During normal months:

  • Demand is stable.

During holiday months:

  • Demand is volatile.

Not only does average demand change.

Variance changes too.

GAMLSS captures that.


Ordinary Regression

Models:

$$
Y \sim N(\mu,\sigma^2)
$$

with:

$$
\sigma = \text{constant}
$$


GAMLSS

Allows:

$$
\mu = f(X)
$$

and

$$
\sigma = g(X)
$$

and

$$
\nu = h(X)
$$

and

$$
\tau = k(X)
$$

Everything can move.


Example: Revenue Per Transaction

Suppose customer count increases.

Observation:

  • Average revenue increases.
  • Variability also increases.

Model for the mean:

$$
\log(\mu)=\beta_0+f(\text{Customers})
$$

Model for the variance:

$$
\log(\sigma)=\alpha_0+g(\text{Customers})
$$

Now uncertainty itself is being modeled.

Very powerful.


Distribution Choice Becomes Important

Unlike GLMs, GAMLSS supports many distributions.

Data TypeDistribution
Positive Skewed DataGamma
CountsNegative Binomial
Heavy Tailst Distribution
ProportionsBeta
Continuous Symmetric DataNormal

This provides enormous flexibility.


Example: Inventory Demand

Suppose monthly sales behave differently throughout the year.

Average demand:

  • Higher in December.

Variance:

  • Also higher in December.

Model for the mean:

$$
\log(\mu)=f(\text{Month})
$$

Model for the variance:

$$
\log(\sigma)=g(\text{Month})
$$

Result:

December predicts not only:

  • More demand
  • More uncertainty

Inventory planning becomes smarter.


Additive Components

Like GAMs, smooth functions appear naturally.

Mean model:

$$
g(\mu)=\beta_0+f_1(x_1)+f_2(x_2)
$$

Variance model:

$$
h(\sigma)=\alpha_0+g_1(x_1)+g_2(x_2)
$$

Different smoothers can be used for different distribution parameters.


Why Model Variance?

Suppose we have two SKUs.

Both average:

$$
20
$$

SKU A:

  • Stable demand

SKU B:

  • Wildly unpredictable demand

Inventory requirements will differ dramatically.

Mean alone fails to capture this.


Example: Your Inventory Problem

Suppose your forecast predicts:

$$
30
$$

pieces.

Should inventory also be set at:

$$
30
$$

Not necessarily.

We need uncertainty estimates.

GAMLSS might estimate:

Mean:

$$
30
$$

Variance:

$$
100
$$

Inventory decisions can then incorporate both expected demand and risk.


Modeling Skewness

Sometimes demand behaves asymmetrically.

Example:

Most months:

$$
5
$$

sales.

Occasionally:

$$
50
$$

sales.

This creates a right-skewed distribution.

GAMLSS can model skewness directly:

$$
\nu = f(\text{Month})
$$

Now the shape of the distribution changes over time.


Modeling Tails

Tail risk often matters most.

Suppose:

Most months:

$$
10
$$

sales.

Occasionally:

$$
100
$$

sales.

Inventory failures occur in these extreme situations.

The tail parameter:

$$
\tau
$$

captures this behavior.


Why GAMLSS Beats GAM Sometimes

ModelMeanVarianceShape
GLMFixedFixed
GAMFixedFixed
GAMLSS

GAMLSS models the entire distribution rather than only the average.


Fitting GAMLSS

Estimation alternates between distribution parameters.

Step 1

Fit the mean.

Step 2

Fit the variance.

Step 3

Fit skewness.

Repeat until convergence.


Overfitting Warning

GAMLSS is extremely flexible.

That flexibility comes with risk.

Too many parameters can lead to overfitting.

Common safeguards include:

  • AIC
  • BIC
  • Cross-validation
  • Holdout testing

Real Inventory Example

Suppose we model inventory demand.

Mean model:

$$
\log(\mu)=\text{Month}+f(\text{CustomerCount})+\text{Year}
$$

Variance model:

$$
\log(\sigma)=\text{Month}+\text{CustomerCount}
$$

Now high-sales months also become high-risk months.

Inventory planning improves substantially.


Connection to Your Heuristic

Your heuristic was:

$$
4 \times \text{Avg} + (P90-P50)\sqrt{4}
$$

Notice what it is trying to capture:

  • Mean demand
  • Variability

You were manually approximating what GAMLSS formalizes statistically.

An interesting observation.


When Should You Use GAMLSS?

Use GAMLSS when:

  • Variance changes over time.
  • Skewness matters.
  • Tail risk matters.
  • Percentile forecasts are important.
  • Uncertainty itself is a business problem.

Common applications:

  • Inventory planning
  • Pricing
  • Insurance
  • Reliability analysis
  • Forecasting

When NOT to Use GAMLSS

Avoid GAMLSS when:

  • Sample size is small.
  • Prediction is simple.
  • Interpretability matters more than flexibility.
  • A standard GLM or GAM already performs well.

Chapter 12’s Big Lesson

This chapter teaches a profound statistical lesson:

Averages do not tell the whole story.

Real systems differ not only in:

  • Center
  • Spread
  • Shape
  • Uncertainty

GAMLSS allows us to model all of them.


Final Thought

Before Chapter 12, the question was:

“What is the expected value?”

After Chapter 12, the question becomes:

“What does the entire distribution look like?”

That shift moves statistics from:

predicting averages

to

understanding uncertainty itself.

Leave a Reply

Discover more from Nerdish.Org

Subscribe now to keep reading and get access to the full archive.

Continue reading