Chapter 10 — Generalized Additive Models (GAMs): When Straight Lines Stop Making Sense

By Chapter 10, statistics reaches an important realization.

Up until now, most models assumed: effects are linear.

Meaning:

If adding:

  • 10 customers increases sales by 5,

then adding:

  • another 10 customers also increases sales by 5.

Constant effect.
Straight lines.

But real life rarely behaves like that.

Maybe:

  • customer growth helps initially then plateaus,
  • deployment improves sales until saturation,
  • aging hurts sales only after 180 days,
  • discounts help only up to a point.

This is where Chapter 10 begins.

This chapter introduces one of the most elegant extensions of GLMs: Generalized Additive Models (GAMs)


Why GLMs Sometimes Fail

Recall GLMs:

g(μ)=Xβg(\mu)=X\beta

This assumes:

every predictor contributes linearly.

Example:

log(μ)=β0+β1Customers\log(\mu) = \beta_0 + \beta_1 \text{Customers}

Meaning:

every additional customer contributes equally.

That may not be realistic.


Example — Customer Count

Suppose:

CustomersSales
102
2010
5040
10060
30070

Notice:

sales increase quickly initially.

Then flatten.

Linear regression struggles.


The Big Idea of GAM

Instead of:

βX\beta X

replace with:

f(X)f(X)

where:

  • f=smooth function learned from data.

Model becomes:

g(μ)=β0+f1(x1)+f2(x2)+g(\mu)=\beta_0+f_1(x_1)+f_2(x_2)+\cdots

That single change makes the model incredibly flexible.


Why Is It Called Additive?

Because effects are added:

f1(X1)+f2(X2)f_1(X_1)+f_2(X_2)

Each predictor gets:

  • its own smooth curve.

What Is a Smooth Function?

A smooth function means:

let the data decide the shape.

Not:

  • straight,
  • quadratic,
  • cubic.

The curve is learned.


Example

Customer count effect:

Linear model: straight line.

GAM: curve.

Maybe:

  • steep,
  • flattening,
  • accelerating.

Splines — The Engine Behind GAM

This was something you discovered earlier.

GAMs are usually built using: splines.


What Is a Spline?

Splines are: small smooth polynomial pieces joined together.

Instead of: one huge equation, build many local equations.


Example

Customer count:

0–50:
one curve.

50–150:
another curve.

150–300:
another.

Joined smoothly.


Knots — Where Splines Join

The joining locations are: knots.

Example:

Customer count:

50

150

300

Those become transition points.


Why Not Use High Degree Polynomials?

People tried that.

Problem:

wild oscillations.

Splines are:

  • stable,
  • interpretable,
  • local.

Smoothness Penalty

Now a problem appears.

Too many knots: overfitting.

Too few: underfitting.

Solution: penalty.

Objective becomes: Fit well.

But stay smooth.


Effective Degrees of Freedom (EDF)

One of the most important GAM outputs.

EDF tells: how nonlinear the relationship is.


If:

EDF ≈ 1

Relationship almost linear.


If:

EDF = 4

Moderately nonlinear.


If:

EDF = 10

Very flexible.


Interpretation Example

Customer spline:

EDF = 1.1

Almost linear.


Customer spline:

EDF = 6

Complex curve.


GAM With GLM Distributions

Beautiful part:

GAM still supports:

  • Poisson,
  • Logistic,
  • Gamma,
  • Negative Binomial.

Only predictor changes.


Poisson GAM

Counts.

Model:

\log(\mu)=\beta_0+f(CustomerCount)


Logistic GAM

Binary.

Model:

\log\left(\frac{p}{1-p}\right)=\beta_0+f(x)


Gamma GAM

Positive skewed.

Model:

g(\mu)=\beta_0+f(x)


Negative Binomial GAM

This became relevant to your inventory problem.

Model:

log(μ)=β0+β1Month+f(CustomerCount)+β2Year\log(\mu)=\beta_0+\beta_1Month+f(CustomerCount)+\beta_2Year

Now:

  • counts,
  • overdispersion,
  • nonlinear demand,

all modeled together.


Interaction Inside GAM

GAM also supports interaction.

Example:

Month changes customer effect.

Model:

g(μ)=β0+f(CustomerCount,Month)g(\mu) = \beta_0 + f(\text{CustomerCount}, \text{Month})

Now:

shape changes by season.

Very powerful.


Your Inventory Example

You built something very close to a GAM.

You proposed:

log(μ)=β0+β1Month+f(CustomerCount)+β2Year\log(\mu) = \beta_0 + \beta_1 \text{Month} + f(\text{CustomerCount}) + \beta_2 \text{Year}

That is essentially: a GAM.

Interpretation:

Month:
seasonality.

Spline:
nonlinear customer effect.

Year:
market drift.


Forecasting Inventory With GAM

You wanted: forecast annual sales.

Then: scale to replenishment lead time.

Then: calculate inventory.

Workflow:

Step 1:
Predict future monthly sales.

Step 2:
Aggregate next 12 months.

Step 3:
Convert to 4-month demand.

Step 4:
Add safety stock.


GAM vs GLM

GLMGAM
Straight effectsSmooth effects
Fixed coefficientsFlexible curves
SimplerMore realistic
Easier interpretationBetter fit

GAM vs Random Effects

This caused confusion earlier.


GAM

Captures:

nonlinear relationships.


Random Effects

Captures:

hidden group differences.


You can combine them:

GAMM

Generalized Additive Mixed Models.


GAMM Example

Inventory forecasting:

log(μ)=β0+f(CustomerCount)+uSKU\log(\mu) = \beta_0 + f(\text{CustomerCount}) + u_{\text{SKU}}

Now:

  • nonlinear demand,
  • SKU-specific effects.

Very modern.


Why Chapter 10 Matters

Before this chapter:

models assumed:

constant effects.

After this chapter:

effects become:

learned curves.

That is a huge shift.


The Deep Lesson

Chapter 10 teaches:

relationships are rarely straight.

Reality bends.

And good models should bend with it.


Final Thought

Before Chapter 10:

you ask:

“What is the coefficient?”

After Chapter 10:

you ask:

“What shape does the relationship have?”

That question moves statistics from:

equation fitting

to

discovering how systems truly behave.

Leave a comment