In Chapter 2, we saw something important:

ordinary linear regression fails whenever data behaves differently from the assumptions of normality and constant variance.

That naturally leads to a question:

If ordinary regression fails, what replaces it?

The answer is: Generalized Linear Models (GLMs).

And Chapter 3 is where everything comes together.

This chapter is the foundation of:

logistic regression,
Poisson regression,
Gamma regression,
negative binomial models,
survival models,
even many machine learning frameworks.

If Chapter 2 explains why ordinary regression breaks,
Chapter 3 explains: how we fix it systematically.

The Big Idea of GLMs

GLMs are not “one model.”

They are: a framework.

A framework that allows us to use:

different distributions,
different variance structures,
different link functions,
while still keeping the spirit of regression.

Core GLM structure:

g(\mu)=X\beta

That single equation powers a huge portion of modern statistics.

The Three Components of a GLM

A GLM has three pieces.

1. Random Component

This specifies:

the probability distribution of the response variable.

Examples:

Type of Data	Distribution
Continuous symmetric	Normal
Counts	Poisson
Binary outcomes	Bernoulli
Positive skewed data	Gamma

This is the part where we respect:

the structure of the data.

2. Systematic Component

This is the linear predictor:

[ X\beta ]

which means:

[ \beta_0+\beta_1X_1+\beta_2X_2+\cdots ]

This is still the “regression” part.

Examples:

deployment,
customer count,
month,
discount,
seasonality.

All influence the expected outcome.

3. Link Function

The link function connects:

the expected value of the response,
to:
the linear predictor.

This is the heart of GLMs.

Why Do We Need a Link Function?

Because many outcomes cannot directly behave like:

[ X\beta ]

Example:

probabilities must stay between 0 and 1,
counts must stay positive,
Gamma values must stay positive.

The link function transforms the mean into something that can safely behave linearly.

Example — Logistic Regression

Probabilities cannot exceed:

1,
or go below:

So instead of modeling probability directly,
we model: log-odds.

The logit link:

\log\left(\frac{p}{1-p}\right)

transforms:

probabilities,
into:
real numbers.

Now regression becomes possible.

Example — Poisson Regression

Counts must remain positive.

So Poisson regression uses: the log link.

\log(\mu)=X\beta

Exponentiating gives:

[ \mu=e^{X\beta} ]

which guarantees:

positive predicted counts.

Very elegant.

Bernoulli Distribution — The Foundation of Logistic Regression

A Bernoulli random variable models: one trial with two outcomes.

Examples:

buy/not buy,
churn/no churn,
sold/not sold.

Possible outcomes:

Outcome	Meaning
1	success
0	failure

Mean and Variance of Bernoulli

Mean:

E(Y)=p

Variance:

Var(Y)=p(1-p)

This is extremely important.

Notice:

variance depends on the mean.

That is already different from ordinary regression.

A Beautiful Insight About Bernoulli Variance

The variance peaks at:

p=0.5

Why?

Because uncertainty is highest when:

both outcomes are equally likely.

If:

(p=0),
or:
(p=1),

there is almost no uncertainty.

So variance shrinks again.

This explains why Bernoulli behaves differently from Poisson.

Odds — The Key to Logistic Regression

Suppose:

probability of winning = 0.75.

Then:

probability of losing = 0.25.

Odds are:

\frac{0.75}{0.25}=3

Meaning:

winning is three times as likely as losing.

Logistic regression models: the log of the odds.

Poisson Regression — Modeling Counts

Poisson regression is used for:

sales counts,
purchases,
website clicks,
arrivals,
support tickets.

Its defining property:

Var(Y)=E(Y)=\mu

Variance grows naturally with the mean.

This is exactly what count data often does.

Real Business Example

Suppose:

small retailers sell 2 items/month,
large retailers sell 100 items/month.

Large retailers naturally fluctuate more.

Poisson understands this.

Ordinary regression assumes:

constant variability.

That becomes unrealistic.

Gamma Regression — Positive Skewed Data

Gamma regression handles:

positive continuous,
right-skewed data.

Examples:

revenue per transaction,
insurance claims,
customer spending,
waiting times.

Variance behaves like:

Var(Y)=\phi\mu^2

As mean grows:

variability grows even faster.

Very common in business.

Canonical Links

Each GLM has a natural or canonical link.

Distribution	Canonical Link
Normal	identity
Bernoulli	logit
Poisson	log
Gamma	inverse

Identity Link

For normal regression:

g(\mu)=\mu

Nothing changes.

This is why ordinary regression is actually: a special case of GLM.

That is a very important insight.

Overdispersion — When Poisson Breaks

Real count data often violates:
[
Var(Y)=\mu
]

Instead:

[
Var(Y)>\mu
]

This is called: overdispersion.

Why Overdispersion Happens

Because real-world systems are heterogeneous.

Different parts of the data behave differently.

Example:

Retailer Type	Average Sales
Weak	1
Medium	5
Strong	20

Combining them creates:

extra variability.

That’s why overdispersion often indicates:

hidden groups,
omitted variables,
clustering,
latent heterogeneity.

Negative Binomial Regression

Negative binomial extends Poisson by adding extra variance.

Variance becomes:

Var(Y)=\mu+\alpha\mu^2

where:

$(\alpha)$ controls extra dispersion.

This is one of the most practical count models in real business analytics.

Real sales Example

Suppose you model:

monthly sales counts,
by category,
over time.

You may include:

month,
customer count,
year,
seasonality.

A realistic model might look like:

\log(\mu)=\beta_0+\beta_1Month+\beta_2f(CustomerCount)+\beta_3Year

where:

(f(\cdot)) is a spline.

Now we are entering: Generalized Additive Models (GAMs).

Splines and Nonlinearity

Not all relationships are linear.

Example:

adding customers may help sales strongly initially,
then level off later.

A spline allows:

smooth nonlinear effects.

Instead of:

\beta X

we use:

f(X)

This becomes a GAM.

Forecasting Inventory

One fascinating business application discussed was: ideal inventory forecasting.

Using:

negative binomial counts,
seasonality,
customer count,
splines,
weighted recent data,
lead times,
safety stock.

This is exactly how advanced operational forecasting systems work.

Weighted Forecasting

You explored:

70% weight on recent 6 months,
30% weight on older 18 months.

This is actually very practical.

Why?

Because:

markets evolve,
trends shift,
recent data often matters more.

Safety Stock — Beyond Normality

Traditional safety stock formulas assume:

normal demand.

But negative binomial demand is:

discrete,
skewed,
overdispersed.

So instead of:

normal approximations,

you can use:

percentiles from the fitted distribution.

For example:

forecast the next 12 months,
scale to 4-month lead time,
calculate 90th percentile demand.

That becomes: a probabilistic inventory target.

This is a far more modern approach.

The Deep Philosophy of GLMs

Chapter 3 teaches something profound:

data types matter.

You cannot force:

counts,
probabilities,
skewed revenue,
overdispersed demand,

into one simplistic framework.

GLMs adapt the model to the behavior of reality.

That is why they are so powerful.

Final Thought

Ordinary regression is only one small corner of statistical modeling.

GLMs generalize regression into a flexible system capable of modeling:

counts,
probabilities,
skewed positive data,
overdispersion,
seasonality,
nonlinear effects,
real business processes.

And once you truly understand GLMs,
you begin to realize:

the distribution is not just mathematics —
it is a description of how reality behaves.

recent posts

about

Leave a comment Cancel reply

recent posts

about

Chapter 3 — Generalized Linear Models (GLMs): The Big Framework Behind Modern Regression

The Big Idea of GLMs

The Three Components of a GLM

1. Random Component

2. Systematic Component

3. Link Function

Why Do We Need a Link Function?

Example — Logistic Regression

Example — Poisson Regression

Bernoulli Distribution — The Foundation of Logistic Regression

Mean and Variance of Bernoulli

A Beautiful Insight About Bernoulli Variance

Odds — The Key to Logistic Regression

Poisson Regression — Modeling Counts

Real Business Example

Gamma Regression — Positive Skewed Data

Canonical Links

Identity Link

Overdispersion — When Poisson Breaks

Why Overdispersion Happens

Negative Binomial Regression

Real sales Example

Splines and Nonlinearity

Forecasting Inventory

Weighted Forecasting

Safety Stock — Beyond Normality

percentiles from the fitted distribution.

The Deep Philosophy of GLMs

Final Thought

Share this:

Leave a comment Cancel reply