Chapter 7 — Conditional Likelihood: How Statistics Removes What You Don’t Care About

pj316

3–4 minutes

courses, GLM, Statistics

By Chapter 7, statistics takes a fascinating turn. Until now we’ve mostly asked:

How do predictors affect outcomes?

But Chapter 7 asks a different question:

What if there are factors affecting outcomes that we do NOT want to estimate?

These unwanted factors are called: nuisance parameters.

And Chapter 7 introduces one of the cleverest ideas in statistics: Conditional Likelihood.

This chapter feels difficult initially.

But once the intuition clicks, it becomes one of the most elegant ideas in modeling.

The Problem: Hidden Baseline Differences

Suppose we want to study: Does exposure increase disease?

You collect data.

But each pair of people differs:

genetics,
age,
income,
baseline health.

Those baseline differences interfere.

Similarly in business:

Suppose you want to know:

Does deployment increase sales?

But retailers already differ:

location,
management,
reputation,
market size.

How do we isolate the effect we care about?

The Main Idea of Conditional Likelihood

Conditional likelihood says: Instead of modeling everything, compare subjects under similar conditions.

Remove what you don’t care about.

Estimate only what matters.

Matched Pair Example

Suppose:

Pair	Case	Control
1	Exposed	Not exposed
2	Not exposed	Exposed

Case:

outcome happened.

Control:

outcome did not happen.

What Does “Case Exposed, Control Not Exposed” Mean?

Suppose:

Pair 1:

Person	Disease	Exposure
A	Yes	Yes
B	No	No

Interpretation:

The person with disease was exposed. That pair supports: exposure increases disease.

Another Pair

Person	Disease	Exposure
A	Yes	No
B	No	Yes

Now evidence goes opposite direction.

Why Matching?

Matching removes baseline risk.

Retailer	Deployment	Sale
A	High	Sold
B	Low	Not Sold

But what if:

retailer A is naturally stronger?

Matching attempts to compare:

similar retailers.

The Surprising Result

After conditioning: baseline disappears. Only relative information remains.

This is the magic.

Conditional Logistic Regression

Ordinary logistic:

\log\left(\frac{p}{1-p}\right)=\beta_0+\beta X

Conditional logistic:

removes:

\beta_0

You estimate only:

\beta

Why Does Baseline Cancel?

This was one of your earlier questions.

Suppose:

Pair-specific model:

\log\left(\frac{p_{ij}}{1-p_{ij}}\right) = \alpha_i + \beta X_{ij}

where:

$\alpha_i$ = pair baseline.

Conditioning mathematically removes:

\alpha_i

Now only:

\beta

remains.

The Key Insight

You stop asking:

“Who has higher baseline risk?”

Instead ask:

“Within similar pairs, what changed?”

Concordant vs Discordant Pairs

This is the most important concept.

Case	Control
Exposed	Exposed

Discordant:

Case	Control
Exposed	Not Exposed

Very informative.

Why?

Only discordant pairs tell us:

which exposure won.

Numerical Example

Suppose:

Pair	Case Exposed	Control Exposed
1	Yes	No
2	Yes	No
3	No	Yes

Estimate:

Odds ratio:

OR=\frac21

Interpretation:

Exposure approximately doubles odds.

Conditional Likelihood Formula

Suppose:

Pair:

Exposure values:

x_1,x_2

Conditional probability:

P(\text{case selected}) = \frac{e^{\beta x_1}} {e^{\beta x_1}+e^{\beta x_2}}

Notice: baseline disappeared.

Only exposure remains.

Why This Is Beautiful

Because we never estimated:

intercept,
pair risk,
hidden baseline.

Statistics removed them.

Marginal vs Conditional Likelihood

This confused a lot of people.

Marginal

Average across everyone.

Integrate nuisance away.

Conditional

Condition on fixed quantities.

Cancel nuisance.

Example:

Retailers:

Marginal:
overall average effect.

Conditional:
within-retailer effect.

Hypergeometric Connection

This often appears suddenly.

Why? Because after conditioning: counts become fixed.

Probability becomes: sampling without replacement.

That creates: hypergeometric distributions.

Real Business Example

Suppose: Question:

Does deployment improve sales?

Retailers differ massively.

Match retailers by:

size,
region,
customer count.

Then compare: higher deployment vs lower deployment. Conditional analysis removes retailer baseline.

Very powerful.

Why This Chapter Feels Hard

Because for the first time: statistics stops estimating everything.

Instead it says: some information is unnecessary.

That feels strange initially. But it is powerful.

Inventory Example

Suppose:

You want to know: Does replenishment improve sales?

Different categories behave differently. Match categories.

Condition away category baseline.

Estimate only replenishment effect.

Chapter 7’s Big Lesson

This chapter teaches: not every parameter deserves estimation.

Some factors should be removed.

And conditioning gives cleaner inference.

Final Thought

Before Chapter 7: you ask:

“How do I estimate everything?”

After Chapter 7: you ask:

“What can I safely eliminate?”

That shift changes how advanced statistical models work.

Conditional likelihood becomes the bridge into:

mixed models,
Bayesian methods,
Cox models,
survival analysis,
modern inference.

nerd-ish

Leave a ReplyCancel reply

Lesson 11: The Dominated Convergence Theorem

Lesson 10: The Monotone Convergence Theorem

Lesson 7: Integration Before Probability

Chapter 7 — Conditional Likelihood: How Statistics Removes What You Don’t Care About

The Problem: Hidden Baseline Differences

The Main Idea of Conditional Likelihood

Matched Pair Example

What Does “Case Exposed, Control Not Exposed” Mean?

Another Pair

Why Matching?

The Surprising Result

Conditional Logistic Regression

Why Does Baseline Cancel?

The Key Insight

Concordant vs Discordant Pairs

Numerical Example

Conditional Likelihood Formula

Why This Is Beautiful

Marginal vs Conditional Likelihood

Marginal

Conditional

Hypergeometric Connection

Real Business Example

Why This Chapter Feels Hard

Inventory Example

Chapter 7’s Big Lesson

Final Thought

Share this:

Like this:

Related posts:

Leave a ReplyCancel reply

Discover more from nerd-ish