Chapter 11 — Survival Analysis and the Cox Proportional Hazards Model: Modeling Time Until Something Happens

By Chapter 11, statistics asks a completely different question.

Until now we mostly modeled:

  • counts,
  • probabilities,
  • continuous outcomes.

But many business and scientific questions are actually about: time.

Questions like:

  • How long until a diamond sells?
  • How long until a customer churns?
  • How long until inventory becomes aged?
  • How long until equipment fails?
  • How long until a patient recovers?

Notice:

The outcome is not:

“How many?”

or

“Yes or No?”

The outcome is:

When?

This chapter introduces: Survival Analysis

and one of the most famous models in statistics: Cox Proportional Hazards Model.


What Makes Survival Data Different?

Suppose:

You deploy 100 diamonds.

After 180 days:

  • 30 sold,
  • 70 still deployed.

Question:

What do you do with the remaining 70?

You cannot pretend:

sale time = 180.

That would be wrong.

Those diamonds simply have: unknown future sale times.

This leads to: censoring.


Censoring — The Core Survival Concept

Censoring means: we know the event did not happen yet.

Example:

DiamondDays OutSold
A40Yes
B120Yes
C300No

Diamond C:

not sold yet.

True sale time unknown.

This is: right censoring.


Why Ordinary Regression Fails

Suppose:

Average sale time:

100 days.

Problem:

Unsold inventory ignored.

Results become biased.

Survival analysis naturally incorporates:

  • sold,
  • unsold.

Survival Function

The first object is:

S(t)S(t)

Meaning:

Probability event has NOT happened by time t.

Formula:

S(t)=P(T>t)S(t)=P(T>t)

Example:

S(180)=0.70S(180)=0.70

Interpretation:

70% still unsold after 180 days.


Hazard Function — The Difficult Concept

Students usually struggle here.

Hazard is NOT probability.

Hazard means:

instantaneous risk of event.

Formula:

h(t)=limΔt0P(tT<t+Δt|Tt)Δth(t)=\lim_{\Delta t\to0}\frac{P(t\le T<t+\Delta t\mid T\ge t)}{\Delta t}

Translation:

Given survival until now,

how quickly is the event happening?


Example — Inventory

Suppose:

Day 300.

Inventory survived.

Hazard asks:

How likely is sale immediately now?

Not cumulative.

Instantaneous.


Relationship Between Survival and Hazard

These are connected.

S(t)=exp(0th(u)du)S(t)=\exp\left(-\int_0^t h(u)du\right)

Interpretation:

Hazard accumulates.

Higher hazard:

faster event.

Lower survival.


Kaplan–Meier Estimator

Before regression:

estimate survival.

Formula:

S^(t)=i:tit(1dini)\hat{S}(t) = \prod_{i:\,t_i \le t}\left(1-\frac{d_i}{n_i}\right)

where:

  • d_i=events,
  • n_i=risk set.

Example

TimeSoldAt Risk
10010100
2001090

90

Survival:

At 100:

0.90

At 200:

0.90\times0.889

0.80

Interpretation:

80% remain.


Comparing Survival Curves

Suppose:

Rounds vs Ovals.

Question:

Which sells faster?

Use: Log-rank test.

Hypothesis:

H0:S1=S2H_0: S_1 = S_2

Enter Cox Regression

Now we introduce predictors.

Question: How does customer count affect sale speed?

Model:

h(t|X)=h0(t)eXβh(t|X)=h_0(t)e^{X\beta}

This is: Cox Proportional Hazards Model


Breaking the Equation Down

Baseline hazard:

h0(t)h_0(t)

Natural hazard.

No predictors.


Predictor multiplier:

eXβe^{X\beta}

Changes hazard.


Why Cox Became Famous

Because: baseline hazard is never specified.

No assumption.

Huge innovation.


Example

Suppose:

β=0.5\beta=0.5

Hazard ratio:

e0.5=1.65e^{0.5}=1.65

Interpretation:

Event occurs 65% faster.


Hazard Ratio

This is everything in Cox.

Formula:

HR=eβHR=e^{\beta}

Interpretation:


HR = 1

No effect.


HR > 1

Faster event.


HR < 1

Slower event.


Example — Diamond Business

Suppose:

Predictor:

customer count.

Model:

β=0.3\beta=0.3

Hazard ratio:

1.35

Interpretation:

More customers increase sale speed by 35%.


Proportional Hazards Assumption

The big assumption. Hazard ratio remains constant.

Meaning:

if Round sells twice as fast today, it sells twice as fast tomorrow.

Parallel hazards.


Violations

Sometimes effects change.

Examples:

  • promotion effect fades,
  • seasonality changes.

Need:

time-varying effects.


Partial Likelihood — The Genius Part

You asked this earlier.

Cox never estimates:

h0(t)h_0(t)

Instead:

conditions it away.

Partial likelihood:

L(β)=ieXiβjRieXjβL(\beta)=\prod_i\frac{e^{X_i\beta}}{\sum_{j\in R_i}e^{X_j\beta}}

Notice:

Baseline hazard disappeared.

Very similar philosophy to Chapter 7.


Numerical Example

Risk set:

DiamondCustomer Count
A2
B4
C6

A sells.

Likelihood:

e2βe2β+e4β+e6β\frac{e^{2\beta}}{e^{2\beta}+e^{4\beta}+e^{6\beta}}

Multiply across events.

Estimate:

β\beta

Time-Varying Covariates

Customer count changes.

Inventory changes.

Model becomes:

h(t|X(t))h(t|X(t))

Very practical.


Frailty Models

Random effects in survival.

Model:

h(t)=uh0(t)eXβh(t)=u h_0(t)e^{X\beta}

Frailty:

hidden risk.

Equivalent spirit to GLMM.


Real Dialog Application

Your dataset:

Aging sheet.

Variables:

  • Days Out
  • Shape
  • Weight
  • Customer Count

Event:

SOLD

Censoring:

blank Result.

Question:

Which factors speed up sales?

Perfect Cox use case.


Inventory Example

Question:

How long until:

LGRD 1.0–1.19 sells?

Not:

how many sell.

This becomes survival.


Why Chapter 11 Matters

Before this chapter:

time was ignored.

After this chapter:

time becomes the outcome.

Huge conceptual shift.


The Deep Lesson

Chapter 11 teaches:

timing matters as much as occurrence.

And incomplete information still contains signal.


Final Thought

Before Chapter 11:

you ask:

“Will it happen?”

After Chapter 11:

you ask:

“When will it happen?”

That question leads into:

  • survival analysis,
  • reliability,
  • customer lifetime value,
  • inventory aging,
  • hazard modeling,
  • modern event forecasting.

Leave a comment