By Chapter 11, statistics asks a completely different question.
Until now we mostly modeled:
- counts,
- probabilities,
- continuous outcomes.
But many business and scientific questions are actually about: time.
Questions like:
- How long until a diamond sells?
- How long until a customer churns?
- How long until inventory becomes aged?
- How long until equipment fails?
- How long until a patient recovers?
Notice:
The outcome is not:
“How many?”
or
“Yes or No?”
The outcome is:
When?
This chapter introduces: Survival Analysis
and one of the most famous models in statistics: Cox Proportional Hazards Model.
What Makes Survival Data Different?
Suppose:
You deploy 100 diamonds.
After 180 days:
- 30 sold,
- 70 still deployed.
Question:
What do you do with the remaining 70?
You cannot pretend:
sale time = 180.
That would be wrong.
Those diamonds simply have: unknown future sale times.
This leads to: censoring.
Censoring — The Core Survival Concept
Censoring means: we know the event did not happen yet.
Example:
| Diamond | Days Out | Sold |
|---|---|---|
| A | 40 | Yes |
| B | 120 | Yes |
| C | 300 | No |
Diamond C:
not sold yet.
True sale time unknown.
This is: right censoring.
Why Ordinary Regression Fails
Suppose:
Average sale time:
100 days.
Problem:
Unsold inventory ignored.
Results become biased.
Survival analysis naturally incorporates:
- sold,
- unsold.
Survival Function
The first object is:
Meaning:
Probability event has NOT happened by time t.
Formula:
Example:
Interpretation:
70% still unsold after 180 days.
Hazard Function — The Difficult Concept
Students usually struggle here.
Hazard is NOT probability.
Hazard means:
instantaneous risk of event.
Formula:
Translation:
Given survival until now,
how quickly is the event happening?
Example — Inventory
Suppose:
Day 300.
Inventory survived.
Hazard asks:
How likely is sale immediately now?
Not cumulative.
Instantaneous.
Relationship Between Survival and Hazard
These are connected.
Interpretation:
Hazard accumulates.
Higher hazard:
faster event.
Lower survival.
Kaplan–Meier Estimator
Before regression:
estimate survival.
Formula:
where:
- d_i=events,
- n_i=risk set.
Example
| Time | Sold | At Risk |
|---|---|---|
| 100 | 10 | 100 |
| 200 | 10 | 90 |
90
Survival:
At 100:
0.90
At 200:
0.90\times0.889
0.80
Interpretation:
80% remain.
Comparing Survival Curves
Suppose:
Rounds vs Ovals.
Question:
Which sells faster?
Use: Log-rank test.
Hypothesis:
Enter Cox Regression
Now we introduce predictors.
Question: How does customer count affect sale speed?
Model:
This is: Cox Proportional Hazards Model
Breaking the Equation Down
Baseline hazard:
Natural hazard.
No predictors.
Predictor multiplier:
Changes hazard.
Why Cox Became Famous
Because: baseline hazard is never specified.
No assumption.
Huge innovation.
Example
Suppose:
Hazard ratio:
Interpretation:
Event occurs 65% faster.
Hazard Ratio
This is everything in Cox.
Formula:
Interpretation:
HR = 1
No effect.
HR > 1
Faster event.
HR < 1
Slower event.
Example — Diamond Business
Suppose:
Predictor:
customer count.
Model:
Hazard ratio:
1.35
Interpretation:
More customers increase sale speed by 35%.
Proportional Hazards Assumption
The big assumption. Hazard ratio remains constant.
Meaning:
if Round sells twice as fast today, it sells twice as fast tomorrow.
Parallel hazards.
Violations
Sometimes effects change.
Examples:
- promotion effect fades,
- seasonality changes.
Need:
time-varying effects.
Partial Likelihood — The Genius Part
You asked this earlier.
Cox never estimates:
Instead:
conditions it away.
Partial likelihood:
Notice:
Baseline hazard disappeared.
Very similar philosophy to Chapter 7.
Numerical Example
Risk set:
| Diamond | Customer Count |
|---|---|
| A | 2 |
| B | 4 |
| C | 6 |
A sells.
Likelihood:
Multiply across events.
Estimate:
Time-Varying Covariates
Customer count changes.
Inventory changes.
Model becomes:
Very practical.
Frailty Models
Random effects in survival.
Model:
Frailty:
hidden risk.
Equivalent spirit to GLMM.
Real Dialog Application
Your dataset:
Aging sheet.
Variables:
- Days Out
- Shape
- Weight
- Customer Count
Event:
SOLD
Censoring:
blank Result.
Question:
Which factors speed up sales?
Perfect Cox use case.
Inventory Example
Question:
How long until:
LGRD 1.0–1.19 sells?
Not:
how many sell.
This becomes survival.
Why Chapter 11 Matters
Before this chapter:
time was ignored.
After this chapter:
time becomes the outcome.
Huge conceptual shift.
The Deep Lesson
Chapter 11 teaches:
timing matters as much as occurrence.
And incomplete information still contains signal.
Final Thought
Before Chapter 11:
you ask:
“Will it happen?”
After Chapter 11:
you ask:
“When will it happen?”
That question leads into:
- survival analysis,
- reliability,
- customer lifetime value,
- inventory aging,
- hazard modeling,
- modern event forecasting.


Leave a comment