Learning Objectives
By the end of this lesson, you will be able to:
- Understand joint probability.
- Understand marginal probability.
- Understand conditional probability.
- Understand statistical independence.
- Apply the multiplication rule.
- Apply the law of total probability.
- Prepare for Bayes’ Theorem in the next lesson.
Why Do We Need These Concepts?
In Bayesian statistics, we are constantly asking questions such as:
What is the probability a patient has a disease given a positive test?
What is the probability a customer will purchase given that they visited the website?
What is the probability a diamond sells given its color and clarity?
All these questions involve relationships between events.
Before we can learn Bayes’ Theorem, we must understand how probabilities behave when multiple events occur together.
Events and Sample Spaces
An event is something that can happen.
Examples:
- A coin lands heads.
- A patient has a disease.
- A customer makes a purchase.
- A machine fails.
Suppose we toss a fair coin.
The sample space is:
$$S={H,T}$$
where:
- H = Heads
- T = Tails
Since the coin is fair:
$$P(H)=0.5$$
and
$$P(T)=0.5$$
Joint Probability
What Is Joint Probability?
Joint probability measures the probability that two events occur simultaneously.
Suppose:
- A = Patient has a disease
- B = Test result is positive
Then:
$$P(A,B)$$
means:
The probability that the patient has the disease AND the test is positive.
The word “AND” is the key idea.
Healthcare Example
Suppose:
- 5% of patients have a disease.
- 4% have both the disease and a positive test.
Then:
$$P(Disease)=0.05$$
and
$$P(Disease,Positive)=0.04$$
The second quantity is a joint probability.
Supply Chain Example
Suppose:
- Event A = Product is stocked.
- Event B = Product sells this week.
Then:
$$P(A,B)$$
represents the probability that the item is stocked and sold during the week.
Visualizing Joint Probability
Imagine a Venn diagram.
Event A occupies one circle.
Event B occupies another.
Their overlap represents:
$$P(A,B)$$
The overlapping region contains outcomes where both events occur.
Marginal Probability
What Is Marginal Probability?
Marginal probability is the probability of a single event regardless of what happens to other variables.
For example:
$$P(A)$$
or
$$P(B)$$
These are marginal probabilities.
Why Is It Called Marginal?
Historically, probabilities were written in large tables.
To obtain probabilities for a single variable, statisticians summed values appearing in the margins of the table.
The name remained.
Example
Suppose a hospital records:
| Disease | Positive Test | Count |
|---|---|---|
| Yes | Yes | 40 |
| Yes | No | 10 |
| No | Yes | 50 |
| No | No | 900 |
Total observations:
$$N=1000$$
Probability of disease:
$$P(Disease)=\frac{40+10}{1000}=0.05$$
Probability of positive test:
$$P(Positive)=\frac{40+50}{1000}=0.09$$
These are marginal probabilities.
Obtaining Marginal Probabilities from Joint Probabilities
Suppose:
| A | B | Probability |
|---|---|---|
| Yes | Yes | 0.20 |
| Yes | No | 0.30 |
| No | Yes | 0.10 |
| No | No | 0.40 |
To find:
$$P(A)$$
we add all rows where A occurs:
$$P(A)=0.20+0.30=0.50$$
Similarly:
$$P(B)=0.20+0.10=0.30$$
This process is called marginalization.
Conditional Probability
Motivation
Suppose a doctor knows:
- The patient tested positive.
The doctor now asks:
What is the probability the patient actually has the disease?
Notice something important.
We are no longer considering everyone.
We are only considering positive tests.
This changes the probability.
Definition
Conditional probability is defined as:
$$P(A|B)=\frac{P(A,B)}{P(B)}$$
provided:
$$P(B)>0$$
Read this as:
Probability of A given B.
Intuition
When B occurs, the sample space shrinks.
We only consider outcomes where B happened.
Among those outcomes, we determine how often A also occurred.
Example: Medical Testing
Suppose:
$$P(Disease,Positive)=0.04$$
and
$$P(Positive)=0.09$$
Then:
$$P(Disease|Positive)=\frac{0.04}{0.09}=0.444$$
or about:
$$44.4%$$
Even though only 5% of people have the disease, among people with positive tests the probability becomes much larger.
Example: Inventory Management
Suppose:
- 20% of inventory items sell each month.
- 15% are both premium products and sold.
Let:
- P = Premium
- S = Sold
Then:
$$P(P,S)=0.15$$
and
$$P(S)=0.20$$
Therefore:
$$P(P|S)=\frac{0.15}{0.20}=0.75$$
Meaning:
75% of sold items are premium products.
Multiplication Rule
Rearranging the conditional probability formula gives:
$$P(A,B)=P(A|B)P(B)$$
This is called the multiplication rule.
An equivalent form is:
$$P(A,B)=P(B|A)P(A)$$
These two formulas are fundamental to Bayes’ Theorem.
Independence
What Does Independence Mean?
Two events are independent if knowing one occurred does not change the probability of the other.
Mathematically:
$$P(A|B)=P(A)$$
If this holds:
A and B are independent.
Example: Independent Events
Suppose:
- Event A = Rain in Toronto.
- Event B = Coin lands heads in Calgary.
These events are unrelated.
Therefore:
$$P(A|B)=P(A)$$
Equivalent Test
Two events are independent if:
$$P(A,B)=P(A)P(B)$$
This is often easier to verify.
Example
Suppose:
$$P(A)=0.4$$
$$P(B)=0.5$$
If independent:
$$P(A,B)=0.4\times0.5=0.2$$
Therefore:
$$P(A,B)=0.2$$
Dependence
Most real-world variables are dependent.
Examples:
- Smoking and lung disease.
- Age and mortality.
- Product price and demand.
- Inventory age and likelihood of sale.
For dependent events:
$$P(A|B)\neq P(A)$$
Law of Total Probability
Suppose events:
$$B_1,B_2,\ldots,B_n$$
partition the sample space.
Then:
$$P(A)=\sum_{i=1}^{n}P(A|B_i)P(B_i)$$
This formula allows us to compute marginal probabilities using conditional probabilities.
It will become extremely important when we derive Bayes’ Theorem.
Healthcare Example
Suppose:
- 5% have disease.
- 95% do not.
Test sensitivity:
$$P(Positive|Disease)=0.80$$
False positive rate:
$$P(Positive|NoDisease)=0.05$$
Then:
$$P(Positive)=P(Positive|Disease)P(Disease)+P(Positive|NoDisease)P(NoDisease)$$
Substituting:
$$P(Positive)=0.80(0.05)+0.05(0.95)=0.0875$$
This quantity will become the denominator of Bayes’ Theorem.
Putting Everything Together
The relationships among probabilities are:
Joint Probability:
$$P(A,B)$$
Conditional Probability:
$$P(A|B)=\frac{P(A,B)}{P(B)}$$
Multiplication Rule:
$$P(A,B)=P(A|B)P(B)$$
Independence:
$$P(A,B)=P(A)P(B)$$
Law of Total Probability:
$$P(A)=\sum_i P(A|B_i)P(B_i)$$
These five ideas form the mathematical machinery behind Bayesian inference.
Key Concepts Learned
- Joint probability describes events occurring together.
- Marginal probability describes a single event.
- Conditional probability updates probabilities when new information becomes available.
- Independence means one event does not affect another.
- The multiplication rule links conditional and joint probabilities.
- The law of total probability computes marginal probabilities from conditional probabilities.
Looking Ahead
In Lesson 3 we derive the most important equation in Bayesian statistics:
$$P(\theta|D)=\frac{P(D|\theta)P(\theta)}{P(D)}$$
We will show how Bayes’ Theorem naturally emerges from the conditional probability formulas learned in this lesson and why it serves as the engine of Bayesian inference.
References
- Bayesian Data Analysis — Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, Donald Rubin.
- Statistical Rethinking — Richard McElreath.
- Doing Bayesian Data Analysis — John K. Kruschke.
- Probability Theory: The Logic of Science — Edwin T. Jaynes.
- Introduction to Probability — Joseph Blitzstein and Jessica Hwang.
Exercises
Exercise 1
Suppose:
$$P(A)=0.6$$
$$P(B)=0.4$$
and A and B are independent.
Find:
$$P(A,B)$$
Exercise 2
Suppose:
$$P(A,B)=0.15$$
$$P(B)=0.30$$
Find:
$$P(A|B)$$
Exercise 3
A disease affects 2% of a population.
A test has:
$$P(Positive|Disease)=0.90$$
$$P(Positive|NoDisease)=0.04$$
Using the law of total probability, compute:
$$P(Positive)$$
This exercise will prepare you for Bayes’ Theorem in the next lesson.

Leave a Reply