Lesson 2: Conditional Probability, Joint Probability, Marginal Probability, and Independence

Learning Objectives

By the end of this lesson, you will be able to:

  • Understand joint probability.
  • Understand marginal probability.
  • Understand conditional probability.
  • Understand statistical independence.
  • Apply the multiplication rule.
  • Apply the law of total probability.
  • Prepare for Bayes’ Theorem in the next lesson.

Why Do We Need These Concepts?

In Bayesian statistics, we are constantly asking questions such as:

What is the probability a patient has a disease given a positive test?

What is the probability a customer will purchase given that they visited the website?

What is the probability a diamond sells given its color and clarity?

All these questions involve relationships between events.

Before we can learn Bayes’ Theorem, we must understand how probabilities behave when multiple events occur together.


Events and Sample Spaces

An event is something that can happen.

Examples:

  • A coin lands heads.
  • A patient has a disease.
  • A customer makes a purchase.
  • A machine fails.

Suppose we toss a fair coin.

The sample space is:

$$S={H,T}$$

where:

  • H = Heads
  • T = Tails

Since the coin is fair:

$$P(H)=0.5$$

and

$$P(T)=0.5$$


Joint Probability

What Is Joint Probability?

Joint probability measures the probability that two events occur simultaneously.

Suppose:

  • A = Patient has a disease
  • B = Test result is positive

Then:

$$P(A,B)$$

means:

The probability that the patient has the disease AND the test is positive.

The word “AND” is the key idea.


Healthcare Example

Suppose:

  • 5% of patients have a disease.
  • 4% have both the disease and a positive test.

Then:

$$P(Disease)=0.05$$

and

$$P(Disease,Positive)=0.04$$

The second quantity is a joint probability.


Supply Chain Example

Suppose:

  • Event A = Product is stocked.
  • Event B = Product sells this week.

Then:

$$P(A,B)$$

represents the probability that the item is stocked and sold during the week.


Visualizing Joint Probability

Imagine a Venn diagram.

Event A occupies one circle.

Event B occupies another.

Their overlap represents:

$$P(A,B)$$

The overlapping region contains outcomes where both events occur.


Marginal Probability

What Is Marginal Probability?

Marginal probability is the probability of a single event regardless of what happens to other variables.

For example:

$$P(A)$$

or

$$P(B)$$

These are marginal probabilities.


Why Is It Called Marginal?

Historically, probabilities were written in large tables.

To obtain probabilities for a single variable, statisticians summed values appearing in the margins of the table.

The name remained.


Example

Suppose a hospital records:

DiseasePositive TestCount
YesYes40
YesNo10
NoYes50
NoNo900

Total observations:

$$N=1000$$

Probability of disease:

$$P(Disease)=\frac{40+10}{1000}=0.05$$

Probability of positive test:

$$P(Positive)=\frac{40+50}{1000}=0.09$$

These are marginal probabilities.


Obtaining Marginal Probabilities from Joint Probabilities

Suppose:

ABProbability
YesYes0.20
YesNo0.30
NoYes0.10
NoNo0.40

To find:

$$P(A)$$

we add all rows where A occurs:

$$P(A)=0.20+0.30=0.50$$

Similarly:

$$P(B)=0.20+0.10=0.30$$

This process is called marginalization.


Conditional Probability

Motivation

Suppose a doctor knows:

  • The patient tested positive.

The doctor now asks:

What is the probability the patient actually has the disease?

Notice something important.

We are no longer considering everyone.

We are only considering positive tests.

This changes the probability.


Definition

Conditional probability is defined as:

$$P(A|B)=\frac{P(A,B)}{P(B)}$$

provided:

$$P(B)>0$$

Read this as:

Probability of A given B.


Intuition

When B occurs, the sample space shrinks.

We only consider outcomes where B happened.

Among those outcomes, we determine how often A also occurred.


Example: Medical Testing

Suppose:

$$P(Disease,Positive)=0.04$$

and

$$P(Positive)=0.09$$

Then:

$$P(Disease|Positive)=\frac{0.04}{0.09}=0.444$$

or about:

$$44.4%$$

Even though only 5% of people have the disease, among people with positive tests the probability becomes much larger.


Example: Inventory Management

Suppose:

  • 20% of inventory items sell each month.
  • 15% are both premium products and sold.

Let:

  • P = Premium
  • S = Sold

Then:

$$P(P,S)=0.15$$

and

$$P(S)=0.20$$

Therefore:

$$P(P|S)=\frac{0.15}{0.20}=0.75$$

Meaning:

75% of sold items are premium products.


Multiplication Rule

Rearranging the conditional probability formula gives:

$$P(A,B)=P(A|B)P(B)$$

This is called the multiplication rule.

An equivalent form is:

$$P(A,B)=P(B|A)P(A)$$

These two formulas are fundamental to Bayes’ Theorem.


Independence

What Does Independence Mean?

Two events are independent if knowing one occurred does not change the probability of the other.

Mathematically:

$$P(A|B)=P(A)$$

If this holds:

A and B are independent.


Example: Independent Events

Suppose:

  • Event A = Rain in Toronto.
  • Event B = Coin lands heads in Calgary.

These events are unrelated.

Therefore:

$$P(A|B)=P(A)$$


Equivalent Test

Two events are independent if:

$$P(A,B)=P(A)P(B)$$

This is often easier to verify.


Example

Suppose:

$$P(A)=0.4$$

$$P(B)=0.5$$

If independent:

$$P(A,B)=0.4\times0.5=0.2$$

Therefore:

$$P(A,B)=0.2$$


Dependence

Most real-world variables are dependent.

Examples:

  • Smoking and lung disease.
  • Age and mortality.
  • Product price and demand.
  • Inventory age and likelihood of sale.

For dependent events:

$$P(A|B)\neq P(A)$$


Law of Total Probability

Suppose events:

$$B_1,B_2,\ldots,B_n$$

partition the sample space.

Then:

$$P(A)=\sum_{i=1}^{n}P(A|B_i)P(B_i)$$

This formula allows us to compute marginal probabilities using conditional probabilities.

It will become extremely important when we derive Bayes’ Theorem.


Healthcare Example

Suppose:

  • 5% have disease.
  • 95% do not.

Test sensitivity:

$$P(Positive|Disease)=0.80$$

False positive rate:

$$P(Positive|NoDisease)=0.05$$

Then:

$$P(Positive)=P(Positive|Disease)P(Disease)+P(Positive|NoDisease)P(NoDisease)$$

Substituting:

$$P(Positive)=0.80(0.05)+0.05(0.95)=0.0875$$

This quantity will become the denominator of Bayes’ Theorem.


Putting Everything Together

The relationships among probabilities are:

Joint Probability:

$$P(A,B)$$

Conditional Probability:

$$P(A|B)=\frac{P(A,B)}{P(B)}$$

Multiplication Rule:

$$P(A,B)=P(A|B)P(B)$$

Independence:

$$P(A,B)=P(A)P(B)$$

Law of Total Probability:

$$P(A)=\sum_i P(A|B_i)P(B_i)$$

These five ideas form the mathematical machinery behind Bayesian inference.


Key Concepts Learned

  1. Joint probability describes events occurring together.
  2. Marginal probability describes a single event.
  3. Conditional probability updates probabilities when new information becomes available.
  4. Independence means one event does not affect another.
  5. The multiplication rule links conditional and joint probabilities.
  6. The law of total probability computes marginal probabilities from conditional probabilities.

Looking Ahead

In Lesson 3 we derive the most important equation in Bayesian statistics:

$$P(\theta|D)=\frac{P(D|\theta)P(\theta)}{P(D)}$$

We will show how Bayes’ Theorem naturally emerges from the conditional probability formulas learned in this lesson and why it serves as the engine of Bayesian inference.


References

  1. Bayesian Data Analysis — Andrew Gelman, John Carlin, Hal Stern, David Dunson, Aki Vehtari, Donald Rubin.
  2. Statistical Rethinking — Richard McElreath.
  3. Doing Bayesian Data Analysis — John K. Kruschke.
  4. Probability Theory: The Logic of Science — Edwin T. Jaynes.
  5. Introduction to Probability — Joseph Blitzstein and Jessica Hwang.

Exercises

Exercise 1

Suppose:

$$P(A)=0.6$$

$$P(B)=0.4$$

and A and B are independent.

Find:

$$P(A,B)$$


Exercise 2

Suppose:

$$P(A,B)=0.15$$

$$P(B)=0.30$$

Find:

$$P(A|B)$$


Exercise 3

A disease affects 2% of a population.

A test has:

$$P(Positive|Disease)=0.90$$

$$P(Positive|NoDisease)=0.04$$

Using the law of total probability, compute:

$$P(Positive)$$

This exercise will prepare you for Bayes’ Theorem in the next lesson.

Leave a Reply

Discover more from nerd-ish

Subscribe now to keep reading and get access to the full archive.

Continue reading