Note 14: Conditional Probability, Independence, and Combinations
Conditional probability · Bayes' Rule · Total Probability Rule · Independence · Product Rule · Union Bound
Coin flips are independent — ten previous tails do not affect the next flip. But if you have dealt ten red cards from a deck, the probability of the next card being red drops noticeably. This difference is the essence of conditional probability: learning that an event occurred changes our probability estimates for other events. This note formalizes that idea and derives the most important computational tools in probability theory.
Conditional Probability
Suppose event $B$ is known to have occurred. This shrinks our sample space from $\Omega$ to the subset $B$. Every sample point $\omega \in B$ must have its probability rescaled so that the new probabilities sum to 1. The scaling factor is $1/\mathbb{P}[B]$.
📘 Example — Balls and Bins: conditioning on bin 2 being empty
Throw $m=4$ balls into $n=3$ bins. Total outcomes: $3^4 = 81$, uniform space.
Let $A$ = event bin 1 is empty, $B$ = event bin 2 is empty.
$\mathbb{P}[B] = (2/3)^4 = 16/81$ (4 balls each avoid bin 2 with prob 2/3).
$A \cap B$ = both bins 1 and 2 empty, so all 4 balls fall in bin 3. $\mathbb{P}[A \cap B] = (1/3)^4 = 1/81$.
$$\mathbb{P}[A \mid B] = \frac{1/81}{16/81} = \frac{1}{16}$$Compare: $\mathbb{P}[A] = 16/81 \approx 0.198$, but $\mathbb{P}[A \mid B] = 1/16 = 0.0625$. Knowing bin 2 is empty drastically reduces the probability that bin 1 is also empty.
$\mathbb{P}[A|B] = 1/16$📘 Example — Card Dealing: second ace given first is an ace
Deal 2 cards in order. $|\Omega| = 52 \times 51$. Let $B$ = first card is an ace, $A$ = second card is an ace.
$\mathbb{P}[B] = 4/52 = 1/13$. $\mathbb{P}[A \cap B] = (4 \times 3)/(52 \times 51) = 12/2652$.
$$\mathbb{P}[A \mid B] = \frac{12/2652}{4/52} = \frac{12}{2652} \times \frac{52}{4} = \frac{3}{51} = \frac{1}{17}$$Without conditioning, $\mathbb{P}[A] = 4/52 = 1/13 \approx 0.077$. With conditioning, $\mathbb{P}[A|B] = 1/17 \approx 0.059$. If we know the first card is an ace, there are now only 3 aces left out of 51 remaining cards.
$\mathbb{P}[A|B] = 3/51 = 1/17$Bayesian Inference
Conditional probability is at the heart of Bayesian inference — the process of updating beliefs after observations. The key idea: you start with a prior probability $\mathbb{P}[A]$ (your belief before any observations), make an observation $B$, and compute the posterior probability $\mathbb{P}[A|B]$ (your updated belief).
- $\mathbb{P}[A] = 0.05$ — 5% of the population is affected
- $\mathbb{P}[B \mid A] = 0.9$ — the test is positive for 90% of affected people (10% false negatives)
- $\mathbb{P}[B \mid \bar{A}] = 0.2$ — the test is positive for 20% of healthy people (20% false positives)
Using Bayes' Rule: $\mathbb{P}[A \mid B] = \dfrac{\mathbb{P}[B \mid A]\mathbb{P}[A]}{\mathbb{P}[B]} = \dfrac{0.9 \times 0.05}{\mathbb{P}[B]} = \dfrac{0.045}{\mathbb{P}[B]}$
$\mathbb{P}[B] = 0.9 \times 0.05 + 0.2 \times 0.95 = 0.045 + 0.19 = 0.235$
$$\mathbb{P}[A \mid B] = \frac{0.045}{0.235} = \frac{9}{47} \approx 0.191$$ Despite testing positive, there is only about a 19% chance of actually being affected!
This result is counterintuitive but crucial. The low prior probability ($5\%$) combined with the high false positive rate ($20\%$) means most positive tests are false. This is why medical screening programs are more effective for high-risk populations.
Bayes' Rule and Total Probability Rule
Total Probability Rule: why it works
$A \cap B$ and $\bar{A} \cap B$ are disjoint and their union is $B$. So $\mathbb{P}[B] = \mathbb{P}[A \cap B] + \mathbb{P}[\bar{A} \cap B]$. Apply definition of conditional probability to each term.
Bayes' Rule: why it works
$\mathbb{P}[A|B] = \mathbb{P}[A \cap B]/\mathbb{P}[B]$. Rewrite the numerator using $\mathbb{P}[B|A]\mathbb{P}[A] = \mathbb{P}[A \cap B]$. Then apply Total Probability Rule to the denominator.
📘 Example — Tennis Match
You will play either opponent X (probability 0.6) or Y (probability 0.4). You win against X with probability 0.7 and against Y with probability 0.3.
By the Total Probability Rule:
$$\mathbb{P}[\text{win}] = \mathbb{P}[\text{win}|X]\mathbb{P}[X] + \mathbb{P}[\text{win}|Y]\mathbb{P}[Y] = 0.7 \times 0.6 + 0.3 \times 0.4 = 0.42 + 0.12 = 0.54$$ You win with probability 0.54.General Form: Partitions
Independence
Intuition: the two events have nothing to do with each other. Coin flips are independent: the outcome of the first flip carries no information about the second. Drawing cards from a deck without replacement is NOT independent: each card drawn changes the composition of the remaining deck.
Mutual Independence vs. Pairwise Independence
- $A$ = first flip is H (prob 1/2)
- $B$ = second flip is H (prob 1/2)
- $C$ = both flips are the same (prob 1/2)
Intersections: The Product Rule
📘 Example — Poker Flush via Product Rule
Let $A_i$ = event that the $i$-th card drawn is a Heart. We want $\mathbb{P}[A_1 \cap A_2 \cap A_3 \cap A_4 \cap A_5]$.
$$\mathbb{P}[A_1] = \frac{13}{52} = \frac{1}{4}$$ $$\mathbb{P}[A_2 \mid A_1] = \frac{12}{51}, \quad \mathbb{P}[A_3 \mid A_1 \cap A_2] = \frac{11}{50}, \quad \mathbb{P}[A_4 \mid A_1\cap A_2 \cap A_3] = \frac{10}{49}, \quad \mathbb{P}[A_5 \mid \cdots] = \frac{9}{48}$$ $$\mathbb{P}[\text{Hearts flush}] = \frac{13}{52} \times \frac{12}{51} \times \frac{11}{50} \times \frac{10}{49} \times \frac{9}{48}$$ $$\mathbb{P}[\text{any flush}] = 4 \times \frac{13 \cdot 12 \cdot 11 \cdot 10 \cdot 9}{52 \cdot 51 \cdot 50 \cdot 49 \cdot 48} \approx 0.00198$$ Same answer as Note 13, via a different method. Two approaches confirm each other.📘 Example — Monty Hall Revisited via Product Rule
Let $C_1$ = contestant picks door 1, $P_2$ = prize behind door 2, $H_3$ = host opens door 3.
Note: $C_1$ and $P_2$ are independent (contestant's choice and prize placement are unrelated).
$$\mathbb{P}[C_1 \cap P_2 \cap H_3] = \mathbb{P}[C_1] \times \mathbb{P}[P_2 \mid C_1] \times \mathbb{P}[H_3 \mid C_1 \cap P_2]$$ $$= \frac{1}{3} \times \frac{1}{3} \times 1 = \frac{1}{9}$$The last factor is 1 because given the contestant chose door 1 and prize is behind door 2, the host MUST open door 3 (the only remaining goat door).
Then: $\mathbb{P}[P_2 \mid C_1 \cap H_3] = \dfrac{\mathbb{P}[C_1 \cap P_2 \cap H_3]}{\mathbb{P}[C_1 \cap H_3]} = \dfrac{1/9}{1/6} = \dfrac{2}{3}$
Switching wins with probability 2/3. Formally confirmed via the Product Rule.Unions: Inclusion-Exclusion and Union Bound
📘 Example — Las Vegas dice game (the casino's wrong claim)
You pick a number from 1 to 6. Three dice are rolled. You win if your number appears on at least one die.
Let $A_i$ = your number appears on die $i$. $\mathbb{P}[A_i] = 1/6$. The dice are independent, so:
$$\mathbb{P}[A_i \cap A_j] = \mathbb{P}[A_i]\mathbb{P}[A_j] = \frac{1}{36}, \quad \mathbb{P}[A_1 \cap A_2 \cap A_3] = \frac{1}{216}$$The casino's (wrong) claim: $\mathbb{P}[A_1 \cup A_2 \cup A_3] = 3 \times \frac{1}{6} = \frac{1}{2}$. Wrong because events overlap!
Correct computation via inclusion-exclusion:
$$\mathbb{P}[A_1 \cup A_2 \cup A_3] = 3 \cdot \frac{1}{6} - 3 \cdot \frac{1}{36} + \frac{1}{216} = \frac{108 - 18 + 1}{216} = \frac{91}{216} \approx 0.421$$ Your actual odds are about 42.1%, not 50%. The casino's argument overcounts outcomes where your number appears on multiple dice.Hard Practice Problems
Problem: A biased coin has $\mathbb{P}[H] = 1/3$. You flip it $n$ times. Let $A$ be the event that you get at least one head, and let $B$ be the event that you get at least one tail. Compute $\mathbb{P}[A \mid B]$ in terms of $n$. What does it approach as $n \to \infty$?
📋 Solution to Hard Problem 1
Solution:
- Compute $\mathbb{P}[A \cap B]$: this is the probability of having at least one head AND at least one tail — equivalently, the sequence is not all-heads and not all-tails. $$\mathbb{P}[A \cap B] = 1 - \mathbb{P}[\bar{A}] - \mathbb{P}[\bar{B}] = 1 - \left(\frac{2}{3}\right)^n - \left(\frac{1}{3}\right)^n$$ ($\bar{A}$ = all tails, prob $(2/3)^n$; $\bar{B}$ = all heads, prob $(1/3)^n$. These are disjoint.)
- Compute $\mathbb{P}[B]$: at least one tail means not all heads. $$\mathbb{P}[B] = 1 - \mathbb{P}[\bar{B}] = 1 - \left(\frac{1}{3}\right)^n$$
- Apply the definition: $$\mathbb{P}[A \mid B] = \frac{\mathbb{P}[A \cap B]}{\mathbb{P}[B]} = \frac{1 - (2/3)^n - (1/3)^n}{1 - (1/3)^n}$$
- As $n \to \infty$: $(2/3)^n \to 0$ and $(1/3)^n \to 0$, so $\mathbb{P}[A|B] \to 1/1 = 1$. Intuitively: if you flip many coins and we know there's at least one tail, it becomes near-certain there's also at least one head.
Problem: An instructor gives a multiple-choice test with 4 options per question. A student either knows the answer (prob $p$) or guesses uniformly at random. Given that a student answered a question correctly, what is the probability they actually knew the answer (vs. guessed correctly)? Compute this for $p = 0.6$ and explain the result intuitively.
📋 Solution to Hard Problem 2
This is a classic application of Bayes' Rule.
- Define events: $K$ = student knows the answer (prob $p$), $C$ = student answers correctly.
- Given values: $\mathbb{P}[K] = p$, $\mathbb{P}[\bar{K}] = 1-p$, $\mathbb{P}[C \mid K] = 1$ (knows $\Rightarrow$ correct), $\mathbb{P}[C \mid \bar{K}] = 1/4$ (random guess, 4 options).
- Total Probability Rule: $\mathbb{P}[C] = 1 \cdot p + \frac{1}{4}(1-p) = p + \frac{1-p}{4} = \frac{3p+1}{4}$.
- Bayes' Rule: $\mathbb{P}[K \mid C] = \dfrac{\mathbb{P}[C|K]\mathbb{P}[K]}{\mathbb{P}[C]} = \dfrac{1 \cdot p}{(3p+1)/4} = \dfrac{4p}{3p+1}$.
- For $p = 0.6$: $\mathbb{P}[K|C] = \dfrac{4 \times 0.6}{3 \times 0.6 + 1} = \dfrac{2.4}{2.8} = \dfrac{6}{7} \approx 0.857$.
Intuition: A correct answer most likely means the student knew it. Even though some guessers get lucky, knowing the answer is by far the most common reason for a correct response. The 85.7% figure quantifies this. If $p$ were very small (very weak class), the fraction would be closer to 0.
$\mathbb{P}[K|C] = 4p/(3p+1)$. For $p=0.6$, this is $6/7 \approx 85.7\%$.Key Formulas Reference
| Formula | When to Use |
|---|---|
| $\mathbb{P}[A|B] = \mathbb{P}[A\cap B]/\mathbb{P}[B]$ | Definition of conditional prob; computing $\mathbb{P}[A|B]$ from intersection |
| $\mathbb{P}[B] = \mathbb{P}[B|A]\mathbb{P}[A]+\mathbb{P}[B|\bar A]\mathbb{P}[\bar A]$ | Total Probability Rule; denominator for Bayes |
| $\mathbb{P}[A|B] = \mathbb{P}[B|A]\mathbb{P}[A]/\mathbb{P}[B]$ | Bayes' Rule; "flipping" conditional probability |
| $\mathbb{P}[A\cap B]=\mathbb{P}[A]\mathbb{P}[B]$ | Test for independence |
| $\mathbb{P}[A_1\cap\cdots\cap A_n]=\mathbb{P}[A_1]\mathbb{P}[A_2|A_1]\cdots$ | Product Rule (chain rule) |
$\mathbb{P}[\bigcup A_i] = \sum \mathbb{P}[A_i]-\sum_{i| Inclusion-Exclusion for unions | |
| $\mathbb{P}[\bigcup A_i] \leq \sum \mathbb{P}[A_i]$ | Union Bound (simple overestimate) |