Note 13: Introduction to Discrete Probability
Sample spaces · Probability spaces · Uniform spaces · Classic examples · Birthday Paradox · Monty Hall
Probability theory began with gambling. Today it is the mathematical foundation for machine learning, algorithm analysis, systems design, and scientific inference. In EECS, every time someone says "this algorithm succeeds with high probability" or "the expected running time is $O(n \log n)$," they are invoking probability theory. This note gives you the precise framework to make such statements rigorously.
Why Probability?
Consider these statements: "The chance of a flush in poker is about 0.2%." "A randomized primality test is wrong with probability at most $10^{-12}$." "In a group of 23 people, there is a better-than-even chance two share a birthday." None of these make sense without specifying an underlying probability space — an explicit model of the random experiment.
Without the probability space, probability statements are nearly meaningless. Specifying it forces you to answer: what are the possible outcomes? what probabilities do they carry? This framework eliminates the intuition traps that even mathematicians fall into (witness the Monty Hall debate).
Sample Spaces and Probability Spaces
Examples: flipping a coin 3 times gives $\Omega = \{HHH, HHT, HTH, \ldots, TTT\}$, which has $|\Omega| = 2^3 = 8$ elements.
- Non-negativity: $0 \leq \mathbb{P}[\omega] \leq 1$ for all $\omega$.
- Normalization: $\displaystyle\sum_{\omega \in \Omega} \mathbb{P}[\omega] = 1$.
Events and the Complement Rule
Classic Probability Spaces
Coin Tosses (Biased Coin)
Suppose a coin has $\mathbb{P}[H] = p$ and $\mathbb{P}[T] = 1-p$ for some $0 \leq p \leq 1$. In $n$ tosses, the sample space has $2^n$ elements. The probability of any specific sequence with $r$ heads and $n-r$ tails is $p^r(1-p)^{n-r}$. (We multiply because tosses are independent — this will be formally justified in Note 14.)
📘 Example — Biased coin with $p = 2/3$, flip 4 times
Let $A$ = event that all four tosses are the same. Then $A = \{HHHH, TTTT\}$.
$$\mathbb{P}[HHHH] = \left(\frac{2}{3}\right)^4 = \frac{16}{81}, \quad \mathbb{P}[TTTT] = \left(\frac{1}{3}\right)^4 = \frac{1}{81}$$ $$\mathbb{P}[A] = \frac{16}{81} + \frac{1}{81} = \frac{17}{81}$$Let $B$ = event of exactly 2 heads. There are $\binom{4}{2} = 6$ such sequences, each with probability $(2/3)^2(1/3)^2 = 4/81$:
$$\mathbb{P}[B] = 6 \cdot \frac{4}{81} = \frac{24}{81} = \frac{8}{27}$$ $\mathbb{P}[A] = 17/81$, $\mathbb{P}[B] = 8/27$Rolling Dice
Two fair dice: $\Omega = \{(i,j): 1 \leq i,j \leq 6\}$ with $|\Omega| = 36$, all outcomes equally likely. This is a uniform space, so $\mathbb{P}[A] = |A|/36$.
Card Shuffling and Poker Hands
Shuffling a deck: $\Omega$ consists of all $52!$ permutations, uniform probability. For a 5-card poker hand: $|\Omega| = \binom{52}{5} = 2{,}598{,}960$, all hands equally likely.
Balls and Bins
Throw $m$ labeled balls into $n$ labeled bins uniformly at random. Each ball independently picks any of the $n$ bins with equal probability $1/n$. The sample space has $n^m$ equally likely outcomes.
For $m = 20$ balls in $n = 10$ bins: $\mathbb{P}[\text{bin 1 empty}] = (9/10)^{20} \approx 0.122$.
The Birthday Paradox
How many people need to be in a room before there is a better than 50% chance that two share a birthday? Most people guess around 183 (half of 365). The true answer is just 23. This counterintuitive result is known as the birthday paradox.
Let $A$ = event that at least two people share a birthday. Use the complement: $$\mathbb{P}[A] = 1 - \mathbb{P}[\bar{A}] = 1 - \frac{365 \times 364 \times \cdots \times (365 - n + 1)}{365^n}$$ The complement $\bar{A}$ is the event that all $n$ birthdays are distinct (sampling without replacement).
The Monty Hall Problem
Three doors: one hides a car, two hide goats. You pick a door (say door 1). The host, who knows where the car is, opens one of the other doors to reveal a goat. Should you switch to the remaining door?
Most people say it doesn't matter — two doors remain, so each has probability 1/2. This is wrong. Switching wins with probability 2/3.
- If $i \neq j$ (prize and pick differ): host is forced to open the one remaining non-prize door. There are 6 such outcomes, each with probability $\frac{1}{3} \times \frac{1}{3} \times 1 = \frac{1}{9}$.
- If $i = j$ (contestant picked the prize door): host can open either of the other 2 doors. There are 6 such outcomes, each with probability $\frac{1}{3} \times \frac{1}{3} \times \frac{1}{2} = \frac{1}{18}$.
Switching wins when $i \neq j$: probability $6 \times \frac{1}{9} = \frac{2}{3}$. Staying wins when $i = j$: probability $6 \times \frac{1}{18} = \frac{1}{3}$.
The key insight: your initial door has a $1/3$ chance of being correct. The other two doors combined have a $2/3$ chance. When the host removes one of them (a goat), all of that $2/3$ probability is concentrated on the remaining door. Switching is always better.
The Four Steps of Probability
- Define the sample space $\Omega$: what are all possible outcomes of the experiment?
- Assign probabilities to each sample point: are they uniform? or does a specific formula apply?
- Identify the event $A$ you care about: which subset of $\Omega$ does it correspond to?
- Compute $\mathbb{P}[A]$: add up the probabilities of all sample points in $A$. For uniform spaces, count $|A|/|\Omega|$.
The Monty Hall problem perfectly illustrates this: many erroneous "proofs" were submitted to newspapers because people reasoned intuitively rather than systematically enumerating the sample space.
Hard Practice Problems
Problem: A class has 30 students. What is the probability that at least two students were born in the same month? (Assume 12 equally likely months and treat month-of-birth as independent across students.) Express your answer both exactly and as a decimal approximation.
📋 Solution to Hard Problem 1
This is the birthday paradox with 12 months instead of 365 days and 30 students instead of $n$.
- Sample space: all sequences of 30 month-values, each from $\{1,\ldots,12\}$. Total: $12^{30}$.
- Complement event $\bar{A}$: all 30 students have distinct birth months. But wait: with 30 students and only 12 months, by the Pigeonhole Principle, it is impossible for all 30 to have distinct months. $|\bar{A}| = 0$.
- Therefore $\mathbb{P}[\bar{A}] = 0$, so $\mathbb{P}[A] = 1 - 0 = 1$.
With 30 students and only 12 months, it is certain that at least two share a birth month. The pigeonhole principle guarantees it: 30 pigeons into 12 holes means at least one hole has $\lceil 30/12 \rceil = 3$ pigeons.
$\mathbb{P}[A] = 1$ (certainty). The birthday paradox becomes trivial when students outnumber months.Problem: Throw $n$ labeled balls into $n$ labeled bins uniformly at random. Let $A_i$ be the event that bin $i$ is empty. Compute $\mathbb{P}[A_1 \cup A_2 \cup \cdots \cup A_n]$, the probability that at least one bin is empty, using the inclusion-exclusion principle. Show that for large $n$ this is approximately $1 - 1/e \approx 0.632$.
📋 Solution to Hard Problem 2
Solution using inclusion-exclusion:
- $\mathbb{P}[A_i] = \left(\frac{n-1}{n}\right)^n$ — all $n$ balls avoid bin $i$. There are $\binom{n}{1} = n$ such events.
- $\mathbb{P}[A_i \cap A_j] = \left(\frac{n-2}{n}\right)^n$ — all balls avoid two specific bins. There are $\binom{n}{2}$ such pairs.
- In general, $\mathbb{P}[A_{i_1}\cap\cdots\cap A_{i_k}] = \left(\frac{n-k}{n}\right)^n$ for any $k$ specific bins. There are $\binom{n}{k}$ such $k$-tuples.
- By inclusion-exclusion: $$\mathbb{P}\left[\bigcup_{i=1}^n A_i\right] = \sum_{k=1}^{n} (-1)^{k-1}\binom{n}{k}\left(\frac{n-k}{n}\right)^n = \sum_{k=1}^{n-1}(-1)^{k-1}\binom{n}{k}\left(1-\frac{k}{n}\right)^n$$ (The $k=n$ term is 0 since $(0/n)^n = 0$.)
- For large $n$, $\left(1-\frac{k}{n}\right)^n \approx e^{-k}$. Using the first term (ignoring higher order terms): $$\mathbb{P}[\text{some bin empty}] \approx n \cdot e^{-1} - \binom{n}{2} e^{-2} + \ldots \approx 1 - e^{-1} \approx 0.632$$ This uses the fact that the full series $\sum_{k=1}^{\infty}(-1)^{k-1}\binom{n}{k}e^{-k} \to 1 - e^{-1}$ as $n\to\infty$, which can be verified via the expansion of $e^{-1}$.