Chapter 7: Numerical Linear Algebra | Linear Algebra — Leon 8th Ed.
Leon 8th Edition · Beyond the Syllabus
Chapter 7: Numerical Linear Algebra
Floating-point arithmetic · LU factorization · Pivoting · Condition numbers · Householder & Givens · QR iteration · Pseudoinverse
When we run algorithms from Chapters 1–6 on a computer, most real numbers can't be stored exactly. Tiny rounding errors compound through millions of operations. A system that looks fine mathematically can be numerically catastrophic. This chapter asks: how do we solve linear algebra problems stably and efficiently on finite-precision hardware? These are the methods inside MATLAB, NumPy, LAPACK, and every serious scientific computing library.
Section 7.1
Floating-Point Numbers
Computers store real numbers in floating-point format: $\pm\, 0.d_1d_2\cdots d_t \times b^e$ where $b$ is the base (typically 2), $t$ is the number of stored digits (the precision), and $e$ is the exponent. Not all real numbers are representable — only finitely many floats exist.
IEEE 754 Double Precision and Machine Epsilon
Machine Epsilon
The machine epsilon $\varepsilon_{\text{mach}}$ is the smallest floating-point number $\varepsilon$ such that $\text{fl}(1+\varepsilon)\neq 1$. For IEEE 754 double precision (64-bit, the default in Python/MATLAB/NumPy):
$$\varepsilon_{\text{mach}} = 2^{-52} \approx 2.22\times10^{-16}$$
Every arithmetic operation introduces a relative error of at most $\varepsilon_{\text{mach}}$: if $x$ is exact, the stored value satisfies $\text{fl}(x)=x(1+\delta)$ where $|\delta|\leq\varepsilon_{\text{mach}}$.
Format
Bits
Decimal digits
$\varepsilon_{\text{mach}}$
Used in
Half precision
16
~3
$\approx9.8\times10^{-4}$
GPU/ML training
Single precision
32
~7
$\approx1.2\times10^{-7}$
Graphics, embedded
Double precision
64
~16
$\approx2.2\times10^{-16}$
MATLAB, NumPy default
Extended/Quad
80/128
~19-34
$\approx10^{-19}$
Numerical validation
Floating-point numbers are not uniformly distributed on the real line — they are dense near zero and sparse far from zero. Each interval $[2^k, 2^{k+1}]$ contains exactly the same number of floats. Click anywhere on the number line to place your value and see the rounding error.
Catastrophic Cancellation
Subtracting two nearly-equal floating-point numbers destroys significant digits, amplifying relative error by a factor of $1/(1 - x_2/x_1)$.
⚠ Example: Catastrophic Cancellation
$c=3.4215298$, $d=3.4213851$. In 6-digit decimal arithmetic:
$c'=0.342153\times10^1$, $d'=0.342139\times10^1$ — both accurate to $\sim10^{-7}$ relatively
$c'-d' = 0.140\times10^{-3}$, but exact value $= 0.1447\times10^{-3}$ — relative error $\approx3\%$ jumped to 97%
Then $1/(c'-d')=7143$ vs exact $\approx6910$ — completely unreliable
The 5 leading significant digits cancelled, leaving only 1–2 accurate digits in the result.
📘 Example 7.1 — Numerically Stable Quadratic Formula
Solve $x^2-62x+1=0$. Exact roots: $x=31\pm\sqrt{960}\approx31\pm30.984$.
Naive formula
$x_2=31-30.984=0.016$ — only 2 sig figs! All others cancelled.
Stable version
Compute the large root first: $x_1=31+\sqrt{960}\approx61.984$ (accurate, no cancellation).
Then use Vieta's formula: $x_1 x_2 = c/a = 1$, so $x_2 = 1/x_1 = 1/61.984\approx0.016134$ — full precision!
Rule: always compute the root where numerator and denominator have the same sign. Use Vieta for the other.
Section 7.2
Gaussian Elimination (Numerically)
Gaussian elimination (Algorithm 7.2.1 in Leon) reduces $A$ to upper triangular form $U$ in $n-1$ steps. At step $k$, it eliminates entries below the diagonal in column $k$ using multipliers $m_{ik}=a_{ik}^{(k)}/a_{kk}^{(k)}$.
Operation Count
Full Gaussian elimination costs:
$$\frac{2}{3}n^3 + O(n^2) \text{ multiplications/additions}$$
$n$
Operations
Time @ $10^9$ flops/s
100
$6.7\times10^5$
0.67 ms
1,000
$6.7\times10^8$
0.67 s
10,000
$6.7\times10^{11}$
11 min
100,000
$6.7\times10^{14}$
8 days
For $n>10{,}000$, direct methods are replaced by iterative methods (Krylov, multigrid).
LU Factorization
Gaussian elimination implicitly computes $A=LU$ where $L$ is unit lower triangular (1s on diagonal, multipliers below) and $U$ is upper triangular. The LU factorization is the key to solving multiple systems with the same $A$:
LU factorization step by step. Click "Step" to apply one elimination step, watching $L$ fill in from the left and $U$ emerge on the right. Click "New RHS" to see how a new $\mathbf{b}$ is solved in $O(n^2)$ without re-factoring $A$.
📘 Example 7.2 — LU Factorization
$$A=\begin{pmatrix}2&3&1\\4&7&3\\-2&-5&-1\end{pmatrix}$$
Step 1: $m_{21}=4/2=2$, $m_{31}=-2/2=-1$. Eliminate below pivot in col 1:
$$A^{(2)}=\begin{pmatrix}2&3&1\\0&1&1\\0&-2&0\end{pmatrix}$$
Step 2: $m_{32}=-2/1=-2$. Eliminate below pivot in col 2:
$$U=\begin{pmatrix}2&3&1\\0&1&1\\0&0&2\end{pmatrix},\qquad L=\begin{pmatrix}1&0&0\\2&1&0\\-1&-2&1\end{pmatrix}$$
Verify: $LU=A$ ✓. Solve $L\mathbf{y}=\mathbf{b}=(1,1,1)^T$ (forward sub): $y_1=1$, $y_2=1-2=-1$, $y_3=1+1+2=4$.
Solve $U\mathbf{x}=\mathbf{y}$ (back sub): $x_3=2$, $x_2=-3$, $x_1=3$.
$\mathbf{x}=(3,-3,2)^T$. The LU step cost $O(n^3)$; each new $\mathbf{b}$ costs only $O(n^2)$.
Section 7.3
Pivoting Strategies
Without pivoting, elimination breaks when a pivot entry $a_{kk}^{(k)}=0$. Worse: even a very small pivot amplifies errors in proportion to $|a_{ij}/a_{kk}|$. The solution is systematic row reordering.
Partial Pivoting
Partial Pivoting Algorithm
Before each elimination step $k$, swap row $k$ with the row $i\geq k$ that has the largest $|a_{ik}|$. This guarantees all multipliers $|m_{ik}|\leq1$. Result: $PA=LU$ where $P$ is a permutation matrix, and $\|L\|_\infty\leq1$.
Partial pivoting on $A=\begin{pmatrix}0.001&2&3\\4&5&6\\7&8&9\end{pmatrix}$. Without pivoting, the tiny pivot $0.001$ creates multipliers of 4000 and 7000 — catastrophic. Click "Step (Partial)" or "Step (No Pivot)" to compare. Watch how the multiplier sizes differ dramatically.
📘 Example 7.3 — Why Small Pivots Destroy Accuracy
Solve $\begin{pmatrix}0.001&1\\1&1\end{pmatrix}\begin{pmatrix}x_1\\x_2\end{pmatrix}=\begin{pmatrix}1\\2\end{pmatrix}$ with 3-digit arithmetic.
Without pivoting
Multiplier $m=1/0.001=1000$. After elimination: $u_{22}=1-1000\cdot1=-999\approx-999$ (ok). But if we had $b_2'=2-1000\cdot1=-998$, then $x_2=998/999\approx0.999$ and $x_1=(1-0.999)/0.001=1$ — actually ok here.
A more pathological case: $b=(0.001, 1)^T$
No pivoting: $u_{22}=1-1000=-999$, $b_2'=1-1000\cdot0.001=0$, so $x_2=0$, $x_1=0.001/0.001=1$. Exact $(1,0)^T$.
With swap: row 2 becomes pivot. $m=0.001/1=0.001$. $u_{22}=0.001-0.001=0$... wait, let's check exact.
The key insight: with partial pivoting, the growth factor is bounded by $2^{n-1}$ (vs unbounded without). In practice it rarely exceeds $n$. Partial pivoting is the standard in all production codes.
Section 7.4
Matrix Norms & Condition Numbers
Vector and Matrix Norms
Norm
Formula
Geometric meaning
$\|\mathbf{x}\|_1$
$\sum|x_i|$
Manhattan distance (taxicab)
$\|\mathbf{x}\|_2$
$\sqrt{\sum x_i^2}$
Euclidean length
$\|\mathbf{x}\|_\infty$
$\max|x_i|$
Chebyshev / max norm
$\|A\|_1$
$\max_j\sum_i|a_{ij}|$
Max absolute column sum
$\|A\|_\infty$
$\max_i\sum_j|a_{ij}|$
Max absolute row sum
$\|A\|_2$
$\sigma_1(A)$
Max stretch factor (singular value)
$\|A\|_F$
$\sqrt{\sum_{i,j}a_{ij}^2}$
Frobenius / entry-wise
All satisfy $\|AB\|\leq\|A\|\|B\|$ and $\|A\mathbf{x}\|\leq\|A\|\|\mathbf{x}\|$.
Condition Number — Measuring Sensitivity
Condition Number
$$\text{cond}(A) = \|A\|\cdot\|A^{-1}\|$$
In the 2-norm: $\text{cond}_2(A)=\sigma_1/\sigma_n$ (ratio of largest to smallest singular value). For the perturbed system $(A+E)\tilde{\mathbf{x}}=\mathbf{b}$:
$$\frac{\|\mathbf{x}-\tilde{\mathbf{x}}\|}{\|\mathbf{x}\|}\leq\text{cond}(A)\cdot\frac{\|E\|}{\|A\|}$$
Rule of thumb: if cond$(A)\approx10^k$, then solving $A\mathbf{x}=\mathbf{b}$ in double precision loses about $k$ decimal digits of accuracy.
Geometric view of ill-conditioning. Two nearly-parallel lines: a small change in one equation shifts the intersection dramatically. Drag the slider to vary how nearly-parallel they are. Notice how the intersection jumps as the lines approach parallel — this is large condition number in action.
📘 Example 7.4 — Condition Number Computation
$$A=\begin{pmatrix}3&3\\4&5\end{pmatrix},\quad A^{-1}=\begin{pmatrix}5&-3\\-4&3\end{pmatrix}$$
Using $\infty$-norm:
$$\|A\|_\infty=\max(|3|+|3|,\;|4|+|5|)=9,\quad\|A^{-1}\|_\infty=\max(|5|+|-3|,\;|-4|+|3|)=8$$
$$\text{cond}_\infty(A)=9\times8=72$$
What this means
A 1% relative change in $A$ can produce up to a 72% relative change in the solution. With double precision ($\varepsilon_{\text{mach}}\approx2.2\times10^{-16}$), we lose $\log_{10}(72)\approx1.9$ decimal digits.
cond $=72$ — well-conditioned. Compare: Hilbert $H_{10}$ has cond $\approx1.6\times10^{13}$, meaning 13 digits of accuracy are lost — the solution is numerically meaningless in double precision.
Section 7.5
Orthogonal Transformations
Orthogonal matrices $Q$ ($Q^TQ=I$) preserve the 2-norm exactly: $\|Q\mathbf{x}\|_2=\|\mathbf{x}\|_2$. Applying $Q$ introduces no amplification of rounding errors. This makes orthogonal transformations the gold standard for numerically stable algorithms.
Householder Reflections
Householder Matrix $H = I - \frac{2}{\|\mathbf{v}\|^2}\mathbf{v}\mathbf{v}^T$
Given $\mathbf{x}\in\mathbb{R}^n$, choose $\mathbf{v}=\mathbf{x}-\alpha\mathbf{e}_1$ where $\alpha=\pm\|\mathbf{x}\|_2$ (choose sign to avoid cancellation: $\alpha=-\text{sign}(x_1)\|\mathbf{x}\|$). Then:
$$H\mathbf{x} = \alpha\mathbf{e}_1 = (\pm\|\mathbf{x}\|,\;0,\;0,\;\ldots,\;0)^T$$
$H$ is symmetric ($H^T=H$), orthogonal ($H^2=I$), and self-inverse. We only store $\mathbf{v}$ and $\beta=\|\mathbf{v}\|^2/2$, not the full $n\times n$ matrix. For any $\mathbf{y}$: $H\mathbf{y}=\mathbf{y}-\frac{2}{\|\mathbf{v}\|^2}(\mathbf{v}^T\mathbf{y})\mathbf{v}$ — costs $O(n)$, not $O(n^2)$.
Applying Householder reflections column by column computes the QR factorization $A=QR$ stably. Since each $H_k$ is orthogonal and $H_{n-1}\cdots H_1 A=R$, we get $A=H_1\cdots H_{n-1}R=QR$.
📘 Example 7.5 — Householder Reflection
$\mathbf{x}=(4,4,2)^T$. Find $H$ zeroing the last two components.
$$\alpha=\|\mathbf{x}\|=6,\quad \mathbf{v}=\mathbf{x}-\alpha\mathbf{e}_1=(-2,4,2)^T,\quad\beta=\|\mathbf{v}\|^2/2=(4+16+4)/2=12$$
$$H=I-\frac{1}{12}\begin{pmatrix}4&-8&-4\\-8&16&8\\-4&8&4\end{pmatrix}=\begin{pmatrix}2/3&2/3&1/3\\2/3&-1/3&-2/3\\1/3&-2/3&2/3\end{pmatrix}$$
Verify: $H\mathbf{x}=(6,0,0)^T$ ✓. $H$ is orthogonal: $H^TH=I$ ✓. We've zeroed two components in one step.
Givens Rotations
A Givens rotation $G(i,j,\theta)$ rotates in the $(i,j)$-plane, zeroing out a single targeted entry. More surgical than Householder: ideal for sparse matrices or updating an existing factorization. For entry $(j,i)$ with $c=x_i/r$, $s=-x_j/r$, $r=\sqrt{x_i^2+x_j^2}$:
The simplest iterative eigenvalue method: starting from $\mathbf{v}_0$, iterate $\mathbf{v}_{k+1}=A\mathbf{v}_k/\|A\mathbf{v}_k\|$. Convergence is geometric with rate $|\lambda_2/\lambda_1|$ — fast when the dominant eigenvalue is well-separated.
Power iteration on $A=\begin{pmatrix}3&1\\1&3\end{pmatrix}$ (eigenvalues $\lambda_1=4$, $\lambda_2=2$). The red trail shows $\mathbf{v}_k$ spiraling toward the dominant eigenvector $(1,1)/\sqrt{2}$ (dashed line). The Rayleigh quotient $\rho_k=\mathbf{v}_k^TA\mathbf{v}_k$ converges to $\lambda_1=4$. Click "Step" to iterate.
QR Iteration (Francis Algorithm)
The definitive eigenvalue algorithm, used in LAPACK's dsyev/dgeev:
Reduce $A$ to upper Hessenberg form $H=Q_0^TAQ_0$ using Householder transformations ($O(n^3)$ once)
At step $k$: form $H_k-\alpha_k I=Q_kR_k$, set $H_{k+1}=R_kQ_k+\alpha_k I$ (similar to $H_k$)
With shifts $\alpha_k\approx\lambda_j$, convergence is cubic — the last row zeros out in a few steps
Eigenvalues appear on the diagonal as $H_k\to$ upper triangular (Schur form)
Why This Works
Each $H_{k+1}=Q_k^TH_kQ_k$ — similar to $H_k$, so eigenvalues are preserved. The shifts $\alpha_k$ are chosen as eigenvalues of the trailing $2\times2$ submatrix, accelerating the zeroing of subdiagonal entries. Total cost: $O(n^3)$ for all $n$ eigenvalues — same as a single matrix multiply, but obtaining the complete eigensystem.
Section 7.7
Numerical Least Squares
Three approaches to $\min_{\mathbf{x}}\|A\mathbf{x}-\mathbf{b}\|_2^2$, in increasing numerical stability:
For $A=U\Sigma V^T$ with rank $r$, the pseudoinverse is $A^+=V\Sigma^+U^T$ where $\Sigma^+=\text{diag}(1/\sigma_1,\ldots,1/\sigma_r,0,\ldots,0)$. The solution $\hat{\mathbf{x}}=A^+\mathbf{b}$ is:
The minimum-norm least squares solution (among all minimizers of $\|A\mathbf{x}-\mathbf{b}\|^2$, it has the smallest $\|\mathbf{x}\|$)
The unique solution satisfying the four Penrose conditions: $AA^+A=A$, $A^+AA^+=A^+$, $(AA^+)^T=AA^+$, $(A^+A)^T=A^+A$
📘 Example 7.6 — Pseudoinverse for Rank-Deficient System
$$A=\begin{pmatrix}1&1\\1&1\\0&0\end{pmatrix},\quad\mathbf{b}=\begin{pmatrix}2\\2\\0\end{pmatrix}$$
$A^TA=\begin{pmatrix}2&2\\2&2\end{pmatrix}$ is singular — normal equations fail. SVD: $\sigma_1=2$, $\sigma_2=0$.
$$A^+=\frac{1}{4}\begin{pmatrix}1&1&0\\1&1&0\end{pmatrix},\quad\hat{\mathbf{x}}=A^+\mathbf{b}=\begin{pmatrix}1\\1\end{pmatrix}$$
The solution set is $\hat{\mathbf{x}}+N(A)$ — infinitely many solutions, all minimizing $\|A\mathbf{x}-\mathbf{b}\|$. The pseudoinverse gives the one with smallest norm: $\|\hat{\mathbf{x}}\|=\sqrt{2}$.
Any $\mathbf{x}=(1+t, 1-t)^T$ is a least squares solution, but $A^+\mathbf{b}=(1,1)^T$ is the unique minimum-norm one.
🔭 Going Further
Iterative methods (Krylov): For $n=10^6$, LU is impossible. Conjugate Gradient ($A$ symmetric positive definite) and GMRES (general) converge in $O(\sqrt{\kappa})$ iterations using only matrix-vector products. Used in finite element analysis, fluid dynamics, quantum chemistry
Preconditioners: Replace $A\mathbf{x}=\mathbf{b}$ with $M^{-1}A\mathbf{x}=M^{-1}\mathbf{b}$ where cond$(M^{-1}A)\ll$ cond$(A)$. The "right" preconditioner is problem-specific and often the hardest part
Randomized linear algebra: Sketch $A$ to a smaller matrix using random projections, solve approximately, then refine. Achieves near-optimal complexity for massive data (Johnson-Lindenstrauss lemma)
LAPACK/BLAS: The reference implementation of all these algorithms — used inside NumPy, SciPy, MATLAB, Julia, R. Every routine in this chapter has a LAPACK counterpart: dgesv (LU), dgeqrf (QR), dgesvd (SVD), dsyev (symmetric eigensystem)