Chapter 7: Numerical Linear Algebra | Linear Algebra — Leon 8th Ed.

Leon 8th Edition · Beyond the Syllabus

Chapter 7: Numerical Linear Algebra

Floating-point arithmetic · LU factorization · Pivoting · Condition numbers · Householder & Givens · QR iteration · Pseudoinverse

When we run algorithms from Chapters 1–6 on a computer, most real numbers can't be stored exactly. Tiny rounding errors compound through millions of operations. A system that looks fine mathematically can be numerically catastrophic. This chapter asks: how do we solve linear algebra problems stably and efficiently on finite-precision hardware? These are the methods inside MATLAB, NumPy, LAPACK, and every serious scientific computing library.

Section 7.1

Floating-Point Numbers

Computers store real numbers in floating-point format: $\pm\, 0.d_1d_2\cdots d_t \times b^e$ where $b$ is the base (typically 2), $t$ is the number of stored digits (the precision), and $e$ is the exponent. Not all real numbers are representable — only finitely many floats exist.

IEEE 754 Double Precision and Machine Epsilon

Machine Epsilon

The machine epsilon $\varepsilon_{\text{mach}}$ is the smallest floating-point number $\varepsilon$ such that $\text{fl}(1+\varepsilon)\neq 1$. For IEEE 754 double precision (64-bit, the default in Python/MATLAB/NumPy): $$\varepsilon_{\text{mach}} = 2^{-52} \approx 2.22\times10^{-16}$$ Every arithmetic operation introduces a relative error of at most $\varepsilon_{\text{mach}}$: if $x$ is exact, the stored value satisfies $\text{fl}(x)=x(1+\delta)$ where $|\delta|\leq\varepsilon_{\text{mach}}$.

Format	Bits	Decimal digits	$\varepsilon_{\text{mach}}$	Used in
Half precision	16	~3	$\approx9.8\times10^{-4}$	GPU/ML training
Single precision	32	~7	$\approx1.2\times10^{-7}$	Graphics, embedded
Double precision	64	~16	$\approx2.2\times10^{-16}$	MATLAB, NumPy default
Extended/Quad	80/128	~19-34	$\approx10^{-19}$	Numerical validation

Floating-point numbers are not uniformly distributed on the real line — they are dense near zero and sparse far from zero. Each interval $[2^k, 2^{k+1}]$ contains exactly the same number of floats. Click anywhere on the number line to place your value and see the rounding error.

Catastrophic Cancellation

Subtracting two nearly-equal floating-point numbers destroys significant digits, amplifying relative error by a factor of $1/(1 - x_2/x_1)$.

⚠ Example: Catastrophic Cancellation

$c=3.4215298$, $d=3.4213851$. In 6-digit decimal arithmetic:

$c'=0.342153\times10^1$, $d'=0.342139\times10^1$ — both accurate to $\sim10^{-7}$ relatively
$c'-d' = 0.140\times10^{-3}$, but exact value $= 0.1447\times10^{-3}$ — relative error $\approx3\%$ jumped to 97%
Then $1/(c'-d')=7143$ vs exact $\approx6910$ — completely unreliable

The 5 leading significant digits cancelled, leaving only 1–2 accurate digits in the result.

📘 Example 7.1 — Numerically Stable Quadratic Formula

Solve $x^2-62x+1=0$. Exact roots: $x=31\pm\sqrt{960}\approx31\pm30.984$. Naive formula $x_2=31-30.984=0.016$ — only 2 sig figs! All others cancelled. Stable version Compute the large root first: $x_1=31+\sqrt{960}\approx61.984$ (accurate, no cancellation).
Then use Vieta's formula: $x_1 x_2 = c/a = 1$, so $x_2 = 1/x_1 = 1/61.984\approx0.016134$ — full precision! Rule: always compute the root where numerator and denominator have the same sign. Use Vieta for the other.

Section 7.2

Gaussian Elimination (Numerically)

Gaussian elimination (Algorithm 7.2.1 in Leon) reduces $A$ to upper triangular form $U$ in $n-1$ steps. At step $k$, it eliminates entries below the diagonal in column $k$ using multipliers $m_{ik}=a_{ik}^{(k)}/a_{kk}^{(k)}$.

Operation Count

Full Gaussian elimination costs: $$\frac{2}{3}n^3 + O(n^2) \text{ multiplications/additions}$$

$n$	Operations	Time @ $10^9$ flops/s
100	$6.7\times10^5$	0.67 ms
1,000	$6.7\times10^8$	0.67 s
10,000	$6.7\times10^{11}$	11 min
100,000	$6.7\times10^{14}$	8 days

For $n>10{,}000$, direct methods are replaced by iterative methods (Krylov, multigrid).

LU Factorization

Gaussian elimination implicitly computes $A=LU$ where $L$ is unit lower triangular (1s on diagonal, multipliers below) and $U$ is upper triangular. The LU factorization is the key to solving multiple systems with the same $A$:

LU factorization step by step. Click "Step" to apply one elimination step, watching $L$ fill in from the left and $U$ emerge on the right. Click "New RHS" to see how a new $\mathbf{b}$ is solved in $O(n^2)$ without re-factoring $A$.

📘 Example 7.2 — LU Factorization

$$A=\begin{pmatrix}2&3&1\\4&7&3\\-2&-5&-1\end{pmatrix}$$ Step 1: $m_{21}=4/2=2$, $m_{31}=-2/2=-1$. Eliminate below pivot in col 1: $$A^{(2)}=\begin{pmatrix}2&3&1\\0&1&1\\0&-2&0\end{pmatrix}$$ Step 2: $m_{32}=-2/1=-2$. Eliminate below pivot in col 2: $$U=\begin{pmatrix}2&3&1\\0&1&1\\0&0&2\end{pmatrix},\qquad L=\begin{pmatrix}1&0&0\\2&1&0\\-1&-2&1\end{pmatrix}$$ Verify: $LU=A$ ✓. Solve $L\mathbf{y}=\mathbf{b}=(1,1,1)^T$ (forward sub): $y_1=1$, $y_2=1-2=-1$, $y_3=1+1+2=4$.
Solve $U\mathbf{x}=\mathbf{y}$ (back sub): $x_3=2$, $x_2=-3$, $x_1=3$. $\mathbf{x}=(3,-3,2)^T$. The LU step cost $O(n^3)$; each new $\mathbf{b}$ costs only $O(n^2)$.

Section 7.3

Pivoting Strategies

Without pivoting, elimination breaks when a pivot entry $a_{kk}^{(k)}=0$. Worse: even a very small pivot amplifies errors in proportion to $|a_{ij}/a_{kk}|$. The solution is systematic row reordering.

Partial Pivoting

Partial Pivoting Algorithm

Before each elimination step $k$, swap row $k$ with the row $i\geq k$ that has the largest $|a_{ik}|$. This guarantees all multipliers $|m_{ik}|\leq1$. Result: $PA=LU$ where $P$ is a permutation matrix, and $\|L\|_\infty\leq1$.

Partial pivoting on $A=\begin{pmatrix}0.001&2&3\\4&5&6\\7&8&9\end{pmatrix}$. Without pivoting, the tiny pivot $0.001$ creates multipliers of 4000 and 7000 — catastrophic. Click "Step (Partial)" or "Step (No Pivot)" to compare. Watch how the multiplier sizes differ dramatically.

📘 Example 7.3 — Why Small Pivots Destroy Accuracy

Solve $\begin{pmatrix}0.001&1\\1&1\end{pmatrix}\begin{pmatrix}x_1\\x_2\end{pmatrix}=\begin{pmatrix}1\\2\end{pmatrix}$ with 3-digit arithmetic. Without pivoting Multiplier $m=1/0.001=1000$. After elimination: $u_{22}=1-1000\cdot1=-999\approx-999$ (ok). But if we had $b_2'=2-1000\cdot1=-998$, then $x_2=998/999\approx0.999$ and $x_1=(1-0.999)/0.001=1$ — actually ok here. A more pathological case: $b=(0.001, 1)^T$ No pivoting: $u_{22}=1-1000=-999$, $b_2'=1-1000\cdot0.001=0$, so $x_2=0$, $x_1=0.001/0.001=1$. Exact $(1,0)^T$. With swap: row 2 becomes pivot. $m=0.001/1=0.001$. $u_{22}=0.001-0.001=0$... wait, let's check exact. The key insight: with partial pivoting, the growth factor is bounded by $2^{n-1}$ (vs unbounded without). In practice it rarely exceeds $n$. Partial pivoting is the standard in all production codes.

Section 7.4

Matrix Norms & Condition Numbers

Vector and Matrix Norms

Norm	Formula	Geometric meaning
$\\|\mathbf{x}\\|_1$	$\sum\|x_i\|$	Manhattan distance (taxicab)
$\\|\mathbf{x}\\|_2$	$\sqrt{\sum x_i^2}$	Euclidean length
$\\|\mathbf{x}\\|_\infty$	$\max\|x_i\|$	Chebyshev / max norm
$\\|A\\|_1$	$\max_j\sum_i\|a_{ij}\|$	Max absolute column sum
$\\|A\\|_\infty$	$\max_i\sum_j\|a_{ij}\|$	Max absolute row sum
$\\|A\\|_2$	$\sigma_1(A)$	Max stretch factor (singular value)
$\\|A\\|_F$	$\sqrt{\sum_{i,j}a_{ij}^2}$	Frobenius / entry-wise

All satisfy $\|AB\|\leq\|A\|\|B\|$ and $\|A\mathbf{x}\|\leq\|A\|\|\mathbf{x}\|$.

Condition Number — Measuring Sensitivity

Condition Number

$$\text{cond}(A) = \|A\|\cdot\|A^{-1}\|$$ In the 2-norm: $\text{cond}_2(A)=\sigma_1/\sigma_n$ (ratio of largest to smallest singular value). For the perturbed system $(A+E)\tilde{\mathbf{x}}=\mathbf{b}$: $$\frac{\|\mathbf{x}-\tilde{\mathbf{x}}\|}{\|\mathbf{x}\|}\leq\text{cond}(A)\cdot\frac{\|E\|}{\|A\|}$$ Rule of thumb: if cond$(A)\approx10^k$, then solving $A\mathbf{x}=\mathbf{b}$ in double precision loses about $k$ decimal digits of accuracy.

Geometric view of ill-conditioning. Two nearly-parallel lines: a small change in one equation shifts the intersection dramatically. Drag the slider to vary how nearly-parallel they are. Notice how the intersection jumps as the lines approach parallel — this is large condition number in action.

📘 Example 7.4 — Condition Number Computation

$$A=\begin{pmatrix}3&3\\4&5\end{pmatrix},\quad A^{-1}=\begin{pmatrix}5&-3\\-4&3\end{pmatrix}$$ Using $\infty$-norm: $$\|A\|_\infty=\max(|3|+|3|,\;|4|+|5|)=9,\quad\|A^{-1}\|_\infty=\max(|5|+|-3|,\;|-4|+|3|)=8$$ $$\text{cond}_\infty(A)=9\times8=72$$ What this means A 1% relative change in $A$ can produce up to a 72% relative change in the solution. With double precision ($\varepsilon_{\text{mach}}\approx2.2\times10^{-16}$), we lose $\log_{10}(72)\approx1.9$ decimal digits. cond $=72$ — well-conditioned. Compare: Hilbert $H_{10}$ has cond $\approx1.6\times10^{13}$, meaning 13 digits of accuracy are lost — the solution is numerically meaningless in double precision.

Section 7.5

Orthogonal Transformations

Orthogonal matrices $Q$ ($Q^TQ=I$) preserve the 2-norm exactly: $\|Q\mathbf{x}\|_2=\|\mathbf{x}\|_2$. Applying $Q$ introduces no amplification of rounding errors. This makes orthogonal transformations the gold standard for numerically stable algorithms.

Householder Reflections

Householder Matrix $H = I - \frac{2}{\|\mathbf{v}\|^2}\mathbf{v}\mathbf{v}^T$

Given $\mathbf{x}\in\mathbb{R}^n$, choose $\mathbf{v}=\mathbf{x}-\alpha\mathbf{e}_1$ where $\alpha=\pm\|\mathbf{x}\|_2$ (choose sign to avoid cancellation: $\alpha=-\text{sign}(x_1)\|\mathbf{x}\|$). Then: $$H\mathbf{x} = \alpha\mathbf{e}_1 = (\pm\|\mathbf{x}\|,\;0,\;0,\;\ldots,\;0)^T$$ $H$ is symmetric ($H^T=H$), orthogonal ($H^2=I$), and self-inverse. We only store $\mathbf{v}$ and $\beta=\|\mathbf{v}\|^2/2$, not the full $n\times n$ matrix. For any $\mathbf{y}$: $H\mathbf{y}=\mathbf{y}-\frac{2}{\|\mathbf{v}\|^2}(\mathbf{v}^T\mathbf{y})\mathbf{v}$ — costs $O(n)$, not $O(n^2)$.

Applying Householder reflections column by column computes the QR factorization $A=QR$ stably. Since each $H_k$ is orthogonal and $H_{n-1}\cdots H_1 A=R$, we get $A=H_1\cdots H_{n-1}R=QR$.

📘 Example 7.5 — Householder Reflection

$\mathbf{x}=(4,4,2)^T$. Find $H$ zeroing the last two components. $$\alpha=\|\mathbf{x}\|=6,\quad \mathbf{v}=\mathbf{x}-\alpha\mathbf{e}_1=(-2,4,2)^T,\quad\beta=\|\mathbf{v}\|^2/2=(4+16+4)/2=12$$ $$H=I-\frac{1}{12}\begin{pmatrix}4&-8&-4\\-8&16&8\\-4&8&4\end{pmatrix}=\begin{pmatrix}2/3&2/3&1/3\\2/3&-1/3&-2/3\\1/3&-2/3&2/3\end{pmatrix}$$ Verify: $H\mathbf{x}=(6,0,0)^T$ ✓. $H$ is orthogonal: $H^TH=I$ ✓. We've zeroed two components in one step.

Givens Rotations

A Givens rotation $G(i,j,\theta)$ rotates in the $(i,j)$-plane, zeroing out a single targeted entry. More surgical than Householder: ideal for sparse matrices or updating an existing factorization. For entry $(j,i)$ with $c=x_i/r$, $s=-x_j/r$, $r=\sqrt{x_i^2+x_j^2}$:

$$G\begin{pmatrix}x_i\\x_j\end{pmatrix}=\begin{pmatrix}c&-s\\s&c\end{pmatrix}\begin{pmatrix}x_i\\x_j\end{pmatrix}=\begin{pmatrix}r\\0\end{pmatrix}$$

Section 7.6

The Eigenvalue Problem (Numerically)

Power Method

The simplest iterative eigenvalue method: starting from $\mathbf{v}_0$, iterate $\mathbf{v}_{k+1}=A\mathbf{v}_k/\|A\mathbf{v}_k\|$. Convergence is geometric with rate $|\lambda_2/\lambda_1|$ — fast when the dominant eigenvalue is well-separated.

Power iteration on $A=\begin{pmatrix}3&1\\1&3\end{pmatrix}$ (eigenvalues $\lambda_1=4$, $\lambda_2=2$). The red trail shows $\mathbf{v}_k$ spiraling toward the dominant eigenvector $(1,1)/\sqrt{2}$ (dashed line). The Rayleigh quotient $\rho_k=\mathbf{v}_k^TA\mathbf{v}_k$ converges to $\lambda_1=4$. Click "Step" to iterate.

QR Iteration (Francis Algorithm)

The definitive eigenvalue algorithm, used in LAPACK's dsyev/dgeev:

Reduce $A$ to upper Hessenberg form $H=Q_0^TAQ_0$ using Householder transformations ($O(n^3)$ once)
At step $k$: form $H_k-\alpha_k I=Q_kR_k$, set $H_{k+1}=R_kQ_k+\alpha_k I$ (similar to $H_k$)
With shifts $\alpha_k\approx\lambda_j$, convergence is cubic — the last row zeros out in a few steps
Eigenvalues appear on the diagonal as $H_k\to$ upper triangular (Schur form)

Why This Works

Each $H_{k+1}=Q_k^TH_kQ_k$ — similar to $H_k$, so eigenvalues are preserved. The shifts $\alpha_k$ are chosen as eigenvalues of the trailing $2\times2$ submatrix, accelerating the zeroing of subdiagonal entries. Total cost: $O(n^3)$ for all $n$ eigenvalues — same as a single matrix multiply, but obtaining the complete eigensystem.

Section 7.7

Numerical Least Squares

Three approaches to $\min_{\mathbf{x}}\|A\mathbf{x}-\mathbf{b}\|_2^2$, in increasing numerical stability:

Method	Equations	Cost	Stability	Use when
Normal equations	$A^TA\mathbf{x}=A^T\mathbf{b}$ via Cholesky	$O(mn^2)$	Poor: cond$(A^TA)=$ cond$(A)^2$	$A$ well-conditioned, $m\gg n$
Householder QR	$A=QR$, solve $R\mathbf{x}=Q^T\mathbf{b}$	$O(mn^2)$	Good: cond$(R)=$ cond$(A)$	General use (LAPACK default)
SVD / Pseudoinverse	$\hat{\mathbf{x}}=A^+\mathbf{b}=V\Sigma^+U^T\mathbf{b}$	$O(mn^2)$	Best: handles rank deficiency	$A$ rank-deficient or near-singular

The Pseudoinverse $A^+$

Moore-Penrose Pseudoinverse

For $A=U\Sigma V^T$ with rank $r$, the pseudoinverse is $A^+=V\Sigma^+U^T$ where $\Sigma^+=\text{diag}(1/\sigma_1,\ldots,1/\sigma_r,0,\ldots,0)$. The solution $\hat{\mathbf{x}}=A^+\mathbf{b}$ is:

The minimum-norm least squares solution (among all minimizers of $\|A\mathbf{x}-\mathbf{b}\|^2$, it has the smallest $\|\mathbf{x}\|$)
The unique solution satisfying the four Penrose conditions: $AA^+A=A$, $A^+AA^+=A^+$, $(AA^+)^T=AA^+$, $(A^+A)^T=A^+A$

📘 Example 7.6 — Pseudoinverse for Rank-Deficient System

$$A=\begin{pmatrix}1&1\\1&1\\0&0\end{pmatrix},\quad\mathbf{b}=\begin{pmatrix}2\\2\\0\end{pmatrix}$$ $A^TA=\begin{pmatrix}2&2\\2&2\end{pmatrix}$ is singular — normal equations fail. SVD: $\sigma_1=2$, $\sigma_2=0$. $$A^+=\frac{1}{4}\begin{pmatrix}1&1&0\\1&1&0\end{pmatrix},\quad\hat{\mathbf{x}}=A^+\mathbf{b}=\begin{pmatrix}1\\1\end{pmatrix}$$ The solution set is $\hat{\mathbf{x}}+N(A)$ — infinitely many solutions, all minimizing $\|A\mathbf{x}-\mathbf{b}\|$. The pseudoinverse gives the one with smallest norm: $\|\hat{\mathbf{x}}\|=\sqrt{2}$. Any $\mathbf{x}=(1+t, 1-t)^T$ is a least squares solution, but $A^+\mathbf{b}=(1,1)^T$ is the unique minimum-norm one.

🔭 Going Further

Iterative methods (Krylov): For $n=10^6$, LU is impossible. Conjugate Gradient ($A$ symmetric positive definite) and GMRES (general) converge in $O(\sqrt{\kappa})$ iterations using only matrix-vector products. Used in finite element analysis, fluid dynamics, quantum chemistry
Preconditioners: Replace $A\mathbf{x}=\mathbf{b}$ with $M^{-1}A\mathbf{x}=M^{-1}\mathbf{b}$ where cond$(M^{-1}A)\ll$ cond$(A)$. The "right" preconditioner is problem-specific and often the hardest part
Randomized linear algebra: Sketch $A$ to a smaller matrix using random projections, solve approximately, then refine. Achieves near-optimal complexity for massive data (Johnson-Lindenstrauss lemma)
LAPACK/BLAS: The reference implementation of all these algorithms — used inside NumPy, SciPy, MATLAB, Julia, R. Every routine in this chapter has a LAPACK counterpart: dgesv (LU), dgeqrf (QR), dgesvd (SVD), dsyev (symmetric eigensystem)