The "Fundamental Theorem" of Linear Algebra: SVD Explained

The Fundamental Theorem of Linear Algebra: SVD (Singular Value Decomposition)

We have spent the last few weeks mastering Eigenvectors and PCA. We learned how to find the hidden axes of data. But there was always a catch: Eigenvectors only work on Square Matrices.

But look at the real world. A dataset of Users vs. Movies is a rectangular matrix. An image is a rectangular matrix of pixels. A spreadsheet of stock prices is rectangular.

If you try to calculate the Eigenvectors of a rectangular matrix, the math breaks. So, how do we find the hidden structure of any matrix, of any shape?

The answer is the Singular Value Decomposition (SVD). It is arguably the most important theorem in all of Linear Algebra.

The Intuition: Breaking Down the Rectangle

SVD states that any matrix A can be broken down into three clean components:

A = U Σ Vᵀ

Let's use a "Netflix" analogy where Matrix A represents Users rating Movies.

  • U (Left Singular Vectors): Relates Users to hidden "Concepts" (like Genre preferences).
  • Vᵀ (Right Singular Vectors): Relates those same Concepts to Movies.
  • Σ (Sigma - Singular Values): A diagonal matrix that tells us the Strength of each Concept. Is "Sci-Fi" a strong trend in the data, or just noise?

The Math: The "Transpose" Trick

How do we calculate these matrices if A isn't square? We cheat. We turn the rectangle into a square.

If we multiply a rectangular matrix A by its transpose Aᵀ, the result is always square and symmetric.

  • To find V, we calculate the Eigenvectors of AᵀA.
  • To find U, we calculate the Eigenvectors of AAᵀ.
  • To find Σ (Singular Values), we take the square root of the Eigenvalues.

So, SVD isn't new math; it is just Eigenvectors applied to squared matrices. We are recycling the tools we already built!

Implementation: Decomposing a Rectangle

Let's prove this in Python using NumPy. First, we create a simple 3x2 matrix.

import numpy as np

# A Rectangular Matrix (3 rows, 2 columns)
A = np.array([
    [3, 1],
    [1, 3],
    [1, 1]
])

# Compute SVD
U, S, Vt = np.linalg.svd(A)

print("U (Left Vectors):\n", np.round(U, 2))
print("S (Singular Values):\n", np.round(S, 2))
print("Vt (Right Vectors):\n", np.round(Vt, 2))
    

Note: NumPy returns S as a 1D array. To reconstruct the matrix, we must convert it back to a diagonal matrix using np.diag(S).

The Project: Image Compression

Now for the visual payoff. To a computer, a grayscale image is just a matrix of numbers. We can use SVD to compress an image by throwing away the "noisy" singular values.

# Define a helper function to reconstruct using top 'k' components
def compress_image(k, U, S, Vt):
    # Take only the first k columns/values
    U_k = U[:, :k]
    S_k = np.diag(S[:k])
    Vt_k = Vt[:k, :]
    
    # Reconstruct
    return U_k @ S_k @ Vt_k
    

When we run this on a high-res image (512x512 pixels), we see something amazing. The singular values drop off incredibly fast.

We can reconstruct the image using only the top 50 components (out of 512) and it looks almost identical to the original. We essentially compressed the data by ~90% while keeping the essence of the image intact.

Conclusion & What's Next?

This brings us to the end of our journey through Linear Algebra. You now have the mathematical toolkit to understand the Structure and Geometry of data.

But real-world data is messy, noisy, and uncertain. Our models are never 100% correct; they are just "probable." To build true intelligence, we need to measure uncertainty.

This leads us to the language of science: Probability and Statistics. This is the topic of our next series.

Get the Code

Want to run the Image Compression demo yourself? Check out the Google Colab Notebook.

This post is the finale of the "Linear Algebra for Machine Learning" series. For the previous part, check out: How to Crush Big Data: The Math of PCA.

Comments

Popular posts from this blog

Orthogonality & Orthonormal Basis Explained (Linear Algebra for Developers)

Polynomial Regression Explained: Math, Python Code & Overfitting

All about READ in RPGLE & Why we use it with SETLL/SETGT?