What is a Vector Space? (The Math That Powers Word2Vec)
Vector Spaces: The "Playground" Where Your Data Lives
How does a computer understand that the word 'King' is related to 'Man' in the same way that 'Queen' is related to 'Woman'? Computers don't understand abstract meaning; they only understand numbers and structure.
To solve this, we represent our data—words, images, users—not as isolated items, but as points on a special kind of mathematical map with a strict set of rules. This 'map' is called a Vector Space. It's the playground where our data lives, and its rules are what allow machine learning models to perform seemingly magical feats of logic.
In this post, we'll build this concept from the ground up, starting with its most fundamental building block.
Watch the video for the full visual explanation, then scroll down for the detailed definitions and axioms.
The Foundation: What is a Mathematical "Group"?
Before we can build a Vector Space, we need to understand a simpler structure: a Group. A Group consists of just two things: a set of elements and a single operation that combines them. For this combination to qualify as a Group, it must satisfy four strict rules, or axioms.
The Developer's Mental Model: Think of these axioms as an Interface or a Contract. Any system that "implements" this interface is guaranteed to be predictable, self-contained, and reversible.
Let's use the set of all integers (ℤ) with the operation of addition (+) as our example.
- 1. Closure: If you take any two elements from the set and apply the operation, the result is still in the set. (An integer plus an integer is always another integer).
- 2. Associativity: The grouping of operations doesn't change the outcome. `(a + b) + c = a + (b + c)`.
- 3. Identity Element: There must be a special "do nothing" element. For integer addition, this is `0`, since `a + 0 = a`.
- 4. Inverse Element: For every element, there must be an "undo" element that gets you back to the identity. For any integer `a`, its inverse is `-a`, since `a + (-a) = 0`.
Because the integers with addition satisfy all four rules, they form a Group.
Building the Vector Space: The 10 Axioms
A Vector Space is a more complex structure built on top of the Group concept. It is a collection of objects called vectors which can be added together and scaled by numbers, called scalars.
For a collection to be a true Vector Space, it must satisfy 10 axioms. We can group these into two main categories.
Rule Group 1: The Vectors Themselves Form a Group
The first five rules simply state that the set of vectors (V) and the operation of vector addition (+) must form a special kind of Group (an Abelian Group).
- Closure under Addition: For any vectors v, w in V, their sum v + w is also in V.
- Associativity of Addition: (u + v) + w = u + (v + w).
- Identity Element of Addition: There is a zero vector 0 such that v + 0 = v for all v.
- Inverse Element of Addition: For every v, there is an inverse -v such that v + (-v) = 0.
- Commutativity of Addition: v + w = w + v.
Rule Group 2: The Interaction Between Scalars and Vectors
The next five rules define how scalars (from a field F, like the real numbers) interact with our vectors.
- Closure under Scalar Multiplication: For any scalar `a` and vector v, the product `a`v is also in V.
- Distributivity of Scalar Multiplication (1): `a`(v + w) = `a`v + `a`w.
- Distributivity of Scalar Multiplication (2): (`a` + `b`)v = `a`v + `b`v.
- Associativity of Scalar Multiplication: `a`(`b`v) = (`ab`)v.
- Identity Element of Scalar Multiplication: `1`v = v, where `1` is the multiplicative identity.
Vector Subspaces: A Space Within a Space
A Vector Subspace is simply a vector space that lives inside a larger parent vector space. To qualify as a subspace, a subset of vectors only needs to satisfy the two closure axioms; it inherits the other eight properties from its parent.
The Subspace Test: A subset is a subspace if and only if:
- It is closed under addition.
- It is closed under scalar multiplication.
The Developer's Mental Model: Think of a 3D video game world (our vector space). The flat ground is a perfect example of a subspace. If you add two vectors on the ground, the result is still on the ground. If you scale a vector on the ground, it remains on the ground. However, the surface of a sphere in that world is *not* a subspace, because adding two vectors on its surface will likely result in a vector that points outside the sphere.
Conclusion: The Payoff for Machine Learning
These rules might seem abstract, but they are the reason machine learning models like Word2Vec can perform feats of analogy. By representing words as vectors in a high-dimensional vector space, the model can use simple vector arithmetic to understand concepts like gender and royalty.
Vector('King') - Vector('Man') + Vector('Woman') ≈ Vector('Queen')
This is not magic; it's vector arithmetic performed in a well-behaved "playground" where the rules are guaranteed. Understanding the structure of that playground is the key to decoding the complexity of the models that live within it.
Your Turn...
Did the "Playground" or "Interface" analogies help clarify these abstract concepts? What other foundational math topics would you like to see decoded? Let me know in the comments!
This post is part of the "Linear Algebra: The Core of Machine Learning" series. For the previous part, check out: How to Solve Systems of Linear Equations.
Comments
Post a Comment