Lecture 2.3
Group Theory: Irreps & Fourier Transform
This lecture dives deep into the group theoretic foundations of steerable methods. We explore Irreducible Representations (Irreps) and discover that they are equivalent to the Fourier Transform on groups.
1. Equivalent Representations
Two representations $\rho_A(g)$ and $\rho_B(g)$ are said to be equivalent if they are related by a similarity transform (a change of basis):
$$ \rho_B(g) = Q^{-1} \rho_A(g) Q $$This means they describe the same transformation, just viewed from a different coordinate system. Our goal is to find a basis $Q$ where the representation becomes as simple as possible.
2. Irreducible Representations (Irreps)
A representation is reducible if we can find a basis where the matrix becomes block-diagonal. If a block cannot be decomposed further, it is called an Irreducible Representation (Irrep).
For the group $SO(2)$ (rotations in 2D), the irreps are:
- Complex Case: 1D scalars $\rho_m(\theta) = e^{-im\theta}$.
- Real Case: 2D rotation matrices $\rho_m(\theta) = \begin{pmatrix} \cos(m\theta) & -\sin(m\theta) \\ \sin(m\theta) & \cos(m\theta) \end{pmatrix}$.
Any representation of $SO(2)$ can be decomposed into a direct sum of these irreps: $\rho(\theta) = \bigoplus_m \rho_m(\theta)$.
3. Connection to Fourier Transform
The "magical" change of basis matrix $Q$ that block-diagonalizes the regular representation is exactly the Fourier Basis!
- Projection: Projecting a function $f$ onto this basis (Change of Basis) is the Fourier Transform. The resulting coefficients are the signal's representation in the "frequency domain".
- Shift Theorem: The Convolution Theorem states that a shift in the spatial domain
corresponds to an element-wise multiplication in the Fourier domain.
In Group Theory terms: The Regular Representation (shifting the signal) corresponds to the Irrep (multiplying the coefficients) in the Fourier domain.
Key Insight: Steerable CNNs operate directly in this "Fourier domain." The feature vectors at each pixel are actually local Fourier coefficients, and we process them by mixing these coefficients (via Clebsch-Gordan products, covered next).