Lecture 3.4
SO(3) Representations
This lecture provides the essential group theoretical background for 3D Steerable CNNs. We analyze the rotation group $SO(3)$ and its representations.
1. Wigner-D Matrices: The Irreps of SO(3)
Just as complex exponentials $e^{im\theta}$ form the irreducible representations of 2D rotations ($SO(2)$), the Wigner-D matrices $D^l(g)$ form the irreducible representations of 3D rotations ($SO(3)$).
- Index $l$: The frequency or degree ($l=0, 1, 2, \dots$).
- Dimension: Each matrix is $(2l+1) \times (2l+1)$.
- $l=0$: $1 \times 1$ (Scalar, invariant).
- $l=1$: $3 \times 3$ (Vector, standard rotation matrix).
- $l=2$: $5 \times 5$ (Rank-2 Traceless Symmetric Tensor).
2. Spherical Harmonics as Steerable Basis
Functions on the sphere $S^2$ can be expanded in the basis of Spherical Harmonics $Y^l_m$. The vector of coefficients for frequency $l$ transforms exactly according to the Wigner-D matrix $D^l(g)$. This duality allows us to treat geometric quantities (directions) as steerable feature vectors.
3. The Clebsch-Gordan Tensor Product
The core operation in 3D steerable networks is combining quantities. If we have a type-$l_1$ vector and a type-$l_2$ vector, their outer product allows for new representations of types $|l_1 - l_2|, \dots, l_1 + l_2$.
Example ($1 \otimes 1$): Combining two 3D vectors ($l=1$) yields:
- $l=0$: Scalar (Dot Product).
- $l=1$: Vector (Cross Product).
- $l=2$: Rank-2 Tensor (Symmetric Traceless Outer Product).
The Clebsch-Gordan coefficients tell us exactly how to linearly combine elements to
isolate these irreducible parts. Libraries like e3nn implement this automatically.