Lecture 3.5

Steerable GNNs

We review the state-of-the-art in Steerable GNNs. These methods utilize the full machinery of representation theory (Wigner-D matrices, Clebsch-Gordan products) to process 3D data.

1. The Steerable GNN Recipe

Most steerable GNNs follow a similar pattern for their layers:

  1. Geometric Embedding: Encode relative position $x_j - x_i$ into a steerable vector $Y(\Delta x)$ (Spherical Harmonics).
  2. Feature Interaction: Combine neighbor features $h_j$ with geometric embedding using the Clebsch-Gordan tensor product: $$ m_{ij} = h_j \otimes_{\text{CG}} Y(\Delta x_{ij}) $$
  3. Aggregation: Sum messages from all neighbors.

2. Key Architectures

  • Tensor Field Networks (TFN): The pioneering work that introduced rotation-equivariant point convolutions. It uses the recipe above, interpretable as a continuous convolution with steerable kernels.
  • SE(3)-Transformers: Adds an Attention Mechanism. The aggregation is weighted by invariant attention coefficients $\alpha_{ij}$, allowing the network to focus on specific neighbors while maintaining equivariance.
  • Cormorant: Designed for molecular physics, explicitly modeling physical interactions (dipoles, quadrupoles) using high-order tensor products.
  • PaiNN: An efficient architecture that restricts features to scalars ($l=0$) and vectors ($l=1$), avoiding expensive higher-order CG products while still achieving state-of-the-art results in many tasks.

3. Physical Interpretation

Higher-order steerable features ($l=0, 1, 2, \dots$) naturally align with the multipole expansion in physics (Charge, Dipole, Quadrupole). By using these features, the network learns to reason in the language of physics.