deepmd.dpmodel.descriptor.dpa4_nn.so3#
SO(3)-equivariant linear layers for DPA4/SeZM.
This module defines the channel-only and focus-aware linear maps used by SeZM SO(3) feature transformations.
This module is the dpmodel (array-API) port of deepmd.pt.model.descriptor.sezm_nn.so3.
Classes#
Per-focus linear projection on the last feature axis. | |
Channel-only linear projection on the last feature axis. | |
Focus-aware degree-wise linear self-interaction. |
Module Contents#
- class deepmd.dpmodel.descriptor.dpa4_nn.so3.FocusLinear(*, in_channels: int, out_channels: int, n_focus: int, precision: str = DEFAULT_PRECISION, bias: bool = True, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#
Bases:
deepmd.dpmodel.NativeOPPer-focus linear projection on the last feature axis.
- Parameters:
- in_channels
Input feature dimension.
- out_channels
Output feature dimension.
- n_focus
Number of focus streams.
- precision
Parameter precision.
- bias
Whether to use bias.
- trainable
Whether parameters are trainable.
- seed
Random seed for initialization.
- init_std
If given, use normal(0, init_std) instead of default uniform init. Useful for gate projections where small initial logits are desired.
Notes
Parameters are stored in (in, out) convention to match Muon’s rectangular correction assumption (rows=fan_in, cols=fan_out): - weight: (in_channels, n_focus * out_channels) - bias: (n_focus * out_channels,)
- class deepmd.dpmodel.descriptor.dpa4_nn.so3.ChannelLinear(*, in_channels: int, out_channels: int, precision: str = DEFAULT_PRECISION, bias: bool = True, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#
Bases:
deepmd.dpmodel.NativeOPChannel-only linear projection on the last feature axis.
- Parameters:
- in_channels
Input feature dimension.
- out_channels
Output feature dimension.
- precision
Parameter precision.
- bias
Whether to use bias.
- trainable
Whether parameters are trainable.
- seed
Random seed for initialization.
- init_std
If given, use normal(0, init_std) instead of default uniform init. Useful for gate projections where small initial logits are desired.
Notes
Parameters are stored in (in, out) convention to match Muon’s rectangular correction assumption (rows=fan_in, cols=fan_out): - weight: (in_channels, out_channels) - bias: (out_channels,)
- class deepmd.dpmodel.descriptor.dpa4_nn.so3.SO3Linear(*, lmax: int, in_channels: int, out_channels: int, n_focus: int = 1, precision: str = DEFAULT_PRECISION, mlp_bias: bool = False, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#
Bases:
deepmd.dpmodel.NativeOPFocus-aware degree-wise linear self-interaction.
This vectorized implementation avoids Python loops by using
torch.einsumandindex_select. The key insight is that weights are shared across allmcomponents within eachlblock.- Parameters:
- lmax
Maximum spherical harmonic degree.
- in_channels
Number of input channels per (l, m) coefficient.
- out_channels
Number of output channels per (l, m) coefficient.
- n_focus
Number of focus streams.
- precision
Parameter precision.
- mlp_bias
Whether to use bias for l=0 (scalar) components.
- trainable
Whether parameters are trainable.
- seed
Random seed for weight initialization.
- init_std
If given, use normal(0, init_std) for all weights instead of default trunc-normal fan-in/fan-out init. Use 0.0 for zero initialization.
Notes
Weight storage:
(lmax+1, C_in, F*C_out).Bias storage:
(F*C_out,), only applied tol=0scalar components.Runtime view restores weights to
(lmax+1, C_in, F, C_out)via reshape.expand_indexmaps each packed(l,m)position to itslvalue.Einsum
ndfi,difo->ndfokeeps the whole multi-focus path vectorized.In HybridMuon slice mode, each
(C_in, F*C_out)slice gets independent NS update with stable rectangular scaling.