deepmd.dpmodel.descriptor.dpa4_nn.so3#

SO(3)-equivariant linear layers for DPA4/SeZM.

This module defines the channel-only and focus-aware linear maps used by SeZM SO(3) feature transformations.

This module is the dpmodel (array-API) port of deepmd.pt.model.descriptor.sezm_nn.so3.

Classes#

FocusLinear

Per-focus linear projection on the last feature axis.

ChannelLinear

Channel-only linear projection on the last feature axis.

SO3Linear

Focus-aware degree-wise linear self-interaction.

Module Contents#

class deepmd.dpmodel.descriptor.dpa4_nn.so3.FocusLinear(*, in_channels: int, out_channels: int, n_focus: int, precision: str = DEFAULT_PRECISION, bias: bool = True, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP

Per-focus linear projection on the last feature axis.

Parameters:
in_channels

Input feature dimension.

out_channels

Output feature dimension.

n_focus

Number of focus streams.

precision

Parameter precision.

bias

Whether to use bias.

trainable

Whether parameters are trainable.

seed

Random seed for initialization.

init_std

If given, use normal(0, init_std) instead of default uniform init. Useful for gate projections where small initial logits are desired.

Notes

Parameters are stored in (in, out) convention to match Muon’s rectangular correction assumption (rows=fan_in, cols=fan_out): - weight: (in_channels, n_focus * out_channels) - bias: (n_focus * out_channels,)

in_channels[source]#
out_channels[source]#
n_focus[source]#
precision = 'float64'[source]#
trainable = True[source]#
use_bias = True[source]#
weight[source]#
call(x: Any) Any[source]#
Parameters:
x

Input array with shape (B, F, Cin).

Returns:
Array

Projected array with shape (B, F, Cout).

serialize() dict[str, Any][source]#

Serialize the FocusLinear to a dict.

classmethod deserialize(data: dict[str, Any]) FocusLinear[source]#

Deserialize a FocusLinear from a dict.

class deepmd.dpmodel.descriptor.dpa4_nn.so3.ChannelLinear(*, in_channels: int, out_channels: int, precision: str = DEFAULT_PRECISION, bias: bool = True, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP

Channel-only linear projection on the last feature axis.

Parameters:
in_channels

Input feature dimension.

out_channels

Output feature dimension.

precision

Parameter precision.

bias

Whether to use bias.

trainable

Whether parameters are trainable.

seed

Random seed for initialization.

init_std

If given, use normal(0, init_std) instead of default uniform init. Useful for gate projections where small initial logits are desired.

Notes

Parameters are stored in (in, out) convention to match Muon’s rectangular correction assumption (rows=fan_in, cols=fan_out): - weight: (in_channels, out_channels) - bias: (out_channels,)

in_channels[source]#
out_channels[source]#
precision = 'float64'[source]#
trainable = True[source]#
use_bias = True[source]#
weight[source]#
call(x: Any) Any[source]#
Parameters:
x

Input array with shape (..., C_in).

Returns:
Array

Projected array with shape (..., C_out).

serialize() dict[str, Any][source]#

Serialize the ChannelLinear to a dict.

classmethod deserialize(data: dict[str, Any]) ChannelLinear[source]#

Deserialize a ChannelLinear from a dict.

class deepmd.dpmodel.descriptor.dpa4_nn.so3.SO3Linear(*, lmax: int, in_channels: int, out_channels: int, n_focus: int = 1, precision: str = DEFAULT_PRECISION, mlp_bias: bool = False, trainable: bool = True, seed: int | list[int] | None = None, init_std: float | None = None)[source]#

Bases: deepmd.dpmodel.NativeOP

Focus-aware degree-wise linear self-interaction.

This vectorized implementation avoids Python loops by using torch.einsum and index_select. The key insight is that weights are shared across all m components within each l block.

Parameters:
lmax

Maximum spherical harmonic degree.

in_channels

Number of input channels per (l, m) coefficient.

out_channels

Number of output channels per (l, m) coefficient.

n_focus

Number of focus streams.

precision

Parameter precision.

mlp_bias

Whether to use bias for l=0 (scalar) components.

trainable

Whether parameters are trainable.

seed

Random seed for weight initialization.

init_std

If given, use normal(0, init_std) for all weights instead of default trunc-normal fan-in/fan-out init. Use 0.0 for zero initialization.

Notes

  • Weight storage: (lmax+1, C_in, F*C_out).

  • Bias storage: (F*C_out,), only applied to l=0 scalar components.

  • Runtime view restores weights to (lmax+1, C_in, F, C_out) via reshape.

  • expand_index maps each packed (l,m) position to its l value.

  • Einsum ndfi,difo->ndfo keeps the whole multi-focus path vectorized.

  • In HybridMuon slice mode, each (C_in, F*C_out) slice gets independent NS update with stable rectangular scaling.

lmax[source]#
in_channels[source]#
out_channels[source]#
n_focus = 1[source]#
precision = 'float64'[source]#
trainable = True[source]#
ebed_dim = 1[source]#
mlp_bias = False[source]#
weight[source]#
expand_index[source]#
call(x: Any) Any[source]#
Parameters:
x

Input features with shape (N, D, F, C_in) where D=(lmax+1)^2.

Returns:
Array

Order-wise mixed features with shape (N, D, F, C_out).

serialize() dict[str, Any][source]#

Serialize the SO3Linear to a dict.

classmethod deserialize(data: dict[str, Any]) SO3Linear[source]#

Deserialize an SO3Linear from a dict.