deepmd.dpmodel.descriptor
Submodules
Package Contents
Classes
Attention-based descriptor which is proposed in the pretrainable DPA-1[1] model. | |
Concate a list of descriptors to form a new descriptor. | |
DeepPot-SE constructed from all information (both angular and radial) of | |
DeepPot-SE_R constructed from only the radial imformation of atomic configurations. |
Functions
- class deepmd.dpmodel.descriptor.DescrptDPA1(rcut: float, rcut_smth: float, sel: List[int] | int, ntypes: int, neuron: List[int] = [25, 50, 100], axis_neuron: int = 8, tebd_dim: int = 8, tebd_input_mode: str = 'concat', resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = False, attn: int = 128, attn_layer: int = 2, attn_dotr: bool = True, attn_mask: bool = False, exclude_types: List[List[int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, scaling_factor=1.0, normalize: bool = True, temperature: float | None = None, trainable_ln: bool = True, ln_eps: float | None = 1e-05, smooth_type_embedding: bool = True, concat_output_tebd: bool = True, spin: Any | None = None, seed: int | None = None)[source]
Bases:
deepmd.dpmodel.NativeOP
,deepmd.dpmodel.descriptor.base_descriptor.BaseDescriptor
Attention-based descriptor which is proposed in the pretrainable DPA-1[1] model.
This descriptor, \(\mathcal{D}^i \in \mathbb{R}^{M \times M_{<}}\), is given by
\[\mathcal{D}^i = \frac{1}{N_c^2}(\hat{\mathcal{G}}^i)^T \mathcal{R}^i (\mathcal{R}^i)^T \hat{\mathcal{G}}^i_<,\]where \(\hat{\mathcal{G}}^i\) represents the embedding matrix:math:mathcal{G}^i after additional self-attention mechanism and \(\mathcal{R}^i\) is defined by the full case in the se_e2_a descriptor. Note that we obtain \(\mathcal{G}^i\) using the type embedding method by default in this descriptor.
To perform the self-attention mechanism, the queries \(\mathcal{Q}^{i,l} \in \mathbb{R}^{N_c\times d_k}\), keys \(\mathcal{K}^{i,l} \in \mathbb{R}^{N_c\times d_k}\), and values \(\mathcal{V}^{i,l} \in \mathbb{R}^{N_c\times d_v}\) are first obtained:
\[\left(\mathcal{Q}^{i,l}\right)_{j}=Q_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),\]\[\left(\mathcal{K}^{i,l}\right)_{j}=K_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),\]\[\left(\mathcal{V}^{i,l}\right)_{j}=V_{l}\left(\left(\mathcal{G}^{i,l-1}\right)_{j}\right),\]where \(Q_{l}\), \(K_{l}\), \(V_{l}\) represent three trainable linear transformations that output the queries and keys of dimension \(d_k\) and values of dimension \(d_v\), and \(l\) is the index of the attention layer. The input embedding matrix to the attention layers, denoted by \(\mathcal{G}^{i,0}\), is chosen as the two-body embedding matrix.
Then the scaled dot-product attention method is adopted:
\[A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})=\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right)\mathcal{V}^{i,l},\]where \(\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right) \in \mathbb{R}^{N_c\times N_c}\) is attention weights. In the original attention method, one typically has \(\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}\right)=\mathrm{softmax}\left(\frac{\mathcal{Q}^{i,l} (\mathcal{K}^{i,l})^{T}}{\sqrt{d_{k}}}\right)\), with \(\sqrt{d_{k}}\) being the normalization temperature. This is slightly modified to incorporate the angular information:
\[\varphi\left(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l},\mathcal{R}^{i,l}\right) = \mathrm{softmax}\left(\frac{\mathcal{Q}^{i,l} (\mathcal{K}^{i,l})^{T}}{\sqrt{d_{k}}}\right) \odot \hat{\mathcal{R}}^{i}(\hat{\mathcal{R}}^{i})^{T},\]- where \(\hat{\mathcal{R}}^{i} \in \mathbb{R}^{N_c\times 3}\) denotes normalized relative coordinates,
\(\hat{\mathcal{R}}^{i}_{j} = \frac{\boldsymbol{r}_{ij}}{\lVert \boldsymbol{r}_{ij} \lVert}\) and \(\odot\) means element-wise multiplication.
- Then layer normalization is added in a residual way to finally obtain the self-attention local embedding matrix
\(\hat{\mathcal{G}}^{i} = \mathcal{G}^{i,L_a}\) after \(L_a\) attention layers:[^1]
\[\mathcal{G}^{i,l} = \mathcal{G}^{i,l-1} + \mathrm{LayerNorm}(A(\mathcal{Q}^{i,l}, \mathcal{K}^{i,l}, \mathcal{V}^{i,l}, \mathcal{R}^{i,l})).\]- Parameters:
- rcut: float
The cut-off radius \(r_c\)
- rcut_smth: float
From where the environment matrix should be smoothed \(r_s\)
- sel
list
[int
],int
list[int]: sel[i] specifies the maxmum number of type i atoms in the cut-off radius int: the total maxmum number of atoms in the cut-off radius
- ntypes
int
Number of element types
- neuron
list
[int
] Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
- axis_neuron: int
Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
- tebd_dim: int
Dimension of the type embedding
- tebd_input_mode: str
The way to mix the type embeddings. Supported options are concat. (TODO need to support stripped_type_embedding option)
- resnet_dt: bool
Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
- trainable: bool
If the weights of this descriptors are trainable.
- trainable_ln: bool
Whether to use trainable shift and scale weights in layer normalization.
- ln_eps: float, Optional
The epsilon value for layer normalization.
- type_one_side: bool
If ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- attn: int
Hidden dimension of the attention vectors
- attn_layer: int
Number of attention layers
- attn_dotr: bool
If dot the angular gate to the attention weights
- attn_mask: bool
(Only support False to keep consistent with other backend references.) (Not used in this version. True option is not implemented.) If mask the diagonal of attention weights
- exclude_types
List
[List
[int
]] The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection: float
Protection parameter to prevent division by zero errors during environment matrix calculations.
- set_davg_zero: bool
Set the shift of embedding net input to zero.
- activation_function: str
The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.
- precision: str
The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.
- scaling_factor: float
The scaling factor of normalization in calculations of attention weights. If temperature is None, the scaling of attention weights is (N_dim * scaling_factor)**0.5
- normalize: bool
Whether to normalize the hidden vectors in attention weights calculation.
- temperature: float
If not None, the scaling of attention weights is temperature itself.
- smooth_type_embedding: bool
Whether to use smooth process in attention weights calculation.
- concat_output_tebd: bool
Whether to concat type embedding at the output of the descriptor.
- spin
(Only support None to keep consistent with other backend references.) (Not used in this version. Not-none option is not implemented.) The old implementation of deepspin.
References
[1]Duo Zhang, Hangrui Bi, Fu-Zhi Dai, Wanrun Jiang, Linfeng Zhang, and Han Wang. 2022. DPA-1: Pretraining of Attention-based Deep Potential Model for Molecular Simulation. arXiv preprint arXiv:2208.08236.
- property dim_out
Returns the output dimension of this descriptor.
- mixed_types()[source]
If true, the discriptor 1. assumes total number of atoms aligned across frames; 2. requires a neighbor list that does not distinguish different atomic types.
If false, the discriptor 1. assumes total number of atoms of each atom type aligned across frames; 2. requires a neighbor list that distinguishes different atomic types.
Share the parameters of self to the base_class with shared_level during multitask training. If not start from checkpoint (resume is False), some seperated parameters (e.g. mean and stddev) will be re-calculated across different classes.
- abstract compute_input_stats(merged: List[dict], path: deepmd.utils.path.DPPath | None = None)[source]
Update mean and stddev for descriptor elements.
- call(coord_ext, atype_ext, nlist, mapping: numpy.ndarray | None = None)[source]
Compute the descriptor.
- Parameters:
- coord_ext
The extended coordinates of atoms. shape: nf x (nallx3)
- atype_ext
The extended aotm types. shape: nf x nall
- nlist
The neighbor list. shape: nf x nloc x nnei
- mapping
The index mapping from extended to lcoal region. not used by this descriptor.
- Returns:
descriptor
The descriptor. shape: nf x nloc x (ng x axis_neuron)
gr
The rotationally equivariant and permutationally invariant single particle representation. shape: nf x nloc x ng x 3
g2
The rotationally invariant pair-partical representation. this descriptor returns None
h2
The rotationally equivariant pair-partical representation. this descriptor returns None
sw
The smooth switch function.
- classmethod deserialize(data: dict) DescrptDPA1 [source]
Deserialize from dict.
- class deepmd.dpmodel.descriptor.DescrptHybrid(list: List[deepmd.dpmodel.descriptor.base_descriptor.BaseDescriptor | Dict[str, Any]])[source]
Bases:
deepmd.dpmodel.descriptor.base_descriptor.BaseDescriptor
,deepmd.dpmodel.common.NativeOP
Concate a list of descriptors to form a new descriptor.
- Parameters:
- mixed_types()[source]
Returns if the descriptor requires a neighbor list that distinguish different atomic types or not.
Share the parameters of self to the base_class with shared_level during multitask training. If not start from checkpoint (resume is False), some seperated parameters (e.g. mean and stddev) will be re-calculated across different classes.
- compute_input_stats(merged: List[dict], path: deepmd.utils.path.DPPath | None = None)[source]
Update mean and stddev for descriptor elements.
- call(coord_ext, atype_ext, nlist, mapping: numpy.ndarray | None = None)[source]
Compute the descriptor.
- Parameters:
- coord_ext
The extended coordinates of atoms. shape: nf x (nallx3)
- atype_ext
The extended aotm types. shape: nf x nall
- nlist
The neighbor list. shape: nf x nloc x nnei
- mapping
The index mapping, not required by this descriptor.
- Returns:
descriptor
The descriptor. shape: nf x nloc x (ng x axis_neuron)
gr
The rotationally equivariant and permutationally invariant single particle representation. shape: nf x nloc x ng x 3.
g2
The rotationally invariant pair-partical representation.
h2
The rotationally equivariant pair-partical representation.
sw
The smooth switch function.
- classmethod update_sel(global_jdata: dict, local_jdata: dict) dict [source]
Update the selection and perform neighbor statistics.
- classmethod deserialize(data: dict) DescrptHybrid [source]
Deserialize the model.
- Parameters:
- data
dict
The serialized data
- data
- Returns:
BD
The deserialized descriptor
- deepmd.dpmodel.descriptor.make_base_descriptor(t_tensor, fwd_method_name: str = 'forward')
Make the base class for the descriptor.
- Parameters:
- t_tensor
The type of the tensor. used in the type hint.
- fwd_method_name
Name of the forward method. For dpmodels, it should be “call”. For torch models, it should be “forward”.
- class deepmd.dpmodel.descriptor.DescrptSeA(rcut: float, rcut_smth: float, sel: List[int], neuron: List[int] = [24, 48, 96], axis_neuron: int = 8, resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = True, exclude_types: List[List[int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, spin: Any | None = None, seed: int | None = None)[source]
Bases:
deepmd.dpmodel.NativeOP
,deepmd.dpmodel.descriptor.base_descriptor.BaseDescriptor
DeepPot-SE constructed from all information (both angular and radial) of atomic configurations. The embedding takes the distance between atoms as input.
The descriptor \(\mathcal{D}^i \in \mathcal{R}^{M_1 \times M_2}\) is given by [1]
\[\mathcal{D}^i = (\mathcal{G}^i)^T \mathcal{R}^i (\mathcal{R}^i)^T \mathcal{G}^i_<\]where \(\mathcal{R}^i \in \mathbb{R}^{N \times 4}\) is the coordinate matrix, and each row of \(\mathcal{R}^i\) can be constructed as follows
\[(\mathcal{R}^i)_j = [ \begin{array}{c} s(r_{ji}) & \frac{s(r_{ji})x_{ji}}{r_{ji}} & \frac{s(r_{ji})y_{ji}}{r_{ji}} & \frac{s(r_{ji})z_{ji}}{r_{ji}} \end{array} ]\]where \(\mathbf{R}_{ji}=\mathbf{R}_j-\mathbf{R}_i = (x_{ji}, y_{ji}, z_{ji})\) is the relative coordinate and \(r_{ji}=\lVert \mathbf{R}_{ji} \lVert\) is its norm. The switching function \(s(r)\) is defined as:
\[\begin{split}s(r)= \begin{cases} \frac{1}{r}, & r<r_s \\ \frac{1}{r} \{ {(\frac{r - r_s}{ r_c - r_s})}^3 (-6 {(\frac{r - r_s}{ r_c - r_s})}^2 +15 \frac{r - r_s}{ r_c - r_s} -10) +1 \}, & r_s \leq r<r_c \\ 0, & r \geq r_c \end{cases}\end{split}\]Each row of the embedding matrix \(\mathcal{G}^i \in \mathbb{R}^{N \times M_1}\) consists of outputs of a embedding network \(\mathcal{N}\) of \(s(r_{ji})\):
\[(\mathcal{G}^i)_j = \mathcal{N}(s(r_{ji}))\]\(\mathcal{G}^i_< \in \mathbb{R}^{N \times M_2}\) takes first \(M_2\) columns of \(\mathcal{G}^i\). The equation of embedding network \(\mathcal{N}\) can be found at
deepmd.tf.utils.network.embedding_net()
.- Parameters:
- rcut
The cut-off radius \(r_c\)
- rcut_smth
From where the environment matrix should be smoothed \(r_s\)
- sel
list
[int
] sel[i] specifies the maxmum number of type i atoms in the cut-off radius
- neuron
list
[int
] Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
- axis_neuron
Number of the axis neuron \(M_2\) (number of columns of the sub-matrix of the embedding matrix)
- resnet_dt
Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
- trainable
If the weights of embedding net are trainable.
- type_one_side
Try to build N_types embedding nets. Otherwise, building N_types^2 embedding nets
- exclude_types
List
[List
[int
]] The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection: float
Protection parameter to prevent division by zero errors during environment matrix calculations.
- set_davg_zero
Set the shift of embedding net input to zero.
- activation_function
The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.
- precision
The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.
- multi_task
If the model has multi fitting nets to train.
- spin
The deepspin object.
References
[1]Linfeng Zhang, Jiequn Han, Han Wang, Wissam A. Saidi, Roberto Car, and E. Weinan. 2018. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 4441-4451.
- property dim_out
Returns the output dimension of this descriptor.
- mixed_types()[source]
Returns if the descriptor requires a neighbor list that distinguish different atomic types or not.
Share the parameters of self to the base_class with shared_level during multitask training. If not start from checkpoint (resume is False), some seperated parameters (e.g. mean and stddev) will be re-calculated across different classes.
- abstract compute_input_stats(merged: List[dict], path: deepmd.utils.path.DPPath | None = None)[source]
Update mean and stddev for descriptor elements.
- call(coord_ext, atype_ext, nlist, mapping: numpy.ndarray | None = None)[source]
Compute the descriptor.
- Parameters:
- coord_ext
The extended coordinates of atoms. shape: nf x (nallx3)
- atype_ext
The extended aotm types. shape: nf x nall
- nlist
The neighbor list. shape: nf x nloc x nnei
- mapping
The index mapping from extended to lcoal region. not used by this descriptor.
- Returns:
descriptor
The descriptor. shape: nf x nloc x (ng x axis_neuron)
gr
The rotationally equivariant and permutationally invariant single particle representation. shape: nf x nloc x ng x 3
g2
The rotationally invariant pair-partical representation. this descriptor returns None
h2
The rotationally equivariant pair-partical representation. this descriptor returns None
sw
The smooth switch function.
- classmethod deserialize(data: dict) DescrptSeA [source]
Deserialize from dict.
- class deepmd.dpmodel.descriptor.DescrptSeR(rcut: float, rcut_smth: float, sel: List[int], neuron: List[int] = [24, 48, 96], resnet_dt: bool = False, trainable: bool = True, type_one_side: bool = True, exclude_types: List[List[int]] = [], env_protection: float = 0.0, set_davg_zero: bool = False, activation_function: str = 'tanh', precision: str = DEFAULT_PRECISION, spin: Any | None = None, seed: int | None = None)[source]
Bases:
deepmd.dpmodel.NativeOP
,deepmd.dpmodel.descriptor.base_descriptor.BaseDescriptor
DeepPot-SE_R constructed from only the radial imformation of atomic configurations.
- Parameters:
- rcut
The cut-off radius \(r_c\)
- rcut_smth
From where the environment matrix should be smoothed \(r_s\)
- sel
list
[int
] sel[i] specifies the maxmum number of type i atoms in the cut-off radius
- neuron
list
[int
] Number of neurons in each hidden layers of the embedding net \(\mathcal{N}\)
- resnet_dt
Time-step dt in the resnet construction: y = x + dt * phi (Wx + b)
- trainable
If the weights of embedding net are trainable.
- type_one_side
Try to build N_types embedding nets. Otherwise, building N_types^2 embedding nets
- exclude_types
List
[List
[int
]] The excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- set_davg_zero
Set the shift of embedding net input to zero.
- activation_function
The activation function in the embedding net. Supported options are “relu”, “tanh”, “none”, “linear”, “softplus”, “sigmoid”, “relu6”, “gelu”, “gelu_tf”.
- precision
The precision of the embedding net parameters. Supported options are “float32”, “default”, “float16”, “float64”.
- multi_task
If the model has multi fitting nets to train.
- spin
The deepspin object.
References
[1]Linfeng Zhang, Jiequn Han, Han Wang, Wissam A. Saidi, Roberto Car, and E. Weinan. 2018. End-to-end symmetry preserving inter-atomic potential energy model for finite and extended systems. In Proceedings of the 32nd International Conference on Neural Information Processing Systems (NIPS’18). Curran Associates Inc., Red Hook, NY, USA, 4441-4451.
- property dim_out
Returns the output dimension of this descriptor.
- mixed_types()[source]
Returns if the descriptor requires a neighbor list that distinguish different atomic types or not.
Share the parameters of self to the base_class with shared_level during multitask training. If not start from checkpoint (resume is False), some seperated parameters (e.g. mean and stddev) will be re-calculated across different classes.
- abstract compute_input_stats(merged: List[dict], path: deepmd.utils.path.DPPath | None = None)[source]
Update mean and stddev for descriptor elements.
- call(coord_ext, atype_ext, nlist, mapping: numpy.ndarray | None = None)[source]
Compute the descriptor.
- Parameters:
- coord_ext
The extended coordinates of atoms. shape: nf x (nallx3)
- atype_ext
The extended aotm types. shape: nf x nall
- nlist
The neighbor list. shape: nf x nloc x nnei
- mapping
The index mapping from extended to lcoal region. not used by this descriptor.
- Returns:
descriptor
The descriptor. shape: nf x nloc x (ng x axis_neuron)
gr
The rotationally equivariant and permutationally invariant single particle representation. shape: nf x nloc x ng x 3
g2
The rotationally invariant pair-partical representation. this descriptor returns None
h2
The rotationally equivariant pair-partical representation. this descriptor returns None
sw
The smooth switch function.
- classmethod deserialize(data: dict) DescrptSeR [source]
Deserialize from dict.