5.4. Training Parameters#
Note
One can load, modify, and export the input file by using our effective web-based tool DP-GUI online or hosted using the command line interface dp gui. All training parameters below can be set in DP-GUI. By clicking “SAVE JSON”, one can download the input file for further training.
Note
One can benefit from IntelliSense and validation when writing JSON files using Visual Studio Code. See here to learn how to configure.
- model:#
- type:
dictargument path:model- type_map:#
- type:
list[str], optionalargument path:model/type_mapA list of strings. Give the name to each type of atoms. It is noted that the number of atom type of training system must be less than 128 in a GPU environment. If not given, type.raw in each system should use the same type indexes, and type_map.raw will take no effect.
- data_stat_nbatch:#
- type:
int, optional, default:10argument path:model/data_stat_nbatchThe model determines the normalization from the statistics of the data. This key specifies the number of frames in each system used for statistics.
- data_stat_protect:#
- type:
float, optional, default:0.01argument path:model/data_stat_protectProtect parameter for atomic energy regression.
- data_bias_nsample:#
- type:
int, optional, default:10argument path:model/data_bias_nsampleThe number of training samples in a system to compute and change the energy bias.
- use_srtab:#
- type:
str, optionalargument path:model/use_srtabThe table for the short-range pairwise interaction added on top of DP. The table is a text data file with (N_t + 1) * N_t / 2 + 1 columes. The first colume is the distance between atoms. The second to the last columes are energies for pairs of certain types. For example we have two atom types, 0 and 1. The columes from 2nd to 4th are for 0-0, 0-1 and 1-1 correspondingly.
- smin_alpha:#
- type:
float, optionalargument path:model/smin_alphaThe short-range tabulated interaction will be switched according to the distance of the nearest neighbor. This distance is calculated by softmin. This parameter is the decaying parameter in the softmin. It is only required when use_srtab is provided.
- sw_rmin:#
- type:
float, optionalargument path:model/sw_rminThe lower boundary of the interpolation between short-range tabulated interaction and DP. It is only required when use_srtab is provided.
- sw_rmax:#
- type:
float, optionalargument path:model/sw_rmaxThe upper boundary of the interpolation between short-range tabulated interaction and DP. It is only required when use_srtab is provided.
- pair_exclude_types:#
- type:
list, optional, default:[]argument path:model/pair_exclude_types(Supported Backend: PyTorch) The atom pairs of the listed types are not treated to be neighbors, i.e. they do not see each other.
- atom_exclude_types:#
- type:
list, optional, default:[]argument path:model/atom_exclude_types(Supported Backend: PyTorch) Exclude the atomic contribution of the listed atom types
- preset_out_bias:#
- type:
dict[str, list[float | list[float] | None]]|NoneType, optional, default:Noneargument path:model/preset_out_bias(Supported Backend: PyTorch) The preset bias of the atomic output. Note that the set_davg_zero should be set to true. The bias is provided as a dict. Taking the energy model that has three atom types for example, the preset_out_bias may be given as { ‘energy’: [null, 0., 1.] }. In this case the energy bias of type 1 and 2 are set to 0. and 1., respectively. A dipole model with two atom types may set preset_out_bias as { ‘dipole’: [null, [0., 1., 2.]] }
- srtab_add_bias:#
- type:
bool, optional, default:Trueargument path:model/srtab_add_bias(Supported Backend: TensorFlow) Whether add energy bias from the statistics of the data to short-range tabulated atomic energy. It only takes effect when use_srtab is provided.
- type_embedding:#
- type:
dict, optionalargument path:model/type_embedding(Supported Backend: TensorFlow) The type embedding. In other backends, the type embedding is already included in the descriptor.
- neuron:#
- type:
list[int], optional, default:[8]argument path:model/type_embedding/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model/type_embedding/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model/type_embedding/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model/type_embedding/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model/type_embedding/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optional, default:Noneargument path:model/type_embedding/seedRandom seed for parameter initialization
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model/type_embedding/use_econf_tebdWhether to use an electronic-configuration-based type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model/type_embedding/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- modifier:#
- type:
dict, optionalargument path:model/modifier(Supported Backend: TensorFlow) The modifier of model output.
Depending on the value of type, different sub args are accepted.
- type:#
The type of modifier.
dipole_charge: Use WFCC to model the electronic structure of the system. Correct the long-range interaction.
When type is set to
dipole_charge:Use WFCC to model the electronic structure of the system. Correct the long-range interaction.
- model_name:#
- type:
strargument path:model/modifier[dipole_charge]/model_nameThe name of the frozen dipole model file.
- model_charge_map:#
- type:
list[float]argument path:model/modifier[dipole_charge]/model_charge_mapThe charge of the WFCC. The list length should be the same as the sel_type.
- sys_charge_map:#
- type:
list[float]argument path:model/modifier[dipole_charge]/sys_charge_mapThe charge of real atoms. The list length should be the same as the type_map
- ewald_beta:#
- type:
float, optional, default:0.4argument path:model/modifier[dipole_charge]/ewald_betaThe splitting parameter of Ewald sum. Unit is A^-1
- ewald_h:#
- type:
float, optional, default:1.0argument path:model/modifier[dipole_charge]/ewald_hThe grid spacing of the FFT grid. Unit is A
- compress:#
- type:
dict, optionalargument path:model/compress(Supported Backend: TensorFlow) Model compression configurations
- spin:#
- type:
dict, optionalargument path:model/spinThe settings for systems with spin.
- use_spin:#
- type:
list[bool]|list[int]argument path:model/spin/use_spinWhether to use atomic spin model for each atom type. List of boolean values with the shape of [ntypes] to specify which types use spin, or a list of integer values (Supported Backend: PyTorch) to indicate the index of the type that uses spin.
- spin_norm:#
- type:
list[float], optionalargument path:model/spin/spin_norm(Supported Backend: TensorFlow) The magnitude of atomic spin for each atom type with spin
- virtual_len:#
- type:
list[float], optionalargument path:model/spin/virtual_len(Supported Backend: TensorFlow) The distance between virtual atom representing spin and its corresponding real atom for each atom type with spin
- virtual_scale:#
- type:
list[float]|float, optionalargument path:model/spin/virtual_scale(Supported Backend: PyTorch) The scaling factor to determine the virtual distance between a virtual atom representing spin and its corresponding real atom for each atom type with spin. This factor is defined as the virtual distance divided by the magnitude of atomic spin for each atom type with spin. The virtual coordinate is defined as the real coordinate plus spin * virtual_scale. List of float values with shape of [ntypes] or [ntypes_spin] or one single float value for all types, only used when use_spin is True for each atom type.
- finetune_head:#
- type:
str, optionalargument path:model/finetune_head(Supported Backend: PyTorch) The chosen fitting net to fine-tune on, when doing multi-task fine-tuning. If not set or set to ‘RANDOM’, the fitting net will be randomly initialized.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key), default:standardargument path:model/typestandard: Standard model, which contains a descriptor and a fitting.dpa4: (Supported Backend: PyTorch) DPA4/SeZM model scaffold with fixed SeZM descriptor and fitting types.pairtab: (Supported Backend: TensorFlow) Pairwise tabulation energy model.pairwise_dprc: (Supported Backend: TensorFlow)linear_ener: (Supported Backend: TensorFlow)
When type is set to
standard:Standard model, which contains a descriptor and a fitting.
- descriptor:#
- type:
dictargument path:model[standard]/descriptorThe descriptor of atomic environment.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key)argument path:model[standard]/descriptor/typepossible choices:loc_frame,se_e2_a,dpa4,se_e3,se_a_tpe,se_e2_r,hybrid,se_atten,se_e3_tebd,se_atten_v2,dpa2,dpa3,se_a_ebd_v2,se_a_maskThe type of the descriptor.
loc_frame: (Supported Backend: TensorFlow) Defines a local frame at each atom, and computes the descriptor as local coordinates under this frame.se_e2_a: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor.dpa4: (Supported Backend: PyTorch) DPA4/SeZM descriptor implemented as the SeZM (Smooth Equivariant Zone-bridging Model) architecture.se_e3: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Three-body embedding will be used by this descriptor.se_a_tpe: (Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Type embedding will be used by this descriptor.se_e2_r: Used by the smooth edition of Deep Potential. Only the distance between atoms is used to construct the descriptor.hybrid: Concatenate of a list of descriptors as a new descriptor.se_atten: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism will be used by this descriptor.se_e3_tebd: (Supported Backend: PyTorch)se_atten_v2: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism with new modifications will be used by this descriptor.dpa2: (Supported Backend: PyTorch)dpa3: (Supported Backend: PyTorch)se_a_ebd_v2: (Supported Backend: TensorFlow)se_a_mask: (Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. It can accept a variable number of atoms in a frame (Non-PBC system). aparam are required as an indicator matrix for the real/virtual sign of input atoms.
When type is set to
loc_frame:(Supported Backend: TensorFlow) Defines a local frame at each atom, and computes the descriptor as local coordinates under this frame.
- sel_a:#
- type:
list[int]argument path:model[standard]/descriptor[loc_frame]/sel_aA list of integers. The length of the list should be the same as the number of atom types in the system. sel_a[i] gives the selected number of type-i neighbors. The full relative coordinates of the neighbors are used by the descriptor.
- sel_r:#
- type:
list[int]argument path:model[standard]/descriptor[loc_frame]/sel_rA list of integers. The length of the list should be the same as the number of atom types in the system. sel_r[i] gives the selected number of type-i neighbors. Only the relative distances of the neighbors are used by the descriptor. sel_a[i] + sel_r[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[loc_frame]/rcutThe cut-off radius. The default value is 6.0
- axis_rule:#
- type:
list[int]argument path:model[standard]/descriptor[loc_frame]/axis_ruleA list of integers. The length should be 6 times the number of types.
axis_rule[i*6+0]: class of the atom defining the first axis of type-i atom. 0 for neighbors with full coordinates and 1 for neighbors only with relative distance.
axis_rule[i*6+1]: type of the atom defining the first axis of type-i atom.
axis_rule[i*6+2]: index of the axis atom defining the first axis. Note that the neighbors with the same class and type are sorted according to their relative distance.
axis_rule[i*6+3]: class of the atom defining the second axis of type-i atom. 0 for neighbors with full coordinates and 1 for neighbors only with relative distance.
axis_rule[i*6+4]: type of the atom defining the second axis of type-i atom.
axis_rule[i*6+5]: index of the axis atom defining the second axis. Note that the neighbors with the same class and type are sorted according to their relative distance.
When type is set to
se_e2_a(or its aliasse_a):Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[standard]/descriptor[se_e2_a]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_e2_a]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_e2_a]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_e2_a]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[standard]/descriptor[se_e2_a]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_e2_a]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e2_a]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e2_a]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_e2_a]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e2_a]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_e2_a]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_e2_a]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_e2_a]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e2_a]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
When type is set to
dpa4(or its aliasesDPA4,SeZM,sezm):(Supported Backend: PyTorch) DPA4/SeZM descriptor implemented as the SeZM (Smooth Equivariant Zone-bridging Model) architecture.
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[standard]/descriptor[dpa4]/selThe maximum number of neighbors. It can be:
int: the total maximum number of neighbors within rcut (all types combined)
list[int]: sel[i] specifies the maximum number of type-i neighbors within rcut
str: Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[dpa4]/rcutThe cut-off radius.
- env_exp:#
- type:
list[int], optional, default:[7, 5]argument path:model[standard]/descriptor[dpa4]/env_expC^3 cutoff envelope exponents [rbf_env_exp, edge_env_exp]. rbf_env_exp controls radial basis function envelope decay; edge_env_exp controls message passing edge weight envelope decay. Larger values give weaker suppression.
- channels:#
- type:
int, optional, default:64argument path:model[standard]/descriptor[dpa4]/channelsTotal channels per (l,m) coefficient.
- basis_type:#
- type:
str, optional, default:besselargument path:model[standard]/descriptor[dpa4]/basis_typeRadial basis type. Supported values are bessel and gaussian.
- n_radial:#
- type:
int, optional, default:16argument path:model[standard]/descriptor[dpa4]/n_radialNumber of radial basis functions.
- radial_mlp:#
- type:
list[int], optional, default:[0]argument path:model[standard]/descriptor[dpa4]/radial_mlpHidden layer sizes for radial networks. An output layer of size (l_schedule[0]+1)*channels will be automatically appended. Use 0 as a placeholder to be replaced by channels.
- use_env_seed:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa4]/use_env_seed(Supported Backend: PyTorch) If True, seed the initial node state with local-environment information: apply environment matrix FiLM conditioning on l=0 features using 4D [s, s*r_hat] representation, and enable the non-scalar geometric initial embedding when l_schedule[0] > 0. If False, the initial state contains only atom-local scalar features before message passing. Internal dimensions are derived from channels: embed_dim=min(channels, 128), axis_dim=min(4 if embed_dim < 64 else 8, embed_dim-1), type_dim=clamp(channels//4, 8, 32), rbf_out_dim=max(32, embed_dim-2*type_dim), hidden_dim=min(256, max(2*embed_dim, rbf_out_dim+2*type_dim)).
- random_gamma:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa4]/random_gamma(Supported Backend: PyTorch) If True, apply a random roll about the edge-aligned local +Z axis before building Wigner-D blocks. The roll is sampled independently per edge and per forward call.
- lmax:#
- type:
int, optional, default:3argument path:model[standard]/descriptor[dpa4]/lmaxMaximum degree, only used when l_schedule is None.
- l_schedule:#
- type:
NoneType|list[int], optional, default:Noneargument path:model[standard]/descriptor[dpa4]/l_schedulePyramid schedule of lmax per block, e.g. [3, 3, 2]. Must be non-increasing. If set, lmax and n_blocks will be ignored.
- mmax:#
- type:
int|NoneType, optional, default:1argument path:model[standard]/descriptor[dpa4]/mmaxMaximum SO(2) order (|m|), only used when m_schedule is None. If None, defaults to the per-block lmax.
- m_schedule:#
- type:
NoneType|list[int], optional, default:Noneargument path:model[standard]/descriptor[dpa4]/m_scheduleSchedule of mmax per block. Must have the same length as l_schedule and satisfy m_schedule[i] <= l_schedule[i]. If set, mmax will be ignored.
- n_blocks:#
- type:
int, optional, default:3argument path:model[standard]/descriptor[dpa4]/n_blocksNumber of blocks (only used when l_schedule is None).
- so2_norm:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/so2_normIf True, apply intermediate ReducedEquivariantRMSNorm between SO(2) mixing layers. When False (default), no normalization is applied between layers.
- so2_layers:#
- type:
int, optional, default:4argument path:model[standard]/descriptor[dpa4]/so2_layersNumber of SO(2) mixing layers per block.
- so2_attn_res:#
- type:
str, optional, default:noneargument path:model[standard]/descriptor[dpa4]/so2_attn_res(Supported Backend: PyTorch) Depth-wise attention residual mode across the internal SO(2) layer history inside each interaction block. Must be one of none, independent, or dependent.
- radial_so2_mode:#
- type:
str, optional, default:degree_channelargument path:model[standard]/descriptor[dpa4]/radial_so2_mode(Supported Backend: PyTorch) Dynamic radial degree mixer mode inside SO(2) convolution. none applies elementwise radial modulation. degree uses an edge-conditioned cross-degree kernel W[l_in,l_out,|m|](r) shared by all channels. degree_channel uses W[l_in,l_out,|m|,c](r), optionally low-rank when radial_so2_rank > 0.
- radial_so2_rank:#
- type:
int, optional, default:1argument path:model[standard]/descriptor[dpa4]/radial_so2_rank(Supported Backend: PyTorch) Low-rank channel factorization rank for radial_so2_mode=degree_channel. 0 uses the full per-channel dynamic degree kernel.
- n_focus:#
- type:
int, optional, default:1argument path:model[standard]/descriptor[dpa4]/n_focusNumber of parallel focus streams used only inside the SO(2) convolution.
- focus_dim:#
- type:
int, optional, default:0argument path:model[standard]/descriptor[dpa4]/focus_dimHidden width per focus stream inside the SO(2) convolution. 0 means using channels.
- n_atten_head:#
- type:
int, optional, default:1argument path:model[standard]/descriptor[dpa4]/n_atten_headNumber of attention heads when aggregating messages in SO(2) convolution. 0 applies a plain envelope-weighted scatter-sum. When >0, the attention width must be divisible by n_atten_head, and envelope-gated grouped softmax attention with output-side head gate is applied. Attention uses w**2 * exp(logit) in the numerator and zeta + sum(w**2 * exp(logit)) in the denominator.
- atten_f_mix:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/atten_f_mix(Supported Backend: PyTorch) If True, merge all SO(2) focus streams into one attention stream after rotate-back. Attention heads split n_focus * focus_dim (or n_focus * channels when focus_dim=0) instead of each focus stream independently. The default False preserves per-focus attention.
- atten_v_proj:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/atten_v_proj(Supported Backend: PyTorch) If True, apply an explicit degree-aware value projection inside SO(2) attention. The default False keeps the raw rotated message as the attention value.
- atten_o_proj:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/atten_o_proj(Supported Backend: PyTorch) If True, apply an explicit degree-aware output projection after the SO(2) attention output gate. The default False keeps the legacy output path without this projection.
- ffn_neurons:#
- type:
int, optional, default:0argument path:model[standard]/descriptor[dpa4]/ffn_neuronsHidden width for block FFNs and the final scalar output FFN. >0 uses the same explicit width for both. 0 lets each path resolve its own width from channels: 4 * channels without GLU, (8 / 3) * channels with GLU, then round up to a multiple of 32.
- grid_mlp:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/grid_mlp(Supported Backend: PyTorch) If True, use the optional grid-MLP structure for the block-internal equivariant FFN. This does not change the final l=0 output head.
- ffn_blocks:#
- type:
int, optional, default:1argument path:model[standard]/descriptor[dpa4]/ffn_blocks(Supported Backend: PyTorch) Number of FFN sublayers per interaction block.
- sandwich_norm:#
- type:
list[bool], optional, default:[False, True, True, False]argument path:model[standard]/descriptor[dpa4]/sandwich_norm(Supported Backend: PyTorch) Pre/post-norm switches for residual branches. Use [so2_pre, so2_post, ffn_pre, ffn_post] to enable pre-norm before and post-norm after SO(2) and FFN operations.
- mlp_bias:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/mlp_bias(Supported Backend: PyTorch) Whether to use bias in equivariant layers. When False, removes bias from: - SO3Linear: l=0 bias - SO2Linear: l=0 bias - GatedActivation: gate linear bias - DepthAttnRes: input-dependent query projection - EnvironmentInitialEmbedding MLPs: rbf_proj_layer1/2 and g_layer1/2 Attention logit and output-gate parameters in SO(2) convolution are always bias-free.
- layer_scale:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/layer_scale(Supported Backend: PyTorch) If True, apply learnable LayerScale (init 1e-3) on residual branches: SO(2) branch uses per-focus-channel scales (shape (n_focus, focus_dim)) on each SO(2) mixing layer, and FFN branch uses per-channel scales (shape (channels,)) on each FFN residual branch.
- full_attn_res:#
- type:
str, optional, default:noneargument path:model[standard]/descriptor[dpa4]/full_attn_res(Supported Backend: PyTorch) Descriptor-level full attention residual mode over the unit history [x0, so2_0, ffn_0_0, ffn_0_1, …, so2_1, ffn_1_0, ffn_1_1, …]. independent uses learned query vectors, while dependent derives the query from the current SeZM state before the SO(2) unit, before each FFN unit, and before the final aggregation. Must be one of none, independent, or dependent. Cannot be enabled together with block_attn_res.
- block_attn_res:#
- type:
str, optional, default:noneargument path:model[standard]/descriptor[dpa4]/block_attn_res(Supported Backend: PyTorch) Descriptor-level block attention residual mode over block history [x0, b1, b2, …], where each block summary is the sum of the SO(2) unit output and all FFN unit outputs inside one interaction block. independent uses learned query vectors, while dependent derives queries from the current SeZM state before the SO(2) unit, before each FFN unit, and before the final block aggregation. Must be one of none, independent, or dependent. Cannot be enabled together with full_attn_res.
- s2_activation:#
- type:
list[bool], optional, default:[False, True]argument path:model[standard]/descriptor[dpa4]/s2_activation(Supported Backend: PyTorch) Two booleans [so2_enabled, ffn_enabled]. so2_enabled=true makes the SO(2) gated activation path use activation_function=”silu”. ffn_enabled=true makes the block-internal FFN path use activation_function=”silu” and glu_activation=true. S2-grid resolutions are resolved automatically per block. The e3nn SO(2) grid is [2 * mmax + 4, ceil_even(3 * lmax + 2)], and the e3nn FFN grid is lifted to [max(R_phi, R_theta), max(R_phi, R_theta)]. Lebedev branches use the smallest packaged rule with precision at least 3 * lmax. The final scalar output FFN is unchanged.
- lebedev_quadrature:#
- type:
bool|list[bool], optional, default:Trueargument path:model[standard]/descriptor[dpa4]/lebedev_quadrature(Supported Backend: PyTorch) Either one boolean applied to both S2 branches, or two booleans [so2_enabled, ffn_enabled] aligned with s2_activation. If a branch is enabled here, its S2 projector uses packaged Lebedev quadrature rules instead of the e3nn product grid. The default keeps the existing e3nn behavior.
- activation_function:#
- type:
str, optional, default:siluargument path:model[standard]/descriptor[dpa4]/activation_functionBase activation function for helper MLPs, the SO(2) gated activation path, and the final scalar output FFN. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”.. It is overridden to “silu” only on paths whose s2_activation switch is enabled.
- glu_activation:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa4]/glu_activation(Supported Backend: PyTorch) Base GLU switch for FFN (e.g., silu -> swiglu, gelu -> geglu). The block-internal FFN overrides this to true when s2_activation[1]=true, while the final scalar output FFN keeps the user-provided value.
- use_amp:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa4]/use_ampIf True, use automatic mixed precision (AMP) with bfloat16 on CUDA during training. This can improve speed and reduce memory usage. Enabling this option is recommended on GPUs with native bfloat16 support. Disable it on GPUs without native bfloat16 support to avoid runtime errors or additional conversion overhead.
- add_chg_spin_ebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa4]/add_chg_spin_ebd(Supported Backend: PyTorch) Whether to add frame-level charge and spin conditions to the descriptor type embedding.
- default_chg_spin:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/descriptor[dpa4]/default_chg_spin(Supported Backend: PyTorch) Default frame-level charge and spin conditions [charge, spin]. This option is used only when add_chg_spin_ebd is enabled. If set, the value is used when explicit charge_spin data are not provided, including during .pt2 inference.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[dpa4]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1. When the SeZM descriptor is used inside a full SeZM model config, prefer the model-level pair_exclude_types; if both fields are provided, they must match.
- precision:#
- type:
str, optional, default:float32argument path:model[standard]/descriptor[dpa4]/precisionThe precision of the descriptor parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”..
- eps:#
- type:
float, optional, default:1e-07argument path:model[standard]/descriptor[dpa4]/eps(Supported Backend: PyTorch) Small epsilon for numerical stability in division and normalization.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa4]/trainableIf the parameters in the descriptor are trainable.
- seed:#
- type:
int|NoneType, optional, default:Noneargument path:model[standard]/descriptor[dpa4]/seedRandom seed for parameter initialization.
When type is set to
se_e3(or its aliasesse_at,se_a_3be,se_t):Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Three-body embedding will be used by this descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[standard]/descriptor[se_e3]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_e3]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_e3]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_e3]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_e3]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e3]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_e3]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e3]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_e3]/seedRandom seed for parameter initialization
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e3]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_e3]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_e3]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
When type is set to
se_a_tpe(or its aliasse_a_ebd):(Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Type embedding will be used by this descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[standard]/descriptor[se_a_tpe]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_a_tpe]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_a_tpe]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_a_tpe]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[standard]/descriptor[se_a_tpe]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_a_tpe]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_tpe]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_tpe]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_a_tpe]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_a_tpe]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_a_tpe]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_a_tpe]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_a_tpe]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_tpe]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- type_nchanl:#
- type:
int, optional, default:4argument path:model[standard]/descriptor[se_a_tpe]/type_nchanlnumber of channels for type embedding
- type_nlayer:#
- type:
int, optional, default:2argument path:model[standard]/descriptor[se_a_tpe]/type_nlayernumber of hidden layers of type embedding net
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/descriptor[se_a_tpe]/numb_aparamdimension of atomic parameter. if set to a value > 0, the atomic parameters are embedded.
When type is set to
se_e2_r(or its aliasse_r):Used by the smooth edition of Deep Potential. Only the distance between atoms is used to construct the descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[standard]/descriptor[se_e2_r]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_e2_r]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_e2_r]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_e2_r]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_e2_r]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e2_r]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e2_r]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_e2_r]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e2_r]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_e2_r]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_e2_r]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e2_r]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_e2_r]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
When type is set to
hybrid:Concatenate of a list of descriptors as a new descriptor.
- list:#
- type:
listargument path:model[standard]/descriptor[hybrid]/listA list of descriptor definitions
When type is set to
se_atten(or its aliasdpa1):Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism will be used by this descriptor.
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[standard]/descriptor[se_atten]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_atten]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_atten]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_atten]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[standard]/descriptor[se_atten]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_atten]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten]/type_one_sideIf ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_atten]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_atten]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_atten]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_atten]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- attn:#
- type:
int, optional, default:128argument path:model[standard]/descriptor[se_atten]/attnThe length of hidden vectors in attention layers
- attn_layer:#
- type:
int, optional, default:2argument path:model[standard]/descriptor[se_atten]/attn_layerThe number of attention layers. Note that model compression of se_atten works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode==’strip’. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed.
- attn_dotr:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten]/attn_dotrWhether to do dot product with the normalized relative coordinates
- attn_mask:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten]/attn_maskWhether to mask the diagonal in the attention matrix
- stripped_type_embedding:#
- type:
bool|NoneType, optional, default:Noneargument path:model[standard]/descriptor[se_atten]/stripped_type_embedding(Deprecated, kept only for compatibility.) Whether to strip the type embedding into a separate embedding network. Setting this parameter to True is equivalent to setting tebd_input_mode to ‘strip’. Setting it to False is equivalent to setting tebd_input_mode to ‘concat’.The default value is None, which means the tebd_input_mode setting will be used instead.
- smooth_type_embedding:#
- type:
bool, optional, default:False, alias: smooth_type_embddingargument path:model[standard]/descriptor[se_atten]/smooth_type_embeddingWhether to use smooth process in attention weights calculation. (Supported Backend: TensorFlow) When using stripped type embedding, whether to dot smooth factor on the network output of type embedding to keep the network smooth, instead of setting set_davg_zero to be True.
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten]/set_davg_zeroSet the normalization average to zero. This option should be set when se_atten descriptor or atom_ener in the energy fitting is used
- trainable_ln:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten]/trainable_lnWhether to use trainable shift and scale weights in layer normalization.
- ln_eps:#
- type:
NoneType|float, optional, default:Noneargument path:model[standard]/descriptor[se_atten]/ln_epsThe epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[standard]/descriptor[se_atten]/tebd_dim(Supported Backend: PyTorch) Dimension of the atom-type embedding (tebd).
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten]/use_econf_tebd(Supported Backend: PyTorch) Whether to use electronic configuration type embedding. For TensorFlow backend, please set use_econf_tebd in type_embedding block instead.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- tebd_input_mode:#
- type:
str, optional, default:concatargument path:model[standard]/descriptor[se_atten]/tebd_input_modeHow the atom-type embedding (tebd) is fed into the descriptor. Supported modes are [‘concat’, ‘strip’].
‘concat’: Concatenate the type embedding with the smoothed radial information as the combined input to the embedding network. When type_one_side is False, the input is input_ij = concat([r_ij, tebd_j, tebd_i]). When type_one_side is True, the input is input_ij = concat([r_ij, tebd_j]). The output is out_ij = embedding(input_ij) for the pair-wise representation of atom i with neighbor j.
‘strip’: Use a separate embedding network for the type embedding and combine its output with the radial embedding-network output. When type_one_side is False, the input is input_t = concat([tebd_j, tebd_i]). (Supported Backend: PyTorch) When type_one_side is True, the input is input_t = tebd_j. The output is out_ij = embedding_t(input_t) * embedding_s(r_ij) + embedding_s(r_ij) for the pair-wise representation of atom i with neighbor j.
- scaling_factor:#
- type:
float, optional, default:1.0argument path:model[standard]/descriptor[se_atten]/scaling_factor(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K). If temperature is None, the scaling of attention weights is (N_hidden_dim * scaling_factor)**0.5. Else, the scaling of attention weights is set to temperature.
- normalize:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten]/normalize(Supported Backend: PyTorch) Whether to normalize the hidden vectors during attention calculation.
- temperature:#
- type:
float, optionalargument path:model[standard]/descriptor[se_atten]/temperature(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K).
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten]/concat_output_tebd(Supported Backend: PyTorch) Whether to concatenate the type embedding to the descriptor output.
When type is set to
se_e3_tebd:(Supported Backend: PyTorch)
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[standard]/descriptor[se_e3_tebd]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_e3_tebd]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_e3_tebd]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_e3_tebd]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[standard]/descriptor[se_e3_tebd]/tebd_dim(Supported Backend: PyTorch) Dimension of the atom-type embedding (tebd).
- tebd_input_mode:#
- type:
str, optional, default:concatargument path:model[standard]/descriptor[se_e3_tebd]/tebd_input_modeHow the atom-type embedding (tebd) is fed into the descriptor. Supported modes are [‘concat’, ‘strip’].
‘concat’: Concatenate the type embedding with the smoothed angular information as the combined input to the embedding network. The input is input_jk = concat([angle_jk, tebd_j, tebd_k]). The output is out_jk = embedding(input_jk) for the three-body representation of atom i with neighbors j and k.
‘strip’: Use a separate embedding network for the type embedding and combine its output with the angular embedding-network output. The input is input_t = concat([tebd_j, tebd_k]). The output is out_jk = embedding_t(input_t) * embedding_s(angle_jk) + embedding_s(angle_jk) for the three-body representation of atom i with neighbors j and k.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e3_tebd]/resnet_dtWhether to use a “Timestep” in the skip connection
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e3_tebd]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_e3_tebd]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_e3_tebd]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- smooth:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e3_tebd]/smoothWhether to use smooth process in calculation when using stripped type embedding. Whether to dot smooth factor (both neighbors j and k) on the network output (out_jk) of type embedding to keep the network smooth, instead of setting set_davg_zero to be True.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_e3_tebd]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_e3_tebd]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e3_tebd]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_e3_tebd]/seedRandom seed for parameter initialization
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e3_tebd]/concat_output_tebd(Supported Backend: PyTorch) Whether to concatenate the type embedding to the descriptor output.
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_e3_tebd]/use_econf_tebd(Supported Backend: PyTorch) Whether to use electronic configuration type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_e3_tebd]/use_tebd_bias
When type is set to
se_atten_v2:Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism with new modifications will be used by this descriptor.
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[standard]/descriptor[se_atten_v2]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_atten_v2]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_atten_v2]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_atten_v2]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[standard]/descriptor[se_atten_v2]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_atten_v2]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten_v2]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten_v2]/type_one_sideIf ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_atten_v2]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten_v2]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_atten_v2]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_atten_v2]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_atten_v2]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- attn:#
- type:
int, optional, default:128argument path:model[standard]/descriptor[se_atten_v2]/attnThe length of hidden vectors in attention layers
- attn_layer:#
- type:
int, optional, default:2argument path:model[standard]/descriptor[se_atten_v2]/attn_layerThe number of attention layers. Note that model compression of se_atten works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode==’strip’. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed.
- attn_dotr:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten_v2]/attn_dotrWhether to do dot product with the normalized relative coordinates
- attn_mask:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten_v2]/attn_maskWhether to mask the diagonal in the attention matrix
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten_v2]/set_davg_zeroSet the normalization average to zero. This option should be set when se_atten descriptor or atom_ener in the energy fitting is used
- trainable_ln:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten_v2]/trainable_lnWhether to use trainable shift and scale weights in layer normalization.
- ln_eps:#
- type:
NoneType|float, optional, default:Noneargument path:model[standard]/descriptor[se_atten_v2]/ln_epsThe epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[standard]/descriptor[se_atten_v2]/tebd_dim(Supported Backend: PyTorch) Dimension of the atom-type embedding (tebd).
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten_v2]/use_econf_tebd(Supported Backend: PyTorch) Whether to use electronic configuration type embedding. For TensorFlow backend, please set use_econf_tebd in type_embedding block instead.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_atten_v2]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- scaling_factor:#
- type:
float, optional, default:1.0argument path:model[standard]/descriptor[se_atten_v2]/scaling_factor(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K). If temperature is None, the scaling of attention weights is (N_hidden_dim * scaling_factor)**0.5. Else, the scaling of attention weights is set to temperature.
- normalize:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten_v2]/normalize(Supported Backend: PyTorch) Whether to normalize the hidden vectors during attention calculation.
- temperature:#
- type:
float, optionalargument path:model[standard]/descriptor[se_atten_v2]/temperature(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K).
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_atten_v2]/concat_output_tebd(Supported Backend: PyTorch) Whether to concatenate the type embedding to the descriptor output.
When type is set to
dpa2:(Supported Backend: PyTorch)
- repinit:#
- type:
dictargument path:model[standard]/descriptor[dpa2]/repinitArguments for the repinit block, which builds the initial atom-wise representations before repformer.
- rcut:#
- type:
floatargument path:model[standard]/descriptor[dpa2]/repinit/rcutThe cut-off radius.
- rcut_smth:#
- type:
floatargument path:model[standard]/descriptor[dpa2]/repinit/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth.
- nsel:#
- type:
int|strargument path:model[standard]/descriptor[dpa2]/repinit/nselMaximally possible number of selected neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- neuron:#
- type:
list, optional, default:[25, 50, 100]argument path:model[standard]/descriptor[dpa2]/repinit/neuronNumber of neurons in each hidden layer of the embedding net.When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:16argument path:model[standard]/descriptor[dpa2]/repinit/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[standard]/descriptor[dpa2]/repinit/tebd_dimDimension of the atom-type embedding (tebd).
- tebd_input_mode:#
- type:
str, optional, default:concatargument path:model[standard]/descriptor[dpa2]/repinit/tebd_input_modeHow the atom-type embedding (tebd) is fed into the descriptor. Supported modes are [‘concat’, ‘strip’].
‘concat’: Concatenate the type embedding with the smoothed radial information as the combined input to the embedding network. When type_one_side is False, the input is input_ij = concat([r_ij, tebd_j, tebd_i]). When type_one_side is True, the input is input_ij = concat([r_ij, tebd_j]). The output is out_ij = embedding(input_ij) for the pair-wise representation of atom i with neighbor j.
‘strip’: Use a separate embedding network for the type embedding and combine its output with the radial embedding-network output. When type_one_side is False, the input is input_t = concat([tebd_j, tebd_i]). (Supported Backend: PyTorch) When type_one_side is True, the input is input_t = tebd_j. The output is out_ij = embedding_t(input_t) * embedding_s(r_ij) + embedding_s(r_ij) for the pair-wise representation of atom i with neighbor j.
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repinit/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[dpa2]/repinit/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”..
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/repinit/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/repinit/resnet_dtWhether to use a “Timestep” in the skip connection.
- use_three_body:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/repinit/use_three_bodyWhether to concatenate an additional three-body representation to the repinit output descriptor.
- three_body_neuron:#
- type:
list, optional, default:[2, 4, 8]argument path:model[standard]/descriptor[dpa2]/repinit/three_body_neuronNumber of neurons in each hidden layer of the three-body embedding net.When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- three_body_rcut:#
- type:
float, optional, default:4.0argument path:model[standard]/descriptor[dpa2]/repinit/three_body_rcutThe cut-off radius in the three-body representation.
- three_body_rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[dpa2]/repinit/three_body_rcut_smthWhere to start smoothing in the three-body representation. For example the 1/r term is smoothed from three_body_rcut to three_body_rcut_smth.
- three_body_sel:#
- type:
int|str, optional, default:40argument path:model[standard]/descriptor[dpa2]/repinit/three_body_selMaximally possible number of selected neighbors in the three-body representation. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- repformer:#
- type:
dictargument path:model[standard]/descriptor[dpa2]/repformerArguments for the repformer block, which refines the representations produced by repinit.
- rcut:#
- type:
floatargument path:model[standard]/descriptor[dpa2]/repformer/rcutThe cut-off radius.
- rcut_smth:#
- type:
floatargument path:model[standard]/descriptor[dpa2]/repformer/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth.
- nsel:#
- type:
int|strargument path:model[standard]/descriptor[dpa2]/repformer/nselMaximally possible number of selected neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- nlayers:#
- type:
int, optional, default:3argument path:model[standard]/descriptor[dpa2]/repformer/nlayersNumber of repformer layers.
- g1_dim:#
- type:
int, optional, default:128argument path:model[standard]/descriptor[dpa2]/repformer/g1_dimDimension of the g1 representation, i.e., the rotationally invariant single-atom representation.
- g2_dim:#
- type:
int, optional, default:16argument path:model[standard]/descriptor[dpa2]/repformer/g2_dimDimension of the g2 representation, i.e., the rotationally invariant pair-atom representation.
- axis_neuron:#
- type:
int, optional, default:4argument path:model[standard]/descriptor[dpa2]/repformer/axis_neuronSize of the submatrix used in the symmetrization operations.
- direct_dist:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/repformer/direct_distWhether to use the direct distance as input to the embedding net when building g2, instead of the smoothed 1/r.
- update_g1_has_conv:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/update_g1_has_convWhether to include the convolution term when updating g1.
- update_g1_has_drrd:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/update_g1_has_drrdWhether to include the drrd term when updating g1.
- update_g1_has_grrg:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/update_g1_has_grrgWhether to include the grrg term when updating g1.
- update_g1_has_attn:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/update_g1_has_attnWhether to include localized self-attention when updating g1.
- update_g2_has_g1g1:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/update_g2_has_g1g1Whether to include the g1 x g1 term when updating g2.
- update_g2_has_attn:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/update_g2_has_attnWhether to include gated self-attention when updating g2.
- use_sqrt_nnei:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/use_sqrt_nneiWhether to normalize symmetrization_op by the square root of the number of neighbors instead of by the number of neighbors itself.
- g1_out_conv:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/g1_out_convWhether to keep the convolutional update of g1 as a separate branch outside the concatenated MLP update.
- g1_out_mlp:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/g1_out_mlpWhether to keep the self-MLP update of g1 as a separate branch outside the concatenated MLP update.
- update_h2:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/repformer/update_h2Whether to update the h2 representation, i.e., the rotationally equivariant pair representation.
- attn1_nhead:#
- type:
int, optional, default:4argument path:model[standard]/descriptor[dpa2]/repformer/attn1_nheadNumber of heads in the localized self-attention used to update g1.
- attn2_nhead:#
- type:
int, optional, default:4argument path:model[standard]/descriptor[dpa2]/repformer/attn2_nheadNumber of heads in the gated self-attention used to update g2.
- attn2_has_gate:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/repformer/attn2_has_gateWhether to use gating in the gated self-attention used to update g2.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[dpa2]/repformer/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”..
- update_style:#
- type:
str, optional, default:res_avgargument path:model[standard]/descriptor[dpa2]/repformer/update_styleStyle to update a representation. Supported options are: -‘res_avg’: Updates a rep u with: u = 1/sqrt{n+1} (u + u_1 + u_2 + … + u_n) -‘res_incr’: Updates a rep u with: u = u + 1/sqrt{n} (u_1 + u_2 + … + u_n)-‘res_residual’: Updates a rep u with: u = u + (r1*u_1 + r2*u_2 + … + r3*u_n) where r1, r2 … r3 are residual weights defined by update_residual and update_residual_init.
- update_residual:#
- type:
float, optional, default:0.001argument path:model[standard]/descriptor[dpa2]/repformer/update_residualWhen update using residual mode, the initial std of residual vector weights.
- update_residual_init:#
- type:
str, optional, default:normargument path:model[standard]/descriptor[dpa2]/repformer/update_residual_initWhen update using residual mode, the initialization mode of residual vector weights.Supported modes are: [‘norm’, ‘const’].
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used.
- trainable_ln:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/repformer/trainable_lnWhether to use trainable shift and scale weights in layer normalization.
- ln_eps:#
- type:
NoneType|float, optional, default:Noneargument path:model[standard]/descriptor[dpa2]/repformer/ln_epsThe epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/concat_output_tebdWhether to concatenate the type embedding to the descriptor output.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[dpa2]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- smooth:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/smoothWhether to use smoothness in processes such as attention weights calculation.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[dpa2]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[dpa2]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa2]/trainableWhether the parameters in the embedding net are trainable.
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[dpa2]/seedRandom seed for parameter initialization.
- add_tebd_to_repinit_out:#
- type:
bool, optional, default:False, alias: repformer_add_type_ebd_to_seqargument path:model[standard]/descriptor[dpa2]/add_tebd_to_repinit_outWhether to add the type embedding to the output of repinit before passing it to repformer.
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/use_econf_tebd(Supported Backend: PyTorch) Whether to use an electronic-configuration-based type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa2]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
When type is set to
dpa3:(Supported Backend: PyTorch)
- repflow:#
- type:
dictargument path:model[standard]/descriptor[dpa3]/repflowArguments for the repflow block, which updates node, edge, and angle representations in DPA3.
- n_dim:#
- type:
int, optional, default:128argument path:model[standard]/descriptor[dpa3]/repflow/n_dimDimension of the node (atom-wise) representation.
- e_dim:#
- type:
int, optional, default:64argument path:model[standard]/descriptor[dpa3]/repflow/e_dimDimension of the edge (pair-wise) representation.
- a_dim:#
- type:
int, optional, default:64argument path:model[standard]/descriptor[dpa3]/repflow/a_dimDimension of the angle (three-body/angular) representation.
- nlayers:#
- type:
int, optional, default:6argument path:model[standard]/descriptor[dpa3]/repflow/nlayersNumber of repflow layers.
- e_rcut:#
- type:
floatargument path:model[standard]/descriptor[dpa3]/repflow/e_rcutThe edge cut-off radius.
- e_rcut_smth:#
- type:
floatargument path:model[standard]/descriptor[dpa3]/repflow/e_rcut_smthWhere to start smoothing for edge. For example the 1/r term is smoothed from rcut to rcut_smth.
- e_sel:#
- type:
int|strargument path:model[standard]/descriptor[dpa3]/repflow/e_selMaximally possible number of selected edge neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- a_rcut:#
- type:
floatargument path:model[standard]/descriptor[dpa3]/repflow/a_rcutThe angle cut-off radius.
- a_rcut_smth:#
- type:
floatargument path:model[standard]/descriptor[dpa3]/repflow/a_rcut_smthWhere to start smoothing for angle. For example the 1/r term is smoothed from rcut to rcut_smth.
- a_sel:#
- type:
int|strargument path:model[standard]/descriptor[dpa3]/repflow/a_selMaximally possible number of selected angle neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- a_compress_rate:#
- type:
int, optional, default:0argument path:model[standard]/descriptor[dpa3]/repflow/a_compress_rateThe compression rate for angular messages. The default value is 0, indicating no compression. If a non-zero integer c is provided, the node and edge dimensions will be compressed to a_dim/c and a_dim/2c, respectively, within the angular message.
- a_compress_e_rate:#
- type:
int, optional, default:1argument path:model[standard]/descriptor[dpa3]/repflow/a_compress_e_rateThe extra compression rate for edge in angular message compression. The default value is 1.When using angular message compression with a_compress_rate c and a_compress_e_rate c_e, the edge dimension will be compressed to (c_e * a_dim / 2c) within the angular message.
- a_compress_use_split:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/repflow/a_compress_use_splitWhether to split first sub-vectors instead of linear mapping during angular message compression. The default value is False.
- n_multi_edge_message:#
- type:
int, optional, default:1argument path:model[standard]/descriptor[dpa3]/repflow/n_multi_edge_messageNumber of heads in the multi-edge-message update of node features. Default is 1, i.e., a single edge-message head.
- axis_neuron:#
- type:
int, optional, default:4argument path:model[standard]/descriptor[dpa3]/repflow/axis_neuronSize of the submatrix used in the symmetrization operations.
- fix_stat_std:#
- type:
float, optional, default:0.3argument path:model[standard]/descriptor[dpa3]/repflow/fix_stat_stdIf non-zero (default is 0.3), use this constant as the normalization standard deviation instead of computing it from data statistics.
- skip_stat:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/repflow/skip_stat(Deprecated, kept only for compatibility.) This parameter is obsolete and will be removed. If set to True, it forces fix_stat_std=0.3 for backward compatibility. Transition to fix_stat_std parameter immediately.
- update_angle:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa3]/repflow/update_angleWhether to update the angle representation. If False, only the node and edge representations are updated.
- update_style:#
- type:
str, optional, default:res_residualargument path:model[standard]/descriptor[dpa3]/repflow/update_styleStyle to update a representation. Supported options are: -‘res_avg’: Updates a rep u with: u = 1/sqrt{n+1} (u + u_1 + u_2 + … + u_n) -‘res_incr’: Updates a rep u with: u = u + 1/sqrt{n} (u_1 + u_2 + … + u_n)-‘res_residual’: Updates a rep u with: u = u + (r1*u_1 + r2*u_2 + … + r3*u_n) where r1, r2 … r3 are residual weights defined by update_residual and update_residual_init.
- update_residual:#
- type:
float, optional, default:0.1argument path:model[standard]/descriptor[dpa3]/repflow/update_residualWhen update using residual mode, the initial std of residual vector weights.
- update_residual_init:#
- type:
str, optional, default:constargument path:model[standard]/descriptor[dpa3]/repflow/update_residual_initWhen update using residual mode, the initialization mode of residual vector weights.Supported modes are: [‘norm’, ‘const’].
- optim_update:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa3]/repflow/optim_updateWhether to enable the optimized update method. Uses a more efficient implementation when enabled. Default is True.
- smooth_edge_update:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/repflow/smooth_edge_updateWhether to make edge update smooth. If True, the edge update from angle message will not use self as padding.
- edge_init_use_dist:#
- type:
bool, optional, default:False, alias: edge_use_distargument path:model[standard]/descriptor[dpa3]/repflow/edge_init_use_distWhether to use direct distance r to initialize the edge features instead of 1/r. Note that when using this option, the activation function will not be used when initializing edge features.
- use_exp_switch:#
- type:
bool, optional, default:False, alias: use_env_envelopeargument path:model[standard]/descriptor[dpa3]/repflow/use_exp_switchWhether to use an exponential switch function instead of a polynomial one in the neighbor update. The exponential switch function ensures neighbor contributions smoothly diminish as the interatomic distance r approaches the cutoff radius rcut. Specifically, the function is defined as: s(r) = exp(-exp(20 * (r - rcut_smth) / rcut_smth)) for 0 < r leq rcut, and s(r) = 0 for r > rcut. Here, rcut_smth is an adjustable smoothing factor and should be chosen carefully according to rcut, ensuring s(r) approaches zero smoothly at the cutoff. Typical recommended values are rcut_smth = 5.3 for rcut = 6.0, and 3.5 for rcut = 4.0.
- use_dynamic_sel:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/repflow/use_dynamic_selWhether to dynamically select neighbors within the cutoff radius. If True, the exact number of neighbors within the cutoff radius is used without padding to a fixed selection numbers. When enabled, users can safely set larger values for e_sel or a_sel (e.g., 1200 or 300, respectively) to guarantee capturing all neighbors within the cutoff radius. Note that when using dynamic selection, the smooth_edge_update must be True.
- sel_reduce_factor:#
- type:
float, optional, default:10.0argument path:model[standard]/descriptor[dpa3]/repflow/sel_reduce_factorReduction factor applied to neighbor-scale normalization when use_dynamic_sel is True. In the dynamic selection case, neighbor-scale normalization will use e_sel / sel_reduce_factor or a_sel / sel_reduce_factor instead of the raw e_sel or a_sel values, accommodating larger selection numbers.
- sequential_update:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/repflow/sequential_updateWhether to use sequential update mode within each repflow layer. When True, updates are applied sequentially: edge self → angle self (using updated edge) → edge angle (using updated angle) → node (using final edge), instead of the default parallel mode where all updates use original embeddings. Currently only supports update_style=’res_residual’ and requires update_angle=True.
- concat_output_tebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/concat_output_tebdWhether to concatenate the type embedding to the descriptor output.
- add_chg_spin_ebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/add_chg_spin_ebdWhether to add charge and spin embedding to the descriptor. When enabled, the dedicated charge_spin input (shape [nframes, 2], [charge, spin]) is embedded and added to the type embedding. When charge_spin is missing in the input data, default_chg_spin is used as a fallback if provided.
- default_chg_spin:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/descriptor[dpa3]/default_chg_spinDefault charge and spin values used as fallback when charge_spin is not provided in the input data. Must be a list of length 2 [charge, spin]. Only used when add_chg_spin_ebd is True.
- activation_function:#
- type:
str, optional, default:siluargument path:model[standard]/descriptor[dpa3]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”..
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[dpa3]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[dpa3]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[dpa3]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa3]/trainableWhether the parameters in the embedding net are trainable.
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[dpa3]/seedRandom seed for parameter initialization.
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/use_econf_tebd(Supported Backend: PyTorch) Whether to use an electronic-configuration-based type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[dpa3]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- use_loc_mapping:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[dpa3]/use_loc_mappingWhether to use local atom index mapping in training or non-parallel inference. When True, local indexing and mapping are applied to neighbor lists and embeddings during descriptor computation.
When type is set to
se_a_ebd_v2(or its aliasse_a_tpe_v2):(Supported Backend: TensorFlow)
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[standard]/descriptor[se_a_ebd_v2]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[standard]/descriptor[se_a_ebd_v2]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[standard]/descriptor[se_a_ebd_v2]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_a_ebd_v2]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[standard]/descriptor[se_a_ebd_v2]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_a_ebd_v2]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_ebd_v2]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_ebd_v2]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_a_ebd_v2]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_a_ebd_v2]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_a_ebd_v2]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_a_ebd_v2]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[standard]/descriptor[se_a_ebd_v2]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_ebd_v2]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
When type is set to
se_a_mask:(Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. It can accept a variable number of atoms in a frame (Non-PBC system). aparam are required as an indicator matrix for the real/virtual sign of input atoms.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[standard]/descriptor[se_a_mask]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[standard]/descriptor[se_a_mask]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[standard]/descriptor[se_a_mask]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/descriptor[se_a_mask]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_mask]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[standard]/descriptor[se_a_mask]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[standard]/descriptor[se_a_mask]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/descriptor[se_a_mask]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[standard]/descriptor[se_a_mask]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/descriptor[se_a_mask]/seedRandom seed for parameter initialization
- fitting_net:#
- type:
dictargument path:model[standard]/fitting_netThe fitting of physical properties.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key), default:enerargument path:model[standard]/fitting_net/typeThe type of the fitting.
ener: Fit an energy model (potential energy surface).dpa4_ener: (Supported Backend: PyTorch) Fit an energy model (potential energy surface).dos: Fit a density of states model. The total density of states / site-projected density of states labels should be provided by dos.npy or atom_dos.npy in each data system. The file has a number of frames (rows) and a number of energy-grid columns (multiplied by the number of atoms in atom_dos.npy). See loss parameter.property: (Supported Backend: PyTorch)polar: Fit an atomic polarizability model. Global polarizability labels or atomic polarizability labels for all selected atoms (see sel_type) should be provided by polarizability.npy in each data system. The file should have shape (n_frames, 9*n_selected) for atomic polarizability labels, or shape (n_frames, 9) for global polarizability labels. See loss parameter.dipole: Fit an atomic dipole model. Global dipole labels or atomic dipole labels for all selected atoms (see sel_type) should be provided by dipole.npy in each data system. The file should have shape (n_frames, 3*n_selected) for atomic dipole labels, or shape (n_frames, 3) for global dipole labels. See loss parameter.
When type is set to
ener:Fit an energy model (potential energy surface).
- numb_fparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[ener]/numb_fparamThe dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[ener]/numb_aparamThe dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[ener]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[ener]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[standard]/fitting_net[ener]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/fitting_net[ener]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/fitting_net[ener]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[ener]/resnet_dtWhether to use a “Timestep” in the skip connection
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[standard]/fitting_net[ener]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool(Supported Backend: TensorFlow) : Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
- rcond:#
- type:
NoneType|float, optional, default:Noneargument path:model[standard]/fitting_net[ener]/rcondThe condition number used to determine the initial energy shift for each type of atoms. See rcond in
numpy.linalg.lstsq()for more details.
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/fitting_net[ener]/seedRandom seed for parameter initialization of the fitting net
- atom_ener:#
- type:
list[float | None], optional, default:[]argument path:model[standard]/fitting_net[ener]/atom_enerSpecify the atomic energy in vacuum for each type
- layer_name:#
- type:
list[str], optionalargument path:model[standard]/fitting_net[ener]/layer_nameThe name of the each layer. The length of this list should be equal to n_neuron + 1. If two layers, either in the same fitting or different fittings, have the same name, they will share the same neural network parameters. The shape of these layers should be the same. If null is given for a layer, parameters will not be shared.
- use_aparam_as_mask:#
- type:
bool, optional, default:Falseargument path:model[standard]/fitting_net[ener]/use_aparam_as_maskWhether to use the aparam as a mask in input.If True, the aparam will not be used in fitting net for embedding.When descrpt is se_a_mask, the aparam will be used as a mask to indicate the input atom is real/virtual. And use_aparam_as_mask should be set to True.
When type is set to
dpa4_ener(or its aliassezm_ener):(Supported Backend: PyTorch) Fit an energy model (potential energy surface).
- numb_fparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dpa4_ener]/numb_fparamDimension of frame parameters. If set to >0, each data system should provide fparam.npy.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dpa4_ener]/numb_aparamDimension of atomic parameters. If set to >0, each data system should provide aparam.npy.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[dpa4_ener]/default_fparam(Supported Backend: PyTorch) Default frame parameters used when a data system does not provide fparam.npy.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dpa4_ener]/dim_case_embd(Supported Backend: PyTorch) Dimension of the case embedding. For multitask training or fine-tuning with case embeddings, set this value to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[0], alias: n_neuronargument path:model[standard]/fitting_net[dpa4_ener]/neuronThe number of neurons in each hidden layer of the fitting net. Use 0 as an auto-width placeholder resolved from the descriptor width.
- activation_function:#
- type:
str, optional, default:siluargument path:model[standard]/fitting_net[dpa4_ener]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- precision:#
- type:
str, optional, default:float32argument path:model[standard]/fitting_net[dpa4_ener]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[standard]/fitting_net[dpa4_ener]/resnet_dtWhether to use a “Timestep” in the skip connection
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[standard]/fitting_net[dpa4_ener]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool(Supported Backend: TensorFlow) : Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
- rcond:#
- type:
NoneType|float, optional, default:Noneargument path:model[standard]/fitting_net[dpa4_ener]/rcondThe condition number used to determine the initial energy shift for each type of atoms. See rcond in
numpy.linalg.lstsq()for more details.
- seed:#
- type:
int|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[dpa4_ener]/seedRandom seed for parameter initialization of the fitting net
- atom_ener:#
- type:
list[float | None], optional, default:[]argument path:model[standard]/fitting_net[dpa4_ener]/atom_enerSpecify the atomic energy in vacuum for each type
- layer_name:#
- type:
list[str], optionalargument path:model[standard]/fitting_net[dpa4_ener]/layer_nameThe name of the each layer. The length of this list should be equal to n_neuron + 1. If two layers, either in the same fitting or different fittings, have the same name, they will share the same neural network parameters. The shape of these layers should be the same. If null is given for a layer, parameters will not be shared.
- use_aparam_as_mask:#
- type:
bool, optional, default:Falseargument path:model[standard]/fitting_net[dpa4_ener]/use_aparam_as_maskWhether to use the aparam as a mask in input.If True, the aparam will not be used in fitting net for embedding.When descrpt is se_a_mask, the aparam will be used as a mask to indicate the input atom is real/virtual. And use_aparam_as_mask should be set to True.
- case_film_embd:#
- type:
bool, optional, default:Falseargument path:model[standard]/fitting_net[dpa4_ener]/case_film_embd(Supported Backend: PyTorch) Whether to use case FiLM conditioning for shared DPA4/SeZM fitting. When enabled, the case embedding modulates fitting features instead of being concatenated to the fitting input.
When type is set to
dos:Fit a density of states model. The total density of states / site-projected density of states labels should be provided by dos.npy or atom_dos.npy in each data system. The file has a number of frames (rows) and a number of energy-grid columns (multiplied by the number of atoms in atom_dos.npy). See loss parameter.
- numb_fparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dos]/numb_fparamThe dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dos]/numb_aparamThe dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[dos]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dos]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120]argument path:model[standard]/fitting_net[dos]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/fitting_net[dos]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- precision:#
- type:
str, optional, default:float64argument path:model[standard]/fitting_net[dos]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[dos]/resnet_dtWhether to use a “Timestep” in the skip connection
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[standard]/fitting_net[dos]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool: Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
- rcond:#
- type:
NoneType|float, optional, default:Noneargument path:model[standard]/fitting_net[dos]/rcondThe condition number used to determine the initial energy shift for each type of atoms. See rcond in
numpy.linalg.lstsq()for more details.
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/fitting_net[dos]/seedRandom seed for parameter initialization of the fitting net
- numb_dos:#
- type:
int, optional, default:300argument path:model[standard]/fitting_net[dos]/numb_dosThe number of gridpoints on which the DOS is evaluated (NEDOS in VASP)
When type is set to
property:(Supported Backend: PyTorch)
- numb_fparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[property]/numb_fparamThe dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[property]/numb_aparamThe dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[property]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[property]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[standard]/fitting_net[property]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/fitting_net[property]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[property]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/fitting_net[property]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/fitting_net[property]/seedRandom seed for parameter initialization of the fitting net
- task_dim:#
- type:
int, optional, default:1argument path:model[standard]/fitting_net[property]/task_dimThe dimension of outputs of fitting net
- intensive:#
- type:
bool, optional, default:Falseargument path:model[standard]/fitting_net[property]/intensiveWhether the fitting property is intensive
- distinguish_types:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[property]/distinguish_typesWhether to distinguish atom types when computing output statistics.
- property_name:#
- type:
strargument path:model[standard]/fitting_net[property]/property_nameThe names of fitting property, which should be consistent with the property name in the dataset.
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[standard]/fitting_net[property]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool: Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
When type is set to
polar:Fit an atomic polarizability model. Global polarizability labels or atomic polarizability labels for all selected atoms (see sel_type) should be provided by polarizability.npy in each data system. The file should have shape (n_frames, 9*n_selected) for atomic polarizability labels, or shape (n_frames, 9) for global polarizability labels. See loss parameter.
- numb_fparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[polar]/numb_fparam(Supported Backend: PyTorch) The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[polar]/numb_aparam(Supported Backend: PyTorch) The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[polar]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[polar]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[standard]/fitting_net[polar]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/fitting_net[polar]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[polar]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/fitting_net[polar]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- fit_diag:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[polar]/fit_diagFit the diagonal part of the rotational invariant polarizability matrix, which will be converted to normal polarizability matrix by contracting with the rotation matrix.
- scale:#
- type:
list[float]|float, optional, default:1.0argument path:model[standard]/fitting_net[polar]/scaleThe output of the fitting net (polarizability matrix) will be scaled by
scale
- shift_diag:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[polar]/shift_diagWhether to shift the diagonal of polar, which is beneficial to training. Default is true.
- sel_type:#
- type:
int|NoneType|list[int], optional, alias: pol_typeargument path:model[standard]/fitting_net[polar]/sel_typeThe atom types for which the atomic polarizability will be provided. If not set, all types will be selected.(Supported Backend: TensorFlow)
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/fitting_net[polar]/seedRandom seed for parameter initialization of the fitting net
When type is set to
dipole:Fit an atomic dipole model. Global dipole labels or atomic dipole labels for all selected atoms (see sel_type) should be provided by dipole.npy in each data system. The file should have shape (n_frames, 3*n_selected) for atomic dipole labels, or shape (n_frames, 3) for global dipole labels. See loss parameter.
- numb_fparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dipole]/numb_fparam(Supported Backend: PyTorch) The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dipole]/numb_aparam(Supported Backend: PyTorch) The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[standard]/fitting_net[dipole]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[standard]/fitting_net[dipole]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[standard]/fitting_net[dipole]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[standard]/fitting_net[dipole]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[standard]/fitting_net[dipole]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[standard]/fitting_net[dipole]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- sel_type:#
- type:
int|NoneType|list[int], optional, alias: dipole_typeargument path:model[standard]/fitting_net[dipole]/sel_typeThe atom types for which the atomic dipole will be provided. If not set, all types will be selected.(Supported Backend: TensorFlow)
- seed:#
- type:
int|NoneType, optionalargument path:model[standard]/fitting_net[dipole]/seedRandom seed for parameter initialization of the fitting net
- model_branch_alias:#
- type:
list[str], optional, default:[]argument path:model[standard]/model_branch_alias(Supported Backend: PyTorch) List of aliases for this model branch. Multiple aliases can be defined, and any alias can reference this branch throughout the model usage. Used only in multi-task models.
- info:#
- type:
dict, optional, default:{}argument path:model[standard]/info(Supported Backend: PyTorch) Dictionary of metadata for this model or model branch. Store arbitrary key-value pairs with model- or branch-specific information. Used in both single- and multi-task models.
When type is set to
dpa4(or its aliasesDPA4,SeZM,sezm):(Supported Backend: PyTorch) DPA4/SeZM model scaffold with fixed SeZM descriptor and fitting types.
- descriptor:#
- type:
dictargument path:model[dpa4]/descriptor(Supported Backend: PyTorch) Descriptor configuration for atomic environments. DPA4/SeZM uses the SeZM descriptor.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key)argument path:model[dpa4]/descriptor/typepossible choices:loc_frame,se_e2_a,dpa4,se_e3,se_a_tpe,se_e2_r,hybrid,se_atten,se_e3_tebd,se_atten_v2,dpa2,dpa3,se_a_ebd_v2,se_a_maskThe type of the descriptor.
loc_frame: (Supported Backend: TensorFlow) Defines a local frame at each atom, and computes the descriptor as local coordinates under this frame.se_e2_a: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor.dpa4: (Supported Backend: PyTorch) DPA4/SeZM descriptor implemented as the SeZM (Smooth Equivariant Zone-bridging Model) architecture.se_e3: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Three-body embedding will be used by this descriptor.se_a_tpe: (Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Type embedding will be used by this descriptor.se_e2_r: Used by the smooth edition of Deep Potential. Only the distance between atoms is used to construct the descriptor.hybrid: Concatenate of a list of descriptors as a new descriptor.se_atten: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism will be used by this descriptor.se_e3_tebd: (Supported Backend: PyTorch)se_atten_v2: Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism with new modifications will be used by this descriptor.dpa2: (Supported Backend: PyTorch)dpa3: (Supported Backend: PyTorch)se_a_ebd_v2: (Supported Backend: TensorFlow)se_a_mask: (Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. It can accept a variable number of atoms in a frame (Non-PBC system). aparam are required as an indicator matrix for the real/virtual sign of input atoms.
When type is set to
loc_frame:(Supported Backend: TensorFlow) Defines a local frame at each atom, and computes the descriptor as local coordinates under this frame.
- sel_a:#
- type:
list[int]argument path:model[dpa4]/descriptor[loc_frame]/sel_aA list of integers. The length of the list should be the same as the number of atom types in the system. sel_a[i] gives the selected number of type-i neighbors. The full relative coordinates of the neighbors are used by the descriptor.
- sel_r:#
- type:
list[int]argument path:model[dpa4]/descriptor[loc_frame]/sel_rA list of integers. The length of the list should be the same as the number of atom types in the system. sel_r[i] gives the selected number of type-i neighbors. Only the relative distances of the neighbors are used by the descriptor. sel_a[i] + sel_r[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[loc_frame]/rcutThe cut-off radius. The default value is 6.0
- axis_rule:#
- type:
list[int]argument path:model[dpa4]/descriptor[loc_frame]/axis_ruleA list of integers. The length should be 6 times the number of types.
axis_rule[i*6+0]: class of the atom defining the first axis of type-i atom. 0 for neighbors with full coordinates and 1 for neighbors only with relative distance.
axis_rule[i*6+1]: type of the atom defining the first axis of type-i atom.
axis_rule[i*6+2]: index of the axis atom defining the first axis. Note that the neighbors with the same class and type are sorted according to their relative distance.
axis_rule[i*6+3]: class of the atom defining the second axis of type-i atom. 0 for neighbors with full coordinates and 1 for neighbors only with relative distance.
axis_rule[i*6+4]: type of the atom defining the second axis of type-i atom.
axis_rule[i*6+5]: index of the axis atom defining the second axis. Note that the neighbors with the same class and type are sorted according to their relative distance.
When type is set to
se_e2_a(or its aliasse_a):Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_e2_a]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_e2_a]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_e2_a]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_e2_a]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[dpa4]/descriptor[se_e2_a]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_e2_a]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e2_a]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e2_a]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_e2_a]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e2_a]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_e2_a]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_e2_a]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_e2_a]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e2_a]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
When type is set to
dpa4(or its aliasesDPA4,SeZM,sezm):(Supported Backend: PyTorch) DPA4/SeZM descriptor implemented as the SeZM (Smooth Equivariant Zone-bridging Model) architecture.
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[dpa4]/descriptor[dpa4]/selThe maximum number of neighbors. It can be:
int: the total maximum number of neighbors within rcut (all types combined)
list[int]: sel[i] specifies the maximum number of type-i neighbors within rcut
str: Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors with in the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally the number is wrapped up to 4 divisible. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[dpa4]/rcutThe cut-off radius.
- env_exp:#
- type:
list[int], optional, default:[7, 5]argument path:model[dpa4]/descriptor[dpa4]/env_expC^3 cutoff envelope exponents [rbf_env_exp, edge_env_exp]. rbf_env_exp controls radial basis function envelope decay; edge_env_exp controls message passing edge weight envelope decay. Larger values give weaker suppression.
- channels:#
- type:
int, optional, default:64argument path:model[dpa4]/descriptor[dpa4]/channelsTotal channels per (l,m) coefficient.
- basis_type:#
- type:
str, optional, default:besselargument path:model[dpa4]/descriptor[dpa4]/basis_typeRadial basis type. Supported values are bessel and gaussian.
- n_radial:#
- type:
int, optional, default:16argument path:model[dpa4]/descriptor[dpa4]/n_radialNumber of radial basis functions.
- radial_mlp:#
- type:
list[int], optional, default:[0]argument path:model[dpa4]/descriptor[dpa4]/radial_mlpHidden layer sizes for radial networks. An output layer of size (l_schedule[0]+1)*channels will be automatically appended. Use 0 as a placeholder to be replaced by channels.
- use_env_seed:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa4]/use_env_seed(Supported Backend: PyTorch) If True, seed the initial node state with local-environment information: apply environment matrix FiLM conditioning on l=0 features using 4D [s, s*r_hat] representation, and enable the non-scalar geometric initial embedding when l_schedule[0] > 0. If False, the initial state contains only atom-local scalar features before message passing. Internal dimensions are derived from channels: embed_dim=min(channels, 128), axis_dim=min(4 if embed_dim < 64 else 8, embed_dim-1), type_dim=clamp(channels//4, 8, 32), rbf_out_dim=max(32, embed_dim-2*type_dim), hidden_dim=min(256, max(2*embed_dim, rbf_out_dim+2*type_dim)).
- random_gamma:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa4]/random_gamma(Supported Backend: PyTorch) If True, apply a random roll about the edge-aligned local +Z axis before building Wigner-D blocks. The roll is sampled independently per edge and per forward call.
- lmax:#
- type:
int, optional, default:3argument path:model[dpa4]/descriptor[dpa4]/lmaxMaximum degree, only used when l_schedule is None.
- l_schedule:#
- type:
NoneType|list[int], optional, default:Noneargument path:model[dpa4]/descriptor[dpa4]/l_schedulePyramid schedule of lmax per block, e.g. [3, 3, 2]. Must be non-increasing. If set, lmax and n_blocks will be ignored.
- mmax:#
- type:
int|NoneType, optional, default:1argument path:model[dpa4]/descriptor[dpa4]/mmaxMaximum SO(2) order (|m|), only used when m_schedule is None. If None, defaults to the per-block lmax.
- m_schedule:#
- type:
NoneType|list[int], optional, default:Noneargument path:model[dpa4]/descriptor[dpa4]/m_scheduleSchedule of mmax per block. Must have the same length as l_schedule and satisfy m_schedule[i] <= l_schedule[i]. If set, mmax will be ignored.
- n_blocks:#
- type:
int, optional, default:3argument path:model[dpa4]/descriptor[dpa4]/n_blocksNumber of blocks (only used when l_schedule is None).
- so2_norm:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/so2_normIf True, apply intermediate ReducedEquivariantRMSNorm between SO(2) mixing layers. When False (default), no normalization is applied between layers.
- so2_layers:#
- type:
int, optional, default:4argument path:model[dpa4]/descriptor[dpa4]/so2_layersNumber of SO(2) mixing layers per block.
- so2_attn_res:#
- type:
str, optional, default:noneargument path:model[dpa4]/descriptor[dpa4]/so2_attn_res(Supported Backend: PyTorch) Depth-wise attention residual mode across the internal SO(2) layer history inside each interaction block. Must be one of none, independent, or dependent.
- radial_so2_mode:#
- type:
str, optional, default:degree_channelargument path:model[dpa4]/descriptor[dpa4]/radial_so2_mode(Supported Backend: PyTorch) Dynamic radial degree mixer mode inside SO(2) convolution. none applies elementwise radial modulation. degree uses an edge-conditioned cross-degree kernel W[l_in,l_out,|m|](r) shared by all channels. degree_channel uses W[l_in,l_out,|m|,c](r), optionally low-rank when radial_so2_rank > 0.
- radial_so2_rank:#
- type:
int, optional, default:1argument path:model[dpa4]/descriptor[dpa4]/radial_so2_rank(Supported Backend: PyTorch) Low-rank channel factorization rank for radial_so2_mode=degree_channel. 0 uses the full per-channel dynamic degree kernel.
- n_focus:#
- type:
int, optional, default:1argument path:model[dpa4]/descriptor[dpa4]/n_focusNumber of parallel focus streams used only inside the SO(2) convolution.
- focus_dim:#
- type:
int, optional, default:0argument path:model[dpa4]/descriptor[dpa4]/focus_dimHidden width per focus stream inside the SO(2) convolution. 0 means using channels.
- n_atten_head:#
- type:
int, optional, default:1argument path:model[dpa4]/descriptor[dpa4]/n_atten_headNumber of attention heads when aggregating messages in SO(2) convolution. 0 applies a plain envelope-weighted scatter-sum. When >0, the attention width must be divisible by n_atten_head, and envelope-gated grouped softmax attention with output-side head gate is applied. Attention uses w**2 * exp(logit) in the numerator and zeta + sum(w**2 * exp(logit)) in the denominator.
- atten_f_mix:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/atten_f_mix(Supported Backend: PyTorch) If True, merge all SO(2) focus streams into one attention stream after rotate-back. Attention heads split n_focus * focus_dim (or n_focus * channels when focus_dim=0) instead of each focus stream independently. The default False preserves per-focus attention.
- atten_v_proj:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/atten_v_proj(Supported Backend: PyTorch) If True, apply an explicit degree-aware value projection inside SO(2) attention. The default False keeps the raw rotated message as the attention value.
- atten_o_proj:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/atten_o_proj(Supported Backend: PyTorch) If True, apply an explicit degree-aware output projection after the SO(2) attention output gate. The default False keeps the legacy output path without this projection.
- ffn_neurons:#
- type:
int, optional, default:0argument path:model[dpa4]/descriptor[dpa4]/ffn_neuronsHidden width for block FFNs and the final scalar output FFN. >0 uses the same explicit width for both. 0 lets each path resolve its own width from channels: 4 * channels without GLU, (8 / 3) * channels with GLU, then round up to a multiple of 32.
- grid_mlp:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/grid_mlp(Supported Backend: PyTorch) If True, use the optional grid-MLP structure for the block-internal equivariant FFN. This does not change the final l=0 output head.
- ffn_blocks:#
- type:
int, optional, default:1argument path:model[dpa4]/descriptor[dpa4]/ffn_blocks(Supported Backend: PyTorch) Number of FFN sublayers per interaction block.
- sandwich_norm:#
- type:
list[bool], optional, default:[False, True, True, False]argument path:model[dpa4]/descriptor[dpa4]/sandwich_norm(Supported Backend: PyTorch) Pre/post-norm switches for residual branches. Use [so2_pre, so2_post, ffn_pre, ffn_post] to enable pre-norm before and post-norm after SO(2) and FFN operations.
- mlp_bias:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/mlp_bias(Supported Backend: PyTorch) Whether to use bias in equivariant layers. When False, removes bias from: - SO3Linear: l=0 bias - SO2Linear: l=0 bias - GatedActivation: gate linear bias - DepthAttnRes: input-dependent query projection - EnvironmentInitialEmbedding MLPs: rbf_proj_layer1/2 and g_layer1/2 Attention logit and output-gate parameters in SO(2) convolution are always bias-free.
- layer_scale:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/layer_scale(Supported Backend: PyTorch) If True, apply learnable LayerScale (init 1e-3) on residual branches: SO(2) branch uses per-focus-channel scales (shape (n_focus, focus_dim)) on each SO(2) mixing layer, and FFN branch uses per-channel scales (shape (channels,)) on each FFN residual branch.
- full_attn_res:#
- type:
str, optional, default:noneargument path:model[dpa4]/descriptor[dpa4]/full_attn_res(Supported Backend: PyTorch) Descriptor-level full attention residual mode over the unit history [x0, so2_0, ffn_0_0, ffn_0_1, …, so2_1, ffn_1_0, ffn_1_1, …]. independent uses learned query vectors, while dependent derives the query from the current SeZM state before the SO(2) unit, before each FFN unit, and before the final aggregation. Must be one of none, independent, or dependent. Cannot be enabled together with block_attn_res.
- block_attn_res:#
- type:
str, optional, default:noneargument path:model[dpa4]/descriptor[dpa4]/block_attn_res(Supported Backend: PyTorch) Descriptor-level block attention residual mode over block history [x0, b1, b2, …], where each block summary is the sum of the SO(2) unit output and all FFN unit outputs inside one interaction block. independent uses learned query vectors, while dependent derives queries from the current SeZM state before the SO(2) unit, before each FFN unit, and before the final block aggregation. Must be one of none, independent, or dependent. Cannot be enabled together with full_attn_res.
- s2_activation:#
- type:
list[bool], optional, default:[False, True]argument path:model[dpa4]/descriptor[dpa4]/s2_activation(Supported Backend: PyTorch) Two booleans [so2_enabled, ffn_enabled]. so2_enabled=true makes the SO(2) gated activation path use activation_function=”silu”. ffn_enabled=true makes the block-internal FFN path use activation_function=”silu” and glu_activation=true. S2-grid resolutions are resolved automatically per block. The e3nn SO(2) grid is [2 * mmax + 4, ceil_even(3 * lmax + 2)], and the e3nn FFN grid is lifted to [max(R_phi, R_theta), max(R_phi, R_theta)]. Lebedev branches use the smallest packaged rule with precision at least 3 * lmax. The final scalar output FFN is unchanged.
- lebedev_quadrature:#
- type:
bool|list[bool], optional, default:Trueargument path:model[dpa4]/descriptor[dpa4]/lebedev_quadrature(Supported Backend: PyTorch) Either one boolean applied to both S2 branches, or two booleans [so2_enabled, ffn_enabled] aligned with s2_activation. If a branch is enabled here, its S2 projector uses packaged Lebedev quadrature rules instead of the e3nn product grid. The default keeps the existing e3nn behavior.
- activation_function:#
- type:
str, optional, default:siluargument path:model[dpa4]/descriptor[dpa4]/activation_functionBase activation function for helper MLPs, the SO(2) gated activation path, and the final scalar output FFN. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”.. It is overridden to “silu” only on paths whose s2_activation switch is enabled.
- glu_activation:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa4]/glu_activation(Supported Backend: PyTorch) Base GLU switch for FFN (e.g., silu -> swiglu, gelu -> geglu). The block-internal FFN overrides this to true when s2_activation[1]=true, while the final scalar output FFN keeps the user-provided value.
- use_amp:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa4]/use_ampIf True, use automatic mixed precision (AMP) with bfloat16 on CUDA during training. This can improve speed and reduce memory usage. Enabling this option is recommended on GPUs with native bfloat16 support. Disable it on GPUs without native bfloat16 support to avoid runtime errors or additional conversion overhead.
- add_chg_spin_ebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa4]/add_chg_spin_ebd(Supported Backend: PyTorch) Whether to add frame-level charge and spin conditions to the descriptor type embedding.
- default_chg_spin:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/descriptor[dpa4]/default_chg_spin(Supported Backend: PyTorch) Default frame-level charge and spin conditions [charge, spin]. This option is used only when add_chg_spin_ebd is enabled. If set, the value is used when explicit charge_spin data are not provided, including during .pt2 inference.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[dpa4]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1. When the SeZM descriptor is used inside a full SeZM model config, prefer the model-level pair_exclude_types; if both fields are provided, they must match.
- precision:#
- type:
str, optional, default:float32argument path:model[dpa4]/descriptor[dpa4]/precisionThe precision of the descriptor parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”..
- eps:#
- type:
float, optional, default:1e-07argument path:model[dpa4]/descriptor[dpa4]/eps(Supported Backend: PyTorch) Small epsilon for numerical stability in division and normalization.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa4]/trainableIf the parameters in the descriptor are trainable.
- seed:#
- type:
int|NoneType, optional, default:Noneargument path:model[dpa4]/descriptor[dpa4]/seedRandom seed for parameter initialization.
When type is set to
se_e3(or its aliasesse_at,se_a_3be,se_t):Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Three-body embedding will be used by this descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_e3]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_e3]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_e3]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_e3]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_e3]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e3]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_e3]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e3]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_e3]/seedRandom seed for parameter initialization
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e3]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_e3]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_e3]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
When type is set to
se_a_tpe(or its aliasse_a_ebd):(Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Type embedding will be used by this descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_a_tpe]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_a_tpe]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_a_tpe]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_a_tpe]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[dpa4]/descriptor[se_a_tpe]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_a_tpe]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_tpe]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_tpe]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_a_tpe]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_a_tpe]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_a_tpe]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_a_tpe]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_a_tpe]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_tpe]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- type_nchanl:#
- type:
int, optional, default:4argument path:model[dpa4]/descriptor[se_a_tpe]/type_nchanlnumber of channels for type embedding
- type_nlayer:#
- type:
int, optional, default:2argument path:model[dpa4]/descriptor[se_a_tpe]/type_nlayernumber of hidden layers of type embedding net
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/descriptor[se_a_tpe]/numb_aparamdimension of atomic parameter. if set to a value > 0, the atomic parameters are embedded.
When type is set to
se_e2_r(or its aliasse_r):Used by the smooth edition of Deep Potential. Only the distance between atoms is used to construct the descriptor.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_e2_r]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_e2_r]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_e2_r]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_e2_r]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_e2_r]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e2_r]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e2_r]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_e2_r]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e2_r]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_e2_r]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_e2_r]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e2_r]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_e2_r]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
When type is set to
hybrid:Concatenate of a list of descriptors as a new descriptor.
- list:#
- type:
listargument path:model[dpa4]/descriptor[hybrid]/listA list of descriptor definitions
When type is set to
se_atten(or its aliasdpa1):Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism will be used by this descriptor.
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_atten]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_atten]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_atten]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_atten]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[dpa4]/descriptor[se_atten]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_atten]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten]/type_one_sideIf ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_atten]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_atten]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_atten]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_atten]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- attn:#
- type:
int, optional, default:128argument path:model[dpa4]/descriptor[se_atten]/attnThe length of hidden vectors in attention layers
- attn_layer:#
- type:
int, optional, default:2argument path:model[dpa4]/descriptor[se_atten]/attn_layerThe number of attention layers. Note that model compression of se_atten works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode==’strip’. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed.
- attn_dotr:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten]/attn_dotrWhether to do dot product with the normalized relative coordinates
- attn_mask:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten]/attn_maskWhether to mask the diagonal in the attention matrix
- stripped_type_embedding:#
- type:
bool|NoneType, optional, default:Noneargument path:model[dpa4]/descriptor[se_atten]/stripped_type_embedding(Deprecated, kept only for compatibility.) Whether to strip the type embedding into a separate embedding network. Setting this parameter to True is equivalent to setting tebd_input_mode to ‘strip’. Setting it to False is equivalent to setting tebd_input_mode to ‘concat’.The default value is None, which means the tebd_input_mode setting will be used instead.
- smooth_type_embedding:#
- type:
bool, optional, default:False, alias: smooth_type_embddingargument path:model[dpa4]/descriptor[se_atten]/smooth_type_embeddingWhether to use smooth process in attention weights calculation. (Supported Backend: TensorFlow) When using stripped type embedding, whether to dot smooth factor on the network output of type embedding to keep the network smooth, instead of setting set_davg_zero to be True.
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten]/set_davg_zeroSet the normalization average to zero. This option should be set when se_atten descriptor or atom_ener in the energy fitting is used
- trainable_ln:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten]/trainable_lnWhether to use trainable shift and scale weights in layer normalization.
- ln_eps:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/descriptor[se_atten]/ln_epsThe epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[dpa4]/descriptor[se_atten]/tebd_dim(Supported Backend: PyTorch) Dimension of the atom-type embedding (tebd).
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten]/use_econf_tebd(Supported Backend: PyTorch) Whether to use electronic configuration type embedding. For TensorFlow backend, please set use_econf_tebd in type_embedding block instead.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- tebd_input_mode:#
- type:
str, optional, default:concatargument path:model[dpa4]/descriptor[se_atten]/tebd_input_modeHow the atom-type embedding (tebd) is fed into the descriptor. Supported modes are [‘concat’, ‘strip’].
‘concat’: Concatenate the type embedding with the smoothed radial information as the combined input to the embedding network. When type_one_side is False, the input is input_ij = concat([r_ij, tebd_j, tebd_i]). When type_one_side is True, the input is input_ij = concat([r_ij, tebd_j]). The output is out_ij = embedding(input_ij) for the pair-wise representation of atom i with neighbor j.
‘strip’: Use a separate embedding network for the type embedding and combine its output with the radial embedding-network output. When type_one_side is False, the input is input_t = concat([tebd_j, tebd_i]). (Supported Backend: PyTorch) When type_one_side is True, the input is input_t = tebd_j. The output is out_ij = embedding_t(input_t) * embedding_s(r_ij) + embedding_s(r_ij) for the pair-wise representation of atom i with neighbor j.
- scaling_factor:#
- type:
float, optional, default:1.0argument path:model[dpa4]/descriptor[se_atten]/scaling_factor(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K). If temperature is None, the scaling of attention weights is (N_hidden_dim * scaling_factor)**0.5. Else, the scaling of attention weights is set to temperature.
- normalize:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten]/normalize(Supported Backend: PyTorch) Whether to normalize the hidden vectors during attention calculation.
- temperature:#
- type:
float, optionalargument path:model[dpa4]/descriptor[se_atten]/temperature(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K).
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten]/concat_output_tebd(Supported Backend: PyTorch) Whether to concatenate the type embedding to the descriptor output.
When type is set to
se_e3_tebd:(Supported Backend: PyTorch)
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_e3_tebd]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_e3_tebd]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_e3_tebd]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_e3_tebd]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[dpa4]/descriptor[se_e3_tebd]/tebd_dim(Supported Backend: PyTorch) Dimension of the atom-type embedding (tebd).
- tebd_input_mode:#
- type:
str, optional, default:concatargument path:model[dpa4]/descriptor[se_e3_tebd]/tebd_input_modeHow the atom-type embedding (tebd) is fed into the descriptor. Supported modes are [‘concat’, ‘strip’].
‘concat’: Concatenate the type embedding with the smoothed angular information as the combined input to the embedding network. The input is input_jk = concat([angle_jk, tebd_j, tebd_k]). The output is out_jk = embedding(input_jk) for the three-body representation of atom i with neighbors j and k.
‘strip’: Use a separate embedding network for the type embedding and combine its output with the angular embedding-network output. The input is input_t = concat([tebd_j, tebd_k]). The output is out_jk = embedding_t(input_t) * embedding_s(angle_jk) + embedding_s(angle_jk) for the three-body representation of atom i with neighbors j and k.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e3_tebd]/resnet_dtWhether to use a “Timestep” in the skip connection
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e3_tebd]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_e3_tebd]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_e3_tebd]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- smooth:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e3_tebd]/smoothWhether to use smooth process in calculation when using stripped type embedding. Whether to dot smooth factor (both neighbors j and k) on the network output (out_jk) of type embedding to keep the network smooth, instead of setting set_davg_zero to be True.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_e3_tebd]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_e3_tebd]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e3_tebd]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_e3_tebd]/seedRandom seed for parameter initialization
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e3_tebd]/concat_output_tebd(Supported Backend: PyTorch) Whether to concatenate the type embedding to the descriptor output.
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_e3_tebd]/use_econf_tebd(Supported Backend: PyTorch) Whether to use electronic configuration type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_e3_tebd]/use_tebd_bias
When type is set to
se_atten_v2:Used by the smooth edition of Deep Potential. The full relative coordinates are used to construct the descriptor. Attention mechanism with new modifications will be used by this descriptor.
- sel:#
- type:
str|int|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_atten_v2]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_atten_v2]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_atten_v2]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_atten_v2]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[dpa4]/descriptor[se_atten_v2]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_atten_v2]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten_v2]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten_v2]/type_one_sideIf ‘False’, type embeddings of both neighbor and central atoms are considered. If ‘True’, only type embeddings of neighbor atoms are considered. Default is ‘False’.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_atten_v2]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten_v2]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_atten_v2]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_atten_v2]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_atten_v2]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- attn:#
- type:
int, optional, default:128argument path:model[dpa4]/descriptor[se_atten_v2]/attnThe length of hidden vectors in attention layers
- attn_layer:#
- type:
int, optional, default:2argument path:model[dpa4]/descriptor[se_atten_v2]/attn_layerThe number of attention layers. Note that model compression of se_atten works for any attn_layer value (for pytorch backend only, for other backends, attn_layer=0 is still needed to compress) when tebd_input_mode==’strip’. When attn_layer!=0, only type embedding is compressed, geometric parts are not compressed.
- attn_dotr:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten_v2]/attn_dotrWhether to do dot product with the normalized relative coordinates
- attn_mask:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten_v2]/attn_maskWhether to mask the diagonal in the attention matrix
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten_v2]/set_davg_zeroSet the normalization average to zero. This option should be set when se_atten descriptor or atom_ener in the energy fitting is used
- trainable_ln:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten_v2]/trainable_lnWhether to use trainable shift and scale weights in layer normalization.
- ln_eps:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/descriptor[se_atten_v2]/ln_epsThe epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[dpa4]/descriptor[se_atten_v2]/tebd_dim(Supported Backend: PyTorch) Dimension of the atom-type embedding (tebd).
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten_v2]/use_econf_tebd(Supported Backend: PyTorch) Whether to use electronic configuration type embedding. For TensorFlow backend, please set use_econf_tebd in type_embedding block instead.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_atten_v2]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- scaling_factor:#
- type:
float, optional, default:1.0argument path:model[dpa4]/descriptor[se_atten_v2]/scaling_factor(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K). If temperature is None, the scaling of attention weights is (N_hidden_dim * scaling_factor)**0.5. Else, the scaling of attention weights is set to temperature.
- normalize:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten_v2]/normalize(Supported Backend: PyTorch) Whether to normalize the hidden vectors during attention calculation.
- temperature:#
- type:
float, optionalargument path:model[dpa4]/descriptor[se_atten_v2]/temperature(Supported Backend: PyTorch) The scaling factor of normalization in calculations of attention weights, which is used to scale the matmul(Q, K).
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_atten_v2]/concat_output_tebd(Supported Backend: PyTorch) Whether to concatenate the type embedding to the descriptor output.
When type is set to
dpa2:(Supported Backend: PyTorch)
- repinit:#
- type:
dictargument path:model[dpa4]/descriptor[dpa2]/repinitArguments for the repinit block, which builds the initial atom-wise representations before repformer.
- rcut:#
- type:
floatargument path:model[dpa4]/descriptor[dpa2]/repinit/rcutThe cut-off radius.
- rcut_smth:#
- type:
floatargument path:model[dpa4]/descriptor[dpa2]/repinit/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth.
- nsel:#
- type:
int|strargument path:model[dpa4]/descriptor[dpa2]/repinit/nselMaximally possible number of selected neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- neuron:#
- type:
list, optional, default:[25, 50, 100]argument path:model[dpa4]/descriptor[dpa2]/repinit/neuronNumber of neurons in each hidden layer of the embedding net.When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:16argument path:model[dpa4]/descriptor[dpa2]/repinit/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- tebd_dim:#
- type:
int, optional, default:8argument path:model[dpa4]/descriptor[dpa2]/repinit/tebd_dimDimension of the atom-type embedding (tebd).
- tebd_input_mode:#
- type:
str, optional, default:concatargument path:model[dpa4]/descriptor[dpa2]/repinit/tebd_input_modeHow the atom-type embedding (tebd) is fed into the descriptor. Supported modes are [‘concat’, ‘strip’].
‘concat’: Concatenate the type embedding with the smoothed radial information as the combined input to the embedding network. When type_one_side is False, the input is input_ij = concat([r_ij, tebd_j, tebd_i]). When type_one_side is True, the input is input_ij = concat([r_ij, tebd_j]). The output is out_ij = embedding(input_ij) for the pair-wise representation of atom i with neighbor j.
‘strip’: Use a separate embedding network for the type embedding and combine its output with the radial embedding-network output. When type_one_side is False, the input is input_t = concat([tebd_j, tebd_i]). (Supported Backend: PyTorch) When type_one_side is True, the input is input_t = tebd_j. The output is out_ij = embedding_t(input_t) * embedding_s(r_ij) + embedding_s(r_ij) for the pair-wise representation of atom i with neighbor j.
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repinit/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[dpa2]/repinit/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”..
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/repinit/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/repinit/resnet_dtWhether to use a “Timestep” in the skip connection.
- use_three_body:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/repinit/use_three_bodyWhether to concatenate an additional three-body representation to the repinit output descriptor.
- three_body_neuron:#
- type:
list, optional, default:[2, 4, 8]argument path:model[dpa4]/descriptor[dpa2]/repinit/three_body_neuronNumber of neurons in each hidden layer of the three-body embedding net.When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- three_body_rcut:#
- type:
float, optional, default:4.0argument path:model[dpa4]/descriptor[dpa2]/repinit/three_body_rcutThe cut-off radius in the three-body representation.
- three_body_rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[dpa2]/repinit/three_body_rcut_smthWhere to start smoothing in the three-body representation. For example the 1/r term is smoothed from three_body_rcut to three_body_rcut_smth.
- three_body_sel:#
- type:
int|str, optional, default:40argument path:model[dpa4]/descriptor[dpa2]/repinit/three_body_selMaximally possible number of selected neighbors in the three-body representation. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- repformer:#
- type:
dictargument path:model[dpa4]/descriptor[dpa2]/repformerArguments for the repformer block, which refines the representations produced by repinit.
- rcut:#
- type:
floatargument path:model[dpa4]/descriptor[dpa2]/repformer/rcutThe cut-off radius.
- rcut_smth:#
- type:
floatargument path:model[dpa4]/descriptor[dpa2]/repformer/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth.
- nsel:#
- type:
int|strargument path:model[dpa4]/descriptor[dpa2]/repformer/nselMaximally possible number of selected neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- nlayers:#
- type:
int, optional, default:3argument path:model[dpa4]/descriptor[dpa2]/repformer/nlayersNumber of repformer layers.
- g1_dim:#
- type:
int, optional, default:128argument path:model[dpa4]/descriptor[dpa2]/repformer/g1_dimDimension of the g1 representation, i.e., the rotationally invariant single-atom representation.
- g2_dim:#
- type:
int, optional, default:16argument path:model[dpa4]/descriptor[dpa2]/repformer/g2_dimDimension of the g2 representation, i.e., the rotationally invariant pair-atom representation.
- axis_neuron:#
- type:
int, optional, default:4argument path:model[dpa4]/descriptor[dpa2]/repformer/axis_neuronSize of the submatrix used in the symmetrization operations.
- direct_dist:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/repformer/direct_distWhether to use the direct distance as input to the embedding net when building g2, instead of the smoothed 1/r.
- update_g1_has_conv:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/update_g1_has_convWhether to include the convolution term when updating g1.
- update_g1_has_drrd:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/update_g1_has_drrdWhether to include the drrd term when updating g1.
- update_g1_has_grrg:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/update_g1_has_grrgWhether to include the grrg term when updating g1.
- update_g1_has_attn:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/update_g1_has_attnWhether to include localized self-attention when updating g1.
- update_g2_has_g1g1:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/update_g2_has_g1g1Whether to include the g1 x g1 term when updating g2.
- update_g2_has_attn:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/update_g2_has_attnWhether to include gated self-attention when updating g2.
- use_sqrt_nnei:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/use_sqrt_nneiWhether to normalize symmetrization_op by the square root of the number of neighbors instead of by the number of neighbors itself.
- g1_out_conv:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/g1_out_convWhether to keep the convolutional update of g1 as a separate branch outside the concatenated MLP update.
- g1_out_mlp:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/g1_out_mlpWhether to keep the self-MLP update of g1 as a separate branch outside the concatenated MLP update.
- update_h2:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/repformer/update_h2Whether to update the h2 representation, i.e., the rotationally equivariant pair representation.
- attn1_nhead:#
- type:
int, optional, default:4argument path:model[dpa4]/descriptor[dpa2]/repformer/attn1_nheadNumber of heads in the localized self-attention used to update g1.
- attn2_nhead:#
- type:
int, optional, default:4argument path:model[dpa4]/descriptor[dpa2]/repformer/attn2_nheadNumber of heads in the gated self-attention used to update g2.
- attn2_has_gate:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/repformer/attn2_has_gateWhether to use gating in the gated self-attention used to update g2.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[dpa2]/repformer/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”..
- update_style:#
- type:
str, optional, default:res_avgargument path:model[dpa4]/descriptor[dpa2]/repformer/update_styleStyle to update a representation. Supported options are: -‘res_avg’: Updates a rep u with: u = 1/sqrt{n+1} (u + u_1 + u_2 + … + u_n) -‘res_incr’: Updates a rep u with: u = u + 1/sqrt{n} (u_1 + u_2 + … + u_n)-‘res_residual’: Updates a rep u with: u = u + (r1*u_1 + r2*u_2 + … + r3*u_n) where r1, r2 … r3 are residual weights defined by update_residual and update_residual_init.
- update_residual:#
- type:
float, optional, default:0.001argument path:model[dpa4]/descriptor[dpa2]/repformer/update_residualWhen update using residual mode, the initial std of residual vector weights.
- update_residual_init:#
- type:
str, optional, default:normargument path:model[dpa4]/descriptor[dpa2]/repformer/update_residual_initWhen update using residual mode, the initialization mode of residual vector weights.Supported modes are: [‘norm’, ‘const’].
- set_davg_zero:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used.
- trainable_ln:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/repformer/trainable_lnWhether to use trainable shift and scale weights in layer normalization.
- ln_eps:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/descriptor[dpa2]/repformer/ln_epsThe epsilon value for layer normalization. The default value for TensorFlow is set to 1e-3 to keep consistent with keras while set to 1e-5 in PyTorch and DP implementation.
- concat_output_tebd:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/concat_output_tebdWhether to concatenate the type embedding to the descriptor output.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[dpa2]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- smooth:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/smoothWhether to use smoothness in processes such as attention weights calculation.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[dpa2]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[dpa2]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa2]/trainableWhether the parameters in the embedding net are trainable.
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[dpa2]/seedRandom seed for parameter initialization.
- add_tebd_to_repinit_out:#
- type:
bool, optional, default:False, alias: repformer_add_type_ebd_to_seqargument path:model[dpa4]/descriptor[dpa2]/add_tebd_to_repinit_outWhether to add the type embedding to the output of repinit before passing it to repformer.
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/use_econf_tebd(Supported Backend: PyTorch) Whether to use an electronic-configuration-based type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa2]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
When type is set to
dpa3:(Supported Backend: PyTorch)
- repflow:#
- type:
dictargument path:model[dpa4]/descriptor[dpa3]/repflowArguments for the repflow block, which updates node, edge, and angle representations in DPA3.
- n_dim:#
- type:
int, optional, default:128argument path:model[dpa4]/descriptor[dpa3]/repflow/n_dimDimension of the node (atom-wise) representation.
- e_dim:#
- type:
int, optional, default:64argument path:model[dpa4]/descriptor[dpa3]/repflow/e_dimDimension of the edge (pair-wise) representation.
- a_dim:#
- type:
int, optional, default:64argument path:model[dpa4]/descriptor[dpa3]/repflow/a_dimDimension of the angle (three-body/angular) representation.
- nlayers:#
- type:
int, optional, default:6argument path:model[dpa4]/descriptor[dpa3]/repflow/nlayersNumber of repflow layers.
- e_rcut:#
- type:
floatargument path:model[dpa4]/descriptor[dpa3]/repflow/e_rcutThe edge cut-off radius.
- e_rcut_smth:#
- type:
floatargument path:model[dpa4]/descriptor[dpa3]/repflow/e_rcut_smthWhere to start smoothing for edge. For example the 1/r term is smoothed from rcut to rcut_smth.
- e_sel:#
- type:
int|strargument path:model[dpa4]/descriptor[dpa3]/repflow/e_selMaximally possible number of selected edge neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- a_rcut:#
- type:
floatargument path:model[dpa4]/descriptor[dpa3]/repflow/a_rcutThe angle cut-off radius.
- a_rcut_smth:#
- type:
floatargument path:model[dpa4]/descriptor[dpa3]/repflow/a_rcut_smthWhere to start smoothing for angle. For example the 1/r term is smoothed from rcut to rcut_smth.
- a_sel:#
- type:
int|strargument path:model[dpa4]/descriptor[dpa3]/repflow/a_selMaximally possible number of selected angle neighbors. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- a_compress_rate:#
- type:
int, optional, default:0argument path:model[dpa4]/descriptor[dpa3]/repflow/a_compress_rateThe compression rate for angular messages. The default value is 0, indicating no compression. If a non-zero integer c is provided, the node and edge dimensions will be compressed to a_dim/c and a_dim/2c, respectively, within the angular message.
- a_compress_e_rate:#
- type:
int, optional, default:1argument path:model[dpa4]/descriptor[dpa3]/repflow/a_compress_e_rateThe extra compression rate for edge in angular message compression. The default value is 1.When using angular message compression with a_compress_rate c and a_compress_e_rate c_e, the edge dimension will be compressed to (c_e * a_dim / 2c) within the angular message.
- a_compress_use_split:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/repflow/a_compress_use_splitWhether to split first sub-vectors instead of linear mapping during angular message compression. The default value is False.
- n_multi_edge_message:#
- type:
int, optional, default:1argument path:model[dpa4]/descriptor[dpa3]/repflow/n_multi_edge_messageNumber of heads in the multi-edge-message update of node features. Default is 1, i.e., a single edge-message head.
- axis_neuron:#
- type:
int, optional, default:4argument path:model[dpa4]/descriptor[dpa3]/repflow/axis_neuronSize of the submatrix used in the symmetrization operations.
- fix_stat_std:#
- type:
float, optional, default:0.3argument path:model[dpa4]/descriptor[dpa3]/repflow/fix_stat_stdIf non-zero (default is 0.3), use this constant as the normalization standard deviation instead of computing it from data statistics.
- skip_stat:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/repflow/skip_stat(Deprecated, kept only for compatibility.) This parameter is obsolete and will be removed. If set to True, it forces fix_stat_std=0.3 for backward compatibility. Transition to fix_stat_std parameter immediately.
- update_angle:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa3]/repflow/update_angleWhether to update the angle representation. If False, only the node and edge representations are updated.
- update_style:#
- type:
str, optional, default:res_residualargument path:model[dpa4]/descriptor[dpa3]/repflow/update_styleStyle to update a representation. Supported options are: -‘res_avg’: Updates a rep u with: u = 1/sqrt{n+1} (u + u_1 + u_2 + … + u_n) -‘res_incr’: Updates a rep u with: u = u + 1/sqrt{n} (u_1 + u_2 + … + u_n)-‘res_residual’: Updates a rep u with: u = u + (r1*u_1 + r2*u_2 + … + r3*u_n) where r1, r2 … r3 are residual weights defined by update_residual and update_residual_init.
- update_residual:#
- type:
float, optional, default:0.1argument path:model[dpa4]/descriptor[dpa3]/repflow/update_residualWhen update using residual mode, the initial std of residual vector weights.
- update_residual_init:#
- type:
str, optional, default:constargument path:model[dpa4]/descriptor[dpa3]/repflow/update_residual_initWhen update using residual mode, the initialization mode of residual vector weights.Supported modes are: [‘norm’, ‘const’].
- optim_update:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa3]/repflow/optim_updateWhether to enable the optimized update method. Uses a more efficient implementation when enabled. Default is True.
- smooth_edge_update:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/repflow/smooth_edge_updateWhether to make edge update smooth. If True, the edge update from angle message will not use self as padding.
- edge_init_use_dist:#
- type:
bool, optional, default:False, alias: edge_use_distargument path:model[dpa4]/descriptor[dpa3]/repflow/edge_init_use_distWhether to use direct distance r to initialize the edge features instead of 1/r. Note that when using this option, the activation function will not be used when initializing edge features.
- use_exp_switch:#
- type:
bool, optional, default:False, alias: use_env_envelopeargument path:model[dpa4]/descriptor[dpa3]/repflow/use_exp_switchWhether to use an exponential switch function instead of a polynomial one in the neighbor update. The exponential switch function ensures neighbor contributions smoothly diminish as the interatomic distance r approaches the cutoff radius rcut. Specifically, the function is defined as: s(r) = exp(-exp(20 * (r - rcut_smth) / rcut_smth)) for 0 < r leq rcut, and s(r) = 0 for r > rcut. Here, rcut_smth is an adjustable smoothing factor and should be chosen carefully according to rcut, ensuring s(r) approaches zero smoothly at the cutoff. Typical recommended values are rcut_smth = 5.3 for rcut = 6.0, and 3.5 for rcut = 4.0.
- use_dynamic_sel:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/repflow/use_dynamic_selWhether to dynamically select neighbors within the cutoff radius. If True, the exact number of neighbors within the cutoff radius is used without padding to a fixed selection numbers. When enabled, users can safely set larger values for e_sel or a_sel (e.g., 1200 or 300, respectively) to guarantee capturing all neighbors within the cutoff radius. Note that when using dynamic selection, the smooth_edge_update must be True.
- sel_reduce_factor:#
- type:
float, optional, default:10.0argument path:model[dpa4]/descriptor[dpa3]/repflow/sel_reduce_factorReduction factor applied to neighbor-scale normalization when use_dynamic_sel is True. In the dynamic selection case, neighbor-scale normalization will use e_sel / sel_reduce_factor or a_sel / sel_reduce_factor instead of the raw e_sel or a_sel values, accommodating larger selection numbers.
- sequential_update:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/repflow/sequential_updateWhether to use sequential update mode within each repflow layer. When True, updates are applied sequentially: edge self → angle self (using updated edge) → edge angle (using updated angle) → node (using final edge), instead of the default parallel mode where all updates use original embeddings. Currently only supports update_style=’res_residual’ and requires update_angle=True.
- concat_output_tebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/concat_output_tebdWhether to concatenate the type embedding to the descriptor output.
- add_chg_spin_ebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/add_chg_spin_ebdWhether to add charge and spin embedding to the descriptor. When enabled, the dedicated charge_spin input (shape [nframes, 2], [charge, spin]) is embedded and added to the type embedding. When charge_spin is missing in the input data, default_chg_spin is used as a fallback if provided.
- default_chg_spin:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/descriptor[dpa3]/default_chg_spinDefault charge and spin values used as fallback when charge_spin is not provided in the input data. Must be a list of length 2 [charge, spin]. Only used when add_chg_spin_ebd is True.
- activation_function:#
- type:
str, optional, default:siluargument path:model[dpa4]/descriptor[dpa3]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”..
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[dpa3]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[dpa3]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[dpa3]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa3]/trainableWhether the parameters in the embedding net are trainable.
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[dpa3]/seedRandom seed for parameter initialization.
- use_econf_tebd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/use_econf_tebd(Supported Backend: PyTorch) Whether to use an electronic-configuration-based type embedding.
- use_tebd_bias:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[dpa3]/use_tebd_biasWhether to use a bias term in the type-embedding layer.
- use_loc_mapping:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[dpa3]/use_loc_mappingWhether to use local atom index mapping in training or non-parallel inference. When True, local indexing and mapping are applied to neighbor lists and embeddings during descriptor computation.
When type is set to
se_a_ebd_v2(or its aliasse_a_tpe_v2):(Supported Backend: TensorFlow)
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_a_ebd_v2]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- rcut:#
- type:
float, optional, default:6.0argument path:model[dpa4]/descriptor[se_a_ebd_v2]/rcutThe cut-off radius.
- rcut_smth:#
- type:
float, optional, default:0.5argument path:model[dpa4]/descriptor[se_a_ebd_v2]/rcut_smthWhere to start smoothing. For example the 1/r term is smoothed from rcut to rcut_smth
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_a_ebd_v2]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[dpa4]/descriptor[se_a_ebd_v2]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_a_ebd_v2]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_ebd_v2]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_ebd_v2]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_a_ebd_v2]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_a_ebd_v2]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_a_ebd_v2]/seedRandom seed for parameter initialization
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_a_ebd_v2]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- env_protection:#
- type:
float, optional, default:0.0argument path:model[dpa4]/descriptor[se_a_ebd_v2]/env_protection(Supported Backend: PyTorch) Protection parameter to prevent division by zero errors during environment matrix calculations. For example, when using paddings, there may be zero distances of neighbors, which may make division by zero error during environment matrix calculations without protection.
- set_davg_zero:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_ebd_v2]/set_davg_zeroSet the normalization average to zero. This option should be set when atom_ener in the energy fitting is used
When type is set to
se_a_mask:(Supported Backend: TensorFlow) Used by the smooth edition of Deep Potential. It can accept a variable number of atoms in a frame (Non-PBC system). aparam are required as an indicator matrix for the real/virtual sign of input atoms.
- sel:#
- type:
str|list[int], optional, default:autoargument path:model[dpa4]/descriptor[se_a_mask]/selThis parameter sets the number of selected neighbors for each type of atom. It can be:
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. sel[i] is recommended to be larger than the maximally possible number of type-i neighbors in the cut-off radius. It is noted that the total sel value must be less than 4096 in a GPU environment.
str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
- neuron:#
- type:
list[int], optional, default:[10, 20, 40]argument path:model[dpa4]/descriptor[se_a_mask]/neuronNumber of neurons in each hidden layer of the embedding net. When two layers are of the same size or one layer is twice as large as the previous layer, a skip connection is built.
- axis_neuron:#
- type:
int, optional, default:4, alias: n_axis_neuronargument path:model[dpa4]/descriptor[se_a_mask]/axis_neuronSize of the submatrix of G (the embedding matrix) used to build the descriptor.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/descriptor[se_a_mask]/activation_functionThe activation function in the embedding net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_mask]/resnet_dtWhether to use a “Timestep” in the skip connection
- type_one_side:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/descriptor[se_a_mask]/type_one_sideIf true, the embedding network parameters vary by types of neighbor atoms only, so there will be $N_text{types}$ sets of embedding network parameters. Otherwise, the embedding network parameters vary by types of centric atoms and types of neighbor atoms, so there will be $N_text{types}^2$ sets of embedding network parameters.
- exclude_types:#
- type:
list[list[int]], optional, default:[]argument path:model[dpa4]/descriptor[se_a_mask]/exclude_typesThe excluded pairs of types which have no interaction with each other. For example, [[0, 1]] means no interaction between type 0 and type 1.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/descriptor[se_a_mask]/precisionThe precision of the embedding net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- trainable:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/descriptor[se_a_mask]/trainableWhether the parameters in the embedding net are trainable
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/descriptor[se_a_mask]/seedRandom seed for parameter initialization
- fitting_net:#
- type:
dictargument path:model[dpa4]/fitting_net(Supported Backend: PyTorch) Fitting network configuration. DPA4/SeZM uses the dpa4_ener GLU energy fitting.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key), default:enerargument path:model[dpa4]/fitting_net/typeThe type of the fitting.
ener: Fit an energy model (potential energy surface).dpa4_ener: (Supported Backend: PyTorch) Fit an energy model (potential energy surface).dos: Fit a density of states model. The total density of states / site-projected density of states labels should be provided by dos.npy or atom_dos.npy in each data system. The file has a number of frames (rows) and a number of energy-grid columns (multiplied by the number of atoms in atom_dos.npy). See loss parameter.property: (Supported Backend: PyTorch)polar: Fit an atomic polarizability model. Global polarizability labels or atomic polarizability labels for all selected atoms (see sel_type) should be provided by polarizability.npy in each data system. The file should have shape (n_frames, 9*n_selected) for atomic polarizability labels, or shape (n_frames, 9) for global polarizability labels. See loss parameter.dipole: Fit an atomic dipole model. Global dipole labels or atomic dipole labels for all selected atoms (see sel_type) should be provided by dipole.npy in each data system. The file should have shape (n_frames, 3*n_selected) for atomic dipole labels, or shape (n_frames, 3) for global dipole labels. See loss parameter.
When type is set to
ener:Fit an energy model (potential energy surface).
- numb_fparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[ener]/numb_fparamThe dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[ener]/numb_aparamThe dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[ener]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[ener]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[dpa4]/fitting_net[ener]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/fitting_net[ener]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/fitting_net[ener]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[ener]/resnet_dtWhether to use a “Timestep” in the skip connection
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[dpa4]/fitting_net[ener]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool(Supported Backend: TensorFlow) : Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
- rcond:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/fitting_net[ener]/rcondThe condition number used to determine the initial energy shift for each type of atoms. See rcond in
numpy.linalg.lstsq()for more details.
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/fitting_net[ener]/seedRandom seed for parameter initialization of the fitting net
- atom_ener:#
- type:
list[float | None], optional, default:[]argument path:model[dpa4]/fitting_net[ener]/atom_enerSpecify the atomic energy in vacuum for each type
- layer_name:#
- type:
list[str], optionalargument path:model[dpa4]/fitting_net[ener]/layer_nameThe name of the each layer. The length of this list should be equal to n_neuron + 1. If two layers, either in the same fitting or different fittings, have the same name, they will share the same neural network parameters. The shape of these layers should be the same. If null is given for a layer, parameters will not be shared.
- use_aparam_as_mask:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/fitting_net[ener]/use_aparam_as_maskWhether to use the aparam as a mask in input.If True, the aparam will not be used in fitting net for embedding.When descrpt is se_a_mask, the aparam will be used as a mask to indicate the input atom is real/virtual. And use_aparam_as_mask should be set to True.
When type is set to
dpa4_ener(or its aliassezm_ener):(Supported Backend: PyTorch) Fit an energy model (potential energy surface).
- numb_fparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dpa4_ener]/numb_fparamDimension of frame parameters. If set to >0, each data system should provide fparam.npy.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dpa4_ener]/numb_aparamDimension of atomic parameters. If set to >0, each data system should provide aparam.npy.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[dpa4_ener]/default_fparam(Supported Backend: PyTorch) Default frame parameters used when a data system does not provide fparam.npy.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dpa4_ener]/dim_case_embd(Supported Backend: PyTorch) Dimension of the case embedding. For multitask training or fine-tuning with case embeddings, set this value to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[0], alias: n_neuronargument path:model[dpa4]/fitting_net[dpa4_ener]/neuronThe number of neurons in each hidden layer of the fitting net. Use 0 as an auto-width placeholder resolved from the descriptor width.
- activation_function:#
- type:
str, optional, default:siluargument path:model[dpa4]/fitting_net[dpa4_ener]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- precision:#
- type:
str, optional, default:float32argument path:model[dpa4]/fitting_net[dpa4_ener]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- resnet_dt:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/fitting_net[dpa4_ener]/resnet_dtWhether to use a “Timestep” in the skip connection
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[dpa4]/fitting_net[dpa4_ener]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool(Supported Backend: TensorFlow) : Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
- rcond:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/fitting_net[dpa4_ener]/rcondThe condition number used to determine the initial energy shift for each type of atoms. See rcond in
numpy.linalg.lstsq()for more details.
- seed:#
- type:
int|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[dpa4_ener]/seedRandom seed for parameter initialization of the fitting net
- atom_ener:#
- type:
list[float | None], optional, default:[]argument path:model[dpa4]/fitting_net[dpa4_ener]/atom_enerSpecify the atomic energy in vacuum for each type
- layer_name:#
- type:
list[str], optionalargument path:model[dpa4]/fitting_net[dpa4_ener]/layer_nameThe name of the each layer. The length of this list should be equal to n_neuron + 1. If two layers, either in the same fitting or different fittings, have the same name, they will share the same neural network parameters. The shape of these layers should be the same. If null is given for a layer, parameters will not be shared.
- use_aparam_as_mask:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/fitting_net[dpa4_ener]/use_aparam_as_maskWhether to use the aparam as a mask in input.If True, the aparam will not be used in fitting net for embedding.When descrpt is se_a_mask, the aparam will be used as a mask to indicate the input atom is real/virtual. And use_aparam_as_mask should be set to True.
- case_film_embd:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/fitting_net[dpa4_ener]/case_film_embd(Supported Backend: PyTorch) Whether to use case FiLM conditioning for shared DPA4/SeZM fitting. When enabled, the case embedding modulates fitting features instead of being concatenated to the fitting input.
When type is set to
dos:Fit a density of states model. The total density of states / site-projected density of states labels should be provided by dos.npy or atom_dos.npy in each data system. The file has a number of frames (rows) and a number of energy-grid columns (multiplied by the number of atoms in atom_dos.npy). See loss parameter.
- numb_fparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dos]/numb_fparamThe dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dos]/numb_aparamThe dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[dos]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dos]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120]argument path:model[dpa4]/fitting_net[dos]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/fitting_net[dos]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- precision:#
- type:
str, optional, default:float64argument path:model[dpa4]/fitting_net[dos]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[dos]/resnet_dtWhether to use a “Timestep” in the skip connection
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[dpa4]/fitting_net[dos]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool: Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
- rcond:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/fitting_net[dos]/rcondThe condition number used to determine the initial energy shift for each type of atoms. See rcond in
numpy.linalg.lstsq()for more details.
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/fitting_net[dos]/seedRandom seed for parameter initialization of the fitting net
- numb_dos:#
- type:
int, optional, default:300argument path:model[dpa4]/fitting_net[dos]/numb_dosThe number of gridpoints on which the DOS is evaluated (NEDOS in VASP)
When type is set to
property:(Supported Backend: PyTorch)
- numb_fparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[property]/numb_fparamThe dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[property]/numb_aparamThe dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[property]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[property]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[dpa4]/fitting_net[property]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/fitting_net[property]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[property]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/fitting_net[property]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/fitting_net[property]/seedRandom seed for parameter initialization of the fitting net
- task_dim:#
- type:
int, optional, default:1argument path:model[dpa4]/fitting_net[property]/task_dimThe dimension of outputs of fitting net
- intensive:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/fitting_net[property]/intensiveWhether the fitting property is intensive
- distinguish_types:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[property]/distinguish_typesWhether to distinguish atom types when computing output statistics.
- property_name:#
- type:
strargument path:model[dpa4]/fitting_net[property]/property_nameThe names of fitting property, which should be consistent with the property name in the dataset.
- trainable:#
- type:
list[bool]|bool, optional, default:Trueargument path:model[dpa4]/fitting_net[property]/trainableWhether the parameters in the fitting net are trainable. This option can be
bool: True if all parameters of the fitting net are trainable, False otherwise.
list of bool: Specifies if each layer is trainable. Since the fitting net is composed of hidden layers followed by an output layer, the length of this list should be equal to len(neuron)+1.
When type is set to
polar:Fit an atomic polarizability model. Global polarizability labels or atomic polarizability labels for all selected atoms (see sel_type) should be provided by polarizability.npy in each data system. The file should have shape (n_frames, 9*n_selected) for atomic polarizability labels, or shape (n_frames, 9) for global polarizability labels. See loss parameter.
- numb_fparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[polar]/numb_fparam(Supported Backend: PyTorch) The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[polar]/numb_aparam(Supported Backend: PyTorch) The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[polar]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[polar]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[dpa4]/fitting_net[polar]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/fitting_net[polar]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[polar]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/fitting_net[polar]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- fit_diag:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[polar]/fit_diagFit the diagonal part of the rotational invariant polarizability matrix, which will be converted to normal polarizability matrix by contracting with the rotation matrix.
- scale:#
- type:
list[float]|float, optional, default:1.0argument path:model[dpa4]/fitting_net[polar]/scaleThe output of the fitting net (polarizability matrix) will be scaled by
scale
- shift_diag:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[polar]/shift_diagWhether to shift the diagonal of polar, which is beneficial to training. Default is true.
- sel_type:#
- type:
int|NoneType|list[int], optional, alias: pol_typeargument path:model[dpa4]/fitting_net[polar]/sel_typeThe atom types for which the atomic polarizability will be provided. If not set, all types will be selected.(Supported Backend: TensorFlow)
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/fitting_net[polar]/seedRandom seed for parameter initialization of the fitting net
When type is set to
dipole:Fit an atomic dipole model. Global dipole labels or atomic dipole labels for all selected atoms (see sel_type) should be provided by dipole.npy in each data system. The file should have shape (n_frames, 3*n_selected) for atomic dipole labels, or shape (n_frames, 3) for global dipole labels. See loss parameter.
- numb_fparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dipole]/numb_fparam(Supported Backend: PyTorch) The dimension of the frame parameter. If set to >0, file fparam.npy should be included to provided the input fparams.
- numb_aparam:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dipole]/numb_aparam(Supported Backend: PyTorch) The dimension of the atomic parameter. If set to >0, file aparam.npy should be included to provided the input aparams.
- default_fparam:#
- type:
list[float]|NoneType, optional, default:Noneargument path:model[dpa4]/fitting_net[dipole]/default_fparam(Supported Backend: PyTorch) The default frame parameter. If set, when fparam.npy files are not included in the data system, this value will be used as the default value for the frame parameter in the fitting net.
- dim_case_embd:#
- type:
int, optional, default:0argument path:model[dpa4]/fitting_net[dipole]/dim_case_embd(Supported Backend: PyTorch) The dimension of the case embedding embedding. When training or fine-tuning a multitask model with case embedding embeddings, this number should be set to the number of model branches.
- neuron:#
- type:
list[int], optional, default:[120, 120, 120], alias: n_neuronargument path:model[dpa4]/fitting_net[dipole]/neuronThe number of neurons in each hidden layer of the fitting net. When two hidden layers are of the same size, a skip connection is built.
- activation_function:#
- type:
str, optional, default:tanhargument path:model[dpa4]/fitting_net[dipole]/activation_functionThe activation function in the fitting net. Supported activation functions are “gelu”, “gelu_tf”, “relu”, “silut”, “none”, “silu”, “tanh”, “softplus”, “sigmoid”, “linear”, “relu6”. Note that “gelu” denotes the custom operator version, and “gelu_tf” denotes the TF standard version. If you set “None” or “none” here, no activation function will be used.
- resnet_dt:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/fitting_net[dipole]/resnet_dtWhether to use a “Timestep” in the skip connection
- precision:#
- type:
str, optional, default:defaultargument path:model[dpa4]/fitting_net[dipole]/precisionThe precision of the fitting net parameters, supported options are “default”, “bfloat16”, “float64”, “float16”, “float32”. Default follows the interface precision.
- sel_type:#
- type:
int|NoneType|list[int], optional, alias: dipole_typeargument path:model[dpa4]/fitting_net[dipole]/sel_typeThe atom types for which the atomic dipole will be provided. If not set, all types will be selected.(Supported Backend: TensorFlow)
- seed:#
- type:
int|NoneType, optionalargument path:model[dpa4]/fitting_net[dipole]/seedRandom seed for parameter initialization of the fitting net
- use_compile:#
- type:
bool, optional, default:Falseargument path:model[dpa4]/use_compile(Supported Backend: PyTorch) Experimental feature. If True, use compact sparse edges together with symbolic make_fx and torch.compile in the DPA4/SeZM model. This path may still expose PyTorch compiler bugs, but can improve training speed by roughly 2-3x on supported workloads. Requires torch==2.11. NVIDIA GPUs require CUDA >= 12.6. Apple Silicon Macs are also supported. Tested with Python 3.13.
- enable_tf32:#
- type:
bool, optional, default:Trueargument path:model[dpa4]/enable_tf32(Supported Backend: PyTorch) If True, enable TF32 matmul precision when use_compile=True.
- model_branch_alias:#
- type:
list[str], optional, default:[]argument path:model[dpa4]/model_branch_alias(Supported Backend: PyTorch) List of aliases for this model branch. Multiple aliases can be defined, and any alias can reference this branch throughout the model usage. Used only in multitask models.
- info:#
- type:
dict, optional, default:{}argument path:model[dpa4]/info(Supported Backend: PyTorch) Dictionary of metadata for this model branch. Store arbitrary key-value pairs with branch-specific information. Used only in multitask models.
- bridging_method:#
- type:
str, optional, default:Noneargument path:model[dpa4]/bridging_method(Supported Backend: PyTorch) Short-range bridging method. Currently supports ‘ZBL’. The value is case-insensitive; set it to ‘None’ to disable bridging.
- bridging_r_inner:#
- type:
float, optional, default:0.5argument path:model[dpa4]/bridging_r_inner(Supported Backend: PyTorch) Inner clamping radius in Å. ML descriptor distances below this radius are frozen. Only used when bridging_method is enabled. For ZBL bridging, set training.training_data.min_pair_dist to the same value so frames with atom pairs closer than bridging_r_inner are skipped during training.
- bridging_r_outer:#
- type:
float, optional, default:0.8argument path:model[dpa4]/bridging_r_outer(Supported Backend: PyTorch) Outer clamping radius in Å. The transition zone [bridging_r_inner, bridging_r_outer] uses a C^3-continuous septic Hermite polynomial. Only used when bridging_method is enabled.
- lora:#
- type:
NoneType|dict, optional, default:Noneargument path:model[dpa4]/lora(Supported Backend: PyTorch) Low-rank adaptation for fine-tuning. Single-task only; setting this in a multi-task input (top-level or per-branch) raises an error in preprocess_shared_params because share_params links descriptor modules across branches to the same object, which would collapse per-branch LoRA into one shared adapter. When set, backbone SO3Linear and SO2Linear weights are frozen and low-rank A/B adapters are injected alongside them (the adapters share the base shape family so HybridMuon’s slice route applies identically). fitting_net, env_seed_embedding, radial_embedding, and small parameters (norm scales, LayerScale, FiLM strength, attention projections, bias terms) stay fully trainable; type embeddings, radial frequencies, and GatedActivation gate projections are frozen. mid-train latest checkpoints include LoRA parameters for resume; best checkpoints from full validation are saved with LoRA deltas folded into base weights, producing plain DPA4/SeZM checkpoints suitable for deployment.
- rank:#
- type:
intargument path:model[dpa4]/lora/rank(Supported Backend: PyTorch) LoRA rank; adapters are injected on every SO3Linear and SO2Linear.
- alpha:#
- type:
NoneType|float, optional, default:Noneargument path:model[dpa4]/lora/alpha(Supported Backend: PyTorch) LoRA scaling numerator; effective scaling is alpha / rank. When omitted, alpha defaults to rank (scaling = 1.0).
When type is set to
frozen:- model_file:#
- type:
strargument path:model[frozen]/model_filePath to the frozen model file.
When type is set to
pairtab:(Supported Backend: TensorFlow) Pairwise tabulation energy model.
- tab_file:#
- type:
strargument path:model[pairtab]/tab_filePath to the tabulation file.
- rcut:#
- type:
floatargument path:model[pairtab]/rcutThe cut-off radius.
- sel:#
- type:
str|int|list[int]argument path:model[pairtab]/selThis parameter sets the number of selected neighbors. Note that this parameter is a little different from that in other descriptors. Instead of separating each type of atoms, only the summation matters. And this number is highly related with the efficiency, thus one should not make it too large. Usually 200 or less is enough, far away from the GPU limitation 4096. It can be:
int. The maximum number of neighbor atoms to be considered. We recommend it to be less than 200.
list[int]. The length of the list should be the same as the number of atom types in the system. sel[i] gives the selected number of type-i neighbors. Only the summation of sel[i] matters, and it is recommended to be less than 200. - str. Can be “auto:factor” or “auto”. “factor” is a float number larger than 1. This option will automatically determine the sel. In detail it counts the maximal number of neighbors within the cutoff radius for each type of neighbor, then multiply the maximum by the “factor”. Finally, the number is rounded up to a multiple of 4. The option “auto” is equivalent to “auto:1.1”.
When type is set to
pairwise_dprc:(Supported Backend: TensorFlow)
- qm_model:#
- type:
dictargument path:model[pairwise_dprc]/qm_model
- qmmm_model:#
- type:
dictargument path:model[pairwise_dprc]/qmmm_model
When type is set to
linear_ener:(Supported Backend: TensorFlow)
- models:#
- type:
list|dictargument path:model[linear_ener]/modelsThe sub-models.
- weights:#
- type:
list|strargument path:model[linear_ener]/weightsIf the type is list of float, a list of weights for each model. If “mean”, the weights are set to be 1 / len(models). If “sum”, the weights are set to be 1.
- learning_rate:#
- type:
dict, optionalargument path:learning_rateThe definition of learning rate
- start_lr:#
- type:
floatargument path:learning_rate/start_lrThe learning rate at the start of the training (after warmup).
- stop_lr:#
- type:
NoneType|float, optional, default:Noneargument path:learning_rate/stop_lrThe desired learning rate at the end of training. Mutually exclusive with stop_lr_ratio.
- stop_lr_ratio:#
- type:
NoneType|float, optional, default:Noneargument path:learning_rate/stop_lr_ratioThe ratio of stop_lr to start_lr. stop_lr = start_lr * stop_lr_ratio. Mutually exclusive with stop_lr.
- warmup_steps:#
- type:
int, optional, default:0argument path:learning_rate/warmup_stepsThe number of steps for learning rate warmup. During warmup, the learning rate increases linearly from warmup_start_factor * start_lr to start_lr. Mutually exclusive with warmup_ratio. Default is 0 (no warmup).
- warmup_ratio:#
- type:
NoneType|float, optional, default:Noneargument path:learning_rate/warmup_ratioThe ratio of warmup steps to total training steps. The actual number of warmup steps is int(warmup_ratio * num_steps).Mutually exclusive with warmup_steps.
- warmup_start_factor:#
- type:
float, optional, default:0.0argument path:learning_rate/warmup_start_factorThe factor of start_lr for the initial warmup learning rate. The warmup learning rate starts from warmup_start_factor * start_lr. Default is 0.0, meaning the learning rate starts from zero.
- scale_by_worker:#
- type:
str, optional, default:linearargument path:learning_rate/scale_by_workerWhen parallel training or batch size scaled, how to alter learning rate. Valid values are linear`(default), `sqrt or none.
Depending on the value of type, different sub args are accepted.
- type:#
When type is set to
exp:- decay_steps:#
- type:
int, optional, default:5000argument path:learning_rate[exp]/decay_stepsThe learning rate is decaying every this number of training steps. If decay_steps exceeds the decay phase steps (num_steps - warmup_steps) and decay_rate is not provided, it will be automatically adjusted to a sensible default value.
- decay_rate:#
- type:
NoneType|float, optional, default:Noneargument path:learning_rate[exp]/decay_rateThe decay rate for the learning rate. If this is provided, it will be used directly as the decay rate for learning rate instead of calculating it through interpolation between start_lr and stop_lr.
- smooth:#
- type:
bool, optional, default:Falseargument path:learning_rate[exp]/smoothIf True, use smooth exponential decay (lr decays continuously). If False (default), use stepped decay (lr decays every decay_steps).
When type is set to
cosine:When type is set to
wsd:- decay_phase_ratio:#
- type:
float, optional, default:0.1argument path:learning_rate[wsd]/decay_phase_ratioThe ratio of the decay phase to total training steps. The remaining post-warmup steps are used as the stable phase. Default is 0.1.
- decay_type:#
- type:
str, optional, default:inverse_linearargument path:learning_rate[wsd]/decay_typeThe decay rule used in the decay phase. Supported values are inverse_linear (default), cosine, and linear.
- optimizer:#
- type:
dict, optionalargument path:optimizerThe definition of optimizer. Supported optimizer types depend on backend: TensorFlow/Paddle: Adam; PyTorch: Adam, AdamW, LKF, AdaMuon, HybridMuon.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key), default:Adamargument path:optimizer/typeThe type of optimizer to use.
AdamW: (Supported Backend: PyTorch)LKF: (Supported Backend: PyTorch)AdaMuon: (Supported Backend: PyTorch)HybridMuon: (Supported Backend: PyTorch) HybridMuon optimizer (DeePMD-kit custom implementation). This is a Hybrid optimizer that automatically combines Muon and Adam. For matrix params: Muon update with Newton-Schulz based on selected muon_mode. For 1D params: Standard Adam. Name-based Adam routing is enabled: final effective parameter name segment containing ‘bias’ or starting with ‘adam_’ (case-insensitive) always uses Adam (no weight decay); segment starting with ‘adamw_’ (case-insensitive) uses AdamW-style decoupled decay. Trailing numeric ParameterList indices are ignored when deriving the effective segment. This is DIFFERENT from PyTorch’s torch.optim.Muon which ONLY supports 2D parameters.
When type is set to
Adam:- adam_beta1:#
- type:
float, optional, default:0.9argument path:optimizer[Adam]/adam_beta1Adam beta1 coefficient for first moment decay.
- adam_beta2:#
- type:
float, optional, default:0.999argument path:optimizer[Adam]/adam_beta2Adam beta2 coefficient for second moment decay.
- weight_decay:#
- type:
float, optional, default:0.0argument path:optimizer[Adam]/weight_decayWeight decay coefficient for Adam. In PyTorch and Paddle, this is an L2 penalty applied to gradients. TensorFlow does not support weight_decay and requires this value to be 0.
When type is set to
AdamW:(Supported Backend: PyTorch)
- adam_beta1:#
- type:
float, optional, default:0.9argument path:optimizer[AdamW]/adam_beta1(Supported Backend: PyTorch) AdamW beta1 coefficient for first moment decay.
- adam_beta2:#
- type:
float, optional, default:0.999argument path:optimizer[AdamW]/adam_beta2(Supported Backend: PyTorch) AdamW beta2 coefficient for second moment decay.
- weight_decay:#
- type:
float, optional, default:0.001argument path:optimizer[AdamW]/weight_decay(Supported Backend: PyTorch) Decoupled weight decay coefficient for AdamW optimizer (PyTorch only).
When type is set to
LKF:(Supported Backend: PyTorch)
- kf_blocksize:#
- type:
int, optional, default:5120argument path:optimizer[LKF]/kf_blocksize(Supported Backend: PyTorch) The blocksize for the Kalman filter.
- kf_start_pref_e:#
- type:
float, optional, default:1.0argument path:optimizer[LKF]/kf_start_pref_e(Supported Backend: PyTorch) The prefactor of energy loss at the start of Kalman filter updates.
- kf_limit_pref_e:#
- type:
float, optional, default:1.0argument path:optimizer[LKF]/kf_limit_pref_e(Supported Backend: PyTorch) The prefactor of energy loss at the end of training for Kalman filter updates.
- kf_start_pref_f:#
- type:
float, optional, default:1.0argument path:optimizer[LKF]/kf_start_pref_f(Supported Backend: PyTorch) The prefactor of force loss at the start of Kalman filter updates.
- kf_limit_pref_f:#
- type:
float, optional, default:1.0argument path:optimizer[LKF]/kf_limit_pref_f(Supported Backend: PyTorch) The prefactor of force loss at the end of training for Kalman filter updates.
When type is set to
AdaMuon:(Supported Backend: PyTorch)
- momentum:#
- type:
float, optional, default:0.95, alias: muon_momentumargument path:optimizer[AdaMuon]/momentum(Supported Backend: PyTorch) Momentum coefficient for AdaMuon optimizer.
- adam_beta1:#
- type:
float, optional, default:0.9argument path:optimizer[AdaMuon]/adam_beta1(Supported Backend: PyTorch) Adam beta1 coefficient for AdaMuon optimizer.
- adam_beta2:#
- type:
float, optional, default:0.95argument path:optimizer[AdaMuon]/adam_beta2(Supported Backend: PyTorch) Adam beta2 coefficient for AdaMuon optimizer.
- weight_decay:#
- type:
float, optional, default:0.001argument path:optimizer[AdaMuon]/weight_decay(Supported Backend: PyTorch) Weight decay coefficient. Applied only to >=2D parameters (AdaMuon path).
- lr_adjust:#
- type:
float, optional, default:10.0argument path:optimizer[AdaMuon]/lr_adjust(Supported Backend: PyTorch) Learning rate adjustment factor for Adam (1D params). If lr_adjust <= 0: use match-RMS scaling (scale = lr_adjust_coeff * sqrt(max(m, n))), Adam uses lr directly. If lr_adjust > 0: use rectangular correction (scale = sqrt(max(1.0, m/n))), Adam uses lr/lr_adjust.
- lr_adjust_coeff:#
- type:
float, optional, default:0.2argument path:optimizer[AdaMuon]/lr_adjust_coeff(Supported Backend: PyTorch) Coefficient for match-RMS scaling. Only effective when lr_adjust <= 0.
When type is set to
HybridMuon:(Supported Backend: PyTorch) HybridMuon optimizer (DeePMD-kit custom implementation). This is a Hybrid optimizer that automatically combines Muon and Adam. For matrix params: Muon update with Newton-Schulz based on selected muon_mode. For 1D params: Standard Adam. Name-based Adam routing is enabled: final effective parameter name segment containing ‘bias’ or starting with ‘adam_’ (case-insensitive) always uses Adam (no weight decay); segment starting with ‘adamw_’ (case-insensitive) uses AdamW-style decoupled decay. Trailing numeric ParameterList indices are ignored when deriving the effective segment. This is DIFFERENT from PyTorch’s torch.optim.Muon which ONLY supports 2D parameters.
- momentum:#
- type:
float, optional, default:0.95, alias: muon_momentumargument path:optimizer[HybridMuon]/momentum(Supported Backend: PyTorch) Momentum coefficient for HybridMuon optimizer (>=2D params). Used in Nesterov momentum update: m_t = beta*m_{t-1} + (1-beta)*g_t.
- adam_beta1:#
- type:
float, optional, default:0.9argument path:optimizer[HybridMuon]/adam_beta1(Supported Backend: PyTorch) Adam beta1 coefficient for 1D parameters (biases, norms).
- adam_beta2:#
- type:
float, optional, default:0.95argument path:optimizer[HybridMuon]/adam_beta2(Supported Backend: PyTorch) Adam beta2 coefficient for 1D parameters (biases, norms).
- weight_decay:#
- type:
float, optional, default:0.001argument path:optimizer[HybridMuon]/weight_decay(Supported Backend: PyTorch) Weight decay coefficient. Applied to Muon-routed parameters and the AdamW-style decay path for matrix parameters.
- lr_adjust:#
- type:
float, optional, default:0.0argument path:optimizer[HybridMuon]/lr_adjust(Supported Backend: PyTorch) Learning rate adjustment mode for HybridMuon scaling and Adam learning rate. If lr_adjust <= 0: use match-RMS scaling (scale = coeff*sqrt(max(m,n))), Adam uses lr directly. If lr_adjust > 0: use rectangular correction (scale = sqrt(max(1, m/n))), Adam uses lr/lr_adjust. Default is 0.0 (match-RMS scaling).
- lr_adjust_coeff:#
- type:
float, optional, default:0.18argument path:optimizer[HybridMuon]/lr_adjust_coeff(Supported Backend: PyTorch) Coefficient for match-RMS scaling. Only effective when lr_adjust <= 0. Default 0.18 follows DeepSeek-V4’s calibration so Muon update RMS matches AdamW’s typical RMS; Moonlight’s original recipe uses 0.2.
- muon_mode:#
- type:
str, optional, default:sliceargument path:optimizer[HybridMuon]/muon_mode(Supported Backend: PyTorch) Muon routing mode. ‘2d’: only effective-rank-2 params are eligible for Muon; effective rank >2 goes to AdamW-style decoupled decay path. ‘flat’: effective-rank >=2 params are flattened to matrix-view (prod(shape[:-1]), shape[-1]) for Muon. ‘slice’ (default): effective-rank >=3 params use per-slice Muon on the last two dimensions; no cross-slice mixing. Routing uses effective shape after removing singleton dimensions.
- enable_gram:#
- type:
bool, optional, default:Trueargument path:optimizer[HybridMuon]/enable_gram(Supported Backend: PyTorch) Enable the compiled Gram Newton-Schulz path for rectangular Muon matrices. Square matrices keep using the current standard Newton-Schulz path.
- flash_muon:#
- type:
bool, optional, default:Trueargument path:optimizer[HybridMuon]/flash_muon(Supported Backend: PyTorch) Enable triton-accelerated Newton-Schulz orthogonalization. Requires triton and CUDA. Falls back to PyTorch implementation when triton is unavailable or running on CPU. Ignored when enable_gram is true.
- magma_muon:#
- type:
bool, optional, default:Trueargument path:optimizer[HybridMuon]/magma_muon(Supported Backend: PyTorch) Enable Magma-lite damping on the Muon route only. When enabled, HybridMuon computes momentum-gradient alignment per Muon block, applies EMA smoothing, and rescales Muon updates to improve stability. Adam/AdamW routes are unchanged.
- loss:#
- type:
dict, optionalargument path:lossThe definition of loss function. The loss type should be set to tensor, ener, dens or left unset.
Depending on the value of type, different sub args are accepted.
- type:#
- type:
str(flag key), default:enerargument path:loss/typeThe type of the loss. When the fitting type is ener, the loss type should be set to ener, dens (Only DPA4/SeZM supported), or left unset. When the fitting type is dipole or polar, the loss type should be set to tensor.
When type is set to
ener:- start_pref_e:#
- type:
int|float, optional, default:0.02argument path:loss[ener]/start_pref_eThe prefactor of energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the energy label should be provided by file energy.npy in each data system. If both start_pref_e and limit_pref_e are set to 0, then the energy will be ignored.
- limit_pref_e:#
- type:
int|float, optional, default:1.0argument path:loss[ener]/limit_pref_eThe prefactor of energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_f:#
- type:
int|float, optional, default:1000argument path:loss[ener]/start_pref_fThe prefactor of force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force label should be provided by file force.npy in each data system. If both start_pref_f and limit_pref_f are set to 0, then the force will be ignored.
- limit_pref_f:#
- type:
int|float, optional, default:1.0argument path:loss[ener]/limit_pref_fThe prefactor of force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_v:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/start_pref_vThe prefactor of virial loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the virial label should be provided by file virial.npy in each data system. If both start_pref_v and limit_pref_v are set to 0, then the virial will be ignored.
- limit_pref_v:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/limit_pref_vThe prefactor of virial loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_h:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/start_pref_hThe prefactor of hessian loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the hessian label should be provided by file hessian.npy in each data system. If both start_pref_h and limit_pref_h are set to 0, then the hessian will be ignored.
- limit_pref_h:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/limit_pref_hThe prefactor of hessian loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_ae:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/start_pref_aeThe prefactor of atomic energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_ener label should be provided by file atom_ener.npy in each data system. If both start_pref_ae and limit_pref_ae are set to 0, then the atomic energy will be ignored.
- limit_pref_ae:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/limit_pref_aeThe prefactor of atomic energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_pf:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/start_pref_pfThe prefactor of atomic prefactor force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_pref label should be provided by file atom_pref.npy in each data system. If both start_pref_pf and limit_pref_pf are set to 0, then the atomic prefactor force will be ignored.
- limit_pref_pf:#
- type:
int|float, optional, default:0.0argument path:loss[ener]/limit_pref_pfThe prefactor of atomic prefactor force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- use_default_pf:#
- type:
bool, optional, default:Falseargument path:loss[ener]/use_default_pfIf true, use default atom_pref of 1.0 for all atoms when atom_pref data is not provided. This allows using the prefactor force loss (pf) without requiring atom_pref.npy files in training data. When atom_pref.npy is provided, it will be used as-is regardless of this setting. Note: this option is only effective for the PyTorch/DPModel backends; the TensorFlow and Paddle backends raise NotImplementedError when set to true.
- relative_f:#
- type:
NoneType|float, optionalargument path:loss[ener]/relative_fIf provided, relative force error will be used in the loss. The difference of force will be normalized by the magnitude of the force in the label with a shift given by relative_f, i.e. DF_i / ( || F || + relative_f ) with DF denoting the difference between prediction and label and || F || denoting the L2 norm of the label.
- enable_atom_ener_coeff:#
- type:
bool, optional, default:Falseargument path:loss[ener]/enable_atom_ener_coeffIf true, the energy will be computed as sum_i c_i E_i. c_i should be provided by file atom_ener_coeff.npy in each data system, otherwise it’s 1.
- start_pref_gf:#
- type:
float, optional, default:0.0argument path:loss[ener]/start_pref_gfThe prefactor of generalized force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the drdq label should be provided by file drdq.npy in each data system. If both start_pref_gf and limit_pref_gf are set to 0, then the generalized force will be ignored.
- limit_pref_gf:#
- type:
float, optional, default:0.0argument path:loss[ener]/limit_pref_gfThe prefactor of generalized force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- numb_generalized_coord:#
- type:
int, optional, default:0argument path:loss[ener]/numb_generalized_coordThe dimension of generalized coordinates. Required when generalized force loss is used.
- use_huber:#
- type:
bool, optional, default:Falseargument path:loss[ener]/use_huberEnables Huber loss calculation for energy/force/virial terms with user-defined threshold delta (D). The loss function smoothly transitions between L2 and L1 loss:
For absolute prediction errors within D: quadratic loss 0.5 * (error**2)
For absolute errors exceeding D: linear loss D * (|error| - 0.5 * D)
Formula: loss = 0.5 * (error**2) if |error| <= D else D * (|error| - 0.5 * D).
- loss_func:#
- type:
str, optional, default:mseargument path:loss[ener]/loss_funcLoss function type for energy, force, and virial terms. Options: ‘mse’ (Mean Squared Error, L2 loss, default) or ‘mae’ (Mean Absolute Error, L1 loss). MAE loss is less sensitive to outliers compared to MSE loss. Future extensions may support additional loss types.
- f_use_norm:#
- type:
bool, optional, default:Falseargument path:loss[ener]/f_use_normIf true, use L2 norm of force vectors for loss calculation when loss_func=’mae’ or use_huber is True. Instead of computing loss on individual force components, computes loss on ||F_pred - F_label||_2 for each atom. This treats the force vector as a whole rather than three independent components. Only effective when loss_func=’mae’ or use_huber=True.
- huber_delta:#
- type:
list[float]|float, optional, default:0.01argument path:loss[ener]/huber_deltaThe threshold delta (D) used for Huber loss, controlling transition between L2 and L1 loss. It can be either one float shared by all terms or a list of three values ordered as [energy, force, virial].
- intensive_ener_virial:#
- type:
bool, optional, default:Falseargument path:loss[ener]/intensive_ener_virialControls intensive normalization for energy and virial loss terms in the current implementation. For non-Huber MSE energy/virial terms, setting this to true uses 1/N^2 normalization instead of the legacy 1/N scaling. This matches per-atom-style reporting more closely for those terms. For MAE, the normalization remains 1/N. When use_huber=True, the residual is already scaled by 1/N before applying the Huber loss, so this flag may have limited or no effect for those terms. The default is false for backward compatibility with models trained using deepmd-kit <= 3.1.3.
When type is set to
dens:- start_pref_e:#
- type:
int|float, optional, default:0.02argument path:loss[dens]/start_pref_eThe prefactor of energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the energy label should be provided by file energy.npy in each data system. If both start_pref_e and limit_pref_e are set to 0, then the energy will be ignored.
- limit_pref_e:#
- type:
int|float, optional, default:1.0argument path:loss[dens]/limit_pref_eThe prefactor of energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_f:#
- type:
int|float, optional, default:1000argument path:loss[dens]/start_pref_fThe prefactor of force loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force label should be provided by file force.npy in each data system. If both start_pref_f and limit_pref_f are set to 0, then the force will be ignored.
- limit_pref_f:#
- type:
int|float, optional, default:1.0argument path:loss[dens]/limit_pref_fThe prefactor of force loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- loss_func:#
- type:
str, optional, default:maeargument path:loss[dens]/loss_funcLoss function type for energy and mixed direct-force / denoising supervision. Options: ‘mse’ (Mean Squared Error, component-wise force loss) or ‘mae’ (Mean Absolute Error, default). In dens mode, f_use_norm is not exposed: mae always uses per-atom force-vector L2 norms, while mse always uses component-wise squared errors.
- dens_prob:#
- type:
int|float, optional, default:0.5argument path:loss[dens]/dens_probProbability of switching one batch to the denoising-enhanced training path. When not selected, the dens head is still trained on clean direct forces.
- dens_fixed_noise_std:#
- type:
bool, optional, default:Trueargument path:loss[dens]/dens_fixed_noise_stdWhether to use a fixed Gaussian noise standard deviation. Only the fixed-noise path is supported in the initial SeZM dens integration.
- dens_std:#
- type:
int|float, optional, default:0.025argument path:loss[dens]/dens_stdStandard deviation of the Gaussian coordinate corruption used in the denoising path.
- dens_corrupt_ratio:#
- type:
int|NoneType|float, optional, default:0.5argument path:loss[dens]/dens_corrupt_ratioFraction of atoms corrupted within a denoising batch. If omitted, all atoms in the batch are corrupted.
- dens_denoising_pos_coefficient:#
- type:
int|float, optional, default:10.0argument path:loss[dens]/dens_denoising_pos_coefficientLoss multiplier applied to corrupted atoms whose target is the injected noise vector.
When type is set to
ener_spin:- start_pref_e:#
- type:
int|float, optional, default:0.02argument path:loss[ener_spin]/start_pref_eThe prefactor of energy loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the energy label should be provided by file energy.npy in each data system. If both start_pref_energy and limit_pref_energy are set to 0, then the energy will be ignored.
- limit_pref_e:#
- type:
int|float, optional, default:1.0argument path:loss[ener_spin]/limit_pref_eThe prefactor of energy loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_fr:#
- type:
int|float, optional, default:1000argument path:loss[ener_spin]/start_pref_frThe prefactor of force_real_atom loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force_real_atom label should be provided by file force_real_atom.npy in each data system. If both start_pref_force_real_atom and limit_pref_force_real_atom are set to 0, then the force_real_atom will be ignored.
- limit_pref_fr:#
- type:
int|float, optional, default:1.0argument path:loss[ener_spin]/limit_pref_frThe prefactor of force_real_atom loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_fm:#
- type:
int|float, optional, default:10000argument path:loss[ener_spin]/start_pref_fmThe prefactor of force_magnetic loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the force_magnetic label should be provided by file force_magnetic.npy in each data system. If both start_pref_force_magnetic and limit_pref_force_magnetic are set to 0, then the force_magnetic will be ignored.
- limit_pref_fm:#
- type:
int|float, optional, default:10.0argument path:loss[ener_spin]/limit_pref_fmThe prefactor of force_magnetic loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_v:#
- type:
int|float, optional, default:0.0argument path:loss[ener_spin]/start_pref_vThe prefactor of virial loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the virial label should be provided by file virial.npy in each data system. If both start_pref_virial and limit_pref_virial are set to 0, then the virial will be ignored.
- limit_pref_v:#
- type:
int|float, optional, default:0.0argument path:loss[ener_spin]/limit_pref_vThe prefactor of virial loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_ae:#
- type:
int|float, optional, default:0.0argument path:loss[ener_spin]/start_pref_aeThe prefactor of atom_ener loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_ener label should be provided by file atom_ener.npy in each data system. If both start_pref_atom_ener and limit_pref_atom_ener are set to 0, then the atom_ener will be ignored.
- limit_pref_ae:#
- type:
int|float, optional, default:0.0argument path:loss[ener_spin]/limit_pref_aeThe prefactor of atom_ener loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_pf:#
- type:
int|float, optional, default:0.0argument path:loss[ener_spin]/start_pref_pfThe prefactor of atom_pref loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atom_pref label should be provided by file atom_pref.npy in each data system. If both start_pref_atom_pref and limit_pref_atom_pref are set to 0, then the atom_pref will be ignored.
- limit_pref_pf:#
- type:
int|float, optional, default:0.0argument path:loss[ener_spin]/limit_pref_pfThe prefactor of atom_pref loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- relative_f:#
- type:
NoneType|float, optionalargument path:loss[ener_spin]/relative_fIf provided, relative force error will be used in the loss. The difference of force will be normalized by the magnitude of the force in the label with a shift given by relative_f, i.e. DF_i / ( || F || + relative_f ) with DF denoting the difference between prediction and label and || F || denoting the L2 norm of the label.
- enable_atom_ener_coeff:#
- type:
bool, optional, default:Falseargument path:loss[ener_spin]/enable_atom_ener_coeffIf true, the energy will be computed as sum_i c_i E_i. c_i should be provided by file atom_ener_coeff.npy in each data system, otherwise it’s 1.
- loss_func:#
- type:
str, optional, default:mseargument path:loss[ener_spin]/loss_funcLoss function type for energy, force, and virial terms. Options: ‘mse’ (Mean Squared Error, L2 loss, default) or ‘mae’ (Mean Absolute Error, L1 loss). MAE loss is less sensitive to outliers compared to MSE loss. Future extensions may support additional loss types.
- intensive_ener_virial:#
- type:
bool, optional, default:Falseargument path:loss[ener_spin]/intensive_ener_virialControls normalization of the energy and virial loss terms. For loss_func=’mse’, if true, energy and virial losses are computed as intensive quantities, normalized by the square of the number of atoms (1/N^2); if false (default), the legacy normalization (1/N) is used. For loss_func=’mae’, this option does not change the existing MAE formulations;The default is false for backward compatibility with models trained using deepmd-kit <= 3.1.3.
When type is set to
dos:- start_pref_dos:#
- type:
int|float, optional, default:0.0argument path:loss[dos]/start_pref_dosThe prefactor of Density of State (DOS) loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the Density of State (DOS) label should be provided by file Density of State (DOS).npy in each data system. If both start_pref_Density of State (DOS) and limit_pref_Density of State (DOS) are set to 0, then the Density of State (DOS) will be ignored.
- limit_pref_dos:#
- type:
int|float, optional, default:0.0argument path:loss[dos]/limit_pref_dosThe prefactor of Density of State (DOS) loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_cdf:#
- type:
int|float, optional, default:0.0argument path:loss[dos]/start_pref_cdfThe prefactor of Cumulative Distribution Function (cumulative integral of DOS) loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the Cumulative Distribution Function (cumulative integral of DOS) label should be provided by file Cumulative Distribution Function (cumulative integral of DOS).npy in each data system. If both start_pref_Cumulative Distribution Function (cumulative integral of DOS) and limit_pref_Cumulative Distribution Function (cumulative integral of DOS) are set to 0, then the Cumulative Distribution Function (cumulative integral of DOS) will be ignored.
- limit_pref_cdf:#
- type:
int|float, optional, default:0.0argument path:loss[dos]/limit_pref_cdfThe prefactor of Cumulative Distribution Function (cumulative integral of DOS) loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_ados:#
- type:
int|float, optional, default:1.0argument path:loss[dos]/start_pref_adosThe prefactor of atomic DOS (site-projected DOS) loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the atomic DOS (site-projected DOS) label should be provided by file atomic DOS (site-projected DOS).npy in each data system. If both start_pref_atomic DOS (site-projected DOS) and limit_pref_atomic DOS (site-projected DOS) are set to 0, then the atomic DOS (site-projected DOS) will be ignored.
- limit_pref_ados:#
- type:
int|float, optional, default:1.0argument path:loss[dos]/limit_pref_adosThe prefactor of atomic DOS (site-projected DOS) loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
- start_pref_acdf:#
- type:
int|float, optional, default:0.0argument path:loss[dos]/start_pref_acdfThe prefactor of Cumulative integral of atomic DOS loss at the start of the training. Should be larger than or equal to 0. If set to none-zero value, the Cumulative integral of atomic DOS label should be provided by file Cumulative integral of atomic DOS.npy in each data system. If both start_pref_Cumulative integral of atomic DOS and limit_pref_Cumulative integral of atomic DOS are set to 0, then the Cumulative integral of atomic DOS will be ignored.
- limit_pref_acdf:#
- type:
int|float, optional, default:0.0argument path:loss[dos]/limit_pref_acdfThe prefactor of Cumulative integral of atomic DOS loss at the limit of the training, Should be larger than or equal to 0. i.e. the training step goes to infinity.
When type is set to
property:- loss_func:#
- type:
str, optional, default:smooth_maeargument path:loss[property]/loss_funcThe loss function to minimize, such as ‘mae’,’smooth_mae’.
- metric:#
- type:
list, optional, default:['mae']argument path:loss[property]/metricThe metric for display. This list can include ‘smooth_mae’, ‘mae’, ‘mse’ and ‘rmse’.
- beta:#
- type:
int|float, optional, default:1.0argument path:loss[property]/betaThe ‘beta’ parameter in ‘smooth_mae’ loss.
When type is set to
tensor:- pref:#
- type:
int|floatargument path:loss[tensor]/prefThe prefactor of the weight of global loss. It should be larger than or equal to 0. It controls the weight of loss corresponding to global label, i.e. ‘polarizability.npy` or dipole.npy, whose shape should be #frames x [9 or 3]. If it’s larger than 0.0, this npy should be included.
- pref_atomic:#
- type:
int|floatargument path:loss[tensor]/pref_atomicThe prefactor of the weight of atomic loss. It should be larger than or equal to 0. It controls the weight of loss corresponding to atomic label, i.e. atomic_polarizability.npy or atomic_dipole.npy, whose shape should be #frames x ([9 or 3] x #atoms). If it’s larger than 0.0, this npy should be included. Both pref and pref_atomic should be provided, and either can be set to 0.0.
- enable_atomic_weight:#
- type:
bool, optional, default:Falseargument path:loss[tensor]/enable_atomic_weightIf true, the atomic loss will be reweighted.
- training:#
- type:
dictargument path:trainingThe training options.
- training_data:#
- type:
dict, optionalargument path:training/training_dataConfigurations of training data.
- systems:#
- type:
list[str]|strargument path:training/training_data/systemsThe data systems for training. This key can be a list or a str. When provided as a string, it can be a system directory path (containing ‘type.raw’) or a parent directory path to recursively search for all system subdirectories. When provided as a list, each string item in the list is processed the same way as individual string inputs, i.e., each path can be a system directory or a parent directory to recursively search for all system subdirectories.
- rglob_patterns:#
- type:
list[str]|NoneType, optional, default:Noneargument path:training/training_data/rglob_patternsThe customized patterns used in rglob to collect all training systems. (Supported Backend: PyTorch)
- batch_size:#
- type:
str|int|list[int], optional, default:autoargument path:training/training_data/batch_sizeThis key can be
list: the length of which is the same as the systems. The batch size of each system is given by the elements of the list.
int: all systems use the same batch size.
string “auto”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than 32.
string “auto:N”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than N.
string “mixed:N”: the batch data will be sampled from all systems and merged into a mixed system with the batch size N. Only support the se_atten descriptor for TensorFlow backend.
string “max:N”: automatically determines the batch size so that batch_size * natoms is at most N. natoms is the per-system atom count for npy data and the per-frame nloc for LMDB data. When a single system/frame already has more than N atoms, the batch size clamps to 1 and that batch will exceed N.
string “filter:N”: the same as “max:N” but additionally drops data whose atom count exceeds N. For npy data this removes whole systems with natoms > N; for LMDB data this removes individual frames with nloc > N.
If MPI is used, the value should be considered as the batch size per task.
- auto_prob:#
- type:
str, optional, default:prob_sys_size, alias: auto_prob_styleargument path:training/training_data/auto_probDetermine the probability of systems automatically. The method is assigned by this key and can be
“prob_uniform” : the probability all the systems are equal, namely 1.0/self.get_nsystems()
“prob_sys_size” : the probability of a system is proportional to the number of batches in the system
“prob_sys_size;stt_idx:end_idx:weight;stt_idx:end_idx:weight;…” : the list of systems is divided into blocks. A block is specified by stt_idx:end_idx:weight, where stt_idx is the starting index of the system, end_idx is then ending (not including) index of the system, the probabilities of the systems in this block sums up to weight, and the relatively probabilities within this block is proportional to the number of batches in the system.
- sys_probs:#
- type:
list[float]|NoneType, optional, default:None, alias: sys_weightsargument path:training/training_data/sys_probsA list of float if specified. Should be of the same length as systems, specifying the probability of each system.
- min_pair_dist:#
- type:
float, optional, default:0.0argument path:training/training_data/min_pair_dist(Supported Backend: PyTorch) Minimum pairwise atomic distance threshold in Å. Frames containing any atom pair closer than this distance are excluded from loss computation, as DFT labels for near-collision configurations are often unreliable. Set to 0 to disable (default). Under distributed training (DDP/FSDP), if ALL frames in a batch are filtered out on a given rank, one frame is retained to ensure every rank participates in collective communication (backward all-reduce). Note: enabling this adds an O(N²) distance check per frame in the DataLoader workers (CPU-side), which may slow down training for large systems. To avoid the overhead, consider pre-cleaning the dataset instead.
- validation_data:#
- type:
NoneType|dict, optional, default:Noneargument path:training/validation_dataConfigurations of validation data. Similar to that of training data, except that a numb_btch argument may be configured
- systems:#
- type:
list[str]|strargument path:training/validation_data/systemsThe data systems for validation. This key can be a list or a str. When provided as a string, it can be a system directory path (containing ‘type.raw’) or a parent directory path to recursively search for all system subdirectories. When provided as a list, each string item in the list is processed the same way as individual string inputs, i.e., each path can be a system directory or a parent directory to recursively search for all system subdirectories.
- rglob_patterns:#
- type:
list[str]|NoneType, optional, default:Noneargument path:training/validation_data/rglob_patternsThe customized patterns used in rglob to collect all validation systems. (Supported Backend: PyTorch)
- batch_size:#
- type:
str|int|list[int], optional, default:autoargument path:training/validation_data/batch_sizeThis key can be
list: the length of which is the same as the systems. The batch size of each system is given by the elements of the list.
int: all systems use the same batch size.
string “auto”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than 32.
string “auto:N”: automatically determines the batch size so that the batch_size times the number of atoms in the system is no less than N.
string “max:N”: automatically determines the batch size so that batch_size * natoms is at most N. natoms is the per-system atom count for npy data and the per-frame nloc for LMDB data. When a single system/frame already has more than N atoms, the batch size clamps to 1 and that batch will exceed N.
string “filter:N”: the same as “max:N” but additionally drops data whose atom count exceeds N. For npy data this removes whole systems with natoms > N; for LMDB data this removes individual frames with nloc > N.
- auto_prob:#
- type:
str, optional, default:prob_sys_size, alias: auto_prob_styleargument path:training/validation_data/auto_probDetermine the probability of systems automatically. The method is assigned by this key and can be
“prob_uniform” : the probability all the systems are equal, namely 1.0/self.get_nsystems()
“prob_sys_size” : the probability of a system is proportional to the number of batches in the system
“prob_sys_size;stt_idx:end_idx:weight;stt_idx:end_idx:weight;…” : the list of systems is divided into blocks. A block is specified by stt_idx:end_idx:weight, where stt_idx is the starting index of the system, end_idx is then ending (not including) index of the system, the probabilities of the systems in this block sums up to weight, and the relatively probabilities within this block is proportional to the number of batches in the system.
- sys_probs:#
- type:
list[float]|NoneType, optional, default:None, alias: sys_weightsargument path:training/validation_data/sys_probsA list of float if specified. Should be of the same length as systems, specifying the probability of each system.
- numb_btch:#
- type:
int, optional, default:1, alias: numb_batchargument path:training/validation_data/numb_btchAn integer that specifies the number of batches to be sampled for each validation period.
- stat_file:#
- type:
str, optionalargument path:training/stat_file(Supported Backend: PyTorch) The file path for saving the data statistics results. If set, the results will be saved and directly loaded during the next training session, avoiding the need to recalculate the statistics. If the file extension is .h5 or .hdf5, an HDF5 file is used to store the statistics; otherwise, a directory containing NumPy binary files are used.
- mixed_precision:#
- type:
dict, optionalargument path:training/mixed_precisionConfigurations of mixed precision.
- output_prec:#
- type:
str, optional, default:float32argument path:training/mixed_precision/output_precThe precision for mixed precision params. “ “The trainable variables precision during the mixed precision training process, “ “supported options are float32 only currently.
- compute_prec:#
- type:
strargument path:training/mixed_precision/compute_precThe precision for mixed precision compute. “ “The compute precision during the mixed precision training process, “” “supported options are float16 and bfloat16 currently.
- numb_steps:#
- type:
int, optional, aliases: stop_batch, num_step, num_steps, numb_stepargument path:training/numb_stepsNumber of training steps (num_step). Each training uses one batch of data. Mutually exclusive with num_epoch in single-task mode. In multi-task mode, this is mutually exclusive with num_epoch_dict. Accepted names: num_step, num_steps, numb_step, numb_steps, stop_batch.
- numb_epoch:#
- type:
int|float, optional, aliases: num_epochs, num_epoch, numb_epochsargument path:training/numb_epochNumber of training epochs (num_epoch; can be fractional) for single-task mode only. Because each step samples the dataset stochastically, this corresponds to an expected epoch count rather than a deterministic full pass. When num_step is not set, the total steps are computed as ceil(num_epoch * total_numb_batch). total_numb_batch is computed as ceil(max_i(n_bch_i / p_i)), where n_bch_i is the number of batches for system i and p_i is the sampling probability after sys_probs/auto_prob normalization. Mutually exclusive with num_step. For multi-task mode, use num_epoch_dict instead. Accepted names: num_epoch, num_epochs, numb_epoch, numb_epochs.
- seed:#
- type:
int|NoneType, optionalargument path:training/seedThe random seed for getting frames from the training data set.
- disp_file:#
- type:
str, optional, default:lcurve.outargument path:training/disp_fileThe file for printing learning curve.
- disp_freq:#
- type:
int, optional, default:1000argument path:training/disp_freqThe frequency of printing learning curve.
- save_freq:#
- type:
int, optional, default:1000argument path:training/save_freqThe frequency of saving check point.
- save_ckpt:#
- type:
str, optional, default:model.ckptargument path:training/save_ckptThe path prefix of saving check point files.
- max_ckpt_keep:#
- type:
int, optional, default:5argument path:training/max_ckpt_keepThe maximum number of checkpoints to keep. The oldest checkpoints will be deleted once the number of checkpoints exceeds max_ckpt_keep. Defaults to 5.
- enable_ema:#
- type:
bool, optional, default:Falseargument path:training/enable_ema(Supported Backend: PyTorch) Whether to maintain an exponential moving average (EMA) of model parameters during training and save periodic EMA checkpoints with an _ema suffix in the checkpoint prefix.
- ema_decay:#
- type:
float, optional, default:0.999argument path:training/ema_decay(Supported Backend: PyTorch) The decay factor used for the exponential moving average of model parameters. The EMA update is ema = ema_decay * ema + (1 - ema_decay) * param.
- ema_ckpt_keep:#
- type:
int, optional, default:3argument path:training/ema_ckpt_keep(Supported Backend: PyTorch) The maximum number of periodic EMA checkpoints to keep. EMA checkpoints use the same prefix-based cleanup rule as regular training checkpoints, but with an EMA-specific checkpoint prefix.
- change_bias_after_training:#
- type:
bool, optional, default:Falseargument path:training/change_bias_after_trainingWhether to change the output bias after the last training step, by performing predictions using trained model on training data and doing least square on the errors to add the target shift on the bias.
- disp_training:#
- type:
bool, optional, default:Trueargument path:training/disp_trainingDisplaying verbose information during training.
- time_training:#
- type:
bool, optional, default:Trueargument path:training/time_trainingTiming during training.
- disp_avg:#
- type:
bool, optional, default:Falseargument path:training/disp_avg(Supported Backend: PyTorch) Display the average loss over the display interval for training sets.
- profiling:#
- type:
bool, optional, default:Falseargument path:training/profilingExport the profiling results to the Chrome JSON file for performance analysis, driven by the legacy TensorFlow profiling API or PyTorch Profiler. The output file will be saved to profiling_file. In the PyTorch backend, when enable_profiler is True, this option is ignored, since the profiling results will be saved to the TensorBoard log.
- profiling_file:#
- type:
str, optional, default:timeline.jsonargument path:training/profiling_fileOutput file for profiling.
- enable_profiler:#
- type:
bool, optional, default:Falseargument path:training/enable_profilerExport the profiling results to the TensorBoard log for performance analysis, driven by TensorFlow Profiler (available in TensorFlow 2.3) or PyTorch Profiler. The log will be saved to tensorboard_log_dir.
- tensorboard:#
- type:
bool, optional, default:Falseargument path:training/tensorboardEnable tensorboard
- tensorboard_log_dir:#
- type:
str, optional, default:logargument path:training/tensorboard_log_dirThe log directory of tensorboard outputs
- tensorboard_freq:#
- type:
int, optional, default:1argument path:training/tensorboard_freqThe frequency of writing tensorboard events.
- gradient_max_norm:#
- type:
float, optionalargument path:training/gradient_max_norm(Supported Backend: PyTorch) Clips the gradient norm to a maximum value. If the gradient norm exceeds this value, it will be clipped to this limit. No gradient clipping will occur if set to 0.
- acc_freq:#
- type:
int, optional, default:1argument path:training/acc_freq(Supported Backend: Paddle) Gradient accumulation steps (number of steps to accumulate gradients before performing an update).
- zero_stage:#
- type:
int, optional, default:0argument path:training/zero_stage(Supported Backend: PyTorch) ZeRO optimization stage for distributed training memory reduction. 0: standard DDP, lowest communication overhead but highest memory usage (full optimizer states, gradients, and parameters replicated on every GPU). 1: DDP + ZeRO stage-1, shards optimizer states across GPUs via ZeroRedundancyOptimizer; same communication volume as DDP (2x model size) but reduces optimizer memory to 1/N per GPU. 2: FSDP2 stage-2, shards optimizer states and gradients; same communication volume as stage-1 but further reduces gradient memory to 1/N per GPU. Stages 2 and 3 require FSDP2, which is available in PyTorch >= 2.6. Note: FSDP2 introduces DTensor dispatch overhead that can slow down models with many small layers; use torch.compile to mitigate. 3: FSDP2 stage-3, shards parameters as well; maximum memory savings but 50% more communication (3x model size) due to parameter all-gather in both forward and backward passes. Default is 0. Requires distributed launch via torchrun. Currently supports single-task training; does not support LKF or change_bias_after_training.
- enable_compile:#
- type:
bool, optional, default:Falseargument path:training/enable_compile(Supported Backend: PyTorch Exportable) Enable torch.compile to accelerate training. Uses make_fx to decompose autograd into primitive ops, then compiles with torch.compile/Inductor for kernel fusion. The first training step will be slower due to one-time compilation.
- validating:#
- type:
dict, optional, default:{}argument path:validating(Supported Backend: PyTorch) Independent full validation options for single-task energy training.
- full_validation:#
- type:
bool, optional, default:Falseargument path:validating/full_validation(Supported Backend: PyTorch) Whether to run an additional full validation pass over the entire validation dataset during training. This flow is independent from the display-time validation controlled by training.disp_freq. Only single-task energy training is supported. Multi-task, spin-energy, and training.zero_stage >= 2 are not supported.
- ema_full_validation:#
- type:
bool, optional, default:Falseargument path:validating/ema_full_validation(Supported Backend: PyTorch) Whether to additionally run the same full validation flow on the EMA-smoothed model when validating.full_validation=true. This reuses the existing full validation schedule, metric, start step, and best-checkpoint settings, writes results to an EMA-specific validation log such as val_ema.log, and saves EMA best checkpoints with a best_ema.ckpt prefix. Requires training.enable_ema=true.
- validation_freq:#
- type:
int, optional, default:5000argument path:validating/validation_freq(Supported Backend: PyTorch) The frequency, in training steps, of running the full validation pass.
- save_best:#
- type:
bool, optional, default:Trueargument path:validating/save_best(Supported Backend: PyTorch) Whether to save an extra checkpoint when the selected full validation metric reaches a new best value.
- max_best_ckpt:#
- type:
int, optional, default:1argument path:validating/max_best_ckpt(Supported Backend: PyTorch) The maximum number of top-ranked best checkpoints to keep. The best checkpoints are ranked by the selected validation metric in ascending order. Default is 1.
- validation_metric:#
- type:
str, optional, default:E:MAEargument path:validating/validation_metric(Supported Backend: PyTorch) Metric used to determine the best checkpoint during full validation. Supported values are E:MAE, E:RMSE, F:MAE, F:RMSE, V:MAE, V:RMSE. The string is case-insensitive. E and V are per-atom metrics; F uses component-wise force errors, matching dp test. The corresponding loss prefactors must not both be 0.
- full_val_file:#
- type:
str, optional, default:val.logargument path:validating/full_val_file(Supported Backend: PyTorch) The file for writing full validation results only. This file is independent from training.disp_file.
- full_val_start:#
- type:
int|float, optional, default:0.5argument path:validating/full_val_start(Supported Backend: PyTorch) The starting point of full validation. 0 means the feature is active from the beginning and will trigger at every validation_freq steps. A value in (0, 1) is interpreted as a ratio of training.numb_steps. 1 disables the feature. A value larger than 1 is interpreted as the starting step after integer conversion.
- compiled_infer:#
- type:
bool, optional, default:Falseargument path:validating/compiled_infer(Supported Backend: PyTorch) Whether to route eval-time forwards (including full validation) through the DPA4/SeZM torch.compile path instead of eager. When true, this flag is translated into DP_COMPILE_INFER=1 at trainer startup before any model is constructed, which is the env var SeZM samples inside SeZMModel.__init__. A manually exported DP_COMPILE_INFER takes precedence over this option. Only meaningful when model.use_compile=true; has no effect on models that do not implement the SeZM-style eval compile path.
- nvnmd:#
- type:
dict, optionalargument path:nvnmdThe nvnmd options.
- version:#
- type:
intargument path:nvnmd/versionconfiguration the nvnmd version (0 | 1), 0 for 4 types, 1 for 32 types
- max_nnei:#
- type:
intargument path:nvnmd/max_nneiconfiguration the max number of neighbors, 128|256 for version 0, 128 for version 1
- net_size:#
- type:
intargument path:nvnmd/net_sizeconfiguration the number of nodes of fitting_net, just can be set as 128
- map_file:#
- type:
strargument path:nvnmd/map_fileA file containing the mapping tables to replace the calculation of embedding nets
- config_file:#
- type:
strargument path:nvnmd/config_fileA file containing the parameters about how to implement the model in certain hardware
- weight_file:#
- type:
strargument path:nvnmd/weight_filea *.npy file containing the weights of the model
- enable:#
- type:
boolargument path:nvnmd/enableenable the nvnmd training
- restore_descriptor:#
- type:
boolargument path:nvnmd/restore_descriptorenable to restore the parameter of embedding_net from weight.npy
- restore_fitting_net:#
- type:
boolargument path:nvnmd/restore_fitting_netenable to restore the parameter of fitting_net from weight.npy
- quantize_descriptor:#
- type:
boolargument path:nvnmd/quantize_descriptorenable the quantizatioin of descriptor
- quantize_fitting_net:#
- type:
boolargument path:nvnmd/quantize_fitting_netenable the quantizatioin of fitting_net
5.4.1. Writing JSON files using Visual Studio Code#
When writing JSON files using Visual Studio Code, one can benefit from IntelliSense and validation by adding a JSON schema. To do so, in a VS Code workspace, one can generate a JSON schema file for the input file by running the following command:
dp doc-train-input --out-type json_schema > deepmd.json
Then one can map the schema by updating the workspace settings in the .vscode/settings.json file as follows:
{
"json.schemas": [
{
"fileMatch": [
"/**/*.json"
],
"url": "./deepmd.json"
}
]
}