deepmd.cluster package
Module that reads node resources, auto detects if running local or on SLURM.
- deepmd.cluster.get_resource() → Tuple[str, List[str], Optional[List[int]]][source]
Get local or slurm resources: nodename, nodelist, and gpus.
- Returns
- Tuple[str, List[str], Optional[List[int]]]
nodename, nodelist, and gpus
Submodules
deepmd.cluster.local module
Get local GPU resources.
deepmd.cluster.slurm module
MOdule to get resources on SLURM cluster.
References
https://github.com/deepsense-ai/tensorflow_on_slurm ####
- deepmd.cluster.slurm.get_resource() → Tuple[str, List[str], Optional[List[int]]][source]
Get SLURM resources: nodename, nodelist, and gpus.
- Returns
- Tuple[str, List[str], Optional[List[int]]]
nodename, nodelist, and gpus
- Raises
- RuntimeError
if number of nodes could not be retrieved
- ValueError
list of nodes is not of the same length sa number of nodes
- ValueError
if current nodename is not found in node list