Launcher¶
The launch
Decorator¶
The launcher allows launching multiple experiments on a cluster using hydra.
- mlxp.launcher.launch(config_path: str = 'configs', seeding_function: Callable[[Any], None] | None = None) Callable[[Callable[[Any], Any]], Any] [source]¶
Create a decorator of the main function to be executed.
launch
allows composing configurations from multiple configuration files by leveraging hydra (see hydra-core package). This function behaves similarly tohydra.main
provided in the hydra-core package: https://github.com/facebookresearch/hydra/blob/main/hydra/main.py. It expects a path to a configuration file namedconfig.yaml
contained in the directoryconfig_path
and returns a decorator. The returned decorator expects functions with the following signature:main(ctx: mlxp.Context)
.- Example:
import mlxp @mlxp.launch(config_path='configs', seeding_function=set_seeds) def main(ctx: mlxp.Context)->None: print(ctx.config) if __name__ == "__main__": main()
Runing the above python code will create an object
ctx
of typemlxp.Context
on the fly and provide it to the functionmain
. Such object stores information about the run. In particular, the fieldctx.config
stores the options contained in the config file ‘config.yaml’. Additionally,ctx.logger
, provides a logger object of the classmlxp.Logger
for logging results of the run. Just like in hydra, it is also possible to override the configs from the command line and to sweep over multiple values of a given configuration when executing python code. See: https://hydra.cc/docs/intro/ for complete documentation on how to use Hydra.- This function is necessary to enable MLXP’s functionalities including:
Multiple submissions to a cluster queue using
mlxpsub
Job versioning: Creating a ‘safe’ working directory from which jobs are executed when submitted to a cluster queue, to ensure each job was executed with a specific version of the code.
- Parameters:
config_path (str (default './configs')) – The config path, a directory where the default user configuration and MLXP settings are stored.
seeding_function (Union[Callable[[Any], None],None] (default None)) – A callable for setting the seed of random number generators. It is called with the seed option in ‘ctx.config.seed’ passed to it.
- Returns:
A decorator of the main function to be executed.
- Type:
Callable[[TaskFunction], Any]
- class mlxp.launcher.Context(config: DictConfig | None = None, mlxp: DictConfig | None = None, info: DictConfig | None = None, logger: Logger | None = None)[source]¶
Bases:
object
The contex object passed to the decorated function when using decorator mlxp.launch.
- config: ConfigDict¶
A structure containing project-specific options provided by the user. These options are loaded from a yaml file ‘config.yaml’ contained in the directory ‘config_path’ provided as argument to the decorator mlxp.launch. It’s content can be overriden from the command line.
- mlxp: ConfigDict¶
A structure containing MLXP’s default settings for the project. Its content is loaded from a yaml file ‘mlxp.yaml’ located in the same directory ‘config.yaml’.
- info: ConfigDict¶
A structure containing information about the current run: ex. status, start time, hostname, etc.
- mlxp.launcher.instance_from_dict(class_name: str, arguments: Dict[str, Any]) T [source]¶
Create an instance of a class based on a dictionary of arguments.
- mlxp.launcher.instantiate(class_name: str) T | Callable [source]¶
Dynamically imports a module and retrieves a class or function in it by name.
Given the fully qualified name of a class or function (in the form ‘module.submodule.ClassName’ or ‘module.submodule.function_name’), this function imports the module and returns a handle to the class or function.
- Parameters:
class_name (str) – The fully qualified name of the class or function to retrieve. This should include the module path and the name, e.g., ‘module.submodule.ClassName’ or ‘module.submodule.function_name’.
- Returns:
A handle (reference) to the class or function specified by class_name.
- Return type:
Type or Callable
- Raises:
ImportError – If the module cannot be imported.
AttributeError – If the class or function cannot be found in the module.
NameError – If the name cannot be evaluated after attempts to retrieve it.
Example:¶
>>> MyClass = instantiate('my_module.MyClass') >>> my_instance = MyClass() >>> my_function = instantiate('my_module.my_function') >>> result = my_function()
The mlxpsub
Command¶
- mlxp.mlxpsub.mlxpsub()[source]¶
A function for submitting a script to a job scheduler. Usage: mlxpsub <script.sh>
The ‘script.sh’ must contain the scheduler’s options defining the resource allocation for each individual job. Below is an example of ‘script.sh’
- Example:
#!/bin/bash #OAR -l core=1, walltime=6:00:00 #OAR -t besteffort #OAR -t idempotent #OAR -p gpumem>'16000' python main.py optimizer.lr=10.,1.,0.1 seed=1,2,3,4 python main.py model.num_units=100,200 seed=1,2,3,4
The command assumes the script contains at least a python command of the form: python <python_file_name.py> options_1=A,B,C option_2=X,Y where <python_file_name.py> is a python file that uses MLXP for launching.
MLXP creates a script for each job corresponding to an option setting. Each script is located in a directory of the form parent_log_dir/log_id, where log_id is automatically assigned by MLXP for each job.
Here is an example of the first created script in ‘logs/1/script.sh’
- Example:
#!/bin/bash #OAR -n logs/1 #OAR -E /root/logs/1/log.stderr #OAR -O /root/logs/1/log.stdout #OAR -l core=1, walltime=6:00:00 #OAR -t besteffort #OAR -t idempotent #OAR -p gpumem>'16000' cd /root/workdir/ python main.py optimizer.lr=10. seed=1
As you can see, MLXP automatically assigns values for the job’s name, stdout and stderr file paths, so there is no need to specify those in the original script ‘script.sh’. These scripts contain the same scheduler’s options as in ‘script.sh’ and a single python command using one specific option setting: optimizer.lr=10. seed=1 Additionally, MLXP pre-processes the python command to extract its working directory and set it explicitly in the newly created script before the python command.
Note
It is also possible to have other commands in the ‘script.sh’, for instance to activate an environment: (conda activate my_env). These commands will be copied from ‘script.sh’ to the new created script and placed before the python command. Variable assignments and directory changes will be systematically ignored.
To use
mlxpsub
, MLXP must be installed on both the head node and all compute nodes. However, application-specific modules do not need to be installed on the head node. You can avoid installing them on the head node by ensuring that these modules are only imported within the function that is decorated with themlxp.launch
decorator.In the follwing example, the
mlxp.launch
decorator is used in the filemain.py
to decorate the functiontrain
. The version below ofmain.py
requirestorch
to be installed in the head node:main.py¶import torch import mlxp @mlxp.launch(config_path='./configs') def train(ctx: mlxp.Context)->None: cfg = ctx.config logger = ctx.logger ... if __name__ == "__main__": train()
To avoid installing
torch
on the head node, you can make the following simple modification to themain.py
file:main.py¶import mlxp @mlxp.launch(config_path='./configs') def train(ctx: mlxp.Context)->None: import torch cfg = ctx.config logger = ctx.logger ... if __name__ == "__main__": train()