Launcher¶
The launch Decorator¶
The launcher allows launching multiple experiments on a cluster using hydra.
- mlxp.launcher.launch(config_path: str = 'configs', seeding_function: Callable[[Any], None] | None = None) Callable[[Callable[[Any], Any]], Any][source]¶
Create a decorator of the main function to be executed.
launchallows composing configurations from multiple configuration files by leveraging hydra (see hydra-core package). This function behaves similarly tohydra.mainprovided in the hydra-core package: https://github.com/facebookresearch/hydra/blob/main/hydra/main.py. It expects a path to a configuration file namedconfig.yamlcontained in the directoryconfig_pathand returns a decorator. The returned decorator expects functions with the following signature:main(ctx: mlxp.Context).- Example:
import mlxp @mlxp.launch(config_path='configs', seeding_function=set_seeds) def main(ctx: mlxp.Context)->None: print(ctx.config) if __name__ == "__main__": main()
Runing the above python code will create an object
ctxof typemlxp.Contexton the fly and provide it to the functionmain. Such object stores information about the run. In particular, the fieldctx.configstores the options contained in the config file ‘config.yaml’. Additionally,ctx.logger, provides a logger object of the classmlxp.Loggerfor logging results of the run. Just like in hydra, it is also possible to override the configs from the command line and to sweep over multiple values of a given configuration when executing python code. See: https://hydra.cc/docs/intro/ for complete documentation on how to use Hydra.- This function is necessary to enable MLXP’s functionalities including:
Multiple submissions to a cluster queue using
mlxpsubJob versioning: Creating a ‘safe’ working directory from which jobs are executed when submitted to a cluster queue, to ensure each job was executed with a specific version of the code.
- Parameters:
config_path (str (default './configs')) – The config path, a directory where the default user configuration and MLXP settings are stored.
seeding_function (Union[Callable[[Any], None],None] (default None)) – A callable for setting the seed of random number generators. It is called with the seed option in ‘ctx.config.seed’ passed to it.
- Returns:
A decorator of the main function to be executed.
- Type:
Callable[[TaskFunction], Any]
- class mlxp.launcher.Context(config: DictConfig | None = None, mlxp: DictConfig | None = None, info: DictConfig | None = None, logger: Logger | None = None)[source]¶
Bases:
objectThe contex object passed to the decorated function when using decorator mlxp.launch.
- config: ConfigDict¶
A structure containing project-specific options provided by the user. These options are loaded from a yaml file ‘config.yaml’ contained in the directory ‘config_path’ provided as argument to the decorator mlxp.launch. It’s content can be overriden from the command line.
- mlxp: ConfigDict¶
A structure containing MLXP’s default settings for the project. Its content is loaded from a yaml file ‘mlxp.yaml’ located in the same directory ‘config.yaml’.
- info: ConfigDict¶
A structure containing information about the current run: ex. status, start time, hostname, etc.
- mlxp.launcher.instance_from_dict(class_name: str, arguments: Dict[str, Any]) T[source]¶
Create an instance of a class based on a dictionary of arguments.
- mlxp.launcher.instantiate(class_name: str) T | Callable[source]¶
Dynamically imports a module and retrieves a class or function in it by name.
Given the fully qualified name of a class or function (in the form ‘module.submodule.ClassName’ or ‘module.submodule.function_name’), this function imports the module and returns a handle to the class or function.
- Parameters:
class_name (str) – The fully qualified name of the class or function to retrieve. This should include the module path and the name, e.g., ‘module.submodule.ClassName’ or ‘module.submodule.function_name’.
- Returns:
A handle (reference) to the class or function specified by class_name.
- Return type:
Type or Callable
- Raises:
ImportError – If the module cannot be imported.
AttributeError – If the class or function cannot be found in the module.
NameError – If the name cannot be evaluated after attempts to retrieve it.
Example:¶
>>> MyClass = instantiate('my_module.MyClass') >>> my_instance = MyClass() >>> my_function = instantiate('my_module.my_function') >>> result = my_function()
The mlxpsub Command¶
- mlxp.mlxpsub.mlxpsub()[source]¶
A function for submitting a script to a job scheduler. Usage: mlxpsub <script.sh>
The ‘script.sh’ must contain the scheduler’s options defining the resource allocation for each individual job. Below is an example of ‘script.sh’
- Example:
#!/bin/bash #OAR -l core=1, walltime=6:00:00 #OAR -t besteffort #OAR -t idempotent #OAR -p gpumem>'16000' python main.py optimizer.lr=10.,1.,0.1 seed=1,2,3,4 python main.py model.num_units=100,200 seed=1,2,3,4
The command assumes the script contains at least a python command of the form: python <python_file_name.py> options_1=A,B,C option_2=X,Y where <python_file_name.py> is a python file that uses MLXP for launching.
MLXP creates a script for each job corresponding to an option setting. Each script is located in a directory of the form parent_log_dir/log_id, where log_id is automatically assigned by MLXP for each job.
Here is an example of the first created script in ‘logs/1/script.sh’
- Example:
#!/bin/bash #OAR -n logs/1 #OAR -E /root/logs/1/log.stderr #OAR -O /root/logs/1/log.stdout #OAR -l core=1, walltime=6:00:00 #OAR -t besteffort #OAR -t idempotent #OAR -p gpumem>'16000' cd /root/workdir/ python main.py optimizer.lr=10. seed=1
As you can see, MLXP automatically assigns values for the job’s name, stdout and stderr file paths, so there is no need to specify those in the original script ‘script.sh’. These scripts contain the same scheduler’s options as in ‘script.sh’ and a single python command using one specific option setting: optimizer.lr=10. seed=1 Additionally, MLXP pre-processes the python command to extract its working directory and set it explicitly in the newly created script before the python command.
Note
It is also possible to have other commands in the ‘script.sh’, for instance to activate an environment: (conda activate my_env). These commands will be copied from ‘script.sh’ to the new created script and placed before the python command. Variable assignments and directory changes will be systematically ignored.
To use
mlxpsub, MLXP must be installed on both the head node and all compute nodes. However, application-specific modules do not need to be installed on the head node. You can avoid installing them on the head node by ensuring that these modules are only imported within the function that is decorated with themlxp.launchdecorator.In the follwing example, the
mlxp.launchdecorator is used in the filemain.pyto decorate the functiontrain. The version below ofmain.pyrequirestorchto be installed in the head node:main.py¶import torch import mlxp @mlxp.launch(config_path='./configs') def train(ctx: mlxp.Context)->None: cfg = ctx.config logger = ctx.logger ... if __name__ == "__main__": train()
To avoid installing
torchon the head node, you can make the following simple modification to themain.pyfile:main.py¶import mlxp @mlxp.launch(config_path='./configs') def train(ctx: mlxp.Context)->None: import torch cfg = ctx.config logger = ctx.logger ... if __name__ == "__main__": train()