Data structures

These are the data structures provided by MLXP to handle configuration options and data

The schemas Classes

Structures for validating the configurations.

class mlxp.data_structures.schemas.ConfigVersionManager(name: str = '???')[source]

Bases: object

Structure of the config file for the version manager.

name: str

Name of the version manager’s class.

class mlxp.data_structures.schemas.ConfigGitVM(name: str = 'mlxp.GitVM', parent_work_dir: str = './.work_dir', compute_requirements: bool = False)[source]

Bases: ConfigVersionManager

Configs for using the GitVM version manager.

It inherits the structure of the class VersionManager.

name: str

Name of the version manager’s class.

parent_work_dir: str

The target parent directory of the new working directory returned by the version manager

compute_requirements: bool

When set to true, the version manager stores a list of requirements and their version.

class mlxp.data_structures.schemas.ConfigLogger(name: str = 'mlxp.DefaultLogger', parent_log_dir: str = './logs', forced_log_id: int = -1, log_streams_to_file: bool = False)[source]

Bases: object

Structure of the config file for the logs.

The outputs for each run are saved in a directory of the form ‘parent_log_dir/log_id’ which is stored in the variable ‘path’ during execution.

name: str

Class name of the logger to use (default “DefaultLogger”)

parent_log_dir: str

Absolute path of the parent directory where the logs of a run are stored. (default “./logs”)

forced_log_id: int

An id optionally provided by the user for the run. If forced_log_id is positive, then the logs of the run will be stored under ‘parent_log_dir/forced_log_id’. Otherwise, the logs will be stored in a directory ‘parent_log_dir/log_id’ where ‘log_id’ is assigned uniquely for the run during execution.

log_streams_to_file: bool

If true logs the system stdout and stderr of a run to a file named “log.stdour” and “log.stderr” in the log directory.

class mlxp.data_structures.schemas.Info(status: str = 'STARTING', current_file_path: str = '', executable: str = '', hostname: str = '', process_id: int = -1, start_date: Any = '', start_time: Any = '', end_date: Any = '', end_time: Any = '', work_dir: str = '/tmp/tmpork1so9r/e5fe80d510d9ba75a2a3370b1e2f890bdf4d4ac0/docs', logger: Any | None = None, scheduler: Any | None = None, version_manager: Any | None = None)[source]

Bases: object

A structure storing general information about the run.

The following variables are assigned during execution.

status: str

Status of a job. The status can take the following values:

  • STARTING: The metadata for the run have been created.

  • RUNNING: The experiment is currently running.

  • COMPLETE: The run is complete and did not through any error.

  • FAILED: The run stoped due to an error.

current_file_path: str

Name of the python file being executed.

executable: str

Path to the python executable used for executing the code.

hostname: str

Name of the host from which code is executed.

process_id: int

Id of the process assigned to the job during execution.

start_date: Any

Date at which job started.

start_time: Any

Time at which job started.

end_date: Any

Date at which job ended.

end_time: Any

Time at which job ended.

logger: Any

Logger info, whenever used.

scheduler: Any

scheduler info, whenever used.

version_manager: Any

version_manager info, whenever used.

class mlxp.data_structures.schemas.MLXPConfig(logger: ~mlxp.data_structures.schemas.ConfigLogger = <factory>, version_manager: ~mlxp.data_structures.schemas.ConfigVersionManager = <factory>, use_version_manager: bool = False, use_scheduler: bool = False, use_logger: bool = True, interactive_mode: bool = True, resolve: bool = True, as_ConfigDict: bool = True)[source]

Bases: object

Default settings of MLXP.

logger: ConfigLogger

The logger’s settings. (default ConfigLogger)

version_manager: ConfigVersionManager

The version_manager’s settings. (default ConfigGitVM)

use_version_manager: bool

If true, uses the version manager. (default False)

use_scheduler: bool

If true, uses the scheduler. (default False)

use_logger: bool

If true, uses the logger. (default True)

interactive_mode: bool

A variable controlling MLXP’s interactive mode.

  1. If ‘interactive_mode==True’, MLXP uses the interactive mode whenever applicable:

    • When ‘use_version_manager==True’: Asks the user:

      • If untracked files should be added.

      • If uncommitted changes should be committed.

      • If a copy of the current repository based on the latest commit should be made (if not already existing) to execute the code from there. Otherwise, code is executed from the current directory.

  2. If ‘interactive_mode==False’, no interactive mode is used and current options are used:

    • When ‘use_version_manager==True’:

      • Existing untracked files or uncommitted changes are ignored.

      • A copy of the code is made based on the latest commit (if not already existing) and code is executed from there.

resolve: bool

If true, resolves the configurations prior to stating the job (default True)

as_ConfigDict: bool

If true, converts the configurations from an omegaconf.dictconfig.DictConfig object to the custom mlxp.ConfigDict object. Once converted, the object becomes mutable and all its values are resolved. (default True)

class mlxp.data_structures.schemas.Metadata(info: ~mlxp.data_structures.schemas.Info = <factory>, mlxp: ~mlxp.data_structures.schemas.MLXPConfig = <factory>, config: ~typing.Any | None = None)[source]

Bases: object

The structure of the config file.

info: Info

Contains config information of the run (hostname, command, application, etc) (default Info)

mlxp: MLXPConfig

Default settings of MLXP. (default MLXPConfig)

config: Any

Contains the user’s defined configs that are specific to the run.

The config_dict Classes

A dictionary-like structure for storing the configurations.

class mlxp.data_structures.config_dict.ConfigDict(*args, **kwargs)[source]

Bases: dict

A subclass of the dict class containing the configuration options.

The value corresponding to a key can be accessed as an attribute: self.key

to_dict() Dict[str, Any][source]

Convert the object into a simple dictionary.

Returns:

A dictionary containing the same information as self

Return type:

Dict[str,Any]

update(new_dict: Dict[str, Any]) None[source]

Update the dictionary based on an input dictionary-like object.

Parameters:

new_dict (Dict[str, Any]) – Dictionary-like object.

mlxp.data_structures.config_dict.convert_dict(src_dict: ~typing.Any, src_class: ~typing.Type = <class 'omegaconf.dictconfig.DictConfig'>, dst_class: ~typing.Type = <class 'mlxp.data_structures.config_dict.ConfigDict'>) Any[source]

Convert a dictionary-like object from a source class to a destination dictionary- like object of a destination class.

Parameters:
  • src_dict (Any) – The source dictionary to be converted

  • src_class (Type) – The type of the src dictionary

  • dst_class (Type) – The destination type of the returned dictionary-like object.

Returns:

A dictionary-like instance of the dst_class copying the data from the src_dict.

Return type:

Any

The Artifact Class

Artifacts objects that can be saved by a Logger object.

The data_dict Module