General Logging

When logging parameters and metrics, the user must specify the context of the information. The available contexts are:

  • TRAINING: adds the information to the training context
  • VALIDATION: adds the information to the validation context
  • TESTING: adds the information to the testing context

Log Parameters

To specify arbitrary training parameters used during the execution of the experiment, the user can call the following function.

prov4ml.log_param(
    key: str, 
    value: str, 
)
ParameterTypeDescription
keystringRequired. Name of the parameter
valuestringRequired. Value of the parameter

Log Metrics

To specify metrics, which can be tracked during the execution of the experiment, the user can call the following function.

prov4ml.log_metric(
    key: str, 
    value: float, 
    context:Context, 
    step: Optional[int] = None, 
    source: LoggingItemKind = None, 
)
ParameterTypeDescription
keystringRequired. Name of the metric
valuefloatRequired. Value of the metric
contextprov4ml.ContextRequired. Context of the metric
stepintOptional. Step of the metric
sourceLoggingItemKindOptional. Source of the metric

The step parameter is optional and can be used to specify the current time step of the experiment, for example the current epoch. The source parameter is optional and can be used to specify the source of the metric, so for example which library the data comes from. If omitted, yProv4ML will try to automatically determine the origin.

Log Artifacts

To log artifacts, the user can call the following function.

prov4ml.log_artifact(
    artifact_path : str, 
    context: Context,
    step: Optional[int] = None, 
    timestamp: Optional[int] = None
)
ParameterTypeDescription
artifact_pathstringRequired. Path to the artifact
contextprov4ml.ContextRequired. Context of the artifact
stepintOptional. Step of the artifact
timestampintOptional. Timestamp of the artifact

The function logs the artifact in the current experiment. The artifact can be a file or a directory. All logged artifacts are saved in the artifacts directory of the current experiment, while the related information is saved in the PROV-JSON file, along with a reference to the file.

Log Models

prov4ml.log_model(
    model: Union[torch.nn.Module, Any], 
    model_name: str = "default", 
    log_model_info: bool = True, 
    log_as_artifact=True, 
)
ParameterTypeDescription
modelUnion[torch.nn.Module, Any]Required. The model to be logged
model_namestringOptional. Name of the model
log_model_infoboolOptional. Whether to log model information
log_as_artifactboolOptional. Whether to log the model as an artifact

It sets the model for the current experiment. It can be called anywhere before the end of the experiment. The same call also logs some model information, such as the number of parameters and the model architecture memory footprint. The saving of these information can be toggled with the log_model_info = False parameter. The model can be saved as an artifact by setting the log_as_artifact = True parameter, which will save its parameters in the artifacts directory and reference the file in the PROV-JSON file.

prov4ml.save_model_version(
    model: Union[torch.nn.Module, Any], 
    model_name: str, 
    context: Context, 
    step: Optional[int] = None, 
    timestamp: Optional[int] = None
)

The save_model_version function saves the state of a PyTorch model and logs it as an artifact, enabling version control and tracking within machine learning experiments.

ParameterTypeDescription
modeltorch.nn.ModuleRequired. The PyTorch model to be saved.
model_namestrRequired. The name under which to save the model.
contextContextRequired. The context in which the model is saved.
stepOptional[int]Optional. The step or epoch number associated with the saved model.
timestampOptional[int]Optional. The timestamp associated with the saved model.

This function saves the model's state dictionary to a specified directory and logs the saved model file as an artifact for provenance tracking. It ensures that the directory for saving the model exists, creates it if necessary, and uses the torch.save method to save the model. It then logs the saved model file using log_artifact, associating it with the given context and optional step number.

Log Datasets

yProv4ML offers helper functions to log information and stats on specific datasets.

prov4ml.log_dataset(
    dataset : Union[DataLoader, Subset, Dataset], 
    label : str
)
ParameterTypeDescription
datasetUnion[DataLoader, Subset, Dataset]Required. The dataset to be logged
labelstringRequired. The label of the dataset

The function logs the dataset in the current experiment. The dataset can be a DataLoader, a Subset, or a Dataset class from pytorch.

Home | Prev | Next