8. API Reference
- class mlsdk.MNDevice
Class to specify which device to use.
- Parameters:
device_name (str) –
Device name separated by
":"(e.g."mncore2:auto"). The first string indicates which device to use, with the subsequent string modifying it. Availabledevice_nameare as follows:mncore,mncore2: Each refers to either the first or second generation of MN-Core and requires modification using a device index orauto(e.g."mncore2:0").pfvm: Used for using MLSDK with backends other than MN-Core. Must be modified forcpuorcuda(e.g."pfvm:cpu").emu,emu2: Each refers to an emulator designed for the first or second generation MN-Core. No modifiers are required (e.g."emu2").
- class mlsdk.Context(device: MNDevice)
- compile(function: Callable[[Dict[str, Tensor]], Dict[str, Tensor]], inputs: Mapping[str, Tensor | TensorProxy], codegen_dir: Path, *, options: Dict[str, Any] | None = None, cache_options: CacheOptions | None = None, num_compiler_threads: int | None = None, quiet: bool = True, exit_after_generate_codegen_dir: bool = False, optimizers: List[Optimizer] | None = None, export_kwargs: Dict[str, Any] | None = None, training: bool = True, initialize: bool | None = True, optimizer_spec: List[OptimizerSpecParamGroup] | None = None, optional_options: Set[str] | None = None, group: ProcessGroup | None = None, predefined_symbols: Dict[str, MNDeviceBuffer] | None = None) CompiledFunction
Compile a Python callable to a function that can be executed on the device.
- Parameters:
function – The Python callable to compile.
inputs – Sample inputs to the function.
codegen_dir – The directory to store intermediate and generated files.
options –
Specify compile options to control the compilation process. Predefined options (like
O0.jsonthroughO4.jsonin thepreset_optionsdirectory,/opt/pfn/pfcomp/codegen/preset_options/in MLSDK) are available, with higher numbers indicating more advanced optimizations but longer compilation times.A crucial setting is
float_dtypeto prevent unintended precision degradation. This option controls the floating-point type assigned totorch.float32tensors:mixed(default): Uses half-precision for GEMM operations (in/out) and float otherwise.half,float,double: Assigns the specified type to all such tensors.
To avoid the default
mixedprecision, setfloat_dtypetofloat.cache_options – Options for caching. See CacheOptions for details.
num_compiler_threads – The number of threads to use for compilation. If None, the number of threads will automatically be determined.
quiet – If True, suppress output from the compiler.
exit_after_generate_codegen_dir – For internal use only. If True, exit after generating the codegen directory. This is useful for decomposed layers test.
optimizers – For internal use only. A list of PyTorch optimizers to use for training.
export_kwargs – For internal use only. kwargs related to exporting the model to ONNX.
training – For internal use only. If True, the function is used for training.
optimizer_spec – For internal use only. The optimizer spec to use for training.
initialize – For internal use only. TODO (akirakawata): Add description.
optional_options – For internal use only. TODO (akirakawata): Add description.
group – For internal use only. TODO (akirakawata): Add description.
predefined_symbols – For internal use only. TODO (akirakawata): Add description.
- get_registered_value_proxy(value: Tensor) TensorProxy
Get the TensorProxy for the given value if it is registered in the context. :param value: The torch.Tensor to get the proxy for. :return: The TensorProxy for the given value.
- load_codegen_dir(codegen_dir: Path) CompiledFunction
Load a function that can be executed on the device from codegen_dir without validation.
- Parameters:
codegen_dir – The directory to load compile results files.
Note
This method will fail if the required compiled artifact, model.app.zst, is not found within the codegen_dir. Be aware that the returned function is strict; it requires an input dictionary with the exact same keys (variable names) and tensor shapes as the input used during the original compilation.
- register_buffer(buffer: Tensor) None
Registers a buffer in the context.
Note
Before calling this method, you must set the name of the buffer using set_tensor_name_in_module or set_tensor_name.
- register_optimizer_buffers(optimizer: MNCoreOptimizer) None
Registers optimizer buffers in the context.
Note
Before calling this method, you must set the name of the buffer using set_buffer_name_in_optimizer or set_tensor_name.
- register_param(param: Parameter) None
Registers a parameter in the context.
Note
Before calling this method, you must set the name of the parameter using set_tensor_name_in_module or set_tensor_name.
- static switch_context(new_context: Context) None
Switching a context causes all the current tensors in the device to be moved back to host memory and the next context ones to be loaded in mncore.
- synchronize() None
Synchronizes the context by moving tensors to the torch framework and marks the context for initialization.
This method performs the following steps:
Calls the synchronize() method of the device associated with the context.
Iterates over all tensor names in the registry and moves each tensor to the torch framework.
Note
Different from sync torch.cuda.synchronize(), which only wait for all kernels in all streams on a CUDA device to complete. This function also moves all tensors in the context’s registry from the device to the host.
- class mlsdk.CompiledFunction(context: Context, code_block: _CompiledFunction, *, output_signature: ValueSignature | None = None)
- allocate_input_proxy() Dict[str, TensorProxy]
Allocate input proxies for the function. :return: A dictionary mapping input names to their corresponding TensorProxy objects.
- class mlsdk.TensorProxy(context: Context, codegen_data: TensorProxyCodegenData, *, is_input: bool = False)
- cpu() Tensor
Transfer the corresponding data to CPU (Host) to access as torch.Tensor.
- load_from(value: Tensor | TensorProxy, *, clone: bool = True) None
Load data from a torch.Tensor or another TensorProxy to this TensorProxy.
- Parameters:
value – The source tensor to copy data from.
clone – If True and value is a torch.Tensor, it will be cloned before copying, enabling the source tensor to be modified without affecting this TensorProxy.
- mlsdk.TensorLike = Union[torch.Tensor, TensorProxy]
- class mlsdk.CacheOptions(cache_dir_str: str, *, enable_app_cache: bool = True, enable_onnx_cache: bool = False, enable_codegen_cache: bool = False, enable_gpfn2obj_cache: bool = False)
- __init__(cache_dir_str: str, *, enable_app_cache: bool = True, enable_onnx_cache: bool = False, enable_codegen_cache: bool = False, enable_gpfn2obj_cache: bool = False) None
The options for specifying the cache directory and controlling cache behavior.
- Parameters:
cache_dir_str – A path string of the root directory to store cache.
enable_app_cache – If True, cache compiled GPFNApp files from ONNX files. GPFNApp is the binary format of MN-Core compiler uses.
enable_onnx_cache – If True, cache exported ONNXs from the given function.
enable_codegen_cache – If True, cache the codegen compilation. This option is mainly for developers.
enable_gpfn2obj_cache – If True, cache the GPFN object data. This option is mainly for developers.
- class mlsdk.MNCoreOptimizer(params: Iterator[Parameter], defaults: Dict[str, Any])
- zero_grad(set_to_none: bool = True) None
Clear the gradient of the parameters.
- Parameters:
set_to_none (bool) – If True, the gradients will be set to None instead of zero.
- class mlsdk.MNCoreSGD(params: Iterator[Parameter], lr: float | Tensor = 0.001, momentum: float = 0, dampening: float = 0, weight_decay: float | Tensor = 0, nesterov: bool = False, *, maximize: bool = False, foreach: bool | None = None, differentiable: bool = False, fused: bool | None = None)
- step(closure=None) None
Perform a single optimization step to update parameter.
- Args:
- closure (Callable): A closure that reevaluates the model and
returns the loss. Optional for most optimizers.
- class mlsdk.MNCoreAdam(params: Iterator[Parameter], lr: float | Tensor = 0.001, betas: tuple[float | Tensor, float | Tensor] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0, amsgrad: bool = False, *, foreach: bool | None = None, maximize: bool = False, capturable: bool = False, differentiable: bool = False, fused: bool | None = None, decoupled_weight_decay: bool = False, chainer_use_torch: bool = True)
- step(closure=None) None
Perform a single optimization step to update parameter.
- Args:
- closure (Callable): A closure that reevaluates the model and
returns the loss. Optional for most optimizers.
- class mlsdk.MNCoreAdamW(params: Iterator[Parameter], lr: float | Tensor = 0.001, betas: tuple[float | Tensor, float | Tensor] = (0.9, 0.999), eps: float = 1e-08, weight_decay: float = 0.01, amsgrad: bool = False, *, maximize: bool = False, foreach: bool | None = None, capturable: bool = False, differentiable: bool = False, fused: bool | None = None)
- step(closure=None) None
Perform a single optimization step to update parameter.
- Args:
- closure (Callable): A closure that reevaluates the model and
returns the loss. Optional for most optimizers.
- class mlsdk.MNCoreLRScheduler(scheduler: LRScheduler, context: Context | None)
- step() None
Perform a step.
- mlsdk.set_buffer_name_in_optimizer(optimizer: MNCoreOptimizer, name: str) None
Set the buffer names in the optimizer.
This function sets the names of the tensors in the optimizer (i.e. buffers) according to the optimizer’s name, so that ONNX exporter (FX2ONNX) can distinguish them later. You need to call this function before registering the optimizer buffers to the context.
- Parameters:
optimizer – The optimizer to set buffer names.
name – The name of the optimizer.
- mlsdk.get_tensor_name(tensor: Tensor) str | None
Get the name of the tensor.
This function returns the name of the tensor set by
set_tensor_name_in_moduleorset_buffer_name_in_optimizer.- Parameters:
tensor – The tensor to get the name of.
- Returns:
The name of the tensor.
- mlsdk.set_tensor_name_in_module(module: Module, module_name: str | None) None
Set the tensor names in the module.
This function sets the names of the tensor in the module (i.e. parameters and buffers such as BN stats), so that ONNX exporter (FX2ONNX) can distinguish them later. You need to call this function before registering the module parameters and buffers to the context.
- Parameters:
module – The module to set tensor names.
module_name – The name of the module.
- storage.path(target: str) Path
- mlsdk.path(target: str) Path
- mlsdk.trace_scope(output_filename: str | Path | None, ignore_if_traced: bool = False) Iterator[None]