Hardware configuration

class optexp.hardwareconfig.HardwareConfig[source]

Abstract base class for hardware configurations.

class optexp.hardwareconfig.StrictManualConfig(num_devices: int = 1, micro_batch_size: int | None = None, eval_micro_batch_size: int | None = None, num_workers: int = 0, device: Literal['cpu', 'cuda', 'auto'] = 'auto')[source]

Manual configuration for hardware settings.

If you want to use multiple devices or if the batch size is too large to fit in memory, use this class to specify the hardware settings.

The _effective_ batch size, the number of samples used to compute each optimization step, is specified in Problem. This class specifies the number of samples loaded at once, the micro batch size, which are then used with gradient accumulation to compute the optimization step.

Operations like drop_last are not yet supported. You will get an error if the batch size does not divide the dataset size.

Parameters:
  • num_devices (int, optional) – number of devices (eg GPUs). Defaults to 1.

  • micro_batch_size (int, optional) – mumber of samples loaded at once during training. Needs to evenly divide the batch size. If not provided, the Problem batch size is used.

  • eval_micro_batch_size (int, optional) – number of samples loaded at once during evaluation. Size of the actual minibatches that will be loaded during evaluation. If not provided, the micro_batch_size is used.

  • num_workers (int, optional) – number of workers to load samples. Defaults to 0

  • device (Literal["cpu" | "cuda" | "auto"]) – device to use for training. Can be “cpu”, “cuda” or “auto”. Defaults to “auto”, using the GPU if available.

Example

  1. Steps with a batch size of 100, loading 10 samples at a time:

    Problem(
        batch_size=100,
        hw_config=StrictManualConfig(
            num_devices=1,
            micro_batch_size=10
        ),
        ...,
    )
    
  2. Steps with a batch size of 100, loading 10 samples at a time with 2 GPUs:

    Problem(
        batch_size=100,
        hw_config=StrictManualConfig(
            num_devices=2,
            micro_batch_size=10,
            device="gpu",
        ),
        ...,
    )
    
  3. Invalid configuration: The batch size is not a multiple of the micro batch size:

    Problem(
        batch_size=100,
        hw_config=StrictManualConfig(
            num_devices=1,
            micro_batch_size=15
        ),
        ...,
    )
    
  4. Invalid configuration: The batch_size is not a multiple of the micro_batch_size * num_devices:

    Problem(
        batch_size=100,
        hw_config=StrictManualConfig(
            num_devices=3,
            micro_batch_size=50
        ),
        ...,
    )