ultralytics

14 KiB

Raw Blame History

Argument	Default	Description
`model`	`None`	Specifies the model file for training. Accepts a path to either a `.pt` pretrained model or a `.yaml` configuration file. Essential for defining the model structure or initializing weights.
`data`	`None`	Path to the dataset configuration file (e.g., `coco8.yaml`). This file contains dataset-specific parameters, including paths to training and validation data, class names, and number of classes.
`epochs`	`100`	Total number of training epochs. Each epoch represents a full pass over the entire dataset. Adjusting this value can affect training duration and model performance.
`time`	`None`	Maximum training time in hours. If set, this overrides the `epochs` argument, allowing training to automatically stop after the specified duration. Useful for time-constrained training scenarios.
`patience`	`100`	Number of epochs to wait without improvement in validation metrics before early stopping the training. Helps prevent overfitting by stopping training when performance plateaus.
`batch`	`16`	Batch size, with three modes: set as an integer (e.g., `batch=16`), auto mode for 60% GPU memory utilization (`batch=-1`), or auto mode with specified utilization fraction (`batch=0.70`).
`imgsz`	`640`	Target image size for training. All images are resized to this dimension before being fed into the model. Affects model accuracy and computational complexity.
`save`	`True`	Enables saving of training checkpoints and final model weights. Useful for resuming training or model deployment.
`save_period`	`-1`	Frequency of saving model checkpoints, specified in epochs. A value of -1 disables this feature. Useful for saving interim models during long training sessions.
`cache`	`False`	Enables caching of dataset images in memory (`True`/`ram`), on disk (`disk`), or disables it (`False`). Improves training speed by reducing disk I/O at the cost of increased memory usage.
`device`	`None`	Specifies the computational device(s) for training: a single GPU (`device=0`), multiple GPUs (`device=0,1`), CPU (`device=cpu`), or MPS for Apple silicon (`device=mps`).
`workers`	`8`	Number of worker threads for data loading (per `RANK` if Multi-GPU training). Influences the speed of data preprocessing and feeding into the model, especially useful in multi-GPU setups.
`project`	`None`	Name of the project directory where training outputs are saved. Allows for organized storage of different experiments.
`name`	`None`	Name of the training run. Used for creating a subdirectory within the project folder, where training logs and outputs are stored.
`exist_ok`	`False`	If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs.
`pretrained`	`True`	Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance.
`optimizer`	`'auto'`	Choice of optimizer for training. Options include `SGD`, `Adam`, `AdamW`, `NAdam`, `RAdam`, `RMSProp` etc., or `auto` for automatic selection based on model configuration. Affects convergence speed and stability.
`seed`	`0`	Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations.
`deterministic`	`True`	Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms.
`single_cls`	`False`	Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification.
`classes`	`None`	Specifies a list of class IDs to train on. Useful for filtering out and focusing only on certain classes during training.
`rect`	`False`	Enables rectangular training, optimizing batch composition for minimal padding. Can improve efficiency and speed but may affect model accuracy.
`cos_lr`	`False`	Utilizes a cosine learning rate scheduler, adjusting the learning rate following a cosine curve over epochs. Helps in managing learning rate for better convergence.
`close_mosaic`	`10`	Disables mosaic data augmentation in the last N epochs to stabilize training before completion. Setting to 0 disables this feature.
`resume`	`False`	Resumes training from the last saved checkpoint. Automatically loads model weights, optimizer state, and epoch count, continuing training seamlessly.
`amp`	`True`	Enables Automatic Mixed Precision (AMP) training, reducing memory usage and possibly speeding up training with minimal impact on accuracy.
`fraction`	`1.0`	Specifies the fraction of the dataset to use for training. Allows for training on a subset of the full dataset, useful for experiments or when resources are limited.
`profile`	`False`	Enables profiling of ONNX and TensorRT speeds during training, useful for optimizing model deployment.
`freeze`	`None`	Freezes the first N layers of the model or specified layers by index, reducing the number of trainable parameters. Useful for fine-tuning or transfer learning.
`lr0`	`0.01`	Initial learning rate (i.e. `SGD=1E-2`, `Adam=1E-3`) . Adjusting this value is crucial for the optimization process, influencing how rapidly model weights are updated.
`lrf`	`0.01`	Final learning rate as a fraction of the initial rate = (`lr0 * lrf`), used in conjunction with schedulers to adjust the learning rate over time.
`momentum`	`0.937`	Momentum factor for SGD or beta1 for Adam optimizers, influencing the incorporation of past gradients in the current update.
`weight_decay`	`0.0005`	L2 regularization term, penalizing large weights to prevent overfitting.
`warmup_epochs`	`3.0`	Number of epochs for learning rate warmup, gradually increasing the learning rate from a low value to the initial learning rate to stabilize training early on.
`warmup_momentum`	`0.8`	Initial momentum for warmup phase, gradually adjusting to the set momentum over the warmup period.
`warmup_bias_lr`	`0.1`	Learning rate for bias parameters during the warmup phase, helping stabilize model training in the initial epochs.
`box`	`7.5`	Weight of the box loss component in the loss function, influencing how much emphasis is placed on accurately predicting bounding box coordinates.
`cls`	`0.5`	Weight of the classification loss in the total loss function, affecting the importance of correct class prediction relative to other components.
`dfl`	`1.5`	Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification.
`pose`	`12.0`	Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints.
`kobj`	`2.0`	Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy.
`nbs`	`64`	Nominal batch size for normalization of loss.
`overlap_mask`	`True`	Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlayed on top of the larger mask during merge.
`mask_ratio`	`4`	Downsample ratio for segmentation masks, affecting the resolution of masks used during training.
`dropout`	`0.0`	Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training.
`val`	`True`	Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset.
`plots`	`False`	Generates and saves plots of training and validation metrics, as well as prediction examples, providing visual insights into model performance and learning progression.

14 KiB Raw Blame History

14 KiB

Raw Blame History