@ -145,27 +145,43 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
!!! example
```{ .py .annotate }
from ultralytics import YOLO
=== "Python"
model = YOLO("yolov8n.pt")
model.export(
format="engine",
dynamic=True, #(1)!
batch=8, #(2)!
workspace=4, #(3)!
int8=True,
data="coco.yaml", #(4)!
)
```{ .py .annotate }
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
model.export(
format="engine",
dynamic=True, #(1)!
batch=8, #(2)!
workspace=4, #(3)!
int8=True,
data="coco.yaml", #(4)!
)
# Load the exported TensorRT INT8 model
model = YOLO("yolov8n.engine", task="detect")
# Run inference
result = model.predict("https://ultralytics.com/images/bus.jpg")
```
1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 * 8` to avoid scaling errors during calibration.
3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
model = YOLO("yolov8n.engine", task="detect") # load the model
```
1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 *×* 8` to avoid scaling errors during calibration.
3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
=== "CLI"
```bash
# Export a YOLOv8n PyTorch model to TensorRT format with INT8 quantization