TensorRT10 with JetPack 6.0 Docs update (#11779)

Co-authored-by: Glenn Jocher <glenn.jocher@ultralytics.com>
Co-authored-by: Lakshantha Dissanayake <lakshanthad@yahoo.com>
pull/12782/head^2
Burhan 6 months ago committed by GitHub
parent 303579c35e
commit 10b3564a1b
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
  1. 78
      docs/en/integrations/tensorrt.md

@ -145,27 +145,43 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
!!! example
```{ .py .annotate }
from ultralytics import YOLO
=== "Python"
model = YOLO("yolov8n.pt")
model.export(
format="engine",
dynamic=True, #(1)!
batch=8, #(2)!
workspace=4, #(3)!
int8=True,
data="coco.yaml", #(4)!
)
```{ .py .annotate }
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
model.export(
format="engine",
dynamic=True, #(1)!
batch=8, #(2)!
workspace=4, #(3)!
int8=True,
data="coco.yaml", #(4)!
)
# Load the exported TensorRT INT8 model
model = YOLO("yolov8n.engine", task="detect")
# Run inference
result = model.predict("https://ultralytics.com/images/bus.jpg")
```
1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 * 8` to avoid scaling errors during calibration.
3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
model = YOLO("yolov8n.engine", task="detect") # load the model
```
1. Exports with dynamic axes, this will be enabled by default when exporting with `int8=True` even when not explicitly set. See [export arguments](../modes/export.md#arguments) for additional information.
2. Sets max batch size of 8 for exported model, which calibrates with `batch = 2 *×* 8` to avoid scaling errors during calibration.
3. Allocates 4 GiB of memory instead of allocating the entire device for conversion process.
4. Uses [COCO dataset](../datasets/detect/coco.md) for calibration, specifically the images used for [validation](../modes/val.md) (5,000 total).
=== "CLI"
```bash
# Export a YOLOv8n PyTorch model to TensorRT format with INT8 quantization
yolo export model=yolov8n.pt format=engine batch=8 workspace=4 int8=True data=coco.yaml # creates 'yolov8n.engine''
# Run inference with the exported TensorRT quantized model
yolo predict model=yolov8n.engine source='https://ultralytics.com/images/bus.jpg'
```
???+ warning "Calibration Cache"
@ -240,12 +256,12 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
| Precision | Eval test | mean<br>(ms) | min \| max<br>(ms) | top-1 | top-5 | `batch` | size<br><sup>(pixels) |
|-----------|------------------|--------------|--------------------|-------|-------|---------|-----------------------|
| FP32 | Predict | 0.26 | 0.25 \| 0.28 | 0.35 | 0.61 | 8 | 640 |
| FP32 | ImageNet<sup>val | 0.26 | | | | 1 | 640 |
| FP16 | Predict | 0.18 | 0.17 \| 0.19 | 0.35 | 0.61 | 8 | 640 |
| FP16 | ImageNet<sup>val | 0.18 | | | | 1 | 640 |
| INT8 | Predict | 0.16 | 0.15 \| 0.57 | 0.32 | 0.59 | 8 | 640 |
| INT8 | ImageNet<sup>val | 0.15 | | | | 1 | 640 |
| FP32 | Predict | 0.26 | 0.25 \| 0.28 | | | 8 | 640 |
| FP32 | ImageNet<sup>val | 0.26 | | 0.35 | 0.61 | 1 | 640 |
| FP16 | Predict | 0.18 | 0.17 \| 0.19 | | | 8 | 640 |
| FP16 | ImageNet<sup>val | 0.18 | | 0.35 | 0.61 | 1 | 640 |
| INT8 | Predict | 0.16 | 0.15 \| 0.57 | | | 8 | 640 |
| INT8 | ImageNet<sup>val | 0.15 | | 0.32 | 0.59 | 1 | 640 |
=== "Pose (COCO)"
@ -338,19 +354,19 @@ Experimentation by NVIDIA led them to recommend using at least 500 calibration i
=== "Jetson Orin NX 16GB"
Tested with JetPack 5.1.3 (L4T 35.5.0) Ubuntu 20.04.6, `python 3.8.10`, `ultralytics==8.2.4`, `tensorrt==8.5.2.2`
Tested with JetPack 6.0 (L4T 36.3) Ubuntu 22.04.4 LTS, `python 3.10.12`, `ultralytics==8.2.16`, `tensorrt==10.0.1`
!!! note
Inference times shown for `mean`, `min` (fastest), and `max` (slowest) for each test using pre-trained weights `yolov8n.engine`
| Precision | Eval test | mean<br>(ms) | min \| max<br>(ms) | mAP<sup>val<br>50(B) | mAP<sup>val<br>50-95(B) | `batch` | size<br><sup>(pixels) |
|-----------|--------------|--------------|--------------------|----------------------|-------------------------|---------|-----------------------|
| FP32 | Predict | 6.90 | 6.89 \| 6.93 | | | 8 | 640 |
| FP32 | COCO<sup>val | 6.97 | | 0.52 | 0.37 | 1 | 640 |
| FP16 | Predict | 3.36 | 3.35 \| 3.39 | | | 8 | 640 |
| FP16 | COCO<sup>val | 3.39 | | 0.52 | 0.37 | 1 | 640 |
| INT8 | Predict | 2.32 | 2.32 \| 2.34 | | | 8 | 640 |
| INT8 | COCO<sup>val | 2.33 | | 0.47 | 0.33 | 1 | 640 |
| FP32 | Predict | 6.11 | 6.10 \| 6.29 | | | 8 | 640 |
| FP32 | COCO<sup>val | 6.17 | | 0.52 | 0.37 | 1 | 640 |
| FP16 | Predict | 3.18 | 3.18 \| 3.20 | | | 8 | 640 |
| FP16 | COCO<sup>val | 3.19 | | 0.52 | 0.37 | 1 | 640 |
| INT8 | Predict | 2.30 | 2.29 \| 2.35 | | | 8 | 640 |
| INT8 | COCO<sup>val | 2.32 | | 0.46 | 0.32 | 1 | 640 |
!!! info

Loading…
Cancel
Save