YOLOv8-TensorRT/README.md

# YOLOv8-TensorRT

`YOLOv8` using TensorRT accelerate !

---
[![Build Status](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fatrox%2Fsync-dotenv%2Fbadge&style=flat)](https://github.com/triple-Mu/YOLOv8-TensorRT)
[![Python Version](https://img.shields.io/badge/Python-3.8--3.10-FFD43B?logo=python)](https://github.com/triple-Mu/YOLOv8-TensorRT)
[![img](https://badgen.net/badge/icon/tensorrt?icon=azurepipelines&label)](https://developer.nvidia.com/tensorrt)
[![C++](https://img.shields.io/badge/CPP-11%2F14-yellow)](https://github.com/triple-Mu/YOLOv8-TensorRT)
[![img](https://badgen.net/github/license/triple-Mu/YOLOv8-TensorRT)](https://github.com/triple-Mu/YOLOv8-TensorRT/blob/main/LICENSE)
[![img](https://badgen.net/github/prs/triple-Mu/YOLOv8-TensorRT)](https://github.com/triple-Mu/YOLOv8-TensorRT/pulls)
[![img](https://img.shields.io/github/stars/triple-Mu/YOLOv8-TensorRT?color=ccf)](https://github.com/triple-Mu/YOLOv8-TensorRT)

---


# Prepare the environment

1. Install `CUDA` follow [`CUDA official website`](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#download-the-nvidia-cuda-toolkit).

   🚀 RECOMMENDED `CUDA` >= 11.4

2. Install `TensorRT` follow [`TensorRT official website`](https://developer.nvidia.com/nvidia-tensorrt-8x-download).

   🚀 RECOMMENDED `TensorRT` >= 8.4

2. Install python requirement.

   ``` shell
   pip install -r requirement.txt
   ```

3. Install [`ultralytics`](https://github.com/ultralytics/ultralytics) package for ONNX export or TensorRT API building.

   ``` shell
   pip install ultralytics
   ```

5. Prepare your own PyTorch weight such as `yolov8s.pt` or `yolov8s-seg.pt`.

***NOTICE:***

Please use the latest `CUDA` and `TensorRT`, so that you can achieve the fastest speed !

If you have to use a lower version of `CUDA` and `TensorRT`, please read the relevant issues carefully !

# Normal Usage

If you get ONNX from origin [`ultralytics`](https://github.com/ultralytics/ultralytics) repo, you should build engine by yourself.

You can only use the `c++` inference code to deserialize the engine and do inference.

You can get more information in [`Normal.md`](docs/Normal.md) !

Besides, other scripts won't work.

# Export End2End ONNX with NMS

You can export your onnx model by `ultralytics` API and add postprocess such as bbox decoder and `NMS` into ONNX model at the same time.

``` shell
python3 export-det.py \
--weights yolov8s.pt \
--iou-thres 0.65 \
--conf-thres 0.25 \
--topk 100 \
--opset 11 \
--sim \
--input-shape 1 3 640 640 \
--device cuda:0
```

#### Description of all arguments

- `--weights` : The PyTorch model you trained.
- `--iou-thres` : IOU threshold for NMS plugin.
- `--conf-thres` : Confidence threshold for NMS plugin.
- `--topk` : Max number of detection bboxes.
- `--opset` : ONNX opset version, default is 11.
- `--sim` : Whether to simplify your onnx model.
- `--input-shape` : Input shape for you model, should be 4 dimensions.
- `--device` : The CUDA deivce you export engine .

You will get an onnx model whose prefix is the same as input weights.

###  Just Taste First

If you just want to taste first, you can download the onnx model which are exported by `YOLOv8` package and modified by me.

[**YOLOv8-n**](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8n_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1772936700&Signature=r6HgJTTcCSAxQxD9bKO9qBTtigQ%3D)

[**YOLOv8-s**](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8s_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1682936722&Signature=JjxQFx1YElcVdsCaMoj81KJ4a5s%3D)

[**YOLOv8-m**](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8m_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1682936739&Signature=IRKBELdVFemD7diixxxgzMYqsWg%3D)

[**YOLOv8-l**](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8l_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1682936763&Signature=RGkJ4G2XJ4J%2BNiki5cJi3oBkDnA%3D)

[**YOLOv8-x**](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8x_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1673936778&Signature=3o%2F7QKhiZg1dW3I6sDrY4ug6MQU%3D)

# Build End2End Engine from ONNX
### 1. Build Engine by TensorRT ONNX Python api

You can export TensorRT engine from ONNX by [`build.py` ](build.py).

Usage:

``` shell
python3 build.py \
--weights yolov8s.onnx \
--iou-thres 0.65 \
--conf-thres 0.25 \
--topk 100 \
--fp16  \
--device cuda:0
```

#### Description of all arguments

- `--weights` : The ONNX model you download.
- `--iou-thres` : IOU threshold for NMS plugin.
- `--conf-thres` : Confidence threshold for NMS plugin.
- `--topk` : Max number of detection bboxes.
- `--fp16` : Whether to export half-precision engine.
- `--device` : The CUDA deivce you export engine .

You can modify `iou-thres` `conf-thres` `topk` by yourself.

### 2. Export Engine by Trtexec Tools

You can export TensorRT engine by [`trtexec`](https://github.com/NVIDIA/TensorRT/tree/main/samples/trtexec) tools.

Usage:

``` shell
/usr/src/tensorrt/bin/trtexec \
--onnx=yolov8s.onnx \
--saveEngine=yolov8s.engine \
--fp16
```

**If you installed TensorRT by a debian package, then the installation path of `trtexec`
is `/usr/src/tensorrt/bin/trtexec`**

**If you installed TensorRT by a tar package, then the installation path of `trtexec` is under the `bin` folder in the path you decompressed**

# Build TensorRT Engine by TensorRT API

Please see more information in [`API-Build.md`](docs/API-Build.md)

***Notice !!!*** We don't support YOLOv8-seg model now !!!

# Inference

## 1. Infer with python script

You can infer images with the engine by [`infer-det.py`](infer-det.py) .

Usage:

``` shell
python3 infer-det.py \
--engine yolov8s.engine \
--imgs data \
--show \
--out-dir outputs \
--device cuda:0
```

#### Description of all arguments

- `--engine` : The Engine you export.
- `--imgs` : The images path you want to detect.
- `--show` : Whether to show detection results.
- `--out-dir` : Where to save detection results images. It will not work when use `--show` flag.
- `--device` : The CUDA deivce you use.
- `--profile` : Profile the TensorRT engine.

## 2. Infer with C++

You can infer with c++ in [`csrc/detect/end2end`](csrc/detect/end2end) .

### Build:

Please set you own librarys in [`CMakeLists.txt`](csrc/detect/end2end/CMakeLists.txt) and modify `CLASS_NAMES` and `COLORS` in [`main.cpp`](csrc/detect/end2end/main.cpp).

``` shell
export root=${PWD}
cd src/detect/end2end
mkdir build
cmake ..
make
mv yolov8 ${root}
cd ${root}
```

Usage:

``` shell
# infer image
./yolov8 yolov8s.engine data/bus.jpg
# infer images
./yolov8 yolov8s.engine data
# infer video
./yolov8 yolov8s.engine data/test.mp4 # the video path
```

# TensorRT Segment Deploy

Please see more information in [`Segment.md`](docs/Segment.md)

# DeepStream Detection Deploy

See more in [`README.md`](csrc/deepstream/README.md)

# Jetson Deploy

Only test on `Jetson-NX 4GB`.
See more in [`Jetson.md`](docs/Jetson.md)

# Profile you engine

If you want to profile the TensorRT engine:

Usage:

``` shell
python3 trt-profile.py --engine yolov8s.engine --device cuda:0
```

# Refuse To Use PyTorch for Model Inference !!!

If you need to break away from pytorch and use tensorrt inference,
you can get more information in [`infer-det-without-torch.py`](infer-det-without-torch.py),
the usage is the same as the pytorch version, but its performance is much worse.

You can use `cuda-python` or `pycuda` for inference.
Please install by such command:

```shell
pip install cuda-python
# or
pip install pycuda
```

Usage:

``` shell
python3 infer-det-without-torch.py \
--engine yolov8s.engine \
--imgs data \
--show \
--out-dir outputs \
--method cudart
```

#### Description of all arguments

- `--engine` : The Engine you export.
- `--imgs` : The images path you want to detect.
- `--show` : Whether to show detection results.
- `--out-dir` : Where to save detection results images. It will not work when use `--show` flag.
- `--method` : Choose `cudart` or `pycuda`, default is `cudart`.
Initial commit 2 years ago			`# YOLOv8-TensorRT`
Add profiler 2 years ago
Support TensorRT api build 2 years ago			`YOLOv8` using TensorRT accelerate !
Update README.md Add urls for ONNX model download 2 years ago
icon 2 years ago			`---`
			`[![Build Status](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2Fatrox%2Fsync-dotenv%2Fbadge&style=flat)](https://github.com/triple-Mu/YOLOv8-TensorRT)`
			`[![Python Version](https://img.shields.io/badge/Python-3.8--3.10-FFD43B?logo=python)](https://github.com/triple-Mu/YOLOv8-TensorRT)`
Fix 2 years ago			`[![img](https://badgen.net/badge/icon/tensorrt?icon=azurepipelines&label)](https://developer.nvidia.com/tensorrt)`
icon 2 years ago			`[![C++](https://img.shields.io/badge/CPP-11%2F14-yellow)](https://github.com/triple-Mu/YOLOv8-TensorRT)`
Fix 2 years ago			`[![img](https://badgen.net/github/license/triple-Mu/YOLOv8-TensorRT)](https://github.com/triple-Mu/YOLOv8-TensorRT/blob/main/LICENSE)`
			`[![img](https://badgen.net/github/prs/triple-Mu/YOLOv8-TensorRT)](https://github.com/triple-Mu/YOLOv8-TensorRT/pulls)`
Fix star 2 years ago			`[![img](https://img.shields.io/github/stars/triple-Mu/YOLOv8-TensorRT?color=ccf)](https://github.com/triple-Mu/YOLOv8-TensorRT)`
icon 2 years ago
			`---`


Support TensorRT api build 2 years ago			`# Prepare the environment`
Add profiler 2 years ago
Update Readme 2 years ago			1. Install `CUDA` follow [`CUDA official website`](https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#download-the-nvidia-cuda-toolkit).

			🚀 RECOMMENDED `CUDA` >= 11.4

			2. Install `TensorRT` follow [`TensorRT official website`](https://developer.nvidia.com/nvidia-tensorrt-8x-download).

			🚀 RECOMMENDED `TensorRT` >= 8.4
Support TensorRT api build 2 years ago
			`2. Install python requirement.`

			``` shell
			`pip install -r requirement.txt`
			```

Update Readme 2 years ago			3. Install [`ultralytics`](https://github.com/ultralytics/ultralytics) package for ONNX export or TensorRT API building.
Support TensorRT api build 2 years ago
Fix Readme 2 years ago			``` shell
			`pip install ultralytics`
			```
Support TensorRT api build 2 years ago
Update Readme 2 years ago			5. Prepare your own PyTorch weight such as `yolov8s.pt` or `yolov8s-seg.pt`.

Support Jetson 2 years ago			`*NOTICE:*`
Update Readme 2 years ago
			Please use the latest `CUDA` and `TensorRT`, so that you can achieve the fastest speed !

			If you have to use a lower version of `CUDA` and `TensorRT`, please read the relevant issues carefully !
Support TensorRT api build 2 years ago
Bump Version 0.1.0 2 years ago			`# Normal Usage`
Support TensorRT api build 2 years ago
Update Readme 2 years ago			If you get ONNX from origin [`ultralytics`](https://github.com/ultralytics/ultralytics) repo, you should build engine by yourself.

			You can only use the `c++` inference code to deserialize the engine and do inference.
Add seg README 2 years ago
Update Readme 2 years ago			You can get more information in [`Normal.md`](docs/Normal.md) !
Support TensorRT api build 2 years ago
Update Readme 2 years ago			`Besides, other scripts won't work.`
Update Readme 2 years ago
Update Readme 2 years ago			`# Export End2End ONNX with NMS`
Add seg README 2 years ago
Update Readme 2 years ago			You can export your onnx model by `ultralytics` API and add postprocess such as bbox decoder and `NMS` into ONNX model at the same time.
Update Readme 2 years ago
			``` shell
Refactor code for detection and segment 2 years ago			`python3 export-det.py \`
Update Readme 2 years ago			`--weights yolov8s.pt \`
			`--iou-thres 0.65 \`
			`--conf-thres 0.25 \`
			`--topk 100 \`
			`--opset 11 \`
			`--sim \`
			`--input-shape 1 3 640 640 \`
			`--device cuda:0`
			```

			`#### Description of all arguments`

			- `--weights` : The PyTorch model you trained.
			- `--iou-thres` : IOU threshold for NMS plugin.
			- `--conf-thres` : Confidence threshold for NMS plugin.
			- `--topk` : Max number of detection bboxes.
			- `--opset` : ONNX opset version, default is 11.
			- `--sim` : Whether to simplify your onnx model.
			- `--input-shape` : Input shape for you model, should be 4 dimensions.
			- `--device` : The CUDA deivce you export engine .

			`You will get an onnx model whose prefix is the same as input weights.`

Add seg README 2 years ago			`### Just Taste First`
Support TensorRT api build 2 years ago
Add seg README 2 years ago			If you just want to taste first, you can download the onnx model which are exported by `YOLOv8` package and modified by me.
Update README.md 2 years ago
Fix format 2 years ago			`[YOLOv8-n](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8n_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1772936700&Signature=r6HgJTTcCSAxQxD9bKO9qBTtigQ%3D)`
Update README.md 2 years ago
Fix format 2 years ago			`[YOLOv8-s](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8s_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1682936722&Signature=JjxQFx1YElcVdsCaMoj81KJ4a5s%3D)`
Update README.md 2 years ago
Fix format 2 years ago			`[YOLOv8-m](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8m_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1682936739&Signature=IRKBELdVFemD7diixxxgzMYqsWg%3D)`
Update README.md 2 years ago
Fix format 2 years ago			`[YOLOv8-l](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8l_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1682936763&Signature=RGkJ4G2XJ4J%2BNiki5cJi3oBkDnA%3D)`
Update README.md 2 years ago
Fix format 2 years ago			`[YOLOv8-x](https://triplemu.oss-cn-beijing.aliyuncs.com/YOLOv8/ONNX/yolov8x_nms.onnx?OSSAccessKeyId=LTAI5tN1dgmZD4PF8AJUXp3J&Expires=1673936778&Signature=3o%2F7QKhiZg1dW3I6sDrY4ug6MQU%3D)`
Update README.md 2 years ago
Update Readme 2 years ago			`# Build End2End Engine from ONNX`
			`### 1. Build Engine by TensorRT ONNX Python api`
Update README.md 2 years ago
Support TensorRT api build 2 years ago			You can export TensorRT engine from ONNX by [`build.py` ](build.py).
Update README.md 2 years ago
Add profiler 2 years ago			`Usage:`
Update README.md 2 years ago
			``` shell
Add seg README 2 years ago			`python3 build.py \`
Remove nms suffix 2 years ago			`--weights yolov8s.onnx \`
Support TensorRT api build 2 years ago			`--iou-thres 0.65 \`
			`--conf-thres 0.25 \`
			`--topk 100 \`
			`--fp16 \`
			`--device cuda:0`
Update README.md 2 years ago			```

			`#### Description of all arguments`

Support TensorRT api build 2 years ago			- `--weights` : The ONNX model you download.
			- `--iou-thres` : IOU threshold for NMS plugin.
			- `--conf-thres` : Confidence threshold for NMS plugin.
			- `--topk` : Max number of detection bboxes.
			- `--fp16` : Whether to export half-precision engine.
Update README.md 2 years ago			- `--device` : The CUDA deivce you export engine .
Support TensorRT api build 2 years ago
			You can modify `iou-thres` `conf-thres` `topk` by yourself.
Update README.md 2 years ago
Fix Readme 2 years ago			`### 2. Export Engine by Trtexec Tools`
Update README.md 2 years ago
			You can export TensorRT engine by [`trtexec`](https://github.com/NVIDIA/TensorRT/tree/main/samples/trtexec) tools.

			`Usage:`

			``` shell
Add seg README 2 years ago			`/usr/src/tensorrt/bin/trtexec \`
			`--onnx=yolov8s.onnx \`
			`--saveEngine=yolov8s.engine \`
			`--fp16`
Update README.md 2 years ago			```

Support TensorRT api build 2 years ago			**If you installed TensorRT by a debian package, then the installation path of `trtexec`
			is `/usr/src/tensorrt/bin/trtexec`**

			If you installed TensorRT by a tar package, then the installation path of `trtexec` is under the `bin` folder in the path you decompressed

Add seg README 2 years ago			`# Build TensorRT Engine by TensorRT API`
Support TensorRT api build 2 years ago
Add seg README 2 years ago			Please see more information in [`API-Build.md`](docs/API-Build.md)
Support TensorRT api build 2 years ago
Add seg README 2 years ago			`*Notice !!!* We don't support YOLOv8-seg model now !!!`
Support TensorRT api build 2 years ago
Add seg README 2 years ago			`# Inference`
Support TensorRT api build 2 years ago
Add seg README 2 years ago			`## 1. Infer with python script`
Add cpp infer readme 2 years ago
Refactor code for detection and segment 2 years ago			You can infer images with the engine by [`infer-det.py`](infer-det.py) .
Update README.md 2 years ago
			`Usage:`

			``` shell
Refactor code for detection and segment 2 years ago			`python3 infer-det.py \`
Add seg README 2 years ago			`--engine yolov8s.engine \`
			`--imgs data \`
			`--show \`
			`--out-dir outputs \`
			`--device cuda:0`
Update README.md 2 years ago			```

			`#### Description of all arguments`

			- `--engine` : The Engine you export.
			- `--imgs` : The images path you want to detect.
			- `--show` : Whether to show detection results.
			- `--out-dir` : Where to save detection results images. It will not work when use `--show` flag.
			- `--device` : The CUDA deivce you use.
Add profiler 2 years ago			- `--profile` : Profile the TensorRT engine.

Add seg README 2 years ago			`## 2. Infer with C++`
Add cpp infer readme 2 years ago
Bump Version 0.1.0 2 years ago			You can infer with c++ in [`csrc/detect/end2end`](csrc/detect/end2end) .
Add cpp infer readme 2 years ago
Add seg README 2 years ago			`### Build:`
Add cpp infer readme 2 years ago
Bump Version 0.1.0 2 years ago			Please set you own librarys in [`CMakeLists.txt`](csrc/detect/end2end/CMakeLists.txt) and modify `CLASS_NAMES` and `COLORS` in [`main.cpp`](csrc/detect/end2end/main.cpp).
Add cpp infer readme 2 years ago
			``` shell
			`export root=${PWD}`
Bump Version 0.1.0 2 years ago			`cd src/detect/end2end`
Add cpp infer readme 2 years ago			`mkdir build`
			`cmake ..`
			`make`
			`mv yolov8 ${root}`
			`cd ${root}`
			```

			`Usage:`

			``` shell
			`# infer image`
Remove nms suffix 2 years ago			`./yolov8 yolov8s.engine data/bus.jpg`
Add cpp infer readme 2 years ago			`# infer images`
Remove nms suffix 2 years ago			`./yolov8 yolov8s.engine data`
Add cpp infer readme 2 years ago			`# infer video`
Remove nms suffix 2 years ago			`./yolov8 yolov8s.engine data/test.mp4 # the video path`
Add cpp infer readme 2 years ago			```

Add seg README 2 years ago			`# TensorRT Segment Deploy`

			Please see more information in [`Segment.md`](docs/Segment.md)

			`# DeepStream Detection Deploy`

			See more in [`README.md`](csrc/deepstream/README.md)

Support Jetson 2 years ago			`# Jetson Deploy`

			Only test on `Jetson-NX 4GB`.
			See more in [`Jetson.md`](docs/Jetson.md)

Support TensorRT api build 2 years ago			`# Profile you engine`

Add profiler 2 years ago			`If you want to profile the TensorRT engine:`

			`Usage:`

			``` shell
Rename profile 2 years ago			`python3 trt-profile.py --engine yolov8s.engine --device cuda:0`
Add profiler 2 years ago			```
version 0.2.0 2 years ago
Refactor code for detection and segment 2 years ago			`# Refuse To Use PyTorch for Model Inference !!!`
version 0.2.0 2 years ago
			`If you need to break away from pytorch and use tensorrt inference,`
Refactor code for detection and segment 2 years ago			you can get more information in [`infer-det-without-torch.py`](infer-det-without-torch.py),
version 0.2.0 2 years ago			`the usage is the same as the pytorch version, but its performance is much worse.`

			You can use `cuda-python` or `pycuda` for inference.
			`Please install by such command:`

			```shell
Fix typo 2 years ago			`pip install cuda-python`
version 0.2.0 2 years ago			`# or`
			`pip install pycuda`
			```

			`Usage:`

			``` shell
Refactor code for detection and segment 2 years ago			`python3 infer-det-without-torch.py \`
version 0.2.0 2 years ago			`--engine yolov8s.engine \`
			`--imgs data \`
			`--show \`
			`--out-dir outputs \`
			`--method cudart`
			```

			`#### Description of all arguments`

			- `--engine` : The Engine you export.
			- `--imgs` : The images path you want to detect.
			- `--show` : Whether to show detection results.
			- `--out-dir` : Where to save detection results images. It will not work when use `--show` flag.
			- `--method` : Choose `cudart` or `pycuda`, default is `cudart`.