From aa463705f40645245316306bf970ab72edd13faf Mon Sep 17 00:00:00 2001 From: Lakshantha Dissanayake Date: Mon, 28 Oct 2024 14:46:36 -0700 Subject: [PATCH 01/46] Update NVIDIA Jetson Guide with YOLO11 (#17206) Co-authored-by: Glenn Jocher --- docs/en/guides/nvidia-jetson.md | 216 ++++++++++++++++---------------- 1 file changed, 108 insertions(+), 108 deletions(-) diff --git a/docs/en/guides/nvidia-jetson.md b/docs/en/guides/nvidia-jetson.md index 16793288a2..8a43d978b1 100644 --- a/docs/en/guides/nvidia-jetson.md +++ b/docs/en/guides/nvidia-jetson.md @@ -1,12 +1,12 @@ --- comments: true -description: Learn to deploy Ultralytics YOLOv8 on NVIDIA Jetson devices with our detailed guide. Explore performance benchmarks and maximize AI capabilities. -keywords: Ultralytics, YOLOv8, NVIDIA Jetson, JetPack, AI deployment, performance benchmarks, embedded systems, deep learning, TensorRT, computer vision +description: Learn to deploy Ultralytics YOLO11 on NVIDIA Jetson devices with our detailed guide. Explore performance benchmarks and maximize AI capabilities. +keywords: Ultralytics, YOLO11, NVIDIA Jetson, JetPack, AI deployment, performance benchmarks, embedded systems, deep learning, TensorRT, computer vision --- -# Quick Start Guide: NVIDIA Jetson with Ultralytics YOLOv8 +# Quick Start Guide: NVIDIA Jetson with Ultralytics YOLO11 -This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLOv8 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices. Additionally, it showcases performance benchmarks to demonstrate the capabilities of YOLOv8 on these small and powerful devices. +This comprehensive guide provides a detailed walkthrough for deploying Ultralytics YOLO11 on [NVIDIA Jetson](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) devices. Additionally, it showcases performance benchmarks to demonstrate the capabilities of YOLO11 on these small and powerful devices.


@@ -16,7 +16,7 @@ This comprehensive guide provides a detailed walkthrough for deploying Ultralyti allowfullscreen>
- Watch: How to Setup NVIDIA Jetson with Ultralytics YOLOv8 + Watch: How to Setup NVIDIA Jetson with Ultralytics YOLO11

NVIDIA Jetson Ecosystem @@ -77,7 +77,7 @@ The below table highlights NVIDIA JetPack versions supported by different NVIDIA ## Quick Start with Docker -The fastest way to get started with Ultralytics YOLOv8 on NVIDIA Jetson is to run with pre-built docker images for Jetson. Refer to the table above and choose the JetPack version according to the Jetson device you own. +The fastest way to get started with Ultralytics YOLO11 on NVIDIA Jetson is to run with pre-built docker images for Jetson. Refer to the table above and choose the JetPack version according to the Jetson device you own. === "JetPack 4" @@ -242,7 +242,7 @@ Out of all the model export formats supported by Ultralytics, TensorRT delivers ### Convert Model to TensorRT and Run Inference -The YOLOv8n model in PyTorch format is converted to TensorRT to run inference with the exported model. +The YOLO11n model in PyTorch format is converted to TensorRT to run inference with the exported model. !!! example @@ -251,14 +251,14 @@ The YOLOv8n model in PyTorch format is converted to TensorRT to run inference wi ```python from ultralytics import YOLO - # Load a YOLOv8n PyTorch model - model = YOLO("yolov8n.pt") + # Load a YOLO11n PyTorch model + model = YOLO("yolo11n.pt") # Export the model to TensorRT - model.export(format="engine") # creates 'yolov8n.engine' + model.export(format="engine") # creates 'yolo11n.engine' # Load the exported TensorRT model - trt_model = YOLO("yolov8n.engine") + trt_model = YOLO("yolo11n.engine") # Run inference results = trt_model("https://ultralytics.com/images/bus.jpg") @@ -267,11 +267,11 @@ The YOLOv8n model in PyTorch format is converted to TensorRT to run inference wi === "CLI" ```bash - # Export a YOLOv8n PyTorch model to TensorRT format - yolo export model=yolov8n.pt format=engine # creates 'yolov8n.engine' + # Export a YOLO11n PyTorch model to TensorRT format + yolo export model=yolo11n.pt format=engine # creates 'yolo11n.engine' # Run inference with the exported model - yolo predict model=yolov8n.engine source='https://ultralytics.com/images/bus.jpg' + yolo predict model=yolo11n.engine source='https://ultralytics.com/images/bus.jpg' ``` ### Use NVIDIA Deep Learning Accelerator (DLA) @@ -292,14 +292,14 @@ The following Jetson devices are equipped with DLA hardware: ```python from ultralytics import YOLO - # Load a YOLOv8n PyTorch model - model = YOLO("yolov8n.pt") + # Load a YOLO11n PyTorch model + model = YOLO("yolo11n.pt") # Export the model to TensorRT with DLA enabled (only works with FP16 or INT8) model.export(format="engine", device="dla:0", half=True) # dla:0 or dla:1 corresponds to the DLA cores # Load the exported TensorRT model - trt_model = YOLO("yolov8n.engine") + trt_model = YOLO("yolo11n.engine") # Run inference results = trt_model("https://ultralytics.com/images/bus.jpg") @@ -308,119 +308,119 @@ The following Jetson devices are equipped with DLA hardware: === "CLI" ```bash - # Export a YOLOv8n PyTorch model to TensorRT format with DLA enabled (only works with FP16 or INT8) - yolo export model=yolov8n.pt format=engine device="dla:0" half=True # dla:0 or dla:1 corresponds to the DLA cores + # Export a YOLO11n PyTorch model to TensorRT format with DLA enabled (only works with FP16 or INT8) + yolo export model=yolo11n.pt format=engine device="dla:0" half=True # dla:0 or dla:1 corresponds to the DLA cores # Run inference with the exported model on the DLA - yolo predict model=yolov8n.engine source='https://ultralytics.com/images/bus.jpg' + yolo predict model=yolo11n.engine source='https://ultralytics.com/images/bus.jpg' ``` !!! note Visit the [Export page](../modes/export.md#arguments) to access additional arguments when exporting models to different model formats -## NVIDIA Jetson Orin YOLOv8 Benchmarks +## NVIDIA Jetson Orin YOLO11 Benchmarks -YOLOv8 benchmarks were run by the Ultralytics team on 10 different model formats measuring speed and [accuracy](https://www.ultralytics.com/glossary/accuracy): PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN. Benchmarks were run on Seeed Studio reComputer J4012 powered by Jetson Orin NX 16GB device at FP32 [precision](https://www.ultralytics.com/glossary/precision) with default input image size of 640. +YOLO11 benchmarks were run by the Ultralytics team on 10 different model formats measuring speed and [accuracy](https://www.ultralytics.com/glossary/accuracy): PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN. Benchmarks were run on Seeed Studio reComputer J4012 powered by Jetson Orin NX 16GB device at FP32 [precision](https://www.ultralytics.com/glossary/precision) with default input image size of 640. ### Comparison Chart Even though all model exports are working with NVIDIA Jetson, we have only included **PyTorch, TorchScript, TensorRT** for the comparison chart below because, they make use of the GPU on the Jetson and are guaranteed to produce the best results. All the other exports only utilize the CPU and the performance is not as good as the above three. You can find benchmarks for all exports in the section after this chart.
- NVIDIA Jetson Ecosystem + NVIDIA Jetson Ecosystem
### Detailed Comparison Table -The below table represents the benchmark results for five different models (YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x) across ten different formats (PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN), giving us the status, size, mAP50-95(B) metric, and inference time for each combination. +The below table represents the benchmark results for five different models (YOLO11n, YOLO11s, YOLO11m, YOLO11l, YOLO11x) across ten different formats (PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, TF SavedModel, TF GraphDef, TF Lite, PaddlePaddle, NCNN), giving us the status, size, mAP50-95(B) metric, and inference time for each combination. !!! performance - === "YOLOv8n" + === "YOLO11n" | Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) | |-----------------|--------|-------------------|-------------|------------------------| - | PyTorch | ✅ | 6.2 | 0.6381 | 14.3 | - | TorchScript | ✅ | 12.4 | 0.6117 | 13.3 | - | ONNX | ✅ | 12.2 | 0.6092 | 70.6 | - | OpenVINO | ✅ | 12.3 | 0.6092 | 104.2 | - | TensorRT (FP32) | ✅ | 16.1 | 0.6091 | 8.01 | - | TensorRT (FP16) | ✅ | 9.2 | 0.6093 | 4.55 | - | TensorRT (INT8) | ✅ | 5.9 | 0.2759 | 4.09 | - | TF SavedModel | ✅ | 30.6 | 0.6092 | 141.74 | - | TF GraphDef | ✅ | 12.3 | 0.6092 | 199.93 | - | TF Lite | ✅ | 12.3 | 0.6092 | 349.18 | - | PaddlePaddle | ✅ | 24.4 | 0.6030 | 555 | - | NCNN | ✅ | 12.2 | 0.6092 | 32 | - - === "YOLOv8s" + | PyTorch | ✅ | 5.4 | 0.6176 | 19.80 | + | TorchScript | ✅ | 10.5 | 0.6100 | 13.30 | + | ONNX | ✅ | 10.2 | 0.6082 | 67.92 | + | OpenVINO | ✅ | 10.4 | 0.6082 | 118.21 | + | TensorRT (FP32) | ✅ | 14.1 | 0.6100 | 7.94 | + | TensorRT (FP16) | ✅ | 8.3 | 0.6082 | 4.80 | + | TensorRT (INT8) | ✅ | 6.6 | 0.3256 | 4.17 | + | TF SavedModel | ✅ | 25.8 | 0.6082 | 185.88 | + | TF GraphDef | ✅ | 10.3 | 0.6082 | 256.66 | + | TF Lite | ✅ | 10.3 | 0.6082 | 284.64 | + | PaddlePaddle | ✅ | 20.4 | 0.6082 | 477.41 | + | NCNN | ✅ | 10.2 | 0.6106 | 32.18 | + + === "YOLO11s" | Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) | |-----------------|--------|-------------------|-------------|------------------------| - | PyTorch | ✅ | 21.5 | 0.6967 | 18 | - | TorchScript | ✅ | 43.0 | 0.7136 | 23.81 | - | ONNX | ✅ | 42.8 | 0.7136 | 185.55 | - | OpenVINO | ✅ | 42.9 | 0.7136 | 243.97 | - | TensorRT (FP32) | ✅ | 46.4 | 0.7136 | 14.01 | - | TensorRT (FP16) | ✅ | 24.2 | 0.722 | 7.16 | - | TensorRT (INT8) | ✅ | 13.7 | 0.4233 | 5.49 | - | TF SavedModel | ✅ | 107 | 0.7136 | 260.03 | - | TF GraphDef | ✅ | 42.8 | 0.7136 | 423.4 | - | TF Lite | ✅ | 42.8 | 0.7136 | 1046.64 | - | PaddlePaddle | ✅ | 85.5 | 0.7140 | 1464 | - | NCNN | ✅ | 42.7 | 0.7200 | 63 | - - === "YOLOv8m" + | PyTorch | ✅ | 18.4 | 0.7526 | 20.20 | + | TorchScript | ✅ | 36.5 | 0.7416 | 23.42 | + | ONNX | ✅ | 36.3 | 0.7416 | 162.01 | + | OpenVINO | ✅ | 36.4 | 0.7416 | 159.61 | + | TensorRT (FP32) | ✅ | 40.3 | 0.7416 | 13.93 | + | TensorRT (FP16) | ✅ | 21.7 | 0.7416 | 7.47 | + | TensorRT (INT8) | ✅ | 13.6 | 0.3179 | 5.66 | + | TF SavedModel | ✅ | 91.1 | 0.7416 | 316.46 | + | TF GraphDef | ✅ | 36.4 | 0.7416 | 506.71 | + | TF Lite | ✅ | 36.4 | 0.7416 | 842.97 | + | PaddlePaddle | ✅ | 72.5 | 0.7416 | 1172.57 | + | NCNN | ✅ | 36.2 | 0.7419 | 66.00 | + + === "YOLO11m" | Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) | |-----------------|--------|-------------------|-------------|------------------------| - | PyTorch | ✅ | 49.7 | 0.7370 | 36.4 | - | TorchScript | ✅ | 99.2 | 0.7285 | 53.58 | - | ONNX | ✅ | 99 | 0.7280 | 452.09 | - | OpenVINO | ✅ | 99.1 | 0.7280 | 544.36 | - | TensorRT (FP32) | ✅ | 102.4 | 0.7285 | 31.51 | - | TensorRT (FP16) | ✅ | 52.6 | 0.7324 | 14.88 | - | TensorRT (INT8) | ✅ | 28.6 | 0.3283 | 10.89 | - | TF SavedModel | ✅ | 247.5 | 0.7280 | 543.65 | - | TF GraphDef | ✅ | 99 | 0.7280 | 906.63 | - | TF Lite | ✅ | 99 | 0.7280 | 2758.08 | - | PaddlePaddle | ✅ | 197.9 | 0.7280 | 3678 | - | NCNN | ✅ | 98.9 | 0.7260 | 135 | - - === "YOLOv8l" + | PyTorch | ✅ | 38.8 | 0.7595 | 36.70 | + | TorchScript | ✅ | 77.3 | 0.7643 | 50.95 | + | ONNX | ✅ | 76.9 | 0.7643 | 416.34 | + | OpenVINO | ✅ | 77.1 | 0.7643 | 370.99 | + | TensorRT (FP32) | ✅ | 81.5 | 0.7640 | 30.49 | + | TensorRT (FP16) | ✅ | 42.2 | 0.7658 | 14.93 | + | TensorRT (INT8) | ✅ | 24.3 | 0.4118 | 10.32 | + | TF SavedModel | ✅ | 192.7 | 0.7643 | 597.08 | + | TF GraphDef | ✅ | 77.0 | 0.7643 | 1016.12 | + | TF Lite | ✅ | 77.0 | 0.7643 | 2494.60 | + | PaddlePaddle | ✅ | 153.8 | 0.7643 | 3218.99 | + | NCNN | ✅ | 76.8 | 0.7691 | 192.77 | + + === "YOLO11l" | Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) | |-----------------|--------|-------------------|-------------|------------------------| - | PyTorch | ✅ | 83.7 | 0.7768 | 61.3 | - | TorchScript | ✅ | 167.2 | 0.7554 | 87.9 | - | ONNX | ✅ | 166.8 | 0.7551 | 852.29 | - | OpenVINO | ✅ | 167 | 0.7551 | 1012.6 | - | TensorRT (FP32) | ✅ | 170.5 | 0.7554 | 49.79 | - | TensorRT (FP16) | ✅ | 86.1 | 0.7535 | 22.89 | - | TensorRT (INT8) | ✅ | 46.4 | 0.4048 | 14.61 | - | TF SavedModel | ✅ | 417.2 | 0.7551 | 990.45 | - | TF GraphDef | ✅ | 166.9 | 0.7551 | 1649.86 | - | TF Lite | ✅ | 166.9 | 0.7551 | 5652.37 | - | PaddlePaddle | ✅ | 333.6 | 0.7551 | 7114.67 | - | NCNN | ✅ | 166.8 | 0.7685 | 231.9 | - - === "YOLOv8x" + | PyTorch | ✅ | 49.0 | 0.7475 | 47.6 | + | TorchScript | ✅ | 97.6 | 0.7250 | 66.36 | + | ONNX | ✅ | 97.0 | 0.7250 | 532.58 | + | OpenVINO | ✅ | 97.3 | 0.7250 | 477.55 | + | TensorRT (FP32) | ✅ | 101.6 | 0.7250 | 38.71 | + | TensorRT (FP16) | ✅ | 52.6 | 0.7265 | 19.35 | + | TensorRT (INT8) | ✅ | 31.6 | 0.3856 | 13.50 | + | TF SavedModel | ✅ | 243.3 | 0.7250 | 895.24 | + | TF GraphDef | ✅ | 97.2 | 0.7250 | 1301.19 | + | TF Lite | ✅ | 97.2 | 0.7250 | 3202.93 | + | PaddlePaddle | ✅ | 193.9 | 0.7250 | 4206.98 | + | NCNN | ✅ | 96.9 | 0.7252 | 225.75 | + + === "YOLO11x" | Format | Status | Size on disk (MB) | mAP50-95(B) | Inference time (ms/im) | |-----------------|--------|-------------------|-------------|------------------------| - | PyTorch | ✅ | 130.5 | 0.7759 | 93 | - | TorchScript | ✅ | 260.7 | 0.7472 | 135.1 | - | ONNX | ✅ | 260.4 | 0.7479 | 1296.13 | - | OpenVINO | ✅ | 260.6 | 0.7479 | 1502.15 | - | TensorRT (FP32) | ✅ | 264.0 | 0.7469 | 80.01 | - | TensorRT (FP16) | ✅ | 133.3 | 0.7513 | 40.76 | - | TensorRT (INT8) | ✅ | 70.2 | 0.4277 | 22.08 | - | TF SavedModel | ✅ | 651.1 | 0.7479 | 1451.76 | - | TF GraphDef | ✅ | 260.5 | 0.7479 | 4029.36 | - | TF Lite | ✅ | 260.4 | 0.7479 | 8772.86 | - | PaddlePaddle | ✅ | 520.8 | 0.7479 | 10619.53 | - | NCNN | ✅ | 260.4 | 0.7646 | 376.38 | + | PyTorch | ✅ | 109.3 | 0.8288 | 85.60 | + | TorchScript | ✅ | 218.1 | 0.8308 | 121.67 | + | ONNX | ✅ | 217.5 | 0.8308 | 1073.14 | + | OpenVINO | ✅ | 217.8 | 0.8308 | 955.60 | + | TensorRT (FP32) | ✅ | 221.6 | 0.8307 | 75.84 | + | TensorRT (FP16) | ✅ | 113.1 | 0.8295 | 35.75 | + | TensorRT (INT8) | ✅ | 62.2 | 0.4783 | 22.23 | + | TF SavedModel | ✅ | 545.0 | 0.8308 | 1497.40 | + | TF GraphDef | ✅ | 217.8 | 0.8308 | 2552.42 | + | TF Lite | ✅ | 217.8 | 0.8308 | 7044.58 | + | PaddlePaddle | ✅ | 434.9 | 0.8308 | 8386.73 | + | NCNN | ✅ | 217.3 | 0.8304 | 486.36 | [Explore more benchmarking efforts by Seeed Studio](https://www.seeedstudio.com/blog/2023/03/30/yolov8-performance-benchmarks-on-nvidia-jetson-devices) running on different versions of NVIDIA Jetson hardware. @@ -435,25 +435,25 @@ To reproduce the above Ultralytics benchmarks on all export [formats](../modes/e ```python from ultralytics import YOLO - # Load a YOLOv8n PyTorch model - model = YOLO("yolov8n.pt") + # Load a YOLO11n PyTorch model + model = YOLO("yolo11n.pt") - # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all all export formats + # Benchmark YOLO11n speed and accuracy on the COCO8 dataset for all all export formats results = model.benchmarks(data="coco8.yaml", imgsz=640) ``` === "CLI" ```bash - # Benchmark YOLOv8n speed and accuracy on the COCO8 dataset for all all export formats - yolo benchmark model=yolov8n.pt data=coco8.yaml imgsz=640 + # Benchmark YOLO11n speed and accuracy on the COCO8 dataset for all all export formats + yolo benchmark model=yolo11n.pt data=coco8.yaml imgsz=640 ``` Note that benchmarking results might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run. For the most reliable results use a dataset with a large number of images, i.e. `data='coco8.yaml' (4 val images), or `data='coco.yaml'` (5000 val images). ## Best Practices when using NVIDIA Jetson -When using NVIDIA Jetson, there are a couple of best practices to follow in order to enable maximum performance on the NVIDIA Jetson running YOLOv8. +When using NVIDIA Jetson, there are a couple of best practices to follow in order to enable maximum performance on the NVIDIA Jetson running YOLO11. 1. Enable MAX Power Mode @@ -486,29 +486,29 @@ When using NVIDIA Jetson, there are a couple of best practices to follow in orde ## Next Steps -Congratulations on successfully setting up YOLOv8 on your NVIDIA Jetson! For further learning and support, visit more guide at [Ultralytics YOLOv8 Docs](../index.md)! +Congratulations on successfully setting up YOLO11 on your NVIDIA Jetson! For further learning and support, visit more guide at [Ultralytics YOLO11 Docs](../index.md)! ## FAQ -### How do I deploy Ultralytics YOLOv8 on NVIDIA Jetson devices? +### How do I deploy Ultralytics YOLO11 on NVIDIA Jetson devices? -Deploying Ultralytics YOLOv8 on NVIDIA Jetson devices is a straightforward process. First, flash your Jetson device with the NVIDIA JetPack SDK. Then, either use a pre-built Docker image for quick setup or manually install the required packages. Detailed steps for each approach can be found in sections [Quick Start with Docker](#quick-start-with-docker) and [Start with Native Installation](#start-with-native-installation). +Deploying Ultralytics YOLO11 on NVIDIA Jetson devices is a straightforward process. First, flash your Jetson device with the NVIDIA JetPack SDK. Then, either use a pre-built Docker image for quick setup or manually install the required packages. Detailed steps for each approach can be found in sections [Quick Start with Docker](#quick-start-with-docker) and [Start with Native Installation](#start-with-native-installation). -### What performance benchmarks can I expect from YOLOv8 models on NVIDIA Jetson devices? +### What performance benchmarks can I expect from YOLO11 models on NVIDIA Jetson devices? -YOLOv8 models have been benchmarked on various NVIDIA Jetson devices showing significant performance improvements. For example, the TensorRT format delivers the best inference performance. The table in the [Detailed Comparison Table](#detailed-comparison-table) section provides a comprehensive view of performance metrics like mAP50-95 and inference time across different model formats. +YOLO11 models have been benchmarked on various NVIDIA Jetson devices showing significant performance improvements. For example, the TensorRT format delivers the best inference performance. The table in the [Detailed Comparison Table](#detailed-comparison-table) section provides a comprehensive view of performance metrics like mAP50-95 and inference time across different model formats. -### Why should I use TensorRT for deploying YOLOv8 on NVIDIA Jetson? +### Why should I use TensorRT for deploying YOLO11 on NVIDIA Jetson? -TensorRT is highly recommended for deploying YOLOv8 models on NVIDIA Jetson due to its optimal performance. It accelerates inference by leveraging the Jetson's GPU capabilities, ensuring maximum efficiency and speed. Learn more about how to convert to TensorRT and run inference in the [Use TensorRT on NVIDIA Jetson](#use-tensorrt-on-nvidia-jetson) section. +TensorRT is highly recommended for deploying YOLO11 models on NVIDIA Jetson due to its optimal performance. It accelerates inference by leveraging the Jetson's GPU capabilities, ensuring maximum efficiency and speed. Learn more about how to convert to TensorRT and run inference in the [Use TensorRT on NVIDIA Jetson](#use-tensorrt-on-nvidia-jetson) section. ### How can I install PyTorch and Torchvision on NVIDIA Jetson? To install PyTorch and Torchvision on NVIDIA Jetson, first uninstall any existing versions that may have been installed via pip. Then, manually install the compatible PyTorch and Torchvision versions for the Jetson's ARM64 architecture. Detailed instructions for this process are provided in the [Install PyTorch and Torchvision](#install-pytorch-and-torchvision) section. -### What are the best practices for maximizing performance on NVIDIA Jetson when using YOLOv8? +### What are the best practices for maximizing performance on NVIDIA Jetson when using YOLO11? -To maximize performance on NVIDIA Jetson with YOLOv8, follow these best practices: +To maximize performance on NVIDIA Jetson with YOLO11, follow these best practices: 1. Enable MAX Power Mode to utilize all CPU and GPU cores. 2. Enable Jetson Clocks to run all cores at their maximum frequency. From 04ed6f2e500cc24988fdfb762c9510f81f1dcecf Mon Sep 17 00:00:00 2001 From: Skillnoob <78843978+Skillnoob@users.noreply.github.com> Date: Mon, 28 Oct 2024 22:47:19 +0100 Subject: [PATCH 02/46] Fix EdgeTPU wrong PyTorch device (#17199) Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Glenn Jocher --- ultralytics/nn/autobackend.py | 1 + 1 file changed, 1 insertion(+) diff --git a/ultralytics/nn/autobackend.py b/ultralytics/nn/autobackend.py index 0dcbb12f96..9e6d38b49f 100644 --- a/ultralytics/nn/autobackend.py +++ b/ultralytics/nn/autobackend.py @@ -345,6 +345,7 @@ class AutoBackend(nn.Module): model_path=w, experimental_delegates=[load_delegate(delegate, options={"device": device})], ) + device = "cpu" # Required, otherwise PyTorch will try to use the wrong device else: # TFLite LOGGER.info(f"Loading {w} for TensorFlow Lite inference...") interpreter = Interpreter(model_path=w) # load TFLite model From 542320c041114a5405729f9d3a0f54534812ba66 Mon Sep 17 00:00:00 2001 From: Burhan <62214284+Burhan-Q@users.noreply.github.com> Date: Mon, 28 Oct 2024 17:48:15 -0400 Subject: [PATCH 03/46] Adds permissions for stale workflow (#17183) Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Glenn Jocher --- .github/workflows/stale.yml | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/.github/workflows/stale.yml b/.github/workflows/stale.yml index dd8503541e..991e0edd99 100644 --- a/.github/workflows/stale.yml +++ b/.github/workflows/stale.yml @@ -5,6 +5,10 @@ on: schedule: - cron: "0 0 * * *" # Runs at 00:00 UTC every day +permissions: + pull-requests: write + issues: write + jobs: stale: runs-on: ubuntu-latest From 6c12c1d69f060767df8945400d9978a9a8e73c72 Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Tue, 29 Oct 2024 06:05:57 +0800 Subject: [PATCH 04/46] `ultralytics 8.3.24` SAM fix `pred_boxes` when no objects segmented (#17215) Co-authored-by: Muhammad Rizwan Munawar Co-authored-by: Glenn Jocher --- ultralytics/__init__.py | 2 +- ultralytics/models/sam/predict.py | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/ultralytics/__init__.py b/ultralytics/__init__.py index a48d3646c0..72a9396473 100644 --- a/ultralytics/__init__.py +++ b/ultralytics/__init__.py @@ -1,6 +1,6 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -__version__ = "8.3.23" +__version__ = "8.3.24" import os diff --git a/ultralytics/models/sam/predict.py b/ultralytics/models/sam/predict.py index 4002e092b6..a83159080f 100644 --- a/ultralytics/models/sam/predict.py +++ b/ultralytics/models/sam/predict.py @@ -478,7 +478,7 @@ class Predictor(BasePredictor): results = [] for masks, orig_img, img_path in zip([pred_masks], orig_imgs, self.batch[0]): if len(masks) == 0: - masks = None + masks, pred_bboxes = None, torch.zeros((0, 6), device=pred_masks.device) else: masks = ops.scale_masks(masks[None].float(), orig_img.shape[:2], padding=False)[0] masks = masks > self.model.mask_threshold # to bool From b0c18b71900148c0598056c7ff79fc560aeab083 Mon Sep 17 00:00:00 2001 From: Francesco Mattioli Date: Tue, 29 Oct 2024 10:31:32 +0100 Subject: [PATCH 05/46] Fix arbitrary imgsz for TFLite (#17138) Co-authored-by: UltralyticsAssistant Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com> Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Laughing <61612323+Laughing-q@users.noreply.github.com> --- ultralytics/engine/exporter.py | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/ultralytics/engine/exporter.py b/ultralytics/engine/exporter.py index 49e84af9f5..5104de1cd1 100644 --- a/ultralytics/engine/exporter.py +++ b/ultralytics/engine/exporter.py @@ -890,8 +890,10 @@ class Exporter: tmp_file = f / "tmp_tflite_int8_calibration_images.npy" # int8 calibration images file if self.args.data: f.mkdir() - images = [batch["img"].permute(0, 2, 3, 1) for batch in self.get_int8_calibration_dataloader(prefix)] - images = torch.cat(images, 0).float() + images = [batch["img"] for batch in self.get_int8_calibration_dataloader(prefix)] + images = torch.nn.functional.interpolate(torch.cat(images, 0).float(), size=self.imgsz).permute( + 0, 2, 3, 1 + ) np.save(str(tmp_file), images.numpy().astype(np.float32)) # BHWC np_data = [["images", tmp_file, [[[[0, 0, 0]]]], [[[[255, 255, 255]]]]]] From 235f2d95af42f0cbbf5fb9ebae1718a298d25159 Mon Sep 17 00:00:00 2001 From: Yan_Mr <124339742+yawnBright@users.noreply.github.com> Date: Tue, 29 Oct 2024 20:12:15 +0800 Subject: [PATCH 06/46] Example ORT==2.0.0-rs.5 to support onnxruntime==1.19.x (#16962) Co-authored-by: Glenn Jocher Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> --- examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml | 11 +- examples/YOLOv8-ONNXRuntime-Rust/README.md | 2 +- examples/YOLOv8-ONNXRuntime-Rust/src/cli.rs | 2 +- examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs | 42 ++++ examples/YOLOv8-ONNXRuntime-Rust/src/main.rs | 2 +- examples/YOLOv8-ONNXRuntime-Rust/src/model.rs | 31 +-- .../src/ort_backend.rs | 183 ++++++++++-------- 7 files changed, 172 insertions(+), 101 deletions(-) diff --git a/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml b/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml index 8ac747e7e3..fcf1fb7974 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml +++ b/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml @@ -9,11 +9,11 @@ edition = "2021" [dependencies] clap = { version = "4.2.4", features = ["derive"] } -image = { version = "0.24.7", default-features = false, features = ["jpeg", "png", "webp-encoder"] } -imageproc = { version = "0.23.0", default-features = false } -ndarray = { version = "0.15.6" } -ort = { version = "1.16.3", default-features = false, features = ["load-dynamic", "copy-dylibs", "half"] } -rusttype = { version = "0.9", default-features = false } +image = { version = "0.25.2"} +imageproc = { version = "0.25.0"} +ndarray = { version = "0.16" } +ort = { version = "2.0.0-rc.5", features = ["cuda", "tensorrt"]} +rusttype = { version = "0.9.3" } anyhow = { version = "1.0.75" } regex = { version = "1.5.4" } rand = { version = "0.8.5" } @@ -21,3 +21,4 @@ chrono = { version = "0.4.30" } half = { version = "2.3.1" } dirs = { version = "5.0.1" } ureq = { version = "2.9.1" } +ab_glyph = "0.2.29" diff --git a/examples/YOLOv8-ONNXRuntime-Rust/README.md b/examples/YOLOv8-ONNXRuntime-Rust/README.md index 48a3017ce8..9121c7dac7 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/README.md +++ b/examples/YOLOv8-ONNXRuntime-Rust/README.md @@ -5,7 +5,7 @@ This repository provides a Rust demo for performing YOLOv8 tasks like `Classific ## Recently Updated - Add YOLOv8-OBB demo -- Update ONNXRuntime to 1.17.x +- Update ONNXRuntime to 1.19.x Newly updated YOLOv8 example code is located in this repository (https://github.com/jamjamjon/usls/tree/main/examples/yolo) diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/cli.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/cli.rs index 2ba0dd49ec..b5bc05a585 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/cli.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/cli.rs @@ -15,7 +15,7 @@ pub struct Args { /// device id #[arg(long, default_value_t = 0)] - pub device_id: u32, + pub device_id: i32, /// using TensorRT EP #[arg(long)] diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs index 1af7f7c5e1..849801ee47 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs @@ -117,3 +117,45 @@ pub fn check_font(font: &str) -> rusttype::Font<'static> { let buffer = std::fs::read(font_path).unwrap(); rusttype::Font::try_from_vec(buffer).unwrap() } + + +use ab_glyph::FontArc; +pub fn load_font() -> FontArc{ + use std::path::Path; + let font_path = Path::new("./font/Arial.ttf"); + match font_path.try_exists() { + Ok(true) => { + let buffer = std::fs::read(font_path).unwrap(); + FontArc::try_from_vec(buffer).unwrap() + }, + Ok(false) => { + std::fs::create_dir_all("./font").unwrap(); + println!("Downloading font..."); + let source_url = "https://ultralytics.com/assets/Arial.ttf"; + let resp = ureq::get(source_url) + .timeout(std::time::Duration::from_secs(500)) + .call() + .unwrap_or_else(|err| panic!("> Failed to download font: {source_url}: {err:?}")); + + // read to buffer + let mut buffer = vec![]; + let total_size = resp + .header("Content-Length") + .and_then(|s| s.parse::().ok()) + .unwrap(); + let _reader = resp + .into_reader() + .take(total_size) + .read_to_end(&mut buffer) + .unwrap(); + // save + let mut fd = std::fs::File::create(font_path).unwrap(); + fd.write_all(&buffer).unwrap(); + println!("Font saved at: {:?}", font_path.display()); + FontArc::try_from_vec(buffer).unwrap() + }, + Err(e) => { + panic!("Failed to load font {}", e); + }, + } +} \ No newline at end of file diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/main.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/main.rs index 8dd1567990..fd3845ced0 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/main.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/main.rs @@ -6,7 +6,7 @@ fn main() -> Result<(), Box> { let args = Args::parse(); // 1. load image - let x = image::io::Reader::open(&args.source)? + let x = image::ImageReader::open(&args.source)? .with_guessed_format()? .decode()?; diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs index 1c0e5e494d..e0c35f6c26 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs @@ -1,5 +1,6 @@ #![allow(clippy::type_complexity)] +use ab_glyph::FontArc; use anyhow::Result; use image::{DynamicImage, GenericImageView, ImageBuffer}; use ndarray::{s, Array, Axis, IxDyn}; @@ -7,7 +8,7 @@ use rand::{thread_rng, Rng}; use std::path::PathBuf; use crate::{ - check_font, gen_time_string, non_max_suppression, Args, Batch, Bbox, Embedding, OrtBackend, + load_font, gen_time_string, non_max_suppression, Args, Batch, Bbox, Embedding, OrtBackend, OrtConfig, OrtEP, Point2, YOLOResult, YOLOTask, SKELETON, }; @@ -36,10 +37,10 @@ impl YOLOv8 { let ep = if config.trt { OrtEP::Trt(config.device_id) } else if config.cuda { - OrtEP::Cuda(config.device_id) + OrtEP::CUDA(config.device_id) } else { - OrtEP::Cpu - }; + OrtEP::CPU + }; // batch let batch = Batch { @@ -330,12 +331,19 @@ impl YOLOv8 { // coefs * proto -> mask let coefs = Array::from_shape_vec((1, nm), coefs)?; // (n, nm) - let proto = proto.to_owned().into_shape((nm, nh * nw))?; // (nm, nh*nw) - let mask = coefs.dot(&proto).into_shape((nh, nw, 1))?; // (nh, nw, n) + + let proto = proto.to_owned(); + let proto = proto.to_shape((nm, nh * nw))?; // (nm, nh*nw) + let mask = coefs.dot(&proto); // (nh, nw, n) + let mask = mask.to_shape((nh, nw, 1))?; // build image from ndarray let mask_im: ImageBuffer, Vec> = - match ImageBuffer::from_raw(nw as u32, nh as u32, mask.into_raw_vec()) { + match ImageBuffer::from_raw( + nw as u32, + nh as u32, + mask.to_owned().into_raw_vec_and_offset().0, + ) { Some(image) => image, None => panic!("can not create image from ndarray"), }; @@ -410,7 +418,7 @@ impl YOLOv8 { skeletons: Option<&[(usize, usize)]>, ) { // check font then load - let font = check_font("Arial.ttf"); + let font: FontArc = load_font(); for (_idb, (img0, y)) in xs0.iter().zip(ys.iter()).enumerate() { let mut img = img0.to_rgb8(); @@ -422,12 +430,13 @@ impl YOLOv8 { let legend_size = img.width().max(img.height()) / scale; let x = img.width() / 20; let y = img.height() / 20 + i as u32 * legend_size; + imageproc::drawing::draw_text_mut( &mut img, image::Rgb([0, 255, 0]), x as i32, y as i32, - rusttype::Scale::uniform(legend_size as f32 - 1.), + legend_size as f32, &font, &legend, ); @@ -454,7 +463,7 @@ impl YOLOv8 { image::Rgb(self.color_palette[bbox.id()].into()), bbox.xmin() as i32, (bbox.ymin() - legend_size as f32) as i32, - rusttype::Scale::uniform(legend_size as f32 - 1.), + legend_size as f32, &font, &legend, ); @@ -551,7 +560,7 @@ impl YOLOv8 { None => String::from(""), }, self.engine.ep(), - if let OrtEP::Cpu = self.engine.ep() { + if let OrtEP::CPU = self.engine.ep() { "" } else { "(May still fall back to CPU)" diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/ort_backend.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/ort_backend.rs index 857baaebae..d88208dead 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/ort_backend.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/ort_backend.rs @@ -2,11 +2,13 @@ use anyhow::Result; use clap::ValueEnum; use half::f16; use ndarray::{Array, CowArray, IxDyn}; -use ort::execution_providers::{CUDAExecutionProviderOptions, TensorRTExecutionProviderOptions}; -use ort::tensor::TensorElementDataType; -use ort::{Environment, ExecutionProvider, Session, SessionBuilder, Value}; +use ort::{ + CPUExecutionProvider, CUDAExecutionProvider, ExecutionProvider, ExecutionProviderDispatch, + TensorRTExecutionProvider, +}; +use ort::{Session, SessionBuilder}; +use ort::{TensorElementType, ValueType}; use regex::Regex; - #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord, ValueEnum)] pub enum YOLOTask { // YOLO tasks @@ -19,9 +21,9 @@ pub enum YOLOTask { #[derive(Debug, Clone, PartialEq, Eq, PartialOrd, Ord)] pub enum OrtEP { // ONNXRuntime execution provider - Cpu, - Cuda(u32), - Trt(u32), + CPU, + CUDA(i32), + Trt(i32), } #[derive(Debug)] @@ -44,8 +46,9 @@ impl Default for Batch { #[derive(Debug, Default)] pub struct OrtInputs { // ONNX model inputs attrs - pub shapes: Vec>, - pub dtypes: Vec, + pub shapes: Vec>, + //pub dtypes: Vec, + pub dtypes: Vec, pub names: Vec, pub sizes: Vec>, } @@ -56,12 +59,19 @@ impl OrtInputs { let mut dtypes = Vec::new(); let mut names = Vec::new(); for i in session.inputs.iter() { - let shape: Vec = i + /* let shape: Vec = i .dimensions() .map(|x| if let Some(x) = x { x as i32 } else { -1i32 }) .collect(); - shapes.push(shape); - dtypes.push(i.input_type); + shapes.push(shape); */ + if let ort::ValueType::Tensor { ty, dimensions } = &i.input_type { + dtypes.push(ty.clone()); + let shape = dimensions.clone(); + shapes.push(shape); + } else { + panic!("不支持的数据格式, {} - {}", file!(), line!()); + } + //dtypes.push(i.input_type); names.push(i.name.clone()); } Self { @@ -97,12 +107,14 @@ pub struct OrtBackend { impl OrtBackend { pub fn build(args: OrtConfig) -> Result { // build env & session - let env = Environment::builder() - .with_name("YOLOv8") - .with_log_level(ort::LoggingLevel::Verbose) - .build()? - .into_arc(); - let session = SessionBuilder::new(&env)?.with_model_from_file(&args.f)?; + // in version 2.x environment is removed + /* let env = ort::EnvironmentBuilder + ::with_name("YOLOv8") + .build()? + .into_arc(); */ + let sessionbuilder = SessionBuilder::new()?; + let session = sessionbuilder.commit_from_file(&args.f)?; + //let session = SessionBuilder::new(&env)?.with_model_from_file(&args.f)?; // get inputs let mut inputs = OrtInputs::new(&session); @@ -142,16 +154,19 @@ impl OrtBackend { // build provider let (ep, provider) = match args.ep { - OrtEP::Cuda(device_id) => Self::set_ep_cuda(device_id), + OrtEP::CUDA(device_id) => Self::set_ep_cuda(device_id), OrtEP::Trt(device_id) => Self::set_ep_trt(device_id, args.trt_fp16, &batch, &inputs), - _ => (OrtEP::Cpu, ExecutionProvider::CPU(Default::default())), + _ => ( + OrtEP::CPU, + ExecutionProviderDispatch::from(CPUExecutionProvider::default()), + ), }; // build session again with the new provider - let session = SessionBuilder::new(&env)? + let session = SessionBuilder::new()? // .with_optimization_level(ort::GraphOptimizationLevel::Level3)? .with_execution_providers([provider])? - .with_model_from_file(args.f)?; + .commit_from_file(args.f)?; // task: using given one or guessing let task = match args.task { @@ -185,57 +200,58 @@ impl OrtBackend { pub fn fetch_inputs_from_session( session: &Session, - ) -> (Vec>, Vec, Vec) { + ) -> (Vec>, Vec, Vec) { // get inputs attrs from ONNX model let mut shapes = Vec::new(); let mut dtypes = Vec::new(); let mut names = Vec::new(); for i in session.inputs.iter() { - let shape: Vec = i - .dimensions() - .map(|x| if let Some(x) = x { x as i32 } else { -1i32 }) - .collect(); - shapes.push(shape); - dtypes.push(i.input_type); + if let ort::ValueType::Tensor { ty, dimensions } = &i.input_type { + dtypes.push(ty.clone()); + let shape = dimensions.clone(); + shapes.push(shape); + } else { + panic!("不支持的数据格式, {} - {}", file!(), line!()); + } names.push(i.name.clone()); } (shapes, dtypes, names) } - pub fn set_ep_cuda(device_id: u32) -> (OrtEP, ExecutionProvider) { - // set CUDA - if ExecutionProvider::CUDA(Default::default()).is_available() { + pub fn set_ep_cuda(device_id: i32) -> (OrtEP, ExecutionProviderDispatch) { + let cuda_provider = CUDAExecutionProvider::default().with_device_id(device_id); + if let Ok(true) = cuda_provider.is_available() { ( - OrtEP::Cuda(device_id), - ExecutionProvider::CUDA(CUDAExecutionProviderOptions { - device_id, - ..Default::default() - }), + OrtEP::CUDA(device_id), + ExecutionProviderDispatch::from(cuda_provider), //PlantForm::CUDA(cuda_provider) ) } else { println!("> CUDA is not available! Using CPU."); - (OrtEP::Cpu, ExecutionProvider::CPU(Default::default())) + ( + OrtEP::CPU, + ExecutionProviderDispatch::from(CPUExecutionProvider::default()), //PlantForm::CPU(CPUExecutionProvider::default()) + ) } } pub fn set_ep_trt( - device_id: u32, + device_id: i32, fp16: bool, batch: &Batch, inputs: &OrtInputs, - ) -> (OrtEP, ExecutionProvider) { + ) -> (OrtEP, ExecutionProviderDispatch) { // set TensorRT - if ExecutionProvider::TensorRT(Default::default()).is_available() { - let (height, width) = (inputs.sizes[0][0], inputs.sizes[0][1]); + let trt_provider = TensorRTExecutionProvider::default().with_device_id(device_id); - // dtype match checking - if inputs.dtypes[0] == TensorElementDataType::Float16 && !fp16 { + //trt_provider. + if let Ok(true) = trt_provider.is_available() { + let (height, width) = (inputs.sizes[0][0], inputs.sizes[0][1]); + if inputs.dtypes[0] == TensorElementType::Float16 && !fp16 { panic!( "Dtype mismatch! Expected: Float32, got: {:?}. You should use `--fp16`", inputs.dtypes[0] ); } - // dynamic shape: input_tensor_1:dim_1xdim_2x...,input_tensor_2:dim_3xdim_4x...,... let mut opt_string = String::new(); let mut min_string = String::new(); @@ -251,17 +267,16 @@ impl OrtBackend { let _ = opt_string.pop(); let _ = min_string.pop(); let _ = max_string.pop(); + + let trt_provider = trt_provider + .with_profile_opt_shapes(opt_string) + .with_profile_min_shapes(min_string) + .with_profile_max_shapes(max_string) + .with_fp16(fp16) + .with_timing_cache(true); ( OrtEP::Trt(device_id), - ExecutionProvider::TensorRT(TensorRTExecutionProviderOptions { - device_id, - fp16_enable: fp16, - timing_cache_enable: true, - profile_min_shapes: min_string, - profile_max_shapes: max_string, - profile_opt_shapes: opt_string, - ..Default::default() - }), + ExecutionProviderDispatch::from(trt_provider), ) } else { println!("> TensorRT is not available! Try using CUDA..."); @@ -283,8 +298,8 @@ impl OrtBackend { pub fn run(&self, xs: Array, profile: bool) -> Result>> { // ORT inference match self.dtype() { - TensorElementDataType::Float16 => self.run_fp16(xs, profile), - TensorElementDataType::Float32 => self.run_fp32(xs, profile), + TensorElementType::Float16 => self.run_fp16(xs, profile), + TensorElementType::Float32 => self.run_fp32(xs, profile), _ => todo!(), } } @@ -300,14 +315,13 @@ impl OrtBackend { // h2d let t = std::time::Instant::now(); let xs = CowArray::from(xs); - let xs = vec![Value::from_array(self.session.allocator(), &xs)?]; if profile { println!("[ORT H2D]: {:?}", t.elapsed()); } // run let t = std::time::Instant::now(); - let ys = self.session.run(xs)?; + let ys = self.session.run(ort::inputs![xs.view()]?)?; if profile { println!("[ORT Inference]: {:?}", t.elapsed()); } @@ -315,21 +329,22 @@ impl OrtBackend { // d2h Ok(ys .iter() - .map(|x| { + .map(|(_k, v)| { // d2h let t = std::time::Instant::now(); - let x = x.try_extract::<_>().unwrap().view().clone().into_owned(); + let v = v.try_extract_tensor().unwrap(); + //let v = v.try_extract::<_>().unwrap().view().clone().into_owned(); if profile { println!("[ORT D2H]: {:?}", t.elapsed()); } // f16->f32 let t_ = std::time::Instant::now(); - let x = x.mapv(f16::to_f32); + let v = v.mapv(f16::to_f32); if profile { println!("[ORT f16->f32]: {:?}", t_.elapsed()); } - x + v }) .collect::>>()) } @@ -338,14 +353,13 @@ impl OrtBackend { // h2d let t = std::time::Instant::now(); let xs = CowArray::from(xs); - let xs = vec![Value::from_array(self.session.allocator(), &xs)?]; if profile { println!("[ORT H2D]: {:?}", t.elapsed()); } // run let t = std::time::Instant::now(); - let ys = self.session.run(xs)?; + let ys = self.session.run(ort::inputs![xs.view()]?)?; if profile { println!("[ORT Inference]: {:?}", t.elapsed()); } @@ -353,39 +367,44 @@ impl OrtBackend { // d2h Ok(ys .iter() - .map(|x| { + .map(|(_k, v)| { let t = std::time::Instant::now(); - let x = x.try_extract::<_>().unwrap().view().clone().into_owned(); + let v = v.try_extract_tensor::().unwrap().into_owned(); + //let x = x.try_extract::<_>().unwrap().view().clone().into_owned(); if profile { println!("[ORT D2H]: {:?}", t.elapsed()); } - x + v }) .collect::>>()) } - pub fn output_shapes(&self) -> Vec> { + pub fn output_shapes(&self) -> Vec> { let mut shapes = Vec::new(); - for o in &self.session.outputs { - let shape: Vec<_> = o - .dimensions() - .map(|x| if let Some(x) = x { x as i32 } else { -1i32 }) - .collect(); - shapes.push(shape); + for output in &self.session.outputs { + if let ValueType::Tensor { ty: _, dimensions } = &output.output_type { + let shape = dimensions.clone(); + shapes.push(shape); + } else { + panic!("not support data format, {} - {}", file!(), line!()); + } } shapes } - pub fn output_dtypes(&self) -> Vec { + pub fn output_dtypes(&self) -> Vec { let mut dtypes = Vec::new(); - self.session - .outputs - .iter() - .for_each(|x| dtypes.push(x.output_type)); + for output in &self.session.outputs { + if let ValueType::Tensor { ty, dimensions: _ } = &output.output_type { + dtypes.push(ty.clone()); + } else { + panic!("not support data format, {} - {}", file!(), line!()); + } + } dtypes } - pub fn input_shapes(&self) -> &Vec> { + pub fn input_shapes(&self) -> &Vec> { &self.inputs.shapes } @@ -393,11 +412,11 @@ impl OrtBackend { &self.inputs.names } - pub fn input_dtypes(&self) -> &Vec { + pub fn input_dtypes(&self) -> &Vec { &self.inputs.dtypes } - pub fn dtype(&self) -> TensorElementDataType { + pub fn dtype(&self) -> TensorElementType { self.input_dtypes()[0] } From c4dae56e1ab88da7a8060e414e0bbc885fa22db6 Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Tue, 29 Oct 2024 20:28:13 +0800 Subject: [PATCH 07/46] Update Triton Inference Server guide (#17059) Co-authored-by: UltralyticsAssistant Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com> Co-authored-by: Glenn Jocher --- docs/en/guides/triton-inference-server.md | 26 +++++++++++++++++++++-- 1 file changed, 24 insertions(+), 2 deletions(-) diff --git a/docs/en/guides/triton-inference-server.md b/docs/en/guides/triton-inference-server.md index 7395ccef11..09f7516b11 100644 --- a/docs/en/guides/triton-inference-server.md +++ b/docs/en/guides/triton-inference-server.md @@ -80,6 +80,28 @@ The Triton Model Repository is a storage location where Triton can access and lo # Create config file (triton_model_path / "config.pbtxt").touch() + + # (Optional) Enable TensorRT for GPU inference + # First run will be slow due to TensorRT engine conversion + import json + + data = { + "optimization": { + "execution_accelerators": { + "gpu_execution_accelerator": [ + { + "name": "tensorrt", + "parameters": {"key": "precision_mode", "value": "FP16"}, + "parameters": {"key": "max_workspace_size_bytes", "value": "3221225472"}, + "parameters": {"key": "trt_engine_cache_enable", "value": "1"}, + } + ] + } + } + } + + with open(triton_model_path / "config.pbtxt", "w") as f: + json.dump(data, f, indent=4) ``` ## Running Triton Inference Server @@ -94,7 +116,7 @@ import time from tritonclient.http import InferenceServerClient # Define image https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver -tag = "nvcr.io/nvidia/tritonserver:23.09-py3" # 6.4 GB +tag = "nvcr.io/nvidia/tritonserver:24.09-py3" # 8.57 GB # Pull the image subprocess.call(f"docker pull {tag}", shell=True) @@ -187,7 +209,7 @@ Setting up [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolov8/) wit from tritonclient.http import InferenceServerClient # Define image https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver - tag = "nvcr.io/nvidia/tritonserver:23.09-py3" + tag = "nvcr.io/nvidia/tritonserver:24.09-py3" subprocess.call(f"docker pull {tag}", shell=True) From ca5e9daed1b27f6b74c86ce74f9d4432e8b51741 Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Tue, 29 Oct 2024 20:57:07 +0800 Subject: [PATCH 08/46] Faster ONNX inference with bindings (#17184) Co-authored-by: UltralyticsAssistant Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com> Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Glenn Jocher --- ultralytics/nn/autobackend.py | 42 ++++++++++++++++++++++++++++++++--- 1 file changed, 39 insertions(+), 3 deletions(-) diff --git a/ultralytics/nn/autobackend.py b/ultralytics/nn/autobackend.py index 9e6d38b49f..b9312fefdb 100644 --- a/ultralytics/nn/autobackend.py +++ b/ultralytics/nn/autobackend.py @@ -189,10 +189,32 @@ class AutoBackend(nn.Module): check_requirements("numpy==1.23.5") import onnxruntime - providers = ["CUDAExecutionProvider", "CPUExecutionProvider"] if cuda else ["CPUExecutionProvider"] + providers = onnxruntime.get_available_providers() + if not cuda and "CUDAExecutionProvider" in providers: + providers.remove("CUDAExecutionProvider") + elif cuda and "CUDAExecutionProvider" not in providers: + LOGGER.warning("WARNING ⚠️ Failed to start ONNX Runtime session with CUDA. Falling back to CPU...") + device = torch.device("cpu") + cuda = False + LOGGER.info(f"Preferring ONNX Runtime {providers[0]}") session = onnxruntime.InferenceSession(w, providers=providers) output_names = [x.name for x in session.get_outputs()] metadata = session.get_modelmeta().custom_metadata_map + dynamic = isinstance(session.get_outputs()[0].shape[0], str) + if not dynamic: + io = session.io_binding() + bindings = [] + for output in session.get_outputs(): + y_tensor = torch.empty(output.shape, dtype=torch.float16 if fp16 else torch.float32).to(device) + io.bind_output( + name=output.name, + device_type=device.type, + device_id=device.index if cuda else 0, + element_type=np.float16 if fp16 else np.float32, + shape=tuple(y_tensor.shape), + buffer_ptr=y_tensor.data_ptr(), + ) + bindings.append(y_tensor) # OpenVINO elif xml: @@ -477,8 +499,22 @@ class AutoBackend(nn.Module): # ONNX Runtime elif self.onnx: - im = im.cpu().numpy() # torch to numpy - y = self.session.run(self.output_names, {self.session.get_inputs()[0].name: im}) + if self.dynamic: + im = im.cpu().numpy() # torch to numpy + y = self.session.run(self.output_names, {self.session.get_inputs()[0].name: im}) + else: + if not self.cuda: + im = im.cpu() + self.io.bind_input( + name="images", + device_type=im.device.type, + device_id=im.device.index if im.device.type == "cuda" else 0, + element_type=np.float16 if self.fp16 else np.float32, + shape=tuple(im.shape), + buffer_ptr=im.data_ptr(), + ) + self.session.run_with_iobinding(self.io) + y = self.bindings # OpenVINO elif self.xml: From 8a0fcbb89e2be09cc7269ee4a36a2e13826f2888 Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Tue, 29 Oct 2024 14:28:57 +0100 Subject: [PATCH 09/46] Notify only on first CI run (#17241) Signed-off-by: UltralyticsAssistant Co-authored-by: UltralyticsAssistant --- .github/workflows/ci.yaml | 2 +- .github/workflows/docker.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 43f5d4cfeb..6963156ce9 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -351,7 +351,7 @@ jobs: if: always() # This ensures the job runs even if previous jobs fail steps: - name: Check for failure and notify - if: (needs.HUB.result == 'failure' || needs.Benchmarks.result == 'failure' || needs.Tests.result == 'failure' || needs.GPU.result == 'failure' || needs.RaspberryPi.result == 'failure' || needs.Conda.result == 'failure' ) && github.repository == 'ultralytics/ultralytics' && (github.event_name == 'schedule' || github.event_name == 'push') + if: (needs.HUB.result == 'failure' || needs.Benchmarks.result == 'failure' || needs.Tests.result == 'failure' || needs.GPU.result == 'failure' || needs.RaspberryPi.result == 'failure' || needs.Conda.result == 'failure' ) && github.repository == 'ultralytics/ultralytics' && (github.event_name == 'schedule' || github.event_name == 'push') && github.run_attempt == '1' uses: slackapi/slack-github-action@v1.27.0 with: payload: | diff --git a/.github/workflows/docker.yaml b/.github/workflows/docker.yaml index c299bc5bfd..ef7dd86e96 100644 --- a/.github/workflows/docker.yaml +++ b/.github/workflows/docker.yaml @@ -192,7 +192,7 @@ jobs: if: always() steps: - name: Check for failure and notify - if: needs.docker.result == 'failure' && github.repository == 'ultralytics/ultralytics' && github.event_name == 'push' + if: needs.docker.result == 'failure' && github.repository == 'ultralytics/ultralytics' && github.event_name == 'push' && github.run_attempt == '1' uses: slackapi/slack-github-action@v1.27.0 with: payload: | From aabd0136ec40223cf423847635a2a4de95bba63d Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Tue, 29 Oct 2024 22:11:49 +0800 Subject: [PATCH 10/46] Decrease default confidence threshold to start tracking new tracks (#17172) Co-authored-by: Glenn Jocher Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com> Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> --- ultralytics/cfg/trackers/botsort.yaml | 4 ++-- ultralytics/cfg/trackers/bytetrack.yaml | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/ultralytics/cfg/trackers/botsort.yaml b/ultralytics/cfg/trackers/botsort.yaml index 01cebb6478..c15fbcd895 100644 --- a/ultralytics/cfg/trackers/botsort.yaml +++ b/ultralytics/cfg/trackers/botsort.yaml @@ -2,9 +2,9 @@ # Default YOLO tracker settings for BoT-SORT tracker https://github.com/NirAharon/BoT-SORT tracker_type: botsort # tracker type, ['botsort', 'bytetrack'] -track_high_thresh: 0.5 # threshold for the first association +track_high_thresh: 0.25 # threshold for the first association track_low_thresh: 0.1 # threshold for the second association -new_track_thresh: 0.6 # threshold for init new track if the detection does not match any tracks +new_track_thresh: 0.25 # threshold for init new track if the detection does not match any tracks track_buffer: 30 # buffer to calculate the time when to remove tracks match_thresh: 0.8 # threshold for matching tracks fuse_score: True # Whether to fuse confidence scores with the iou distances before matching diff --git a/ultralytics/cfg/trackers/bytetrack.yaml b/ultralytics/cfg/trackers/bytetrack.yaml index 49ab3f697b..7cdec59b33 100644 --- a/ultralytics/cfg/trackers/bytetrack.yaml +++ b/ultralytics/cfg/trackers/bytetrack.yaml @@ -2,9 +2,9 @@ # Default YOLO tracker settings for ByteTrack tracker https://github.com/ifzhang/ByteTrack tracker_type: bytetrack # tracker type, ['botsort', 'bytetrack'] -track_high_thresh: 0.5 # threshold for the first association +track_high_thresh: 0.25 # threshold for the first association track_low_thresh: 0.1 # threshold for the second association -new_track_thresh: 0.6 # threshold for init new track if the detection does not match any tracks +new_track_thresh: 0.25 # threshold for init new track if the detection does not match any tracks track_buffer: 30 # buffer to calculate the time when to remove tracks match_thresh: 0.8 # threshold for matching tracks fuse_score: True # Whether to fuse confidence scores with the iou distances before matching From 886d0c7127301fe52ea3aaeb94bf2a4fa4992baa Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Tue, 29 Oct 2024 23:02:41 +0100 Subject: [PATCH 11/46] Update publish.yml (#17251) --- .github/workflows/publish.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/publish.yml b/.github/workflows/publish.yml index 1ec1b9a93c..a41a908228 100644 --- a/.github/workflows/publish.yml +++ b/.github/workflows/publish.yml @@ -92,7 +92,7 @@ jobs: uses: slackapi/slack-github-action@v1.27.0 with: payload: | - {"text": " GitHub Actions success for ${{ github.workflow }} ✅\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* NEW '${{ github.repository }} ${{ steps.check_pypi.outputs.current_tag }}' pip package published 😃\n*Job Status:* ${{ job.status }}\n*Pull Request:* ${{ env.PR_TITLE }}\n"} + {"text": " GitHub Actions success for ${{ github.workflow }} ✅\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* NEW `${{ github.repository }} ${{ steps.check_pypi.outputs.current_tag }}` pip package published 😃\n*Job Status:* ${{ job.status }}\n*Pull Request:* ${{ env.PR_TITLE }}\n"} env: SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_YOLO }} - name: Notify on Slack (Failure) From 83404afff1536e0ce08c9926dbb1a1217c411914 Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Wed, 30 Oct 2024 07:53:04 +0800 Subject: [PATCH 12/46] Pin `ray` `numpy<=2.0.0` test (#17245) Co-authored-by: Glenn Jocher Co-authored-by: UltralyticsAssistant --- ultralytics/utils/tuner.py | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/ultralytics/utils/tuner.py b/ultralytics/utils/tuner.py index c60022c0b8..e611fa9af8 100644 --- a/ultralytics/utils/tuner.py +++ b/ultralytics/utils/tuner.py @@ -1,6 +1,5 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -import subprocess from ultralytics.cfg import TASK2DATA, TASK2METRIC, get_save_dir from ultralytics.utils import DEFAULT_CFG, DEFAULT_CFG_DICT, LOGGER, NUM_THREADS, checks @@ -39,7 +38,7 @@ def run_ray_tune( train_args = {} try: - subprocess.run("pip install ray[tune]".split(), check=True) # do not add single quotes here + checks.check_requirements(("ray[tune]", "numpy<2.0.0")) import ray from ray import tune From 6ffd8841fd8cdd86ab1ce1d102997f941f3c88e8 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Wed, 30 Oct 2024 16:36:03 +0500 Subject: [PATCH 13/46] Update notebooks (#17260) Co-authored-by: UltralyticsAssistant --- examples/heatmaps.ipynb | 2 +- examples/object_counting.ipynb | 2 +- examples/object_tracking.ipynb | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/examples/heatmaps.ipynb b/examples/heatmaps.ipynb index c674ad4800..d0124df894 100644 --- a/examples/heatmaps.ipynb +++ b/examples/heatmaps.ipynb @@ -112,7 +112,7 @@ "heatmap_obj = solutions.Heatmap(\n", " colormap=cv2.COLORMAP_PARULA, # Color of the heatmap\n", " show=True, # Display the image during processing\n", - " model=yolo11n.pt, # Ultralytics YOLO11 model file\n", + " model=\"yolo11n.pt\", # Ultralytics YOLO11 model file\n", ")\n", "\n", "while cap.isOpened():\n", diff --git a/examples/object_counting.ipynb b/examples/object_counting.ipynb index 50168f262e..e742cff6a7 100644 --- a/examples/object_counting.ipynb +++ b/examples/object_counting.ipynb @@ -123,7 +123,7 @@ "counter = solutions.ObjectCounter(\n", " show=True, # Display the image during processing\n", " region=line_points, # Region of interest points\n", - " model=yolo11n.pt, # Ultralytics YOLO11 model file\n", + " model=\"yolo11n.pt\", # Ultralytics YOLO11 model file\n", " line_width=2, # Thickness of the lines and bounding boxes\n", ")\n", "\n", diff --git a/examples/object_tracking.ipynb b/examples/object_tracking.ipynb index 7691fce9cd..cc4d03add8 100644 --- a/examples/object_tracking.ipynb +++ b/examples/object_tracking.ipynb @@ -176,7 +176,7 @@ "\n", " # Annotate each mask with its corresponding tracking ID and color\n", " for mask, track_id in zip(masks, track_ids):\n", - " annotator.seg_bbox(mask=mask, mask_color=colors(track_id, True), track_label=str(track_id))\n", + " annotator.seg_bbox(mask=mask, mask_color=colors(int(track_id), True), label=str(track_id))\n", "\n", " # Write the annotated frame to the output video\n", " out.write(im0)\n", From e798dbf52e02c367b657daea85e90bf49c340f3c Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Wed, 30 Oct 2024 19:37:56 +0800 Subject: [PATCH 14/46] Fix missing argument (#17253) --- ultralytics/models/sam/modules/sam.py | 1 + 1 file changed, 1 insertion(+) diff --git a/ultralytics/models/sam/modules/sam.py b/ultralytics/models/sam/modules/sam.py index 562314b2b9..7bfd716615 100644 --- a/ultralytics/models/sam/modules/sam.py +++ b/ultralytics/models/sam/modules/sam.py @@ -854,6 +854,7 @@ class SAM2Model(torch.nn.Module): mask_inputs, output_dict, num_frames, + track_in_reverse, prev_sam_mask_logits, ): """Performs a single tracking step, updating object masks and memory features based on current frame inputs.""" From b8c90baffee06b7b162cb29bd94383a693a42744 Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Wed, 30 Oct 2024 19:38:28 +0800 Subject: [PATCH 15/46] Update triton-inference-server.md (#17252) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- docs/en/guides/triton-inference-server.md | 43 ++++++++++++++--------- 1 file changed, 26 insertions(+), 17 deletions(-) diff --git a/docs/en/guides/triton-inference-server.md b/docs/en/guides/triton-inference-server.md index 09f7516b11..0151cc078d 100644 --- a/docs/en/guides/triton-inference-server.md +++ b/docs/en/guides/triton-inference-server.md @@ -83,25 +83,34 @@ The Triton Model Repository is a storage location where Triton can access and lo # (Optional) Enable TensorRT for GPU inference # First run will be slow due to TensorRT engine conversion - import json - - data = { - "optimization": { - "execution_accelerators": { - "gpu_execution_accelerator": [ - { - "name": "tensorrt", - "parameters": {"key": "precision_mode", "value": "FP16"}, - "parameters": {"key": "max_workspace_size_bytes", "value": "3221225472"}, - "parameters": {"key": "trt_engine_cache_enable", "value": "1"}, - } - ] - } + data = """ + optimization { + execution_accelerators { + gpu_execution_accelerator { + name: "tensorrt" + parameters { + key: "precision_mode" + value: "FP16" + } + parameters { + key: "max_workspace_size_bytes" + value: "3221225472" + } + parameters { + key: "trt_engine_cache_enable" + value: "1" + } + parameters { + key: "trt_engine_cache_path" + value: "/models/yolo/1" + } } + } } + """ with open(triton_model_path / "config.pbtxt", "w") as f: - json.dump(data, f, indent=4) + f.write(data) ``` ## Running Triton Inference Server @@ -124,7 +133,7 @@ subprocess.call(f"docker pull {tag}", shell=True) # Run the Triton server and capture the container ID container_id = ( subprocess.check_output( - f"docker run -d --rm -v {triton_repo_path}:/models -p 8000:8000 {tag} tritonserver --model-repository=/models", + f"docker run -d --rm --gpus 0 -v {triton_repo_path}:/models -p 8000:8000 {tag} tritonserver --model-repository=/models", shell=True, ) .decode("utf-8") @@ -215,7 +224,7 @@ Setting up [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolov8/) wit container_id = ( subprocess.check_output( - f"docker run -d --rm -v {triton_repo_path}/models -p 8000:8000 {tag} tritonserver --model-repository=/models", + f"docker run -d --rm --gpus 0 -v {triton_repo_path}/models -p 8000:8000 {tag} tritonserver --model-repository=/models", shell=True, ) .decode("utf-8") From 11b419434487a894656fe46d819d2ae868f25cc1 Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Wed, 30 Oct 2024 13:42:39 +0100 Subject: [PATCH 16/46] Disable Ray tests (#17266) Co-authored-by: UltralyticsAssistant --- .github/workflows/ci.yaml | 4 ++-- ultralytics/utils/tuner.py | 10 +++++++--- 2 files changed, 9 insertions(+), 5 deletions(-) diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 6963156ce9..381e92c4c1 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -184,7 +184,7 @@ jobs: torch="torch==1.8.0 torchvision==0.9.0" fi if [[ "${{ github.event_name }}" =~ ^(schedule|workflow_dispatch)$ ]]; then - slow="pycocotools mlflow ray[tune]" + slow="pycocotools mlflow" fi pip install -e ".[export]" $torch $slow pytest-cov --extra-index-url https://download.pytorch.org/whl/cpu - name: Check environment @@ -247,7 +247,7 @@ jobs: - name: Install requirements run: | python -m pip install --upgrade pip wheel - pip install -e ".[export]" pytest mlflow pycocotools "ray[tune]" + pip install -e ".[export]" pytest mlflow pycocotools - name: Check environment run: | yolo checks diff --git a/ultralytics/utils/tuner.py b/ultralytics/utils/tuner.py index e611fa9af8..165c788a75 100644 --- a/ultralytics/utils/tuner.py +++ b/ultralytics/utils/tuner.py @@ -1,12 +1,16 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license - from ultralytics.cfg import TASK2DATA, TASK2METRIC, get_save_dir from ultralytics.utils import DEFAULT_CFG, DEFAULT_CFG_DICT, LOGGER, NUM_THREADS, checks def run_ray_tune( - model, space: dict = None, grace_period: int = 10, gpu_per_trial: int = None, max_samples: int = 10, **train_args + model, + space: dict = None, + grace_period: int = 10, + gpu_per_trial: int = None, + max_samples: int = 10, + **train_args, ): """ Runs hyperparameter tuning using Ray Tune. @@ -38,7 +42,7 @@ def run_ray_tune( train_args = {} try: - checks.check_requirements(("ray[tune]", "numpy<2.0.0")) + checks.check_requirements("ray[tune]") import ray from ray import tune From 9c72d94ba4a83e8911595341b0c3a1d30bbbe8a8 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E7=8E=8B=E5=8F=AC=E5=BE=B7?= <8401806+wangzhaode@users.noreply.github.com> Date: Wed, 30 Oct 2024 20:59:48 +0800 Subject: [PATCH 17/46] `ultralytics 8.3.25` Alibaba MNN export and predict support (#16802) Co-authored-by: UltralyticsAssistant Co-authored-by: Francesco Mattioli Co-authored-by: Laughing <61612323+Laughing-q@users.noreply.github.com> Co-authored-by: Laughing-q <1185102784@qq.com> Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Glenn Jocher --- .gitignore | 1 + docs/en/guides/model-deployment-options.md | 35 ++- docs/en/integrations/index.md | 6 +- docs/en/integrations/mnn.md | 342 +++++++++++++++++++++ docs/en/macros/export-table.md | 1 + docs/mkdocs_github_authors.yaml | 3 + mkdocs.yml | 3 +- tests/test_exports.py | 7 + ultralytics/__init__.py | 2 +- ultralytics/engine/exporter.py | 37 ++- ultralytics/engine/predictor.py | 1 + ultralytics/engine/validator.py | 1 + ultralytics/nn/autobackend.py | 58 +++- ultralytics/utils/benchmarks.py | 7 +- 14 files changed, 465 insertions(+), 39 deletions(-) create mode 100644 docs/en/integrations/mnn.md diff --git a/.gitignore b/.gitignore index 5cc365b4d2..4e0f0845b2 100644 --- a/.gitignore +++ b/.gitignore @@ -157,6 +157,7 @@ weights/ *.torchscript *.tflite *.h5 +*.mnn *_saved_model/ *_web_model/ *_openvino_model/ diff --git a/docs/en/guides/model-deployment-options.md b/docs/en/guides/model-deployment-options.md index a9efee17c9..1b97e31e43 100644 --- a/docs/en/guides/model-deployment-options.md +++ b/docs/en/guides/model-deployment-options.md @@ -258,25 +258,30 @@ NCNN is a high-performance neural network inference framework optimized for the - **Hardware Acceleration**: Tailored for ARM CPUs and GPUs, with specific optimizations for these architectures. +#### MNN + +MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. In addition, MNN is also used on embedded devices, such as IoT. + ## Comparative Analysis of YOLO11 Deployment Options The following table provides a snapshot of the various deployment options available for YOLO11 models, helping you to assess which may best fit your project needs based on several critical criteria. For an in-depth look at each deployment option's format, please see the [Ultralytics documentation page on export formats](../modes/export.md#export-formats). -| Deployment Option | Performance Benchmarks | Compatibility and Integration | Community Support and Ecosystem | Case Studies | Maintenance and Updates | Security Considerations | Hardware Acceleration | -| ----------------- | ----------------------------------------------- | ---------------------------------------------- | --------------------------------------------- | ------------------------------------------ | ------------------------------------------- | ------------------------------------------------- | ---------------------------------- | -| PyTorch | Good flexibility; may trade off raw performance | Excellent with Python libraries | Extensive resources and community | Research and prototypes | Regular, active development | Dependent on deployment environment | CUDA support for GPU acceleration | -| TorchScript | Better for production than PyTorch | Smooth transition from PyTorch to C++ | Specialized but narrower than PyTorch | Industry where Python is a bottleneck | Consistent updates with PyTorch | Improved security without full Python | Inherits CUDA support from PyTorch | -| ONNX | Variable depending on runtime | High across different frameworks | Broad ecosystem, supported by many orgs | Flexibility across ML frameworks | Regular updates for new operations | Ensure secure conversion and deployment practices | Various hardware optimizations | -| OpenVINO | Optimized for Intel hardware | Best within Intel ecosystem | Solid in computer vision domain | IoT and edge with Intel hardware | Regular updates for Intel hardware | Robust features for sensitive applications | Tailored for Intel hardware | -| TensorRT | Top-tier on NVIDIA GPUs | Best for NVIDIA hardware | Strong network through NVIDIA | Real-time video and image inference | Frequent updates for new GPUs | Emphasis on security | Designed for NVIDIA GPUs | -| CoreML | Optimized for on-device Apple hardware | Exclusive to Apple ecosystem | Strong Apple and developer support | On-device ML on Apple products | Regular Apple updates | Focus on privacy and security | Apple neural engine and GPU | -| TF SavedModel | Scalable in server environments | Wide compatibility in TensorFlow ecosystem | Large support due to TensorFlow popularity | Serving models at scale | Regular updates by Google and community | Robust features for enterprise | Various hardware accelerations | -| TF GraphDef | Stable for static computation graphs | Integrates well with TensorFlow infrastructure | Resources for optimizing static graphs | Scenarios requiring static graphs | Updates alongside TensorFlow core | Established TensorFlow security practices | TensorFlow acceleration options | -| TF Lite | Speed and efficiency on mobile/embedded | Wide range of device support | Robust community, Google backed | Mobile applications with minimal footprint | Latest features for mobile | Secure environment on end-user devices | GPU and DSP among others | -| TF Edge TPU | Optimized for Google's Edge TPU hardware | Exclusive to Edge TPU devices | Growing with Google and third-party resources | IoT devices requiring real-time processing | Improvements for new Edge TPU hardware | Google's robust IoT security | Custom-designed for Google Coral | -| TF.js | Reasonable in-browser performance | High with web technologies | Web and Node.js developers support | Interactive web applications | TensorFlow team and community contributions | Web platform security model | Enhanced with WebGL and other APIs | -| PaddlePaddle | Competitive, easy to use and scalable | Baidu ecosystem, wide application support | Rapidly growing, especially in China | Chinese market and language processing | Focus on Chinese AI applications | Emphasizes data privacy and security | Including Baidu's Kunlun chips | -| NCNN | Optimized for mobile ARM-based devices | Mobile and embedded ARM systems | Niche but active mobile/embedded ML community | Android and ARM systems efficiency | High performance maintenance on ARM | On-device security advantages | ARM CPUs and GPUs optimizations | +| Deployment Option | Performance Benchmarks | Compatibility and Integration | Community Support and Ecosystem | Case Studies | Maintenance and Updates | Security Considerations | Hardware Acceleration | +| ----------------- | ----------------------------------------------- | ---------------------------------------------- | --------------------------------------------- | ------------------------------------------ | ---------------------------------------------- | ------------------------------------------------- | ---------------------------------- | +| PyTorch | Good flexibility; may trade off raw performance | Excellent with Python libraries | Extensive resources and community | Research and prototypes | Regular, active development | Dependent on deployment environment | CUDA support for GPU acceleration | +| TorchScript | Better for production than PyTorch | Smooth transition from PyTorch to C++ | Specialized but narrower than PyTorch | Industry where Python is a bottleneck | Consistent updates with PyTorch | Improved security without full Python | Inherits CUDA support from PyTorch | +| ONNX | Variable depending on runtime | High across different frameworks | Broad ecosystem, supported by many orgs | Flexibility across ML frameworks | Regular updates for new operations | Ensure secure conversion and deployment practices | Various hardware optimizations | +| OpenVINO | Optimized for Intel hardware | Best within Intel ecosystem | Solid in computer vision domain | IoT and edge with Intel hardware | Regular updates for Intel hardware | Robust features for sensitive applications | Tailored for Intel hardware | +| TensorRT | Top-tier on NVIDIA GPUs | Best for NVIDIA hardware | Strong network through NVIDIA | Real-time video and image inference | Frequent updates for new GPUs | Emphasis on security | Designed for NVIDIA GPUs | +| CoreML | Optimized for on-device Apple hardware | Exclusive to Apple ecosystem | Strong Apple and developer support | On-device ML on Apple products | Regular Apple updates | Focus on privacy and security | Apple neural engine and GPU | +| TF SavedModel | Scalable in server environments | Wide compatibility in TensorFlow ecosystem | Large support due to TensorFlow popularity | Serving models at scale | Regular updates by Google and community | Robust features for enterprise | Various hardware accelerations | +| TF GraphDef | Stable for static computation graphs | Integrates well with TensorFlow infrastructure | Resources for optimizing static graphs | Scenarios requiring static graphs | Updates alongside TensorFlow core | Established TensorFlow security practices | TensorFlow acceleration options | +| TF Lite | Speed and efficiency on mobile/embedded | Wide range of device support | Robust community, Google backed | Mobile applications with minimal footprint | Latest features for mobile | Secure environment on end-user devices | GPU and DSP among others | +| TF Edge TPU | Optimized for Google's Edge TPU hardware | Exclusive to Edge TPU devices | Growing with Google and third-party resources | IoT devices requiring real-time processing | Improvements for new Edge TPU hardware | Google's robust IoT security | Custom-designed for Google Coral | +| TF.js | Reasonable in-browser performance | High with web technologies | Web and Node.js developers support | Interactive web applications | TensorFlow team and community contributions | Web platform security model | Enhanced with WebGL and other APIs | +| PaddlePaddle | Competitive, easy to use and scalable | Baidu ecosystem, wide application support | Rapidly growing, especially in China | Chinese market and language processing | Focus on Chinese AI applications | Emphasizes data privacy and security | Including Baidu's Kunlun chips | +| MNN | High-performance for mobile devices. | Mobile and embedded ARM systems and X86-64 CPU | Mobile/embedded ML community | Moblile systems efficiency | High performance maintenance on Mobile Devices | On-device security advantages | ARM CPUs and GPUs optimizations | +| NCNN | Optimized for mobile ARM-based devices | Mobile and embedded ARM systems | Niche but active mobile/embedded ML community | Android and ARM systems efficiency | High performance maintenance on ARM | On-device security advantages | ARM CPUs and GPUs optimizations | This comparative analysis gives you a high-level overview. For deployment, it's essential to consider the specific requirements and constraints of your project, and consult the detailed documentation and resources available for each option. diff --git a/docs/en/integrations/index.md b/docs/en/integrations/index.md index bdb8b9c907..f2859e8388 100644 --- a/docs/en/integrations/index.md +++ b/docs/en/integrations/index.md @@ -57,6 +57,8 @@ Welcome to the Ultralytics Integrations page! This page provides an overview of - [Weights & Biases (W&B)](weights-biases.md): Monitor experiments, visualize metrics, and foster reproducibility and collaboration on Ultralytics projects. +- [VS Code](vscode.md): An extension for VS Code that provides code snippets for accelerating development workflows with Ultralytics and also for anyone looking for examples to help learn or get started with Ultralytics. + ## Deployment Integrations - [CoreML](coreml.md): CoreML, developed by [Apple](https://www.apple.com/), is a framework designed for efficiently integrating machine learning models into applications across iOS, macOS, watchOS, and tvOS, using Apple's hardware for effective and secure [model deployment](https://www.ultralytics.com/glossary/model-deployment). @@ -65,6 +67,8 @@ Welcome to the Ultralytics Integrations page! This page provides an overview of - [NCNN](ncnn.md): Developed by [Tencent](http://www.tencent.com/), NCNN is an efficient [neural network](https://www.ultralytics.com/glossary/neural-network-nn) inference framework tailored for mobile devices. It enables direct deployment of AI models into apps, optimizing performance across various mobile platforms. +- [MNN](mnn.md): Developed by [Alibaba](https://www.alibabagroup.com/), MNN is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. + - [Neural Magic](neural-magic.md): Leverage Quantization Aware Training (QAT) and pruning techniques to optimize Ultralytics models for superior performance and leaner size. - [ONNX](onnx.md): An open-source format created by [Microsoft](https://www.microsoft.com/) for facilitating the transfer of AI models between various frameworks, enhancing the versatility and deployment flexibility of Ultralytics models. @@ -87,8 +91,6 @@ Welcome to the Ultralytics Integrations page! This page provides an overview of - [TorchScript](torchscript.md): Developed as part of the [PyTorch](https://pytorch.org/) framework, TorchScript enables efficient execution and deployment of machine learning models in various production environments without the need for Python dependencies. -- [VS Code](vscode.md): An extension for VS Code that provides code snippets for accelerating development workflows with Ultralytics and also for anyone looking for examples to help learn or get started with Ultralytics. - ### Export Formats We also support a variety of model export formats for deployment in different environments. Here are the available formats: diff --git a/docs/en/integrations/mnn.md b/docs/en/integrations/mnn.md new file mode 100644 index 0000000000..5919373611 --- /dev/null +++ b/docs/en/integrations/mnn.md @@ -0,0 +1,342 @@ +--- +comments: true +description: Optimize YOLO11 models for mobile and embedded devices by exporting to MNN format. +keywords: Ultralytics, YOLO11, MNN, model export, machine learning, deployment, mobile, embedded systems, deep learning, AI models +--- + +# MNN Export for YOLO11 Models and Deploy + +## MNN + +

+ MNN architecture +

+ +[MNN](https://github.com/alibaba/MNN) is a highly efficient and lightweight deep learning framework. It supports inference and training of deep learning models and has industry-leading performance for inference and training on-device. At present, MNN has been integrated into more than 30 apps of Alibaba Inc, such as Taobao, Tmall, Youku, DingTalk, Xianyu, etc., covering more than 70 usage scenarios such as live broadcast, short video capture, search recommendation, product searching by image, interactive marketing, equity distribution, security risk control. In addition, MNN is also used on embedded devices, such as IoT. + +## Export to MNN: Converting Your YOLO11 Model + +You can expand model compatibility and deployment flexibility by converting YOLO11 models to MNN format. + +### Installation + +To install the required packages, run: + +!!! tip "Installation" + + === "CLI" + + ```bash + # Install the required package for YOLO11 and MNN + pip install ultralytics + pip install MNN + ``` + +### Usage + +Before diving into the usage instructions, it's important to note that while all [Ultralytics YOLO11 models](../models/index.md) are available for exporting, you can ensure that the model you select supports export functionality [here](../modes/export.md). + +!!! example "Usage" + + === "Python" + + ```python + from ultralytics import YOLO + + # Load the YOLO11 model + model = YOLO("yolo11n.pt") + + # Export the model to MNN format + model.export(format="mnn") # creates 'yolo11n.mnn' + + # Load the exported MNN model + mnn_model = YOLO("yolo11n.mnn") + + # Run inference + results = mnn_model("https://ultralytics.com/images/bus.jpg") + ``` + + === "CLI" + + ```bash + # Export a YOLO11n PyTorch model to MNN format + yolo export model=yolo11n.pt format=mnn # creates 'yolo11n.mnn' + + # Run inference with the exported model + yolo predict model='yolo11n.mnn' source='https://ultralytics.com/images/bus.jpg' + ``` + +For more details about supported export options, visit the [Ultralytics documentation page on deployment options](../guides/model-deployment-options.md). + +### MNN-Only Inference + +A function that relies solely on MNN for YOLO11 inference and preprocessing is implemented, providing both Python and C++ versions for easy deployment in any scenario. + +!!! example "MNN" + + === "Python" + + ```python + import argparse + + import MNN + import MNN.cv as cv2 + import MNN.numpy as np + + + def inference(model, img, precision, backend, thread): + config = {} + config["precision"] = precision + config["backend"] = backend + config["numThread"] = thread + rt = MNN.nn.create_runtime_manager((config,)) + # net = MNN.nn.load_module_from_file(model, ['images'], ['output0'], runtime_manager=rt) + net = MNN.nn.load_module_from_file(model, [], [], runtime_manager=rt) + original_image = cv2.imread(img) + ih, iw, _ = original_image.shape + length = max((ih, iw)) + scale = length / 640 + image = np.pad(original_image, [[0, length - ih], [0, length - iw], [0, 0]], "constant") + image = cv2.resize( + image, (640, 640), 0.0, 0.0, cv2.INTER_LINEAR, -1, [0.0, 0.0, 0.0], [1.0 / 255.0, 1.0 / 255.0, 1.0 / 255.0] + ) + input_var = np.expand_dims(image, 0) + input_var = MNN.expr.convert(input_var, MNN.expr.NC4HW4) + output_var = net.forward(input_var) + output_var = MNN.expr.convert(output_var, MNN.expr.NCHW) + output_var = output_var.squeeze() + # output_var shape: [84, 8400]; 84 means: [cx, cy, w, h, prob * 80] + cx = output_var[0] + cy = output_var[1] + w = output_var[2] + h = output_var[3] + probs = output_var[4:] + # [cx, cy, w, h] -> [y0, x0, y1, x1] + x0 = cx - w * 0.5 + y0 = cy - h * 0.5 + x1 = cx + w * 0.5 + y1 = cy + h * 0.5 + boxes = np.stack([x0, y0, x1, y1], axis=1) + # get max prob and idx + scores = np.max(probs, 0) + class_ids = np.argmax(probs, 0) + result_ids = MNN.expr.nms(boxes, scores, 100, 0.45, 0.25) + print(result_ids.shape) + # nms result box, score, ids + result_boxes = boxes[result_ids] + result_scores = scores[result_ids] + result_class_ids = class_ids[result_ids] + for i in range(len(result_boxes)): + x0, y0, x1, y1 = result_boxes[i].read_as_tuple() + y0 = int(y0 * scale) + y1 = int(y1 * scale) + x0 = int(x0 * scale) + x1 = int(x1 * scale) + print(result_class_ids[i]) + cv2.rectangle(original_image, (x0, y0), (x1, y1), (0, 0, 255), 2) + cv2.imwrite("res.jpg", original_image) + + + if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument("--model", type=str, required=True, help="the yolo11 model path") + parser.add_argument("--img", type=str, required=True, help="the input image path") + parser.add_argument("--precision", type=str, default="normal", help="inference precision: normal, low, high, lowBF") + parser.add_argument( + "--backend", + type=str, + default="CPU", + help="inference backend: CPU, OPENCL, OPENGL, NN, VULKAN, METAL, TRT, CUDA, HIAI", + ) + parser.add_argument("--thread", type=int, default=4, help="inference using thread: int") + args = parser.parse_args() + inference(args.model, args.img, args.precision, args.backend, args.thread) + ``` + + === "CPP" + + ```cpp + #include + #include + #include + #include + #include + #include + + #include + + using namespace MNN; + using namespace MNN::Express; + using namespace MNN::CV; + + int main(int argc, const char* argv[]) { + if (argc < 3) { + MNN_PRINT("Usage: ./yolo11_demo.out model.mnn input.jpg [forwardType] [precision] [thread]\n"); + return 0; + } + int thread = 4; + int precision = 0; + int forwardType = MNN_FORWARD_CPU; + if (argc >= 4) { + forwardType = atoi(argv[3]); + } + if (argc >= 5) { + precision = atoi(argv[4]); + } + if (argc >= 6) { + thread = atoi(argv[5]); + } + MNN::ScheduleConfig sConfig; + sConfig.type = static_cast(forwardType); + sConfig.numThread = thread; + BackendConfig bConfig; + bConfig.precision = static_cast(precision); + sConfig.backendConfig = &bConfig; + std::shared_ptr rtmgr = std::shared_ptr(Executor::RuntimeManager::createRuntimeManager(sConfig)); + if(rtmgr == nullptr) { + MNN_ERROR("Empty RuntimeManger\n"); + return 0; + } + rtmgr->setCache(".cachefile"); + + std::shared_ptr net(Module::load(std::vector{}, std::vector{}, argv[1], rtmgr)); + auto original_image = imread(argv[2]); + auto dims = original_image->getInfo()->dim; + int ih = dims[0]; + int iw = dims[1]; + int len = ih > iw ? ih : iw; + float scale = len / 640.0; + std::vector padvals { 0, len - ih, 0, len - iw, 0, 0 }; + auto pads = _Const(static_cast(padvals.data()), {3, 2}, NCHW, halide_type_of()); + auto image = _Pad(original_image, pads, CONSTANT); + image = resize(image, Size(640, 640), 0, 0, INTER_LINEAR, -1, {0., 0., 0.}, {1./255., 1./255., 1./255.}); + auto input = _Unsqueeze(image, {0}); + input = _Convert(input, NC4HW4); + auto outputs = net->onForward({input}); + auto output = _Convert(outputs[0], NCHW); + output = _Squeeze(output); + // output shape: [84, 8400]; 84 means: [cx, cy, w, h, prob * 80] + auto cx = _Gather(output, _Scalar(0)); + auto cy = _Gather(output, _Scalar(1)); + auto w = _Gather(output, _Scalar(2)); + auto h = _Gather(output, _Scalar(3)); + std::vector startvals { 4, 0 }; + auto start = _Const(static_cast(startvals.data()), {2}, NCHW, halide_type_of()); + std::vector sizevals { -1, -1 }; + auto size = _Const(static_cast(sizevals.data()), {2}, NCHW, halide_type_of()); + auto probs = _Slice(output, start, size); + // [cx, cy, w, h] -> [y0, x0, y1, x1] + auto x0 = cx - w * _Const(0.5); + auto y0 = cy - h * _Const(0.5); + auto x1 = cx + w * _Const(0.5); + auto y1 = cy + h * _Const(0.5); + auto boxes = _Stack({x0, y0, x1, y1}, 1); + auto scores = _ReduceMax(probs, {0}); + auto ids = _ArgMax(probs, 0); + auto result_ids = _Nms(boxes, scores, 100, 0.45, 0.25); + auto result_ptr = result_ids->readMap(); + auto box_ptr = boxes->readMap(); + auto ids_ptr = ids->readMap(); + auto score_ptr = scores->readMap(); + for (int i = 0; i < 100; i++) { + auto idx = result_ptr[i]; + if (idx < 0) break; + auto x0 = box_ptr[idx * 4 + 0] * scale; + auto y0 = box_ptr[idx * 4 + 1] * scale; + auto x1 = box_ptr[idx * 4 + 2] * scale; + auto y1 = box_ptr[idx * 4 + 3] * scale; + auto class_idx = ids_ptr[idx]; + auto score = score_ptr[idx]; + rectangle(original_image, {x0, y0}, {x1, y1}, {0, 0, 255}, 2); + } + if (imwrite("res.jpg", original_image)) { + MNN_PRINT("result image write to `res.jpg`.\n"); + } + rtmgr->updateCache(); + return 0; + } + ``` + +## Summary + +In this guide, we introduce how to export the Ultralytics YOLO11 model to MNN and use MNN for inference. + +For more usage, please refer to the [MNN documentation](https://mnn-docs.readthedocs.io/en/latest). + +## FAQ + +### How do I export Ultralytics YOLO11 models to MNN format? + +To export your Ultralytics YOLO11 model to MNN format, follow these steps: + +!!! example "Export" + + === "Python" + + ```python + from ultralytics import YOLO + + # Load the YOLO11 model + model = YOLO("yolo11n.pt") + + # Export to MNN format + model.export(format="mnn") # creates 'yolo11n.mnn' with fp32 weight + model.export(format="mnn", half=True) # creates 'yolo11n.mnn' with fp16 weight + model.export(format="mnn", int8=True) # creates 'yolo11n.mnn' with int8 weight + ``` + + === "CLI" + + ```bash + yolo export model=yolo11n.pt format=mnn # creates 'yolo11n.mnn' with fp32 weight + yolo export model=yolo11n.pt format=mnn half=True # creates 'yolo11n.mnn' with fp16 weight + yolo export model=yolo11n.pt format=mnn int8=True # creates 'yolo11n.mnn' with int8 weight + ``` + +For detailed export options, check the [Export](../modes/export.md) page in the documentation. + +### How do I predict with an exported YOLO11 MNN model? + +To predict with an exported YOLO11 MNN model, use the `predict` function from the YOLO class. + +!!! example "Predict" + + === "Python" + + ```python + from ultralytics import YOLO + + # Load the YOLO11 MNN model + model = YOLO("yolo11n.mnn") + + # Export to MNN format + results = mnn_model("https://ultralytics.com/images/bus.jpg") # predict with `fp32` + results = mnn_model("https://ultralytics.com/images/bus.jpg", half=True) # predict with `fp16` if device support + + for result in results: + result.show() # display to screen + result.save(filename="result.jpg") # save to disk + ``` + + === "CLI" + + ```bash + yolo predict model='yolo11n.mnn' source='https://ultralytics.com/images/bus.jpg' # predict with `fp32` + yolo predict model='yolo11n.mnn' source='https://ultralytics.com/images/bus.jpg' --half=True # predict with `fp16` if device support + ``` + +### What platforms are supported for MNN? + +MNN is versatile and supports various platforms: + +- **Mobile**: Android, iOS, Harmony. +- **Embedded Systems and IoT Devices**: Devices like Raspberry Pi and NVIDIA Jetson. +- **Desktop and Servers**: Linux, Windows, and macOS. + +### How can I deploy Ultralytics YOLO11 MNN models on Mobile Devices? + +To deploy your YOLO11 models on Mobile devices: + +1. **Build for Android**: Follow the [MNN Android](https://github.com/alibaba/MNN/tree/master/project/android). +2. **Build for iOS**: Follow the [MNN iOS](https://github.com/alibaba/MNN/tree/master/project/ios). +3. **Build for Harmony**: Follow the [MNN Harmony](https://github.com/alibaba/MNN/tree/master/project/harmony). diff --git a/docs/en/macros/export-table.md b/docs/en/macros/export-table.md index 7cda31963a..b7134f42b8 100644 --- a/docs/en/macros/export-table.md +++ b/docs/en/macros/export-table.md @@ -12,4 +12,5 @@ | [TF Edge TPU](../integrations/edge-tpu.md) | `edgetpu` | `{{ model_name or "yolo11n" }}_edgetpu.tflite` | ✅ | `imgsz` | | [TF.js](../integrations/tfjs.md) | `tfjs` | `{{ model_name or "yolo11n" }}_web_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` | | [PaddlePaddle](../integrations/paddlepaddle.md) | `paddle` | `{{ model_name or "yolo11n" }}_paddle_model/` | ✅ | `imgsz`, `batch` | +| [MNN](../integrations/mnn.md) | `mnn` | `{{ model_name or "yolo11n" }}.mnn` | ✅ | `imgsz`, `batch`, `int8`, `half` | | [NCNN](../integrations/ncnn.md) | `ncnn` | `{{ model_name or "yolo11n" }}_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` | diff --git a/docs/mkdocs_github_authors.yaml b/docs/mkdocs_github_authors.yaml index 2e20921385..55ac6ec959 100644 --- a/docs/mkdocs_github_authors.yaml +++ b/docs/mkdocs_github_authors.yaml @@ -154,3 +154,6 @@ web@ultralytics.com: xinwang614@gmail.com: avatar: https://avatars.githubusercontent.com/u/17264618?v=4 username: GreatV +zhaode.wzd@alibaba-inc.com: + avatar: https://avatars.githubusercontent.com/u/8401806?v=4 + username: ZhaodeWang diff --git a/mkdocs.yml b/mkdocs.yml index a7157ec942..3ee15f83b0 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -398,11 +398,12 @@ nav: - JupyterLab: integrations/jupyterlab.md - Kaggle: integrations/kaggle.md - MLflow: integrations/mlflow.md - - NCNN: integrations/ncnn.md - Neural Magic: integrations/neural-magic.md - ONNX: integrations/onnx.md - OpenVINO: integrations/openvino.md - PaddlePaddle: integrations/paddlepaddle.md + - MNN: integrations/mnn.md + - NCNN: integrations/ncnn.md - Paperspace Gradient: integrations/paperspace.md - Ray Tune: integrations/ray-tune.md - Roboflow: integrations/roboflow.md diff --git a/tests/test_exports.py b/tests/test_exports.py index e6e2ec1598..12443fa30c 100644 --- a/tests/test_exports.py +++ b/tests/test_exports.py @@ -197,3 +197,10 @@ def test_export_ncnn(): """Test YOLO exports to NCNN format.""" file = YOLO(MODEL).export(format="ncnn", imgsz=32) YOLO(file)(SOURCE, imgsz=32) # exported model inference + + +@pytest.mark.slow +def test_export_mnn(): + """Test YOLO exports to MNN format.""" + file = YOLO(MODEL).export(format="mnn", imgsz=32) + YOLO(file)(SOURCE, imgsz=32) # exported model inference diff --git a/ultralytics/__init__.py b/ultralytics/__init__.py index 72a9396473..c847dd4d18 100644 --- a/ultralytics/__init__.py +++ b/ultralytics/__init__.py @@ -1,6 +1,6 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -__version__ = "8.3.24" +__version__ = "8.3.25" import os diff --git a/ultralytics/engine/exporter.py b/ultralytics/engine/exporter.py index 5104de1cd1..ea8d03b468 100644 --- a/ultralytics/engine/exporter.py +++ b/ultralytics/engine/exporter.py @@ -16,6 +16,7 @@ TensorFlow Lite | `tflite` | yolo11n.tflite TensorFlow Edge TPU | `edgetpu` | yolo11n_edgetpu.tflite TensorFlow.js | `tfjs` | yolo11n_web_model/ PaddlePaddle | `paddle` | yolo11n_paddle_model/ +MNN | `mnn` | yolo11n.mnn NCNN | `ncnn` | yolo11n_ncnn_model/ Requirements: @@ -41,6 +42,7 @@ Inference: yolo11n.tflite # TensorFlow Lite yolo11n_edgetpu.tflite # TensorFlow Edge TPU yolo11n_paddle_model # PaddlePaddle + yolo11n.mnn # MNN yolo11n_ncnn_model # NCNN TensorFlow.js: @@ -109,6 +111,7 @@ def export_formats(): ["TensorFlow Edge TPU", "edgetpu", "_edgetpu.tflite", True, False], ["TensorFlow.js", "tfjs", "_web_model", True, False], ["PaddlePaddle", "paddle", "_paddle_model", True, True], + ["MNN", "mnn", ".mnn", True, True], ["NCNN", "ncnn", "_ncnn_model", True, True], ] return dict(zip(["Format", "Argument", "Suffix", "CPU", "GPU"], zip(*x))) @@ -190,7 +193,9 @@ class Exporter: flags = [x == fmt for x in fmts] if sum(flags) != 1: raise ValueError(f"Invalid export format='{fmt}'. Valid formats are {fmts}") - jit, onnx, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle, ncnn = flags # export booleans + jit, onnx, xml, engine, coreml, saved_model, pb, tflite, edgetpu, tfjs, paddle, mnn, ncnn = ( + flags # export booleans + ) is_tf_format = any((saved_model, pb, tflite, edgetpu, tfjs)) # Device @@ -333,8 +338,10 @@ class Exporter: f[9], _ = self.export_tfjs() if paddle: # PaddlePaddle f[10], _ = self.export_paddle() + if mnn: # MNN + f[11], _ = self.export_mnn() if ncnn: # NCNN - f[11], _ = self.export_ncnn() + f[12], _ = self.export_ncnn() # Finish f = [str(x) for x in f if x] # filter out '' and None @@ -541,6 +548,32 @@ class Exporter: yaml_save(Path(f) / "metadata.yaml", self.metadata) # add metadata.yaml return f, None + @try_export + def export_mnn(self, prefix=colorstr("MNN:")): + """YOLOv8 MNN export using MNN https://github.com/alibaba/MNN.""" + f_onnx, _ = self.export_onnx() # get onnx model first + + check_requirements("MNN>=2.9.6") + import MNN # noqa + from MNN.tools import mnnconvert + + # Setup and checks + LOGGER.info(f"\n{prefix} starting export with MNN {MNN.version()}...") + assert Path(f_onnx).exists(), f"failed to export ONNX file: {f_onnx}" + f = str(self.file.with_suffix(".mnn")) # MNN model file + args = ["", "-f", "ONNX", "--modelFile", f_onnx, "--MNNModel", f, "--bizCode", json.dumps(self.metadata)] + if self.args.int8: + args.append("--weightQuantBits") + args.append("8") + if self.args.half: + args.append("--fp16") + mnnconvert.convert(args) + # remove scratch file for model convert optimize + convert_scratch = Path(self.file.parent / ".__convert_external_data.bin") + if convert_scratch.exists(): + convert_scratch.unlink() + return f, None + @try_export def export_ncnn(self, prefix=colorstr("NCNN:")): """YOLO NCNN export using PNNX https://github.com/pnnx/pnnx.""" diff --git a/ultralytics/engine/predictor.py b/ultralytics/engine/predictor.py index 16f12a88ea..fbe593e065 100644 --- a/ultralytics/engine/predictor.py +++ b/ultralytics/engine/predictor.py @@ -26,6 +26,7 @@ Usage - formats: yolov8n.tflite # TensorFlow Lite yolov8n_edgetpu.tflite # TensorFlow Edge TPU yolov8n_paddle_model # PaddlePaddle + yolov8n.mnn # MNN yolov8n_ncnn_model # NCNN """ diff --git a/ultralytics/engine/validator.py b/ultralytics/engine/validator.py index daa058a9de..1f6f6912c0 100644 --- a/ultralytics/engine/validator.py +++ b/ultralytics/engine/validator.py @@ -17,6 +17,7 @@ Usage - formats: yolov8n.tflite # TensorFlow Lite yolov8n_edgetpu.tflite # TensorFlow Edge TPU yolov8n_paddle_model # PaddlePaddle + yolov8n.mnn # MNN yolov8n_ncnn_model # NCNN """ diff --git a/ultralytics/nn/autobackend.py b/ultralytics/nn/autobackend.py index b9312fefdb..245e42c4ed 100644 --- a/ultralytics/nn/autobackend.py +++ b/ultralytics/nn/autobackend.py @@ -59,21 +59,22 @@ class AutoBackend(nn.Module): range of formats, each with specific naming conventions as outlined below: Supported Formats and Naming Conventions: - | Format | File Suffix | - |-----------------------|------------------| - | PyTorch | *.pt | - | TorchScript | *.torchscript | - | ONNX Runtime | *.onnx | - | ONNX OpenCV DNN | *.onnx (dnn=True)| - | OpenVINO | *openvino_model/ | - | CoreML | *.mlpackage | - | TensorRT | *.engine | - | TensorFlow SavedModel | *_saved_model | - | TensorFlow GraphDef | *.pb | - | TensorFlow Lite | *.tflite | - | TensorFlow Edge TPU | *_edgetpu.tflite | - | PaddlePaddle | *_paddle_model | - | NCNN | *_ncnn_model | + | Format | File Suffix | + |-----------------------|-------------------| + | PyTorch | *.pt | + | TorchScript | *.torchscript | + | ONNX Runtime | *.onnx | + | ONNX OpenCV DNN | *.onnx (dnn=True) | + | OpenVINO | *openvino_model/ | + | CoreML | *.mlpackage | + | TensorRT | *.engine | + | TensorFlow SavedModel | *_saved_model/ | + | TensorFlow GraphDef | *.pb | + | TensorFlow Lite | *.tflite | + | TensorFlow Edge TPU | *_edgetpu.tflite | + | PaddlePaddle | *_paddle_model/ | + | MNN | *.mnn | + | NCNN | *_ncnn_model/ | This class offers dynamic backend switching capabilities based on the input model format, making it easier to deploy models across various platforms. @@ -120,6 +121,7 @@ class AutoBackend(nn.Module): edgetpu, tfjs, paddle, + mnn, ncnn, triton, ) = self._model_type(w) @@ -403,6 +405,26 @@ class AutoBackend(nn.Module): output_names = predictor.get_output_names() metadata = w.parents[1] / "metadata.yaml" + # MNN + elif mnn: + LOGGER.info(f"Loading {w} for MNN inference...") + check_requirements("MNN") # requires MNN + import os + + import MNN + + config = {} + config["precision"] = "low" + config["backend"] = "CPU" + config["numThread"] = (os.cpu_count() + 1) // 2 + rt = MNN.nn.create_runtime_manager((config,)) + net = MNN.nn.load_module_from_file(w, [], [], runtime_manager=rt, rearrange=True) + + def torch_to_mnn(x): + return MNN.expr.const(x.data_ptr(), x.shape) + + metadata = json.loads(net.get_info()["bizCode"]) + # NCNN elif ncnn: LOGGER.info(f"Loading {w} for NCNN inference...") @@ -590,6 +612,12 @@ class AutoBackend(nn.Module): self.predictor.run() y = [self.predictor.get_output_handle(x).copy_to_cpu() for x in self.output_names] + # MNN + elif self.mnn: + input_var = self.torch_to_mnn(im) + output_var = self.net.onForward([input_var]) + y = [x.read() for x in output_var] + # NCNN elif self.ncnn: mat_in = self.pyncnn.Mat(im[0].cpu().numpy()) diff --git a/ultralytics/utils/benchmarks.py b/ultralytics/utils/benchmarks.py index 653f48d3a9..3ddd934db7 100644 --- a/ultralytics/utils/benchmarks.py +++ b/ultralytics/utils/benchmarks.py @@ -21,6 +21,7 @@ TensorFlow Lite | `tflite` | yolov8n.tflite TensorFlow Edge TPU | `edgetpu` | yolov8n_edgetpu.tflite TensorFlow.js | `tfjs` | yolov8n_web_model/ PaddlePaddle | `paddle` | yolov8n_paddle_model/ +MNN | `mnn` | yolov8n.mnn NCNN | `ncnn` | yolov8n_ncnn_model/ """ @@ -111,8 +112,8 @@ def benchmark( assert not isinstance(model, YOLOWorld), "YOLOWorldv2 Paddle exports not supported yet" assert not is_end2end, "End-to-end models not supported by PaddlePaddle yet" assert LINUX or MACOS, "Windows Paddle exports not supported yet" - if i in {12}: # NCNN - assert not isinstance(model, YOLOWorld), "YOLOWorldv2 NCNN exports not supported yet" + if i in {12, 13}: # MNN, NCNN + assert not isinstance(model, YOLOWorld), "YOLOWorldv2 MNN, NCNN exports not supported yet" if "cpu" in device.type: assert cpu, "inference not supported on CPU" if "cuda" in device.type: @@ -132,7 +133,7 @@ def benchmark( assert model.task != "pose" or i != 7, "GraphDef Pose inference is not supported" assert i not in {9, 10}, "inference not supported" # Edge TPU and TF.js are unsupported assert i != 5 or platform.system() == "Darwin", "inference only supported on macOS>=10.13" # CoreML - if i in {12}: + if i in {13}: assert not is_end2end, "End-to-end torch.topk operation is not supported for NCNN prediction yet" exported_model.predict(ASSETS / "bus.jpg", imgsz=imgsz, device=device, half=half) From bfa6f9a8e76a6bc1799d9505935c4ad116d8baa1 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Thu, 31 Oct 2024 16:42:05 +0500 Subject: [PATCH 18/46] Update `sam.md` and `sam-2.md` (#17286) --- docs/en/models/sam-2.md | 4 ++-- docs/en/models/sam.md | 4 ++-- ultralytics/data/annotator.py | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-) diff --git a/docs/en/models/sam-2.md b/docs/en/models/sam-2.md index d5e8888e29..9083899ea6 100644 --- a/docs/en/models/sam-2.md +++ b/docs/en/models/sam-2.md @@ -250,13 +250,13 @@ To auto-annotate your dataset using SAM 2, follow this example: ```python from ultralytics.data.annotator import auto_annotate - auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model="sam2_b.pt") + auto_annotate(data="path/to/images", det_model="yolo11x.pt", sam_model="sam2_b.pt") ``` | Argument | Type | Description | Default | | ------------ | ----------------------- | ------------------------------------------------------------------------------------------------------- | -------------- | | `data` | `str` | Path to a folder containing images to be annotated. | | -| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolov8x.pt'. | `'yolov8x.pt'` | +| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolo11x.pt'. | `'yolov8x.pt'` | | `sam_model` | `str`, optional | Pre-trained SAM 2 segmentation model. Defaults to 'sam2_b.pt'. | `'sam2_b.pt'` | | `device` | `str`, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | | | `output_dir` | `str`, `None`, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | `None` | diff --git a/docs/en/models/sam.md b/docs/en/models/sam.md index f9acad72df..c38b06e355 100644 --- a/docs/en/models/sam.md +++ b/docs/en/models/sam.md @@ -205,13 +205,13 @@ To auto-annotate your dataset with the Ultralytics framework, use the `auto_anno ```python from ultralytics.data.annotator import auto_annotate - auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model="sam_b.pt") + auto_annotate(data="path/to/images", det_model="yolo11x.pt", sam_model="sam_b.pt") ``` | Argument | Type | Description | Default | | ------------ | --------------------- | ------------------------------------------------------------------------------------------------------- | -------------- | | `data` | `str` | Path to a folder containing images to be annotated. | | -| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolov8x.pt'. | `'yolov8x.pt'` | +| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolo11x.pt'. | `'yolov8x.pt'` | | `sam_model` | `str`, optional | Pre-trained SAM segmentation model. Defaults to 'sam_b.pt'. | `'sam_b.pt'` | | `device` | `str`, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | | | `output_dir` | `str`, None, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | `None` | diff --git a/ultralytics/data/annotator.py b/ultralytics/data/annotator.py index 30d02d9d73..3880741d34 100644 --- a/ultralytics/data/annotator.py +++ b/ultralytics/data/annotator.py @@ -5,7 +5,7 @@ from pathlib import Path from ultralytics import SAM, YOLO -def auto_annotate(data, det_model="yolov8x.pt", sam_model="sam_b.pt", device="", output_dir=None): +def auto_annotate(data, det_model="yolo11x.pt", sam_model="sam_b.pt", device="", output_dir=None): """ Automatically annotates images using a YOLO object detection model and a SAM segmentation model. From 66adbd79ad4cdb87082cec0023588a1dc4b6d88c Mon Sep 17 00:00:00 2001 From: Compunet <117437050+dme-compunet@users.noreply.github.com> Date: Thu, 31 Oct 2024 13:46:46 +0200 Subject: [PATCH 19/46] Update examples/README.md (#17284) --- examples/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/examples/README.md b/examples/README.md index 22da53f294..ab875b3ba8 100644 --- a/examples/README.md +++ b/examples/README.md @@ -8,7 +8,7 @@ This directory features a collection of real-world applications and walkthroughs | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------ | ----------------------------------------------------------------------------------------- | | [YOLO ONNX Detection Inference with C++](./YOLOv8-CPP-Inference) | C++/ONNX | [Justas Bartnykas](https://github.com/JustasBart) | | [YOLO OpenCV ONNX Detection Python](./YOLOv8-OpenCV-ONNX-Python) | OpenCV/Python/ONNX | [Farid Inawan](https://github.com/frdteknikelektro) | -| [YOLOv8 .NET ONNX ImageSharp](https://github.com/dme-compunet/YOLOv8) | C#/ONNX/ImageSharp | [Compunet](https://github.com/dme-compunet) | +| [YOLO C# ONNX-Runtime](https://github.com/dme-compunet/YoloSharp) | .NET/ONNX-Runtime | [Compunet](https://github.com/dme-compunet) | | [YOLO .Net ONNX Detection C#](https://www.nuget.org/packages/Yolov8.Net) | C# .Net | [Samuel Stainback](https://github.com/sstainba) | | [YOLOv8 on NVIDIA Jetson(TensorRT and DeepStream)](https://wiki.seeedstudio.com/YOLOv8-DeepStream-TRT-Jetson/) | Python | [Lakshantha](https://github.com/lakshanthad) | | [YOLOv8 ONNXRuntime Python](./YOLOv8-ONNXRuntime) | Python/ONNXRuntime | [Semih Demirel](https://github.com/semihhdemirel) | From b8783cad24dc751a33b708f454d9d745d91578e6 Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Thu, 31 Oct 2024 12:48:24 +0100 Subject: [PATCH 20/46] Patch MNN test order bug (#17290) Co-authored-by: UltralyticsAssistant --- tests/test_exports.py | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/tests/test_exports.py b/tests/test_exports.py index 12443fa30c..a05f0e0593 100644 --- a/tests/test_exports.py +++ b/tests/test_exports.py @@ -193,14 +193,14 @@ def test_export_paddle(): @pytest.mark.slow -def test_export_ncnn(): - """Test YOLO exports to NCNN format.""" - file = YOLO(MODEL).export(format="ncnn", imgsz=32) +def test_export_mnn(): + """Test YOLO exports to MNN format (WARNING: MNN test must precede NCNN test or CI error on Windows).""" + file = YOLO(MODEL).export(format="mnn", imgsz=32) YOLO(file)(SOURCE, imgsz=32) # exported model inference @pytest.mark.slow -def test_export_mnn(): - """Test YOLO exports to MNN format.""" - file = YOLO(MODEL).export(format="mnn", imgsz=32) +def test_export_ncnn(): + """Test YOLO exports to NCNN format.""" + file = YOLO(MODEL).export(format="ncnn", imgsz=32) YOLO(file)(SOURCE, imgsz=32) # exported model inference From c943a3b747cd445a31c116690823d17a5996b0b9 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Thu, 31 Oct 2024 16:52:14 +0500 Subject: [PATCH 21/46] Case-insensitive optimizer name (#17287) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- ultralytics/engine/trainer.py | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/ultralytics/engine/trainer.py b/ultralytics/engine/trainer.py index 352067397f..e82aed9e08 100644 --- a/ultralytics/engine/trainer.py +++ b/ultralytics/engine/trainer.py @@ -791,6 +791,8 @@ class BaseTrainer: else: # weight (with decay) g[0].append(param) + optimizers = {"Adam", "Adamax", "AdamW", "NAdam", "RAdam", "RMSProp", "SGD", "auto"} + name = {x.lower(): x for x in optimizers}.get(name.lower(), None) if name in {"Adam", "Adamax", "AdamW", "NAdam", "RAdam"}: optimizer = getattr(optim, name, optim.Adam)(g[2], lr=lr, betas=(momentum, 0.999), weight_decay=0.0) elif name == "RMSProp": @@ -799,9 +801,8 @@ class BaseTrainer: optimizer = optim.SGD(g[2], lr=lr, momentum=momentum, nesterov=True) else: raise NotImplementedError( - f"Optimizer '{name}' not found in list of available optimizers " - f"[Adam, AdamW, NAdam, RAdam, RMSProp, SGD, auto]." - "To request support for addition optimizers please visit https://github.com/ultralytics/ultralytics." + f"Optimizer '{name}' not found in list of available optimizers {optimizers}. " + "Request support for addition optimizers at https://github.com/ultralytics/ultralytics." ) optimizer.add_param_group({"params": g[0], "weight_decay": decay}) # add g0 with weight_decay From e8743f2ac9f43d83143e6575598282a8ec55cd88 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Thu, 31 Oct 2024 17:35:26 +0500 Subject: [PATCH 22/46] Auto annotation new parameters for SAM models (#17288) Co-authored-by: UltralyticsAssistant --- docs/en/models/sam-2.md | 5 ++++- docs/en/models/sam.md | 5 ++++- ultralytics/data/annotator.py | 9 +++++++-- 3 files changed, 15 insertions(+), 4 deletions(-) diff --git a/docs/en/models/sam-2.md b/docs/en/models/sam-2.md index 9083899ea6..de5881c42e 100644 --- a/docs/en/models/sam-2.md +++ b/docs/en/models/sam-2.md @@ -256,9 +256,12 @@ To auto-annotate your dataset using SAM 2, follow this example: | Argument | Type | Description | Default | | ------------ | ----------------------- | ------------------------------------------------------------------------------------------------------- | -------------- | | `data` | `str` | Path to a folder containing images to be annotated. | | -| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolo11x.pt'. | `'yolov8x.pt'` | +| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolo11x.pt'. | `'yolo11x.pt'` | | `sam_model` | `str`, optional | Pre-trained SAM 2 segmentation model. Defaults to 'sam2_b.pt'. | `'sam2_b.pt'` | | `device` | `str`, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | | +| `conf` | `float`, optional | Confidence threshold for detection model; default is 0.25. | `0.25` | +| `iou` | `float`, optional | IoU threshold for filtering overlapping boxes in detection results; default is 0.45. | `0.45` | +| `imgsz` | `int`, optional | Input image resize dimension; default is 640. | `640` | | `output_dir` | `str`, `None`, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | `None` | This function facilitates the rapid creation of high-quality segmentation datasets, ideal for researchers and developers aiming to accelerate their projects. diff --git a/docs/en/models/sam.md b/docs/en/models/sam.md index c38b06e355..fe4c01bd8b 100644 --- a/docs/en/models/sam.md +++ b/docs/en/models/sam.md @@ -211,9 +211,12 @@ To auto-annotate your dataset with the Ultralytics framework, use the `auto_anno | Argument | Type | Description | Default | | ------------ | --------------------- | ------------------------------------------------------------------------------------------------------- | -------------- | | `data` | `str` | Path to a folder containing images to be annotated. | | -| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolo11x.pt'. | `'yolov8x.pt'` | +| `det_model` | `str`, optional | Pre-trained YOLO detection model. Defaults to 'yolo11x.pt'. | `'yolo11x.pt'` | | `sam_model` | `str`, optional | Pre-trained SAM segmentation model. Defaults to 'sam_b.pt'. | `'sam_b.pt'` | | `device` | `str`, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | | +| `conf` | `float`, optional | Confidence threshold for detection model; default is 0.25. | `0.25` | +| `iou` | `float`, optional | IoU threshold for filtering overlapping boxes in detection results; default is 0.45. | `0.45` | +| `imgsz` | `int`, optional | Input image resize dimension; default is 640. | `640` | | `output_dir` | `str`, None, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | `None` | The `auto_annotate` function takes the path to your images, with optional arguments for specifying the pre-trained detection and SAM segmentation models, the device to run the models on, and the output directory for saving the annotated results. diff --git a/ultralytics/data/annotator.py b/ultralytics/data/annotator.py index 3880741d34..64ee9af6c0 100644 --- a/ultralytics/data/annotator.py +++ b/ultralytics/data/annotator.py @@ -5,7 +5,9 @@ from pathlib import Path from ultralytics import SAM, YOLO -def auto_annotate(data, det_model="yolo11x.pt", sam_model="sam_b.pt", device="", output_dir=None): +def auto_annotate( + data, det_model="yolo11x.pt", sam_model="sam_b.pt", device="", conf=0.25, iou=0.45, imgsz=640, output_dir=None +): """ Automatically annotates images using a YOLO object detection model and a SAM segmentation model. @@ -17,6 +19,9 @@ def auto_annotate(data, det_model="yolo11x.pt", sam_model="sam_b.pt", device="", det_model (str): Path or name of the pre-trained YOLO detection model. sam_model (str): Path or name of the pre-trained SAM segmentation model. device (str): Device to run the models on (e.g., 'cpu', 'cuda', '0'). + conf (float): Confidence threshold for detection model; default is 0.25. + iou (float): IoU threshold for filtering overlapping boxes in detection results; default is 0.45. + imgsz (int): Input image resize dimension; default is 640. output_dir (str | None): Directory to save the annotated results. If None, a default directory is created. Examples: @@ -36,7 +41,7 @@ def auto_annotate(data, det_model="yolo11x.pt", sam_model="sam_b.pt", device="", output_dir = data.parent / f"{data.stem}_auto_annotate_labels" Path(output_dir).mkdir(exist_ok=True, parents=True) - det_results = det_model(data, stream=True, device=device) + det_results = det_model(data, stream=True, device=device, conf=conf, iou=iou, imgsz=imgsz) for result in det_results: class_ids = result.boxes.cls.int().tolist() # noqa From f4e7756bff5c1b0c246afb491f42e9f9de84dc84 Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Thu, 31 Oct 2024 20:36:27 +0800 Subject: [PATCH 23/46] `ultralytics 8.3.26` EdgeTPU Pose models fix (#17281) Co-authored-by: Glenn Jocher Co-authored-by: UltralyticsAssistant --- ultralytics/__init__.py | 2 +- ultralytics/nn/autobackend.py | 3 +++ ultralytics/nn/modules/head.py | 18 +++++++++++++++--- 3 files changed, 19 insertions(+), 4 deletions(-) diff --git a/ultralytics/__init__.py b/ultralytics/__init__.py index c847dd4d18..fedf8629a8 100644 --- a/ultralytics/__init__.py +++ b/ultralytics/__init__.py @@ -1,6 +1,6 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -__version__ = "8.3.25" +__version__ = "8.3.26" import os diff --git a/ultralytics/nn/autobackend.py b/ultralytics/nn/autobackend.py index 245e42c4ed..cef05a3571 100644 --- a/ultralytics/nn/autobackend.py +++ b/ultralytics/nn/autobackend.py @@ -663,6 +663,9 @@ class AutoBackend(nn.Module): else: x[:, [0, 2]] *= w x[:, [1, 3]] *= h + if self.task == "pose": + x[:, 5::3] *= w + x[:, 6::3] *= h y.append(x) # TF segment fixes: export is reversed vs ONNX export and protos are transposed if len(y) == 2: # segment with (det, proto) output order reversed diff --git a/ultralytics/nn/modules/head.py b/ultralytics/nn/modules/head.py index 4bc1fa25e7..84c31709ca 100644 --- a/ultralytics/nn/modules/head.py +++ b/ultralytics/nn/modules/head.py @@ -246,9 +246,21 @@ class Pose(Detect): def kpts_decode(self, bs, kpts): """Decodes keypoints.""" ndim = self.kpt_shape[1] - if self.export: # required for TFLite export to avoid 'PLACEHOLDER_FOR_GREATER_OP_CODES' bug - y = kpts.view(bs, *self.kpt_shape, -1) - a = (y[:, :, :2] * 2.0 + (self.anchors - 0.5)) * self.strides + if self.export: + if self.format in { + "tflite", + "edgetpu", + }: # required for TFLite export to avoid 'PLACEHOLDER_FOR_GREATER_OP_CODES' bug + # Precompute normalization factor to increase numerical stability + y = kpts.view(bs, *self.kpt_shape, -1) + grid_h, grid_w = self.shape[2], self.shape[3] + grid_size = torch.tensor([grid_w, grid_h], device=y.device).reshape(1, 2, 1) + norm = self.strides / (self.stride[0] * grid_size) + a = (y[:, :, :2] * 2.0 + (self.anchors - 0.5)) * norm + else: + # NCNN fix + y = kpts.view(bs, *self.kpt_shape, -1) + a = (y[:, :, :2] * 2.0 + (self.anchors - 0.5)) * self.strides if ndim == 3: a = torch.cat((a, y[:, :, 2:3].sigmoid()), 2) return a.view(bs, self.nk, -1) From 1e70710f3eb3779bcbaa38fb428b1be53fc913eb Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Thu, 31 Oct 2024 22:18:28 +0500 Subject: [PATCH 24/46] Add model comparison graphs in `benchmark.md` (#17212) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- docs/en/modes/benchmark.md | 20 +++++++ docs/overrides/javascript/extra.js | 88 ++++++++++++++++++++++++++++++ 2 files changed, 108 insertions(+) diff --git a/docs/en/modes/benchmark.md b/docs/en/modes/benchmark.md index 3086e98ec6..00d851bea2 100644 --- a/docs/en/modes/benchmark.md +++ b/docs/en/modes/benchmark.md @@ -8,6 +8,26 @@ keywords: model benchmarking, YOLO11, Ultralytics, performance evaluation, expor Ultralytics YOLO ecosystem and integrations +## Benchmark Visualization + + + +!!! tip "Refresh Browser" + + You may need to refresh the page to view the graphs correctly due to potential cookie issues. + +
+ + + + + + + + +
+ + ## Introduction Once your model is trained and validated, the next logical step is to evaluate its performance in various real-world scenarios. Benchmark mode in Ultralytics YOLO11 serves this purpose by providing a robust framework for assessing the speed and [accuracy](https://www.ultralytics.com/glossary/accuracy) of your model across a range of export formats. diff --git a/docs/overrides/javascript/extra.js b/docs/overrides/javascript/extra.js index 5029ff4893..13f07397e1 100644 --- a/docs/overrides/javascript/extra.js +++ b/docs/overrides/javascript/extra.js @@ -147,3 +147,91 @@ document.addEventListener("DOMContentLoaded", () => { addInkeepWidget(); // initialize the widget }); }); + +// This object contains the benchmark data for various object detection models. +const data = { + 'YOLOv5': {s: {speed: 1.92, mAP: 37.4}, m: {speed: 4.03, mAP: 45.4}, l: {speed: 6.61, mAP: 49.0}, x: {speed: 11.89, mAP: 50.7}}, + 'YOLOv6': {n: {speed: 1.17, mAP: 37.5}, s: {speed: 2.66, mAP: 45.0}, m: {speed: 5.28, mAP: 50.0}, l: {speed: 8.95, mAP: 52.8}}, + 'YOLOv7': {l: {speed: 6.84, mAP: 51.4}, x: {speed: 11.57, mAP: 53.1}}, + 'YOLOv8': {n: {speed: 1.47, mAP: 37.3}, s: {speed: 2.66, mAP: 44.9}, m: {speed: 5.86, mAP: 50.2}, l: {speed: 9.06, mAP: 52.9}, x: {speed: 14.37, mAP: 53.9}}, + 'YOLOv9': {t: {speed: 2.30, mAP: 37.8}, s: {speed: 3.54, mAP: 46.5}, m: {speed: 6.43, mAP: 51.5}, c: {speed: 7.16, mAP: 52.8}, e: {speed: 16.77, mAP: 55.1}}, + 'YOLOv10': {n: {speed: 1.56, mAP: 39.5}, s: {speed: 2.66, mAP: 46.7}, m: {speed: 5.48, mAP: 51.3}, b: {speed: 6.54, mAP: 52.7}, l: {speed: 8.33, mAP: 53.3}, x: {speed: 12.2, mAP: 54.4}}, + 'PPYOLOE': {t: {speed: 2.84, mAP: 39.9}, s: {speed: 2.62, mAP: 43.7}, m: {speed: 5.56, mAP: 49.8}, l: {speed: 8.36, mAP: 52.9}, x: {speed: 14.3, mAP: 54.7}}, + 'YOLO11': {n: {speed: 1.55, mAP: 39.5}, s: {speed: 2.63, mAP: 47.0}, m: {speed: 5.27, mAP: 51.4}, l: {speed: 6.84, mAP: 53.2}, x: {speed: 12.49, mAP: 54.7}} +}; + +let chart = null; // chart variable will hold the reference to the current chart instance. + +// This function is responsible for updating the benchmarks chart. +function updateChart() { + // If a chart instance already exists, destroy it. + if (chart) { chart.destroy(); } + + // Get the selected algorithms from the checkboxes. + const selectedAlgorithms = [...document.querySelectorAll('input[name="algorithm"]:checked')].map(e => e.value); + + // Create the datasets for the selected algorithms. + const datasets = selectedAlgorithms.map((algorithm, index) => ({ + label: algorithm, // Label for the data points in the legend. + data: Object.entries(data[algorithm]).map(([version, point]) => ({ + x: point.speed, // Speed data points on the x-axis. + y: point.mAP, // mAP data points on the y-axis. + version: version.toUpperCase() // Store the version as additional data. + })), + fill: false, // Don't fill the chart. + borderColor: `hsl(${index * 90}, 70%, 50%)`, // Assign a unique color to each dataset. + tension: 0.3, // Smooth the line. + pointRadius: 5, // Increase the dot size. + pointHoverRadius: 10, // Increase the dot size on hover. + borderWidth: 2 // Set the line thickness. + })); + + // If there are no selected algorithms, return without creating a new chart. + if (datasets.length === 0) return; + + // Create a new chart instance. + chart = new Chart(document.getElementById('chart').getContext('2d'), { + type: 'line', // Set the chart type to line. + data: { datasets }, + options: { + plugins: { + legend: { display: true, position: 'top', labels: { color: '#111e68' } }, // Configure the legend. + tooltip: { + callbacks: { + label: (tooltipItem) => { + const { dataset, dataIndex } = tooltipItem; + const point = dataset.data[dataIndex]; + return `${dataset.label}${point.version.toLowerCase()}: Speed = ${point.x}, mAP = ${point.y}`; // Custom tooltip label. + } + }, + mode: 'nearest', + intersect: false + } // Configure the tooltip. + }, + interaction: { mode: 'nearest', axis: 'x', intersect: false }, // Configure the interaction mode. + scales: { + x: { + type: 'linear', position: 'bottom', + title: { display: true, text: 'Latency T4 TensorRT10 FP16 (ms/img)', color: '#111e68' }, // X-axis title. + grid: { color: '#e0e0e0' }, // Grid line color. + ticks: { color: '#111e68' } // Tick label color. + }, + y: { + title: { display: true, text: 'mAP', color: '#111e68' }, // Y-axis title. + grid: { color: '#e0e0e0' }, // Grid line color. + ticks: { color: '#111e68' } // Tick label color. + } + } + } + }); +} + +// Add event listeners to the checkboxes to trigger the chart update. +document.addEventListener("DOMContentLoaded", () => { + document.querySelectorAll('input[name="algorithm"]').forEach(checkbox => + checkbox.addEventListener('change', updateChart) + ); + // Call updateChart on initial load + updateChart(); + console.log("DOM loaded, initial chart render attempted"); +}); From daaac35fffe0889ce3e6371fff0253434b5f0c9b Mon Sep 17 00:00:00 2001 From: Lakshantha Dissanayake Date: Thu, 31 Oct 2024 17:12:29 -0700 Subject: [PATCH 25/46] Skip MNN export for Raspberry Pi and NVIDIA Jetson (#17292) Co-authored-by: Glenn Jocher Co-authored-by: UltralyticsAssistant --- tests/test_exports.py | 1 + ultralytics/engine/exporter.py | 3 +++ 2 files changed, 4 insertions(+) diff --git a/tests/test_exports.py b/tests/test_exports.py index a05f0e0593..5a54b1afa6 100644 --- a/tests/test_exports.py +++ b/tests/test_exports.py @@ -193,6 +193,7 @@ def test_export_paddle(): @pytest.mark.slow +@pytest.mark.skipif(IS_RASPBERRYPI, reason="MNN not supported on Raspberry Pi") def test_export_mnn(): """Test YOLO exports to MNN format (WARNING: MNN test must precede NCNN test or CI error on Windows).""" file = YOLO(MODEL).export(format="mnn", imgsz=32) diff --git a/ultralytics/engine/exporter.py b/ultralytics/engine/exporter.py index ea8d03b468..223454f600 100644 --- a/ultralytics/engine/exporter.py +++ b/ultralytics/engine/exporter.py @@ -77,6 +77,7 @@ from ultralytics.utils import ( ARM64, DEFAULT_CFG, IS_JETSON, + IS_RASPBERRYPI, LINUX, LOGGER, MACOS, @@ -244,6 +245,8 @@ class Exporter: "WARNING ⚠️ INT8 export requires a missing 'data' arg for calibration. " f"Using default 'data={self.args.data}'." ) + if mnn and (IS_RASPBERRYPI or IS_JETSON): + raise SystemError("MNN export not supported on Raspberry Pi and NVIDIA Jetson") # Input im = torch.zeros(self.args.batch, 3, *self.imgsz).to(self.device) file = Path( From 7cb36d64b23e311eadd9a75f402d599598396893 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Fri, 1 Nov 2024 05:34:03 +0500 Subject: [PATCH 26/46] Benchmark graph fix (#17296) Co-authored-by: Glenn Jocher --- docs/en/modes/benchmark.md | 22 +++++++++++--------- docs/overrides/javascript/extra.js | 33 ++++++++++++++++-------------- 2 files changed, 30 insertions(+), 25 deletions(-) diff --git a/docs/en/modes/benchmark.md b/docs/en/modes/benchmark.md index 00d851bea2..b562a979ec 100644 --- a/docs/en/modes/benchmark.md +++ b/docs/en/modes/benchmark.md @@ -16,17 +16,19 @@ keywords: model benchmarking, YOLO11, Ultralytics, performance evaluation, expor You may need to refresh the page to view the graphs correctly due to potential cookie issues. -
- - - - - - - - +
+
+
+
+
+
+
+
+
+ +
+
- ## Introduction diff --git a/docs/overrides/javascript/extra.js b/docs/overrides/javascript/extra.js index 13f07397e1..e2faf7986e 100644 --- a/docs/overrides/javascript/extra.js +++ b/docs/overrides/javascript/extra.js @@ -165,7 +165,7 @@ let chart = null; // chart variable will hold the reference to the current char // This function is responsible for updating the benchmarks chart. function updateChart() { // If a chart instance already exists, destroy it. - if (chart) { chart.destroy(); } + if (chart) chart.destroy(); // Get the selected algorithms from the checkboxes. const selectedAlgorithms = [...document.querySelectorAll('input[name="algorithm"]:checked')].map(e => e.value); @@ -195,7 +195,7 @@ function updateChart() { data: { datasets }, options: { plugins: { - legend: { display: true, position: 'top', labels: { color: '#111e68' } }, // Configure the legend. + legend: { display: true, position: 'top', labels: {color: '#808080'} }, // Configure the legend. tooltip: { callbacks: { label: (tooltipItem) => { @@ -212,26 +212,29 @@ function updateChart() { scales: { x: { type: 'linear', position: 'bottom', - title: { display: true, text: 'Latency T4 TensorRT10 FP16 (ms/img)', color: '#111e68' }, // X-axis title. + title: { display: true, text: 'Latency T4 TensorRT10 FP16 (ms/img)', color: '#808080'}, // X-axis title. grid: { color: '#e0e0e0' }, // Grid line color. - ticks: { color: '#111e68' } // Tick label color. + ticks: { color: '#808080' } // Tick label color. }, y: { - title: { display: true, text: 'mAP', color: '#111e68' }, // Y-axis title. + title: { display: true, text: 'mAP', color: '#808080'}, // Y-axis title. grid: { color: '#e0e0e0' }, // Grid line color. - ticks: { color: '#111e68' } // Tick label color. + ticks: { color: '#808080' } // Tick label color. } } } }); } -// Add event listeners to the checkboxes to trigger the chart update. -document.addEventListener("DOMContentLoaded", () => { - document.querySelectorAll('input[name="algorithm"]').forEach(checkbox => - checkbox.addEventListener('change', updateChart) - ); - // Call updateChart on initial load - updateChart(); - console.log("DOM loaded, initial chart render attempted"); -}); +// Poll for Chart.js to load, then initialize checkboxes and chart +function initializeApp() { + if (typeof Chart !== 'undefined') { + document.querySelectorAll('input[name="algorithm"]').forEach(checkbox => + checkbox.addEventListener('change', updateChart) + ); + updateChart(); + } else { + setTimeout(initializeApp, 100); // Retry every 100ms + } +} +document.addEventListener("DOMContentLoaded", initializeApp); // Initial chart rendering on page load From 3a4b65c347863e0bb1f1eb6b797a9bc59936bf3b Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Fri, 1 Nov 2024 01:42:51 +0100 Subject: [PATCH 27/46] `ultralytics 8.3.27` HUB timed training fix (#17298) Signed-off-by: UltralyticsAssistant Co-authored-by: UltralyticsAssistant --- docs/mkdocs_github_authors.yaml | 11 +++++++---- ultralytics/__init__.py | 2 +- ultralytics/engine/trainer.py | 2 +- ultralytics/utils/checks.py | 8 ++------ 4 files changed, 11 insertions(+), 12 deletions(-) diff --git a/docs/mkdocs_github_authors.yaml b/docs/mkdocs_github_authors.yaml index 55ac6ec959..3e6919e7fe 100644 --- a/docs/mkdocs_github_authors.yaml +++ b/docs/mkdocs_github_authors.yaml @@ -5,8 +5,8 @@ avatar: https://avatars.githubusercontent.com/u/116908874?v=4 username: jk4e 1185102784@qq.com: - avatar: null - username: null + avatar: https://avatars.githubusercontent.com/u/61612323?v=4 + username: Laughing-q 130829914+IvorZhu331@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/130829914?v=4 username: IvorZhu331 @@ -137,8 +137,8 @@ rulosanti@gmail.com: avatar: null username: null shuizhuyuanluo@126.com: - avatar: null - username: null + avatar: https://avatars.githubusercontent.com/u/171016?v=4 + username: https://github.com/nihui sometimesocrazy@gmail.com: avatar: null username: null @@ -157,3 +157,6 @@ xinwang614@gmail.com: zhaode.wzd@alibaba-inc.com: avatar: https://avatars.githubusercontent.com/u/8401806?v=4 username: ZhaodeWang +davis.justin@mssm.org: + avatar: https://avatars.githubusercontent.com/u/23462437?v=4 + username: justincdavis diff --git a/ultralytics/__init__.py b/ultralytics/__init__.py index fedf8629a8..e24b210eda 100644 --- a/ultralytics/__init__.py +++ b/ultralytics/__init__.py @@ -1,6 +1,6 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -__version__ = "8.3.26" +__version__ = "8.3.27" import os diff --git a/ultralytics/engine/trainer.py b/ultralytics/engine/trainer.py index e82aed9e08..068274a429 100644 --- a/ultralytics/engine/trainer.py +++ b/ultralytics/engine/trainer.py @@ -118,7 +118,7 @@ class BaseTrainer: self.save_period = self.args.save_period self.batch_size = self.args.batch - self.epochs = self.args.epochs + self.epochs = self.args.epochs or 100 # in case users accidentally pass epochs=None with timed training self.start_epoch = 0 if RANK == -1: print_args(vars(self.args)) diff --git a/ultralytics/utils/checks.py b/ultralytics/utils/checks.py index 9591d3dea2..3a8201a54e 100644 --- a/ultralytics/utils/checks.py +++ b/ultralytics/utils/checks.py @@ -23,7 +23,6 @@ from ultralytics.utils import ( AUTOINSTALL, IS_COLAB, IS_GIT_DIR, - IS_JUPYTER, IS_KAGGLE, IS_PIP_PACKAGE, LINUX, @@ -569,11 +568,8 @@ def check_yolo(verbose=True, device=""): from ultralytics.utils.torch_utils import select_device - if IS_JUPYTER: - if check_requirements("wandb", install=False): - os.system("pip uninstall -y wandb") # uninstall wandb: unwanted account creation prompt with infinite hang - if IS_COLAB: - shutil.rmtree("sample_data", ignore_errors=True) # remove colab /sample_data directory + if IS_COLAB: + shutil.rmtree("sample_data", ignore_errors=True) # remove colab /sample_data directory if verbose: # System info From 591fdbd8b1a48eb820bd6dffe3d128db809f323d Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Fri, 1 Nov 2024 21:08:30 +0800 Subject: [PATCH 28/46] Fix `Bboxes` numpy.reshape (#17301) --- ultralytics/utils/instance.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ultralytics/utils/instance.py b/ultralytics/utils/instance.py index f883895719..d18bdb612c 100644 --- a/ultralytics/utils/instance.py +++ b/ultralytics/utils/instance.py @@ -176,7 +176,7 @@ class Bboxes: length as the number of bounding boxes. """ if isinstance(index, int): - return Bboxes(self.bboxes[index].view(1, -1)) + return Bboxes(self.bboxes[index].reshape(1, -1)) b = self.bboxes[index] assert b.ndim == 2, f"Indexing on Bboxes with {index} failed to return a matrix!" return Bboxes(b) From 4ca50c8c377c5b7a63723777b6f91ccd0a836dc8 Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Fri, 1 Nov 2024 16:11:48 +0100 Subject: [PATCH 29/46] Fix MNN Raspberry Pi benchmark attempt (#17308) --- ultralytics/utils/benchmarks.py | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/ultralytics/utils/benchmarks.py b/ultralytics/utils/benchmarks.py index 3ddd934db7..13d940780f 100644 --- a/ultralytics/utils/benchmarks.py +++ b/ultralytics/utils/benchmarks.py @@ -108,12 +108,16 @@ def benchmark( assert not isinstance(model, YOLOWorld), "YOLOWorldv2 TensorFlow exports not supported by onnx2tf yet" if i in {9, 10}: # TF EdgeTPU and TF.js assert not isinstance(model, YOLOWorld), "YOLOWorldv2 TensorFlow exports not supported by onnx2tf yet" - if i in {11}: # Paddle + if i == 11: # Paddle assert not isinstance(model, YOLOWorld), "YOLOWorldv2 Paddle exports not supported yet" assert not is_end2end, "End-to-end models not supported by PaddlePaddle yet" assert LINUX or MACOS, "Windows Paddle exports not supported yet" - if i in {12, 13}: # MNN, NCNN - assert not isinstance(model, YOLOWorld), "YOLOWorldv2 MNN, NCNN exports not supported yet" + if i == 12: # MNN + assert not isinstance(model, YOLOWorld), "YOLOWorldv2 MNN exports not supported yet" + assert not IS_RASPBERRYPI, "MNN export not supported on Raspberry Pi" + assert not IS_JETSON, "MNN export not supported on NVIDIA Jetson" + if i == 13: # NCNN + assert not isinstance(model, YOLOWorld), "YOLOWorldv2 NCNN exports not supported yet" if "cpu" in device.type: assert cpu, "inference not supported on CPU" if "cuda" in device.type: From 19d9f77cc291f5fb5e11b3229eb3db8f4fbb794b Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Sat, 2 Nov 2024 12:07:41 +0100 Subject: [PATCH 30/46] Fix mkdocs_github_authors.yaml (#17314) --- docs/mkdocs_github_authors.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/mkdocs_github_authors.yaml b/docs/mkdocs_github_authors.yaml index 3e6919e7fe..5240ff8e53 100644 --- a/docs/mkdocs_github_authors.yaml +++ b/docs/mkdocs_github_authors.yaml @@ -138,7 +138,7 @@ rulosanti@gmail.com: username: null shuizhuyuanluo@126.com: avatar: https://avatars.githubusercontent.com/u/171016?v=4 - username: https://github.com/nihui + username: nihui sometimesocrazy@gmail.com: avatar: null username: null @@ -156,7 +156,7 @@ xinwang614@gmail.com: username: GreatV zhaode.wzd@alibaba-inc.com: avatar: https://avatars.githubusercontent.com/u/8401806?v=4 - username: ZhaodeWang + username: wangzhaode davis.justin@mssm.org: avatar: https://avatars.githubusercontent.com/u/23462437?v=4 username: justincdavis From 788387831aa37e29c3fdf5dd62d47624b5db6dc6 Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Sat, 2 Nov 2024 12:59:48 +0100 Subject: [PATCH 31/46] Update mkdocs_github_authors.yaml (#17320) --- docs/mkdocs_github_authors.yaml | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/mkdocs_github_authors.yaml b/docs/mkdocs_github_authors.yaml index 5240ff8e53..f91a730b87 100644 --- a/docs/mkdocs_github_authors.yaml +++ b/docs/mkdocs_github_authors.yaml @@ -157,6 +157,9 @@ xinwang614@gmail.com: zhaode.wzd@alibaba-inc.com: avatar: https://avatars.githubusercontent.com/u/8401806?v=4 username: wangzhaode +8401806+wangzhaode@users.noreply.github.com: + avatar: https://avatars.githubusercontent.com/u/8401806?v=4 + username: wangzhaode davis.justin@mssm.org: avatar: https://avatars.githubusercontent.com/u/23462437?v=4 username: justincdavis From d28caa9a58dc720a71d4916d7a9c69a376ed7a6c Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Sat, 2 Nov 2024 20:00:05 +0800 Subject: [PATCH 32/46] Refactor TFLite example. Support FP32, Fp16, INT8 models (#17317) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- examples/README.md | 2 +- .../README.md | 65 ---- .../YOLOv8-OpenCV-int8-tflite-Python/main.py | 308 ------------------ examples/YOLOv8-TFLite-Python/README.md | 55 ++++ examples/YOLOv8-TFLite-Python/main.py | 221 +++++++++++++ 5 files changed, 277 insertions(+), 374 deletions(-) delete mode 100644 examples/YOLOv8-OpenCV-int8-tflite-Python/README.md delete mode 100644 examples/YOLOv8-OpenCV-int8-tflite-Python/main.py create mode 100644 examples/YOLOv8-TFLite-Python/README.md create mode 100644 examples/YOLOv8-TFLite-Python/main.py diff --git a/examples/README.md b/examples/README.md index ab875b3ba8..260ec2f51c 100644 --- a/examples/README.md +++ b/examples/README.md @@ -18,7 +18,7 @@ This directory features a collection of real-world applications and walkthroughs | [YOLOv8 Region Counter](https://github.com/RizwanMunawar/ultralytics/blob/main/examples/YOLOv8-Region-Counter/yolov8_region_counter.py) | Python | [Muhammad Rizwan Munawar](https://github.com/RizwanMunawar) | | [YOLOv8 Segmentation ONNXRuntime Python](./YOLOv8-Segmentation-ONNXRuntime-Python) | Python/ONNXRuntime | [jamjamjon](https://github.com/jamjamjon) | | [YOLOv8 LibTorch CPP](./YOLOv8-LibTorch-CPP-Inference) | C++/LibTorch | [Myyura](https://github.com/Myyura) | -| [YOLOv8 OpenCV INT8 TFLite Python](./YOLOv8-OpenCV-int8-tflite-Python) | Python | [Wamiq Raza](https://github.com/wamiqraza) | +| [YOLOv8 OpenCV INT8 TFLite Python](./YOLOv8-TFLite-Python) | Python | [Wamiq Raza](https://github.com/wamiqraza) | | [YOLOv8 All Tasks ONNXRuntime Rust](./YOLOv8-ONNXRuntime-Rust) | Rust/ONNXRuntime | [jamjamjon](https://github.com/jamjamjon) | | [YOLOv8 OpenVINO CPP](./YOLOv8-OpenVINO-CPP-Inference) | C++/OpenVINO | [Erlangga Yudi Pradana](https://github.com/rlggyp) | diff --git a/examples/YOLOv8-OpenCV-int8-tflite-Python/README.md b/examples/YOLOv8-OpenCV-int8-tflite-Python/README.md deleted file mode 100644 index ea14e4440e..0000000000 --- a/examples/YOLOv8-OpenCV-int8-tflite-Python/README.md +++ /dev/null @@ -1,65 +0,0 @@ -# YOLOv8 - Int8-TFLite Runtime - -Welcome to the YOLOv8 Int8 TFLite Runtime for efficient and optimized object detection project. This README provides comprehensive instructions for installing and using our YOLOv8 implementation. - -## Installation - -Ensure a smooth setup by following these steps to install necessary dependencies. - -### Installing Required Dependencies - -Install all required dependencies with this simple command: - -```bash -pip install -r requirements.txt -``` - -### Installing `tflite-runtime` - -To load TFLite models, install the `tflite-runtime` package using: - -```bash -pip install tflite-runtime -``` - -### Installing `tensorflow-gpu` (For NVIDIA GPU Users) - -Leverage GPU acceleration with NVIDIA GPUs by installing `tensorflow-gpu`: - -```bash -pip install tensorflow-gpu -``` - -**Note:** Ensure you have compatible GPU drivers installed on your system. - -### Installing `tensorflow` (CPU Version) - -For CPU usage or non-NVIDIA GPUs, install TensorFlow with: - -```bash -pip install tensorflow -``` - -## Usage - -Follow these instructions to run YOLOv8 after successful installation. - -Convert the YOLOv8 model to Int8 TFLite format: - -```bash -yolo export model=yolov8n.pt imgsz=640 format=tflite int8 -``` - -Locate the Int8 TFLite model in `yolov8n_saved_model`. Choose `best_full_integer_quant` or verify quantization at [Netron](https://netron.app/). Then, execute the following in your terminal: - -```bash -python main.py --model yolov8n_full_integer_quant.tflite --img image.jpg --conf-thres 0.5 --iou-thres 0.5 -``` - -Replace `best_full_integer_quant.tflite` with your model file's path, `image.jpg` with your input image, and adjust the confidence (conf-thres) and IoU thresholds (iou-thres) as necessary. - -### Output - -The output is displayed as annotated images, showcasing the model's detection capabilities: - -![image](https://github.com/wamiqraza/Attribute-recognition-and-reidentification-Market1501-dataset/blob/main/img/bus.jpg) diff --git a/examples/YOLOv8-OpenCV-int8-tflite-Python/main.py b/examples/YOLOv8-OpenCV-int8-tflite-Python/main.py deleted file mode 100644 index 46d7fb4272..0000000000 --- a/examples/YOLOv8-OpenCV-int8-tflite-Python/main.py +++ /dev/null @@ -1,308 +0,0 @@ -# Ultralytics YOLO 🚀, AGPL-3.0 license - -import argparse - -import cv2 -import numpy as np -from tflite_runtime import interpreter as tflite - -from ultralytics.utils import ASSETS, yaml_load -from ultralytics.utils.checks import check_yaml - -# Declare as global variables, can be updated based trained model image size -img_width = 640 -img_height = 640 - - -class LetterBox: - """Resizes and reshapes images while maintaining aspect ratio by adding padding, suitable for YOLO models.""" - - def __init__( - self, new_shape=(img_width, img_height), auto=False, scaleFill=False, scaleup=True, center=True, stride=32 - ): - """Initializes LetterBox with parameters for reshaping and transforming image while maintaining aspect ratio.""" - self.new_shape = new_shape - self.auto = auto - self.scaleFill = scaleFill - self.scaleup = scaleup - self.stride = stride - self.center = center # Put the image in the middle or top-left - - def __call__(self, labels=None, image=None): - """Return updated labels and image with added border.""" - if labels is None: - labels = {} - img = labels.get("img") if image is None else image - shape = img.shape[:2] # current shape [height, width] - new_shape = labels.pop("rect_shape", self.new_shape) - if isinstance(new_shape, int): - new_shape = (new_shape, new_shape) - - # Scale ratio (new / old) - r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) - if not self.scaleup: # only scale down, do not scale up (for better val mAP) - r = min(r, 1.0) - - # Compute padding - ratio = r, r # width, height ratios - new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) - dw, dh = new_shape[1] - new_unpad[0], new_shape[0] - new_unpad[1] # wh padding - if self.auto: # minimum rectangle - dw, dh = np.mod(dw, self.stride), np.mod(dh, self.stride) # wh padding - elif self.scaleFill: # stretch - dw, dh = 0.0, 0.0 - new_unpad = (new_shape[1], new_shape[0]) - ratio = new_shape[1] / shape[1], new_shape[0] / shape[0] # width, height ratios - - if self.center: - dw /= 2 # divide padding into 2 sides - dh /= 2 - - if shape[::-1] != new_unpad: # resize - img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR) - top, bottom = int(round(dh - 0.1)) if self.center else 0, int(round(dh + 0.1)) - left, right = int(round(dw - 0.1)) if self.center else 0, int(round(dw + 0.1)) - img = cv2.copyMakeBorder( - img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114) - ) # add border - if labels.get("ratio_pad"): - labels["ratio_pad"] = (labels["ratio_pad"], (left, top)) # for evaluation - - if len(labels): - labels = self._update_labels(labels, ratio, dw, dh) - labels["img"] = img - labels["resized_shape"] = new_shape - return labels - else: - return img - - def _update_labels(self, labels, ratio, padw, padh): - """Update labels.""" - labels["instances"].convert_bbox(format="xyxy") - labels["instances"].denormalize(*labels["img"].shape[:2][::-1]) - labels["instances"].scale(*ratio) - labels["instances"].add_padding(padw, padh) - return labels - - -class Yolov8TFLite: - """Class for performing object detection using YOLOv8 model converted to TensorFlow Lite format.""" - - def __init__(self, tflite_model, input_image, confidence_thres, iou_thres): - """ - Initializes an instance of the Yolov8TFLite class. - - Args: - tflite_model: Path to the TFLite model. - input_image: Path to the input image. - confidence_thres: Confidence threshold for filtering detections. - iou_thres: IoU (Intersection over Union) threshold for non-maximum suppression. - """ - self.tflite_model = tflite_model - self.input_image = input_image - self.confidence_thres = confidence_thres - self.iou_thres = iou_thres - - # Load the class names from the COCO dataset - self.classes = yaml_load(check_yaml("coco8.yaml"))["names"] - - # Generate a color palette for the classes - self.color_palette = np.random.uniform(0, 255, size=(len(self.classes), 3)) - - def draw_detections(self, img, box, score, class_id): - """ - Draws bounding boxes and labels on the input image based on the detected objects. - - Args: - img: The input image to draw detections on. - box: Detected bounding box. - score: Corresponding detection score. - class_id: Class ID for the detected object. - - Returns: - None - """ - # Extract the coordinates of the bounding box - x1, y1, w, h = box - - # Retrieve the color for the class ID - color = self.color_palette[class_id] - - # Draw the bounding box on the image - cv2.rectangle(img, (int(x1), int(y1)), (int(x1 + w), int(y1 + h)), color, 2) - - # Create the label text with class name and score - label = f"{self.classes[class_id]}: {score:.2f}" - - # Calculate the dimensions of the label text - (label_width, label_height), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1) - - # Calculate the position of the label text - label_x = x1 - label_y = y1 - 10 if y1 - 10 > label_height else y1 + 10 - - # Draw a filled rectangle as the background for the label text - cv2.rectangle( - img, - (int(label_x), int(label_y - label_height)), - (int(label_x + label_width), int(label_y + label_height)), - color, - cv2.FILLED, - ) - - # Draw the label text on the image - cv2.putText(img, label, (int(label_x), int(label_y)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 1, cv2.LINE_AA) - - def preprocess(self): - """ - Preprocesses the input image before performing inference. - - Returns: - image_data: Preprocessed image data ready for inference. - """ - # Read the input image using OpenCV - self.img = cv2.imread(self.input_image) - - print("image before", self.img) - # Get the height and width of the input image - self.img_height, self.img_width = self.img.shape[:2] - - letterbox = LetterBox(new_shape=[img_width, img_height], auto=False, stride=32) - image = letterbox(image=self.img) - image = [image] - image = np.stack(image) - image = image[..., ::-1].transpose((0, 3, 1, 2)) - img = np.ascontiguousarray(image) - # n, h, w, c - image = img.astype(np.float32) - return image / 255 - - def postprocess(self, input_image, output): - """ - Performs post-processing on the model's output to extract bounding boxes, scores, and class IDs. - - Args: - input_image (numpy.ndarray): The input image. - output (numpy.ndarray): The output of the model. - - Returns: - numpy.ndarray: The input image with detections drawn on it. - """ - # Transpose predictions outside the loop - output = [np.transpose(pred) for pred in output] - - boxes = [] - scores = [] - class_ids = [] - - # Vectorize extraction of bounding boxes, scores, and class IDs - for pred in output: - x, y, w, h = pred[:, 0], pred[:, 1], pred[:, 2], pred[:, 3] - x1 = x - w / 2 - y1 = y - h / 2 - boxes.extend(np.column_stack([x1, y1, w, h])) - - # Argmax and score extraction for all predictions at once - idx = np.argmax(pred[:, 4:], axis=1) - scores.extend(pred[np.arange(pred.shape[0]), idx + 4]) - class_ids.extend(idx) - - # Precompute gain and pad once - img_height, img_width = input_image.shape[:2] - gain = min(img_width / self.img_width, img_height / self.img_height) - pad = ( - round((img_width - self.img_width * gain) / 2 - 0.1), - round((img_height - self.img_height * gain) / 2 - 0.1), - ) - - # Non-Maximum Suppression (NMS) in one go - indices = cv2.dnn.NMSBoxes(boxes, scores, self.confidence_thres, self.iou_thres) - - # Process selected indices - for i in indices.flatten(): - box = boxes[i] - box[0] = (box[0] - pad[0]) / gain - box[1] = (box[1] - pad[1]) / gain - box[2] = box[2] / gain - box[3] = box[3] / gain - - score = scores[i] - class_id = class_ids[i] - - if score > 0.25: - # Draw the detection on the input image - self.draw_detections(input_image, box, score, class_id) - - return input_image - - def main(self): - """ - Performs inference using a TFLite model and returns the output image with drawn detections. - - Returns: - output_img: The output image with drawn detections. - """ - # Create an interpreter for the TFLite model - interpreter = tflite.Interpreter(model_path=self.tflite_model) - self.model = interpreter - interpreter.allocate_tensors() - - # Get the model inputs - input_details = interpreter.get_input_details() - output_details = interpreter.get_output_details() - - # Store the shape of the input for later use - input_shape = input_details[0]["shape"] - self.input_width = input_shape[1] - self.input_height = input_shape[2] - - # Preprocess the image data - img_data = self.preprocess() - img_data = img_data - # img_data = img_data.cpu().numpy() - # Set the input tensor to the interpreter - print(input_details[0]["index"]) - print(img_data.shape) - img_data = img_data.transpose((0, 2, 3, 1)) - - scale, zero_point = input_details[0]["quantization"] - img_data_int8 = (img_data / scale + zero_point).astype(np.int8) - interpreter.set_tensor(input_details[0]["index"], img_data_int8) - - # Run inference - interpreter.invoke() - - # Get the output tensor from the interpreter - output = interpreter.get_tensor(output_details[0]["index"]) - scale, zero_point = output_details[0]["quantization"] - output = (output.astype(np.float32) - zero_point) * scale - - output[:, [0, 2]] *= img_width - output[:, [1, 3]] *= img_height - print(output) - # Perform post-processing on the outputs to obtain output image. - return self.postprocess(self.img, output) - - -if __name__ == "__main__": - # Create an argument parser to handle command-line arguments - parser = argparse.ArgumentParser() - parser.add_argument( - "--model", type=str, default="yolov8n_full_integer_quant.tflite", help="Input your TFLite model." - ) - parser.add_argument("--img", type=str, default=str(ASSETS / "bus.jpg"), help="Path to input image.") - parser.add_argument("--conf-thres", type=float, default=0.5, help="Confidence threshold") - parser.add_argument("--iou-thres", type=float, default=0.5, help="NMS IoU threshold") - args = parser.parse_args() - - # Create an instance of the Yolov8TFLite class with the specified arguments - detection = Yolov8TFLite(args.model, args.img, args.conf_thres, args.iou_thres) - - # Perform object detection and obtain the output image - output_image = detection.main() - - # Display the output image in a window - cv2.imshow("Output", output_image) - - # Wait for a key press to exit - cv2.waitKey(0) diff --git a/examples/YOLOv8-TFLite-Python/README.md b/examples/YOLOv8-TFLite-Python/README.md new file mode 100644 index 0000000000..0156759fdb --- /dev/null +++ b/examples/YOLOv8-TFLite-Python/README.md @@ -0,0 +1,55 @@ +# YOLOv8 - TFLite Runtime + +This example shows how to run inference with YOLOv8 TFLite model. It supports FP32, FP16 and INT8 models. + +## Installation + +### Installing `tflite-runtime` + +To load TFLite models, install the `tflite-runtime` package using: + +```bash +pip install tflite-runtime +``` + +### Installing `tensorflow-gpu` (For NVIDIA GPU Users) + +Leverage GPU acceleration with NVIDIA GPUs by installing `tensorflow-gpu`: + +```bash +pip install tensorflow-gpu +``` + +**Note:** Ensure you have compatible GPU drivers installed on your system. + +### Installing `tensorflow` (CPU Version) + +For CPU usage or non-NVIDIA GPUs, install TensorFlow with: + +```bash +pip install tensorflow +``` + +## Usage + +Follow these instructions to run YOLOv8 after successful installation. + +Convert the YOLOv8 model to TFLite format: + +```bash +yolo export model=yolov8n.pt imgsz=640 format=tflite int8 +``` + +Locate the TFLite model in `yolov8n_saved_model`. Then, execute the following in your terminal: + +```bash +python main.py --model yolov8n_full_integer_quant.tflite --img image.jpg --conf 0.25 --iou 0.45 --metadata "metadata.yaml" +``` + +Replace `best_full_integer_quant.tflite` with the TFLite model path, `image.jpg` with the input image path, `metadata.yaml` with the one generated by `ultralytics` during export, and adjust the confidence (conf) and IoU thresholds (iou) as necessary. + +### Output + +The output would show the detections along with the class labels and confidences of each detected object. + +![image](https://github.com/wamiqraza/Attribute-recognition-and-reidentification-Market1501-dataset/blob/main/img/bus.jpg) diff --git a/examples/YOLOv8-TFLite-Python/main.py b/examples/YOLOv8-TFLite-Python/main.py new file mode 100644 index 0000000000..1fadd86b20 --- /dev/null +++ b/examples/YOLOv8-TFLite-Python/main.py @@ -0,0 +1,221 @@ +# Ultralytics YOLO 🚀, AGPL-3.0 license + +import argparse +from typing import Tuple, Union + +import cv2 +import numpy as np +import tensorflow as tf +import yaml + +from ultralytics.utils import ASSETS + +try: + from tflite_runtime.interpreter import Interpreter +except ImportError: + import tensorflow as tf + + Interpreter = tf.lite.Interpreter + + +class YOLOv8TFLite: + """ + YOLOv8TFLite. + + A class for performing object detection using the YOLOv8 model with TensorFlow Lite. + + Attributes: + model (str): Path to the TensorFlow Lite model file. + conf (float): Confidence threshold for filtering detections. + iou (float): Intersection over Union threshold for non-maximum suppression. + metadata (Optional[str]): Path to the metadata file, if any. + + Methods: + detect(img_path: str) -> np.ndarray: + Performs inference and returns the output image with drawn detections. + """ + + def __init__(self, model: str, conf: float = 0.25, iou: float = 0.45, metadata: Union[str, None] = None): + """ + Initializes an instance of the YOLOv8TFLite class. + + Args: + model (str): Path to the TFLite model. + conf (float, optional): Confidence threshold for filtering detections. Defaults to 0.25. + iou (float, optional): IoU (Intersection over Union) threshold for non-maximum suppression. Defaults to 0.45. + metadata (Union[str, None], optional): Path to the metadata file or None if not used. Defaults to None. + """ + self.conf = conf + self.iou = iou + if metadata is None: + self.classes = {i: i for i in range(1000)} + else: + with open(metadata) as f: + self.classes = yaml.safe_load(f)["names"] + np.random.seed(42) + self.color_palette = np.random.uniform(128, 255, size=(len(self.classes), 3)) + + self.model = Interpreter(model_path=model) + self.model.allocate_tensors() + + input_details = self.model.get_input_details()[0] + + self.in_width, self.in_height = input_details["shape"][1:3] + self.in_index = input_details["index"] + self.in_scale, self.in_zero_point = input_details["quantization"] + self.int8 = input_details["dtype"] == np.int8 + + output_details = self.model.get_output_details()[0] + self.out_index = output_details["index"] + self.out_scale, self.out_zero_point = output_details["quantization"] + + def letterbox(self, img: np.ndarray, new_shape: Tuple = (640, 640)) -> Tuple[np.ndarray, Tuple[float, float]]: + """Resizes and reshapes images while maintaining aspect ratio by adding padding, suitable for YOLO models.""" + shape = img.shape[:2] # current shape [height, width] + + # Scale ratio (new / old) + r = min(new_shape[0] / shape[0], new_shape[1] / shape[1]) + + # Compute padding + new_unpad = int(round(shape[1] * r)), int(round(shape[0] * r)) + dw, dh = (new_shape[1] - new_unpad[0]) / 2, (new_shape[0] - new_unpad[1]) / 2 # wh padding + + if shape[::-1] != new_unpad: # resize + img = cv2.resize(img, new_unpad, interpolation=cv2.INTER_LINEAR) + top, bottom = int(round(dh - 0.1)), int(round(dh + 0.1)) + left, right = int(round(dw - 0.1)), int(round(dw + 0.1)) + img = cv2.copyMakeBorder(img, top, bottom, left, right, cv2.BORDER_CONSTANT, value=(114, 114, 114)) + + return img, (top / img.shape[0], left / img.shape[1]) + + def draw_detections(self, img: np.ndarray, box: np.ndarray, score: np.float32, class_id: int) -> None: + """ + Draws bounding boxes and labels on the input image based on the detected objects. + + Args: + img (np.ndarray): The input image to draw detections on. + box (np.ndarray): Detected bounding box in the format [x1, y1, width, height]. + score (np.float32): Corresponding detection score. + class_id (int): Class ID for the detected object. + + Returns: + None + """ + x1, y1, w, h = box + color = self.color_palette[class_id] + + cv2.rectangle(img, (int(x1), int(y1)), (int(x1 + w), int(y1 + h)), color, 2) + + label = f"{self.classes[class_id]}: {score:.2f}" + + (label_width, label_height), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.5, 1) + + label_x = x1 + label_y = y1 - 10 if y1 - 10 > label_height else y1 + 10 + + cv2.rectangle( + img, + (int(label_x), int(label_y - label_height)), + (int(label_x + label_width), int(label_y + label_height)), + color, + cv2.FILLED, + ) + + cv2.putText(img, label, (int(label_x), int(label_y)), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 0), 1, cv2.LINE_AA) + + def preprocess(self, img: np.ndarray) -> Tuple[np.ndarray, Tuple[float, float]]: + """ + Preprocesses the input image before performing inference. + + Args: + img (np.ndarray): The input image to be preprocessed. + + Returns: + Tuple[np.ndarray, Tuple[float, float]]: A tuple containing: + - The preprocessed image (np.ndarray). + - A tuple of two float values representing the padding applied (top/bottom, left/right). + """ + img, pad = self.letterbox(img, (self.in_width, self.in_height)) + img = img[..., ::-1][None] # N,H,W,C for TFLite + img = np.ascontiguousarray(img) + img = img.astype(np.float32) + return img / 255, pad + + def postprocess(self, img: np.ndarray, outputs: np.ndarray, pad: Tuple[float, float]) -> np.ndarray: + """ + Performs post-processing on the model's output to extract bounding boxes, scores, and class IDs. + + Args: + img (numpy.ndarray): The input image. + outputs (numpy.ndarray): The output of the model. + pad (Tuple[float, float]): Padding used by letterbox. + + Returns: + numpy.ndarray: The input image with detections drawn on it. + """ + outputs[:, 0] -= pad[1] + outputs[:, 1] -= pad[0] + outputs[:, :4] *= max(img.shape) + + outputs = outputs.transpose(0, 2, 1) + outputs[..., 0] -= outputs[..., 2] / 2 + outputs[..., 1] -= outputs[..., 3] / 2 + + for out in outputs: + scores = out[:, 4:].max(-1) + keep = scores > self.conf + boxes = out[keep, :4] + scores = scores[keep] + class_ids = out[keep, 4:].argmax(-1) + + indices = cv2.dnn.NMSBoxes(boxes, scores, self.conf, self.iou).flatten() + + [self.draw_detections(img, boxes[i], scores[i], class_ids[i]) for i in indices] + + return img + + def detect(self, img_path: str) -> np.ndarray: + """ + Performs inference using a TFLite model and returns the output image with drawn detections. + + Args: + img_path (str): The path to the input image file. + + Returns: + np.ndarray: The output image with drawn detections. + """ + img = cv2.imread(img_path) + x, pad = self.preprocess(img) + if self.int8: + x = (x / self.in_scale + self.in_zero_point).astype(np.int8) + self.model.set_tensor(self.in_index, x) + + self.model.invoke() + + y = self.model.get_tensor(self.out_index) + + if self.int8: + y = (y.astype(np.float32) - self.out_zero_point) * self.out_scale + + return self.postprocess(img, y, pad) + + +if __name__ == "__main__": + parser = argparse.ArgumentParser() + parser.add_argument( + "--model", + type=str, + default="yolov8n_saved_model/yolov8n_full_integer_quant.tflite", + help="Path to TFLite model.", + ) + parser.add_argument("--img", type=str, default=str(ASSETS / "bus.jpg"), help="Path to input image") + parser.add_argument("--conf", type=float, default=0.25, help="Confidence threshold") + parser.add_argument("--iou", type=float, default=0.45, help="NMS IoU threshold") + parser.add_argument("--metadata", type=str, default="yolov8n_saved_model/metadata.yaml", help="Metadata yaml") + args = parser.parse_args() + + detector = YOLOv8TFLite(args.model, args.conf, args.iou, args.metadata) + result = detector.detect(str(ASSETS / "bus.jpg"))[..., ::-1] + + cv2.imshow("Output", result) + cv2.waitKey(0) From f95dc37311fc9eaf78e26cec69305e711247244c Mon Sep 17 00:00:00 2001 From: Jamjamjon <51357717+jamjamjon@users.noreply.github.com> Date: Sat, 2 Nov 2024 20:06:07 +0800 Subject: [PATCH 33/46] [Example] YOLO-Series(v5-11) ONNXRuntime Rust (#17311) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- examples/README.md | 1 + .../YOLO-Series-ONNXRuntime-Rust/Cargo.toml | 12 + .../YOLO-Series-ONNXRuntime-Rust/README.md | 94 +++++++ .../YOLO-Series-ONNXRuntime-Rust/src/main.rs | 236 ++++++++++++++++++ examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml | 2 +- examples/YOLOv8-ONNXRuntime-Rust/README.md | 27 +- examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs | 13 +- examples/YOLOv8-ONNXRuntime-Rust/src/model.rs | 6 +- 8 files changed, 362 insertions(+), 29 deletions(-) create mode 100644 examples/YOLO-Series-ONNXRuntime-Rust/Cargo.toml create mode 100644 examples/YOLO-Series-ONNXRuntime-Rust/README.md create mode 100644 examples/YOLO-Series-ONNXRuntime-Rust/src/main.rs diff --git a/examples/README.md b/examples/README.md index 260ec2f51c..76f078bde2 100644 --- a/examples/README.md +++ b/examples/README.md @@ -21,6 +21,7 @@ This directory features a collection of real-world applications and walkthroughs | [YOLOv8 OpenCV INT8 TFLite Python](./YOLOv8-TFLite-Python) | Python | [Wamiq Raza](https://github.com/wamiqraza) | | [YOLOv8 All Tasks ONNXRuntime Rust](./YOLOv8-ONNXRuntime-Rust) | Rust/ONNXRuntime | [jamjamjon](https://github.com/jamjamjon) | | [YOLOv8 OpenVINO CPP](./YOLOv8-OpenVINO-CPP-Inference) | C++/OpenVINO | [Erlangga Yudi Pradana](https://github.com/rlggyp) | +| [YOLOv5-YOLO11 ONNXRuntime Rust](./YOLO-Series-ONNXRuntime-Rust) | Rust/ONNXRuntime | [jamjamjon](https://github.com/jamjamjon) | ### How to Contribute diff --git a/examples/YOLO-Series-ONNXRuntime-Rust/Cargo.toml b/examples/YOLO-Series-ONNXRuntime-Rust/Cargo.toml new file mode 100644 index 0000000000..a795eea293 --- /dev/null +++ b/examples/YOLO-Series-ONNXRuntime-Rust/Cargo.toml @@ -0,0 +1,12 @@ +[package] +name = "YOLO-ONNXRuntime-Rust" +version = "0.1.0" +edition = "2021" +authors = ["Jamjamjon "] + +[dependencies] +anyhow = "1.0.92" +clap = "4.5.20" +tracing = "0.1.40" +tracing-subscriber = "0.3.18" +usls = { version = "0.0.19", features = ["auto"] } diff --git a/examples/YOLO-Series-ONNXRuntime-Rust/README.md b/examples/YOLO-Series-ONNXRuntime-Rust/README.md new file mode 100644 index 0000000000..ca05fbb180 --- /dev/null +++ b/examples/YOLO-Series-ONNXRuntime-Rust/README.md @@ -0,0 +1,94 @@ +# YOLO-Series ONNXRuntime Rust Demo for Core YOLO Tasks + +This repository provides a Rust demo for key YOLO-Series tasks such as `Classification`, `Segmentation`, `Detection`, `Pose Detection`, and `OBB` using ONNXRuntime. It supports various YOLO models (v5 - 11) across multiple vision tasks. + +## Introduction + +- This example leverages the latest versions of both ONNXRuntime and YOLO models. +- We utilize the [usls](https://github.com/jamjamjon/usls/tree/main) crate to streamline YOLO model inference, providing efficient data loading, visualization, and optimized inference performance. + +## Features + +- **Extensive Model Compatibility**: Supports `YOLOv5`, `YOLOv6`, `YOLOv7`, `YOLOv8`, `YOLOv9`, `YOLOv10`, `YOLO11`, `YOLO-world`, `RTDETR`, and others, covering a wide range of YOLO versions. +- **Versatile Task Coverage**: Includes `Classification`, `Segmentation`, `Detection`, `Pose`, and `OBB`. +- **Precision Flexibility**: Works with `FP16` and `FP32` ONNX models. +- **Execution Providers**: Accelerated support for `CPU`, `CUDA`, `CoreML`, and `TensorRT`. +- **Dynamic Input Shapes**: Dynamically adjusts to variable `batch`, `width`, and `height` dimensions for flexible model input. +- **Flexible Data Loading**: The `DataLoader` handles images, folders, videos, and video streams. +- **Real-Time Display and Video Export**: `Viewer` provides real-time frame visualization and video export functions, similar to OpenCV’s `imshow()` and `imwrite()`. +- **Enhanced Annotation and Visualization**: The `Annotator` facilitates comprehensive result rendering, with support for bounding boxes (HBB), oriented bounding boxes (OBB), polygons, masks, keypoints, and text labels. + +## Setup Instructions + +### 1. ONNXRuntime Linking + +
+You have two options to link the ONNXRuntime library: + +- **Option 1: Manual Linking** + + - For detailed setup, consult the [ONNX Runtime linking documentation](https://ort.pyke.io/setup/linking). + - **Linux or macOS**: + 1. Download the ONNX Runtime package from the [Releases page](https://github.com/microsoft/onnxruntime/releases). + 2. Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable: + ```shell + export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0 + ``` + +- **Option 2: Automatic Download** + - Use the `--features auto` flag to handle downloading automatically: + ```shell + cargo run -r --example yolo --features auto + ``` + +
+ +### 2. \[Optional\] Install CUDA, CuDNN, and TensorRT + +- The CUDA execution provider requires CUDA version `12.x`. +- The TensorRT execution provider requires both CUDA `12.x` and TensorRT `10.x`. + +### 3. \[Optional\] Install ffmpeg + +To view video frames and save video inferences, install `rust-ffmpeg`. For instructions, see: +[https://github.com/zmwangx/rust-ffmpeg/wiki/Notes-on-building#dependencies](https://github.com/zmwangx/rust-ffmpeg/wiki/Notes-on-building#dependencies) + +## Get Started + +```Shell +# customized +cargo run -r -- --task detect --ver v8 --nc 6 --model xxx.onnx # YOLOv8 + +# Classify +cargo run -r -- --task classify --ver v5 --scale s --width 224 --height 224 --nc 1000 # YOLOv5 +cargo run -r -- --task classify --ver v8 --scale n --width 224 --height 224 --nc 1000 # YOLOv8 +cargo run -r -- --task classify --ver v11 --scale n --width 224 --height 224 --nc 1000 # YOLOv11 + +# Detect +cargo run -r -- --task detect --ver v5 --scale n # YOLOv5 +cargo run -r -- --task detect --ver v6 --scale n # YOLOv6 +cargo run -r -- --task detect --ver v7 --scale t # YOLOv7 +cargo run -r -- --task detect --ver v8 --scale n # YOLOv8 +cargo run -r -- --task detect --ver v9 --scale t # YOLOv9 +cargo run -r -- --task detect --ver v10 --scale n # YOLOv10 +cargo run -r -- --task detect --ver v11 --scale n # YOLOv11 +cargo run -r -- --task detect --ver rtdetr --scale l # RTDETR + +# Pose +cargo run -r -- --task pose --ver v8 --scale n # YOLOv8-Pose +cargo run -r -- --task pose --ver v11 --scale n # YOLOv11-Pose + +# Segment +cargo run -r -- --task segment --ver v5 --scale n # YOLOv5-Segment +cargo run -r -- --task segment --ver v8 --scale n # YOLOv8-Segment +cargo run -r -- --task segment --ver v11 --scale n # YOLOv8-Segment +cargo run -r -- --task segment --ver v8 --model yolo/FastSAM-s-dyn-f16.onnx # FastSAM + +# OBB +cargo run -r -- --ver v8 --task obb --scale n --width 1024 --height 1024 --source images/dota.png # YOLOv8-Obb +cargo run -r -- --ver v11 --task obb --scale n --width 1024 --height 1024 --source images/dota.png # YOLOv11-Obb +``` + +**`cargo run -- --help` for more options** + +For more details, please refer to [usls-yolo](https://github.com/jamjamjon/usls/tree/main/examples/yolo). diff --git a/examples/YOLO-Series-ONNXRuntime-Rust/src/main.rs b/examples/YOLO-Series-ONNXRuntime-Rust/src/main.rs new file mode 100644 index 0000000000..3c71a25310 --- /dev/null +++ b/examples/YOLO-Series-ONNXRuntime-Rust/src/main.rs @@ -0,0 +1,236 @@ +use anyhow::Result; +use clap::Parser; + +use usls::{ + models::YOLO, Annotator, DataLoader, Device, Options, Viewer, Vision, YOLOScale, YOLOTask, + YOLOVersion, COCO_SKELETONS_16, +}; + +#[derive(Parser, Clone)] +#[command(author, version, about, long_about = None)] +pub struct Args { + /// Path to the ONNX model + #[arg(long)] + pub model: Option, + + /// Input source path + #[arg(long, default_value_t = String::from("../../ultralytics/assets/bus.jpg"))] + pub source: String, + + /// YOLO Task + #[arg(long, value_enum, default_value_t = YOLOTask::Detect)] + pub task: YOLOTask, + + /// YOLO Version + #[arg(long, value_enum, default_value_t = YOLOVersion::V8)] + pub ver: YOLOVersion, + + /// YOLO Scale + #[arg(long, value_enum, default_value_t = YOLOScale::N)] + pub scale: YOLOScale, + + /// Batch size + #[arg(long, default_value_t = 1)] + pub batch_size: usize, + + /// Minimum input width + #[arg(long, default_value_t = 224)] + pub width_min: isize, + + /// Input width + #[arg(long, default_value_t = 640)] + pub width: isize, + + /// Maximum input width + #[arg(long, default_value_t = 1024)] + pub width_max: isize, + + /// Minimum input height + #[arg(long, default_value_t = 224)] + pub height_min: isize, + + /// Input height + #[arg(long, default_value_t = 640)] + pub height: isize, + + /// Maximum input height + #[arg(long, default_value_t = 1024)] + pub height_max: isize, + + /// Number of classes + #[arg(long, default_value_t = 80)] + pub nc: usize, + + /// Class confidence + #[arg(long)] + pub confs: Vec, + + /// Enable TensorRT support + #[arg(long)] + pub trt: bool, + + /// Enable CUDA support + #[arg(long)] + pub cuda: bool, + + /// Enable CoreML support + #[arg(long)] + pub coreml: bool, + + /// Use TensorRT half precision + #[arg(long)] + pub half: bool, + + /// Device ID to use + #[arg(long, default_value_t = 0)] + pub device_id: usize, + + /// Enable performance profiling + #[arg(long)] + pub profile: bool, + + /// Disable contour drawing, for saving time + #[arg(long)] + pub no_contours: bool, + + /// Show result + #[arg(long)] + pub view: bool, + + /// Do not save output + #[arg(long)] + pub nosave: bool, +} + +fn main() -> Result<()> { + let args = Args::parse(); + + // logger + if args.profile { + tracing_subscriber::fmt() + .with_max_level(tracing::Level::INFO) + .init(); + } + + // model path + let path = match &args.model { + None => format!( + "yolo/{}-{}-{}.onnx", + args.ver.name(), + args.scale.name(), + args.task.name() + ), + Some(x) => x.to_string(), + }; + + // saveout + let saveout = match &args.model { + None => format!( + "{}-{}-{}", + args.ver.name(), + args.scale.name(), + args.task.name() + ), + Some(x) => { + let p = std::path::PathBuf::from(&x); + p.file_stem().unwrap().to_str().unwrap().to_string() + } + }; + + // device + let device = if args.cuda { + Device::Cuda(args.device_id) + } else if args.trt { + Device::Trt(args.device_id) + } else if args.coreml { + Device::CoreML(args.device_id) + } else { + Device::Cpu(args.device_id) + }; + + // build options + let options = Options::new() + .with_model(&path)? + .with_yolo_version(args.ver) + .with_yolo_task(args.task) + .with_device(device) + .with_trt_fp16(args.half) + .with_ixx(0, 0, (1, args.batch_size as _, 4).into()) + .with_ixx(0, 2, (args.height_min, args.height, args.height_max).into()) + .with_ixx(0, 3, (args.width_min, args.width, args.width_max).into()) + .with_confs(if args.confs.is_empty() { + &[0.2, 0.15] + } else { + &args.confs + }) + .with_nc(args.nc) + .with_find_contours(!args.no_contours) // find contours or not + // .with_names(&COCO_CLASS_NAMES_80) // detection class names + // .with_names2(&COCO_KEYPOINTS_17) // keypoints class names + // .exclude_classes(&[0]) + // .retain_classes(&[0, 5]) + .with_profile(args.profile); + + // build model + let mut model = YOLO::new(options)?; + + // build dataloader + let dl = DataLoader::new(&args.source)? + .with_batch(model.batch() as _) + .build()?; + + // build annotator + let annotator = Annotator::default() + .with_skeletons(&COCO_SKELETONS_16) + .without_masks(true) // no masks plotting when doing segment task + .with_bboxes_thickness(3) + .with_keypoints_name(false) // enable keypoints names + .with_saveout_subs(&["YOLO"]) + .with_saveout(&saveout); + + // build viewer + let mut viewer = if args.view { + Some(Viewer::new().with_delay(5).with_scale(1.).resizable(true)) + } else { + None + }; + + // run & annotate + for (xs, _paths) in dl { + let ys = model.forward(&xs, args.profile)?; + let images_plotted = annotator.plot(&xs, &ys, !args.nosave)?; + + // show image + match &mut viewer { + Some(viewer) => viewer.imshow(&images_plotted)?, + None => continue, + } + + // check out window and key event + match &mut viewer { + Some(viewer) => { + if !viewer.is_open() || viewer.is_key_pressed(usls::Key::Escape) { + break; + } + } + None => continue, + } + + // write video + if !args.nosave { + match &mut viewer { + Some(viewer) => viewer.write_batch(&images_plotted)?, + None => continue, + } + } + } + + // finish video write + if !args.nosave { + if let Some(viewer) = &mut viewer { + viewer.finish_write()?; + } + } + + Ok(()) +} diff --git a/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml b/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml index fcf1fb7974..39dff0313d 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml +++ b/examples/YOLOv8-ONNXRuntime-Rust/Cargo.toml @@ -12,7 +12,7 @@ clap = { version = "4.2.4", features = ["derive"] } image = { version = "0.25.2"} imageproc = { version = "0.25.0"} ndarray = { version = "0.16" } -ort = { version = "2.0.0-rc.5", features = ["cuda", "tensorrt"]} +ort = { version = "2.0.0-rc.5", features = ["cuda", "tensorrt", "load-dynamic", "copy-dylibs", "half"]} rusttype = { version = "0.9.3" } anyhow = { version = "1.0.75" } regex = { version = "1.5.4" } diff --git a/examples/YOLOv8-ONNXRuntime-Rust/README.md b/examples/YOLOv8-ONNXRuntime-Rust/README.md index 9121c7dac7..53a7da883e 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/README.md +++ b/examples/YOLOv8-ONNXRuntime-Rust/README.md @@ -7,7 +7,7 @@ This repository provides a Rust demo for performing YOLOv8 tasks like `Classific - Add YOLOv8-OBB demo - Update ONNXRuntime to 1.19.x -Newly updated YOLOv8 example code is located in this repository (https://github.com/jamjamjon/usls/tree/main/examples/yolo) +Newly updated YOLOv8 example code is located in [this repository](https://github.com/jamjamjon/usls/tree/main/examples/yolo) ## Features @@ -22,25 +22,16 @@ Newly updated YOLOv8 example code is located in this repository (https://github. Please follow the Rust official installation. (https://www.rust-lang.org/tools/install) -### 2. Install ONNXRuntime +### 2. ONNXRuntime Linking -This repository use `ort` crate, which is ONNXRuntime wrapper for Rust. (https://docs.rs/ort/latest/ort/) +- #### For detailed setup instructions, refer to the [ORT documentation](https://ort.pyke.io/setup/linking). -You can follow the instruction with `ort` doc or simply do this: - -- step1: Download ONNXRuntime(https://github.com/microsoft/onnxruntime/releases) -- setp2: Set environment variable `PATH` for linking. - -On ubuntu, You can do like this: - -```bash -vim ~/.bashrc - -# Add the path of ONNXRUntime lib -export LD_LIBRARY_PATH=/home/qweasd/Documents/onnxruntime-linux-x64-gpu-1.16.3/lib${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} - -source ~/.bashrc -``` +- #### For Linux or macOS Users: + - Download the ONNX Runtime package from the [Releases page](https://github.com/microsoft/onnxruntime/releases). + - Set up the library path by exporting the `ORT_DYLIB_PATH` environment variable: + ```shell + export ORT_DYLIB_PATH=/path/to/onnxruntime/lib/libonnxruntime.so.1.19.0 + ``` ### 3. \[Optional\] Install CUDA & CuDNN & TensorRT diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs index 849801ee47..0084535ee5 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/lib.rs @@ -118,16 +118,15 @@ pub fn check_font(font: &str) -> rusttype::Font<'static> { rusttype::Font::try_from_vec(buffer).unwrap() } - use ab_glyph::FontArc; -pub fn load_font() -> FontArc{ +pub fn load_font() -> FontArc { use std::path::Path; let font_path = Path::new("./font/Arial.ttf"); match font_path.try_exists() { Ok(true) => { let buffer = std::fs::read(font_path).unwrap(); FontArc::try_from_vec(buffer).unwrap() - }, + } Ok(false) => { std::fs::create_dir_all("./font").unwrap(); println!("Downloading font..."); @@ -136,7 +135,7 @@ pub fn load_font() -> FontArc{ .timeout(std::time::Duration::from_secs(500)) .call() .unwrap_or_else(|err| panic!("> Failed to download font: {source_url}: {err:?}")); - + // read to buffer let mut buffer = vec![]; let total_size = resp @@ -153,9 +152,9 @@ pub fn load_font() -> FontArc{ fd.write_all(&buffer).unwrap(); println!("Font saved at: {:?}", font_path.display()); FontArc::try_from_vec(buffer).unwrap() - }, + } Err(e) => { panic!("Failed to load font {}", e); - }, + } } -} \ No newline at end of file +} diff --git a/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs b/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs index e0c35f6c26..95b2bdfffa 100644 --- a/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs +++ b/examples/YOLOv8-ONNXRuntime-Rust/src/model.rs @@ -8,7 +8,7 @@ use rand::{thread_rng, Rng}; use std::path::PathBuf; use crate::{ - load_font, gen_time_string, non_max_suppression, Args, Batch, Bbox, Embedding, OrtBackend, + gen_time_string, load_font, non_max_suppression, Args, Batch, Bbox, Embedding, OrtBackend, OrtConfig, OrtEP, Point2, YOLOResult, YOLOTask, SKELETON, }; @@ -40,7 +40,7 @@ impl YOLOv8 { OrtEP::CUDA(config.device_id) } else { OrtEP::CPU - }; + }; // batch let batch = Batch { @@ -463,7 +463,7 @@ impl YOLOv8 { image::Rgb(self.color_palette[bbox.id()].into()), bbox.xmin() as i32, (bbox.ymin() - legend_size as f32) as i32, - legend_size as f32, + legend_size as f32, &font, &legend, ); From 7453a1c3fc5d469a55a2e0fd8c8e42d2195a46d2 Mon Sep 17 00:00:00 2001 From: Glenn Jocher Date: Sat, 2 Nov 2024 14:41:23 +0100 Subject: [PATCH 34/46] Fix Docker badges (#17321) Signed-off-by: UltralyticsAssistant Co-authored-by: UltralyticsAssistant --- README.md | 6 +++--- README.zh-CN.md | 6 +++--- docs/en/index.md | 21 +++++++++++---------- 3 files changed, 17 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 51f13230ed..01277aff54 100644 --- a/README.md +++ b/README.md @@ -8,8 +8,8 @@
Ultralytics CI + Ultralytics Downloads Ultralytics YOLO Citation - Ultralytics Docker Pulls Ultralytics Discord Ultralytics Forums Ultralytics Reddit @@ -55,7 +55,7 @@ See below for a quickstart install and usage examples, and see our [Docs](https: Pip install the ultralytics package including all [requirements](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) in a [**Python>=3.8**](https://www.python.org/) environment with [**PyTorch>=1.8**](https://pytorch.org/get-started/locally/). -[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/) +[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Ultralytics Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/) ```bash pip install ultralytics @@ -63,7 +63,7 @@ pip install ultralytics For alternative installation methods including [Conda](https://anaconda.org/conda-forge/ultralytics), [Docker](https://hub.docker.com/r/ultralytics/ultralytics), and Git, please refer to the [Quickstart Guide](https://docs.ultralytics.com/quickstart/). -[![Conda Version](https://img.shields.io/conda/vn/conda-forge/ultralytics?logo=condaforge)](https://anaconda.org/conda-forge/ultralytics) [![Docker Image Version](https://img.shields.io/docker/v/ultralytics/ultralytics?sort=semver&logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics) +[![Conda Version](https://img.shields.io/conda/vn/conda-forge/ultralytics?logo=condaforge)](https://anaconda.org/conda-forge/ultralytics) [![Docker Image Version](https://img.shields.io/docker/v/ultralytics/ultralytics?sort=semver&logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics) [![Ultralytics Docker Pulls](https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics) diff --git a/README.zh-CN.md b/README.zh-CN.md index d7665f166d..caf5e6b470 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -8,8 +8,8 @@
Ultralytics CI + Ultralytics Downloads Ultralytics YOLO Citation - Ultralytics Docker Pulls Ultralytics Discord Ultralytics Forums Ultralytics Reddit @@ -55,7 +55,7 @@ 在 [**Python>=3.8**](https://www.python.org/) 环境中使用 [**PyTorch>=1.8**](https://pytorch.org/get-started/locally/) 通过 pip 安装包含所有[依赖项](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) 的 ultralytics 包。 -[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/) +[![PyPI - Version](https://img.shields.io/pypi/v/ultralytics?logo=pypi&logoColor=white)](https://pypi.org/project/ultralytics/) [![Ultralytics Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics) [![PyPI - Python Version](https://img.shields.io/pypi/pyversions/ultralytics?logo=python&logoColor=gold)](https://pypi.org/project/ultralytics/) ```bash pip install ultralytics @@ -63,7 +63,7 @@ pip install ultralytics 有关其他安装方法,包括 [Conda](https://anaconda.org/conda-forge/ultralytics)、[Docker](https://hub.docker.com/r/ultralytics/ultralytics) 和 Git,请参阅 [快速开始指南](https://docs.ultralytics.com/quickstart/)。 -[![Conda Version](https://img.shields.io/conda/vn/conda-forge/ultralytics?logo=condaforge)](https://anaconda.org/conda-forge/ultralytics) [![Docker Image Version](https://img.shields.io/docker/v/ultralytics/ultralytics?sort=semver&logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics) +[![Conda Version](https://img.shields.io/conda/vn/conda-forge/ultralytics?logo=condaforge)](https://anaconda.org/conda-forge/ultralytics) [![Docker Image Version](https://img.shields.io/docker/v/ultralytics/ultralytics?sort=semver&logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics) [![Ultralytics Docker Pulls](https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker)](https://hub.docker.com/r/ultralytics/ultralytics) diff --git a/docs/en/index.md b/docs/en/index.md index f796e4b483..ef1245f891 100644 --- a/docs/en/index.md +++ b/docs/en/index.md @@ -19,16 +19,17 @@ keywords: Ultralytics, YOLO, YOLO11, object detection, image segmentation, deep العربية

-Ultralytics CI -YOLO Citation -Docker Pulls -Discord -Ultralytics Forums -Ultralytics Reddit -
-Run on Gradient -Open In Colab -Open In Kaggle + Ultralytics CI + Ultralytics Downloads + Ultralytics YOLO Citation + Ultralytics Discord + Ultralytics Forums + Ultralytics Reddit +
+ Run Ultralytics on Gradient + Open Ultralytics In Colab + Open Ultralytics In Kaggle + Open Ultralytics In Binder
Introducing [Ultralytics](https://www.ultralytics.com/) [YOLO11](https://github.com/ultralytics/ultralytics), the latest version of the acclaimed real-time object detection and image segmentation model. YOLO11 is built on cutting-edge advancements in [deep learning](https://www.ultralytics.com/glossary/deep-learning-dl) and [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv), offering unparalleled performance in terms of speed and [accuracy](https://www.ultralytics.com/glossary/accuracy). Its streamlined design makes it suitable for various applications and easily adaptable to different hardware platforms, from edge devices to cloud APIs. From 2a1fabcf83df6e44f451065ab46b5b0e6fd3b601 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Sun, 3 Nov 2024 04:54:06 +0500 Subject: [PATCH 35/46] Add ultralytics models publication notice in citations section (#17318) Co-authored-by: Glenn Jocher --- docs/en/models/yolo11.md | 8 ++++---- docs/en/models/yolov5.md | 8 ++++---- docs/en/models/yolov8.md | 8 ++++---- 3 files changed, 12 insertions(+), 12 deletions(-) diff --git a/docs/en/models/yolo11.md b/docs/en/models/yolo11.md index fe9115f2ed..dee9344b46 100644 --- a/docs/en/models/yolo11.md +++ b/docs/en/models/yolo11.md @@ -8,10 +8,6 @@ keywords: YOLO11, state-of-the-art object detection, YOLO series, Ultralytics, c ## Overview -!!! tip "Ultralytics YOLO11 Publication" - - Ultralytics has not published a formal research paper for YOLO11 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com). - YOLO11 is the latest iteration in the [Ultralytics](https://www.ultralytics.com/) YOLO series of real-time object detectors, redefining what's possible with cutting-edge [accuracy](https://www.ultralytics.com/glossary/accuracy), speed, and efficiency. Building upon the impressive advancements of previous YOLO versions, YOLO11 introduces significant improvements in architecture and training methods, making it a versatile choice for a wide range of [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) tasks. ![Ultralytics YOLO11 Comparison Plots](https://raw.githubusercontent.com/ultralytics/assets/refs/heads/main/yolo/performance-comparison.png) @@ -132,6 +128,10 @@ Note that the example below is for YOLO11 [Detect](../tasks/detect.md) models fo ## Citations and Acknowledgements +!!! tip "Ultralytics YOLO11 Publication" + + Ultralytics has not published a formal research paper for YOLO11 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com). + If you use YOLO11 or any other software from this repository in your work, please cite it using the following format: !!! quote "" diff --git a/docs/en/models/yolov5.md b/docs/en/models/yolov5.md index 91c562a44e..4d261df5c4 100644 --- a/docs/en/models/yolov5.md +++ b/docs/en/models/yolov5.md @@ -6,10 +6,6 @@ keywords: YOLOv5, YOLOv5u, object detection, Ultralytics, anchor-free, pre-train # Ultralytics YOLOv5 -!!! tip "Ultralytics YOLOv5 Publication" - - Ultralytics has not published a formal research paper for YOLOv5 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com). - ## Overview YOLOv5u represents an advancement in [object detection](https://www.ultralytics.com/glossary/object-detection) methodologies. Originating from the foundational architecture of the [YOLOv5](https://github.com/ultralytics/yolov5) model developed by Ultralytics, YOLOv5u integrates the anchor-free, objectness-free split head, a feature previously introduced in the [YOLOv8](yolov8.md) models. This adaptation refines the model's architecture, leading to an improved accuracy-speed tradeoff in object detection tasks. Given the empirical results and its derived features, YOLOv5u provides an efficient alternative for those seeking robust solutions in both research and practical applications. @@ -96,6 +92,10 @@ This example provides simple YOLOv5 training and inference examples. For full do ## Citations and Acknowledgements +!!! tip "Ultralytics YOLOv5 Publication" + + Ultralytics has not published a formal research paper for YOLOv5 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com). + If you use YOLOv5 or YOLOv5u in your research, please cite the Ultralytics YOLOv5 repository as follows: !!! quote "" diff --git a/docs/en/models/yolov8.md b/docs/en/models/yolov8.md index c8e4397d15..bb4f287a98 100644 --- a/docs/en/models/yolov8.md +++ b/docs/en/models/yolov8.md @@ -6,10 +6,6 @@ keywords: YOLOv8, real-time object detection, YOLO series, Ultralytics, computer # Ultralytics YOLOv8 -!!! tip "Ultralytics YOLOv8 Publication" - - Ultralytics has not published a formal research paper for YOLOv8 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com). - ## Overview YOLOv8 is the latest iteration in the YOLO series of real-time object detectors, offering cutting-edge performance in terms of accuracy and speed. Building upon the advancements of previous YOLO versions, YOLOv8 introduces new features and optimizations that make it an ideal choice for various [object detection](https://www.ultralytics.com/glossary/object-detection) tasks in a wide range of applications. @@ -169,6 +165,10 @@ Note the below example is for YOLOv8 [Detect](../tasks/detect.md) models for obj ## Citations and Acknowledgements +!!! tip "Ultralytics YOLOv8 Publication" + + Ultralytics has not published a formal research paper for YOLOv8 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com). + If you use the YOLOv8 model or any other software from this repository in your work, please cite it using the following format: !!! quote "" From bf1d076e20f3e169ff25f60fb10c6e05f82db353 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Sun, 3 Nov 2024 04:55:55 +0500 Subject: [PATCH 36/46] Optimize Auto-Annotation with all args (#17315) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- docs/en/models/sam-2.md | 2 ++ docs/en/models/sam.md | 2 ++ ultralytics/data/annotator.py | 17 +++++++++++++++-- 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/docs/en/models/sam-2.md b/docs/en/models/sam-2.md index de5881c42e..86059422da 100644 --- a/docs/en/models/sam-2.md +++ b/docs/en/models/sam-2.md @@ -262,6 +262,8 @@ To auto-annotate your dataset using SAM 2, follow this example: | `conf` | `float`, optional | Confidence threshold for detection model; default is 0.25. | `0.25` | | `iou` | `float`, optional | IoU threshold for filtering overlapping boxes in detection results; default is 0.45. | `0.45` | | `imgsz` | `int`, optional | Input image resize dimension; default is 640. | `640` | +| `max_det` | `int`, optional | Limits detections per image to control outputs in dense scenes. | `300` | +| `classes` | `list`, optional | Filters predictions to specified class IDs, returning only relevant detections. | `None` | | `output_dir` | `str`, `None`, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | `None` | This function facilitates the rapid creation of high-quality segmentation datasets, ideal for researchers and developers aiming to accelerate their projects. diff --git a/docs/en/models/sam.md b/docs/en/models/sam.md index fe4c01bd8b..d6f49792ea 100644 --- a/docs/en/models/sam.md +++ b/docs/en/models/sam.md @@ -217,6 +217,8 @@ To auto-annotate your dataset with the Ultralytics framework, use the `auto_anno | `conf` | `float`, optional | Confidence threshold for detection model; default is 0.25. | `0.25` | | `iou` | `float`, optional | IoU threshold for filtering overlapping boxes in detection results; default is 0.45. | `0.45` | | `imgsz` | `int`, optional | Input image resize dimension; default is 640. | `640` | +| `max_det` | `int`, optional | Limits detections per image to control outputs in dense scenes. | `300` | +| `classes` | `list`, optional | Filters predictions to specified class IDs, returning only relevant detections. | `None` | | `output_dir` | `str`, None, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | `None` | The `auto_annotate` function takes the path to your images, with optional arguments for specifying the pre-trained detection and SAM segmentation models, the device to run the models on, and the output directory for saving the annotated results. diff --git a/ultralytics/data/annotator.py b/ultralytics/data/annotator.py index 64ee9af6c0..fc3b8d0765 100644 --- a/ultralytics/data/annotator.py +++ b/ultralytics/data/annotator.py @@ -6,7 +6,16 @@ from ultralytics import SAM, YOLO def auto_annotate( - data, det_model="yolo11x.pt", sam_model="sam_b.pt", device="", conf=0.25, iou=0.45, imgsz=640, output_dir=None + data, + det_model="yolo11x.pt", + sam_model="sam_b.pt", + device="", + conf=0.25, + iou=0.45, + imgsz=640, + max_det=300, + classes=None, + output_dir=None, ): """ Automatically annotates images using a YOLO object detection model and a SAM segmentation model. @@ -22,6 +31,8 @@ def auto_annotate( conf (float): Confidence threshold for detection model; default is 0.25. iou (float): IoU threshold for filtering overlapping boxes in detection results; default is 0.45. imgsz (int): Input image resize dimension; default is 640. + max_det (int): Limits detections per image to control outputs in dense scenes. + classes (list): Filters predictions to specified class IDs, returning only relevant detections. output_dir (str | None): Directory to save the annotated results. If None, a default directory is created. Examples: @@ -41,7 +52,9 @@ def auto_annotate( output_dir = data.parent / f"{data.stem}_auto_annotate_labels" Path(output_dir).mkdir(exist_ok=True, parents=True) - det_results = det_model(data, stream=True, device=device, conf=conf, iou=iou, imgsz=imgsz) + det_results = det_model( + data, stream=True, device=device, conf=conf, iou=iou, imgsz=imgsz, max_det=max_det, classes=classes + ) for result in det_results: class_ids = result.boxes.cls.int().tolist() # noqa From 2875c30072e840a3573504fd32ce8d3a9bb7698d Mon Sep 17 00:00:00 2001 From: Francesco Mattioli Date: Sun, 3 Nov 2024 01:38:00 +0100 Subject: [PATCH 37/46] New JupyterLab Dockerfile (#17071) Co-authored-by: Ultralytics Assistant <135830346+UltralyticsAssistant@users.noreply.github.com> Co-authored-by: Glenn Jocher --- .github/workflows/docker.yaml | 9 +++++++++ docker/Dockerfile-jupyter | 34 ++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) create mode 100644 docker/Dockerfile-jupyter diff --git a/.github/workflows/docker.yaml b/.github/workflows/docker.yaml index ef7dd86e96..38f30bb1b6 100644 --- a/.github/workflows/docker.yaml +++ b/.github/workflows/docker.yaml @@ -170,6 +170,15 @@ jobs: docker build -f docker/Dockerfile-runner -t $t . docker push $t fi + if [[ "${{ matrix.tags }}" == "latest-python" ]]; then + t=ultralytics/ultralytics:latest-jupyter + v=ultralytics/ultralytics:${{ steps.get_version.outputs.version_tag }}-jupyter + docker build -f docker/Dockerfile-jupyter -t $t -t $v . + docker push $t + if [[ "${{ steps.check_tag.outputs.new_release }}" == "true" ]]; then + docker push $v + fi + fi trigger-actions: runs-on: ubuntu-latest diff --git a/docker/Dockerfile-jupyter b/docker/Dockerfile-jupyter new file mode 100644 index 0000000000..e42639b9b7 --- /dev/null +++ b/docker/Dockerfile-jupyter @@ -0,0 +1,34 @@ +# Ultralytics YOLO 🚀, AGPL-3.0 license +# Builds ultralytics/ultralytics:latest-jupyter image on DockerHub https://hub.docker.com/r/ultralytics/ultralytics +# Image provides JupyterLab interface for interactive YOLO development and includes tutorial notebooks + +# Start from Python-based Ultralytics image for full Python environment +FROM ultralytics/ultralytics:latest-python + +# Install JupyterLab for interactive development +RUN /usr/local/bin/pip install jupyterlab + +# Create persistent data directory structure +RUN mkdir /data + +# Configure YOLO directory paths +RUN mkdir /data/datasets && /usr/local/bin/yolo settings datasets_dir="/data/datasets" +RUN mkdir /data/weights && /usr/local/bin/yolo settings weights_dir="/data/weights" +RUN mkdir /data/runs && /usr/local/bin/yolo settings runs_dir="/data/runs" + +# Start JupyterLab with tutorial notebook +ENTRYPOINT ["/usr/local/bin/jupyter", "lab", "--allow-root", "/ultralytics/examples/tutorial.ipynb"] + +# Usage Examples ------------------------------------------------------------------------------------------------------- + +# Build and Push +# t=ultralytics/ultralytics:latest-jupyter && sudo docker build -f docker/Dockerfile-jupyter -t $t . && sudo docker push $t + +# Run +# t=ultralytics/ultralytics:latest-jupyter && sudo docker run -it --ipc=host -p 8888:8888 $t + +# Pull and Run +# t=ultralytics/ultralytics:latest-jupyter && sudo docker pull $t && sudo docker run -it --ipc=host -p 8888:8888 $t + +# Pull and Run with local volume mounted +# t=ultralytics/ultralytics:latest-jupyter && sudo docker pull $t && sudo docker run -it --ipc=host -p 8888:8888 -v "$(pwd)"/datasets:/data/datasets $t From 5f9911a44a086d99785a4d6a9e566b5a6a6e2f52 Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Tue, 5 Nov 2024 08:20:48 +0800 Subject: [PATCH 38/46] Update `overlap_mask` description. (#17324) Co-authored-by: UltralyticsAssistant --- docs/en/macros/train-args.md | 2 +- ultralytics/cfg/default.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/en/macros/train-args.md b/docs/en/macros/train-args.md index cb72bdeced..ede32f910b 100644 --- a/docs/en/macros/train-args.md +++ b/docs/en/macros/train-args.md @@ -43,7 +43,7 @@ | `kobj` | `2.0` | Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy. | | `label_smoothing` | `0.0` | Applies label smoothing, softening hard labels to a mix of the target label and a uniform distribution over labels, can improve generalization. | | `nbs` | `64` | Nominal batch size for normalization of loss. | -| `overlap_mask` | `True` | Determines whether segmentation masks should overlap during training, applicable in [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation) tasks. | +| `overlap_mask` | `True` | Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlayed on top of the larger mask during merge. | | `mask_ratio` | `4` | Downsample ratio for segmentation masks, affecting the resolution of masks used during training. | | `dropout` | `0.0` | Dropout rate for regularization in classification tasks, preventing overfitting by randomly omitting units during training. | | `val` | `True` | Enables validation during training, allowing for periodic evaluation of model performance on a separate dataset. | diff --git a/ultralytics/cfg/default.yaml b/ultralytics/cfg/default.yaml index 7922f63592..2ef1f4284f 100644 --- a/ultralytics/cfg/default.yaml +++ b/ultralytics/cfg/default.yaml @@ -36,7 +36,7 @@ profile: False # (bool) profile ONNX and TensorRT speeds during training for log freeze: None # (int | list, optional) freeze first n layers, or freeze list of layer indices during training multi_scale: False # (bool) Whether to use multiscale during training # Segmentation -overlap_mask: True # (bool) masks should overlap during training (segment train only) +overlap_mask: True # (bool) merge object masks into a single image mask during training (segment train only) mask_ratio: 4 # (int) mask downsample ratio (segment train only) # Classification dropout: 0.0 # (float) use dropout regularization (classify train only) From f5ce64c12887cc752bd8ef5bd3271b07ecb22c27 Mon Sep 17 00:00:00 2001 From: Jairaj Jangle <25704330+JairajJangle@users.noreply.github.com> Date: Tue, 5 Nov 2024 05:52:21 +0530 Subject: [PATCH 39/46] Generalized M1/M2 references to "Apple silicon" in train.md for broader inclusion (#17330) Co-authored-by: Glenn Jocher --- docs/en/modes/train.md | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/docs/en/modes/train.md b/docs/en/modes/train.md index 9cbe791991..276bd4f695 100644 --- a/docs/en/modes/train.md +++ b/docs/en/modes/train.md @@ -1,7 +1,7 @@ --- comments: true description: Learn how to efficiently train object detection models using YOLO11 with comprehensive instructions on settings, augmentation, and hardware utilization. -keywords: Ultralytics, YOLO11, model training, deep learning, object detection, GPU training, dataset augmentation, hyperparameter tuning, model performance, M1 M2 training +keywords: Ultralytics, YOLO11, model training, deep learning, object detection, GPU training, dataset augmentation, hyperparameter tuning, model performance, apple silicon training --- # Model Training with Ultralytics YOLO @@ -107,11 +107,11 @@ Multi-GPU training allows for more efficient utilization of available hardware r yolo detect train data=coco8.yaml model=yolo11n.pt epochs=100 imgsz=640 device=0,1 ``` -### Apple M1 and M2 MPS Training +### Apple Silicon MPS Training -With the support for Apple M1 and M2 chips integrated in the Ultralytics YOLO models, it's now possible to train your models on devices utilizing the powerful Metal Performance Shaders (MPS) framework. The MPS offers a high-performance way of executing computation and image processing tasks on Apple's custom silicon. +With the support for Apple silicon chips integrated in the Ultralytics YOLO models, it's now possible to train your models on devices utilizing the powerful Metal Performance Shaders (MPS) framework. The MPS offers a high-performance way of executing computation and image processing tasks on Apple's custom silicon. -To enable training on Apple M1 and M2 chips, you should specify 'mps' as your device when initiating the training process. Below is an example of how you could do this in Python and via the command line: +To enable training on Apple silicon chips, you should specify 'mps' as your device when initiating the training process. Below is an example of how you could do this in Python and via the command line: !!! example "MPS Training Example" @@ -134,7 +134,7 @@ To enable training on Apple M1 and M2 chips, you should specify 'mps' as your de yolo detect train data=coco8.yaml model=yolo11n.pt epochs=100 imgsz=640 device=mps ``` -While leveraging the computational power of the M1/M2 chips, this enables more efficient processing of the training tasks. For more detailed guidance and advanced configuration options, please refer to the [PyTorch MPS documentation](https://pytorch.org/docs/stable/notes/mps.html). +While leveraging the computational power of the Apple silicon chips, this enables more efficient processing of the training tasks. For more detailed guidance and advanced configuration options, please refer to the [PyTorch MPS documentation](https://pytorch.org/docs/stable/notes/mps.html). ### Resuming Interrupted Trainings @@ -335,9 +335,9 @@ To resume training from an interrupted session, set the `resume` argument to `Tr Check the section on [Resuming Interrupted Trainings](#resuming-interrupted-trainings) for more information. -### Can I train YOLO11 models on Apple M1 and M2 chips? +### Can I train YOLO11 models on Apple silicon chips? -Yes, Ultralytics YOLO11 supports training on Apple M1 and M2 chips utilizing the Metal Performance Shaders (MPS) framework. Specify 'mps' as your training device. +Yes, Ultralytics YOLO11 supports training on Apple silicon chips utilizing the Metal Performance Shaders (MPS) framework. Specify 'mps' as your training device. !!! example "MPS Training Example" @@ -349,7 +349,7 @@ Yes, Ultralytics YOLO11 supports training on Apple M1 and M2 chips utilizing the # Load a pretrained model model = YOLO("yolo11n.pt") - # Train the model on M1/M2 chip + # Train the model on Apple silicon chip (M1/M2/M3/M4) results = model.train(data="coco8.yaml", epochs=100, imgsz=640, device="mps") ``` @@ -359,7 +359,7 @@ Yes, Ultralytics YOLO11 supports training on Apple M1 and M2 chips utilizing the yolo detect train data=coco8.yaml model=yolo11n.pt epochs=100 imgsz=640 device=mps ``` -For more details, refer to the [Apple M1 and M2 MPS Training](#apple-m1-and-m2-mps-training) section. +For more details, refer to the [Apple Silicon MPS Training](#apple-silicon-mps-training) section. ### What are the common training settings, and how do I configure them? From 603fa84774376b10d8783ffa9017f6b9b9b84861 Mon Sep 17 00:00:00 2001 From: Abirami Vina Date: Tue, 5 Nov 2024 05:52:46 +0530 Subject: [PATCH 40/46] Add Albumentations Integrations Docs Page (#17297) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher Co-authored-by: Francesco Mattioli --- docs/en/integrations/albumentations.md | 160 +++++++++++++++++++++++++ docs/en/integrations/index.md | 2 + mkdocs.yml | 1 + 3 files changed, 163 insertions(+) create mode 100644 docs/en/integrations/albumentations.md diff --git a/docs/en/integrations/albumentations.md b/docs/en/integrations/albumentations.md new file mode 100644 index 0000000000..3c407093e9 --- /dev/null +++ b/docs/en/integrations/albumentations.md @@ -0,0 +1,160 @@ +--- +comments: true +description: Learn how to use Albumentations with YOLO11 to enhance data augmentation, improve model performance, and streamline your computer vision projects. +keywords: Albumentations, YOLO11, data augmentation, Ultralytics, computer vision, object detection, model training, image transformations, machine learning +--- + +# Enhance Your Dataset to Train YOLO11 Using Albumentations + +When you are building [computer vision models](../models/index.md), the quality and variety of your [training data](../datasets/index.md) can play a big role in how well your model performs. Albumentations offers a fast, flexible, and efficient way to apply a wide range of image transformations that can improve your model's ability to adapt to real-world scenarios. It easily integrates with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics) and can help you create robust datasets for [object detection](../tasks/detect.md), [segmentation](../tasks/segment.md), and [classification](../tasks/classify.md) tasks. + +By using Albumentations, you can boost your YOLO11 training data with techniques like geometric transformations and color adjustments. In this article, we’ll see how Albumentations can improve your [data augmentation](../guides/preprocessing_annotated_data.md) process and make your [YOLO11 projects](../solutions/index.md) even more impactful. Let’s get started! + +## Albumentations for Image Augmentation + +[Albumentations](https://albumentations.ai/) is an open-source image augmentation library created in [June 2018](https://arxiv.org/pdf/1809.06839). It is designed to simplify and accelerate the image augmentation process in [computer vision](https://www.ultralytics.com/blog/exploring-image-processing-computer-vision-and-machine-vision). Created with [performance](https://www.ultralytics.com/blog/measuring-ai-performance-to-weigh-the-impact-of-your-innovations) and flexibility in mind, it supports many diverse augmentation techniques, ranging from simple transformations like rotations and flips to more complex adjustments like brightness and contrast changes. Albumentations helps developers generate rich, varied datasets for tasks like [image classification](https://www.youtube.com/watch?v=5BO0Il_YYAg), [object detection](https://www.youtube.com/watch?v=5ku7npMrW40&t=1s), and [segmentation](https://www.youtube.com/watch?v=o4Zd-IeMlSY). + +You can use Albumentations to easily apply augmentations to images, [segmentation masks](https://www.ultralytics.com/glossary/image-segmentation), [bounding boxes](https://www.ultralytics.com/glossary/bounding-box), and [key points](../datasets/pose/index.md), and make sure that all elements of your dataset are transformed together. It works seamlessly with popular deep learning frameworks like [PyTorch](../integrations/torchscript.md) and [TensorFlow](../integrations/tensorboard.md), making it accessible for a wide range of projects. + +Also, Albumentations is a great option for augmentation whether you're handling small datasets or large-scale [computer vision tasks](../tasks/index.md). It ensures fast and efficient processing, cutting down the time spent on data preparation. At the same time, it helps improve [model performance](../guides/yolo-performance-metrics.md), making your models more effective in real-world applications. + +## Key Features of Albumentations + +Albumentations offers many useful features that simplify complex image augmentations for a wide range of [computer vision applications](https://www.ultralytics.com/blog/exploring-how-the-applications-of-computer-vision-work). Here are some of the key features: + +- **Wide Range of Transformations**: Albumentations offers over [70 different transformations](https://github.com/albumentations-team/albumentations?tab=readme-ov-file#list-of-augmentations), including geometric changes (e.g., rotation, flipping), color adjustments (e.g., brightness, contrast), and noise addition (e.g., Gaussian noise). Having multiple options enables the creation of highly diverse and robust training datasets. + +

+ Example of Image Augmentations +

+ +- **High Performance Optimization**: Built on OpenCV and NumPy, Albumentations uses advanced optimization techniques like SIMD (Single Instruction, Multiple Data), which processes multiple data points simultaneously to speed up processing. It handles large datasets quickly, making it one of the fastest options available for image augmentation. + +- **Three Levels of Augmentation**: Albumentations supports three levels of augmentation: pixel-level transformations, spatial-level transformations, and mixing-level transformation. Pixel-level transformations only affect the input images without altering masks, bounding boxes, or key points. Meanwhile, both the image and its elements, like masks and bounding boxes, are transformed using spatial-level transformations. Furthermore, mixing-level transformations are a unique way to augment data as it combines multiple images into one. + +![Overview of the Different Levels of Augmentations](https://github.com/ultralytics/docs/releases/download/0/levels-of-augmentation.avif) + +- **[Benchmarking Results](https://albumentations.ai/docs/benchmarking_results/)**: When it comes to benchmarking, Albumentations consistently outperforms other libraries, especially with large datasets. + +## Why Should You Use Albumentations for Your Vision AI Projects? + +With respect to image augmentation, Albumentations stands out as a reliable tool for computer vision tasks. Here are a few key reasons why you should consider using it for your Vision AI projects: + +- **Easy-to-Use API**: Albumentations provides a single, straightforward API for applying a wide range of augmentations to images, masks, bounding boxes, and keypoints. It’s designed to adapt easily to different datasets, making [data preparation](../guides/data-collection-and-annotation.md) simpler and more efficient. + +- **Rigorous Bug Testing**: Bugs in the augmentation pipeline can silently corrupt input data, often going unnoticed but ultimately degrading model performance. Albumentations addresses this with a thorough test suite that helps catch bugs early in development. + +- **Extensibility**: Albumentations can be used to easily add new augmentations and use them in computer vision pipelines through a single interface along with built-in transformations. + +## How to Use Albumentations to Augment Data for YOLO11 Training + +Now that we’ve covered what Albumentations is and what it can do, let’s look at how to use it to augment your data for YOLO11 model training. It’s easy to set up because it integrates directly into [Ultralytics’ training mode](../modes/train.md) and applies automatically if you have the Albumentations package installed. + +### Installation + +To use Albumentations with YOLOv11, start by making sure you have the necessary packages installed. If Albumentations isn’t installed, the augmentations won’t be applied during training. Once set up, you’ll be ready to create an augmented dataset for training, with Albumentations integrated to enhance your model automatically. + +!!! tip "Installation" + + === "CLI" + + ```bash + # Install the required packages + pip install albumentations ultralytics + ``` + +For detailed instructions and best practices related to the installation process, check our [Ultralytics Installation guide](../quickstart.md). While installing the required packages for YOLO11, if you encounter any difficulties, consult our [Common Issues guide](../guides/yolo-common-issues.md) for solutions and tips. + +### Usage + +After installing the necessary packages, you’re ready to start using Albumentations with YOLO11. When you train YOLOv11, a set of augmentations is automatically applied through its integration with Albumentations, making it easy to enhance your model’s performance. + +!!! example "Usage" + + === "Python" + + ```python + from ultralytics import YOLO + + # Load a pre-trained model + model = YOLO("yolo11n.pt") + + # Train the model + results = model.train(data="coco8.yaml", epochs=100, imgsz=640) + ``` + +Next, let’s take look a closer look at the specific augmentations that are applied during training. + +### Blur + +The Blur transformation in Albumentations applies a simple blur effect to the image by averaging pixel values within a small square area, or kernel. This is done using OpenCV’s `cv2.blur` function, which helps reduce noise in the image, though it also slightly reduces image details. + +Here are the parameters and values used in this integration: + +- **blur_limit**: This controls the size range of the blur effect. The default range is (3, 7), meaning the kernel size for the blur can vary between 3 and 7 pixels, with only odd numbers allowed to keep the blur centered. + +- **p**: The probability of applying the blur. In the integration, p=0.01, so there’s a 1% chance that this blur will be applied to each image. The low probability allows for occasional blur effects, introducing a bit of variation to help the model generalize without over-blurring the images. + +An Example of the Blur Augmentation + +### Median Blur + +The MedianBlur transformation in Albumentations applies a median blur effect to the image, which is particularly useful for reducing noise while preserving edges. Unlike typical blurring methods, MedianBlur uses a median filter, which is especially effective at removing salt-and-pepper noise while maintaining sharpness around the edges. + +Here are the parameters and values used in this integration: + +- **blur_limit**: This parameter controls the maximum size of the blurring kernel. In this integration, it defaults to a range of (3, 7), meaning the kernel size for the blur is randomly chosen between 3 and 7 pixels, with only odd values allowed to ensure proper alignment. + +- **p**: Sets the probability of applying the median blur. Here, p=0.01, so the transformation has a 1% chance of being applied to each image. This low probability ensures that the median blur is used sparingly, helping the model generalize by occasionally seeing images with reduced noise and preserved edges. + +The image below shows an example of this augmentation applied to an image. + +An Example of the MedianBlur Augmentation + +### Grayscale + +The ToGray transformation in Albumentations converts an image to grayscale, reducing it to a single-channel format and optionally replicating this channel to match a specified number of output channels. Different methods can be used to adjust how grayscale brightness is calculated, ranging from simple averaging to more advanced techniques for realistic perception of contrast and brightness. + +Here are the parameters and values used in this integration: + +- **num_output_channels**: Sets the number of channels in the output image. If this value is more than 1, the single grayscale channel will be replicated to create a multi-channel grayscale image. By default, it’s set to 3, giving a grayscale image with three identical channels. + +- **method**: Defines the grayscale conversion method. The default method, "weighted_average", applies a formula (0.299R + 0.587G + 0.114B) that closely aligns with human perception, providing a natural-looking grayscale effect. Other options, like "from_lab", "desaturation", "average", "max", and "pca", offer alternative ways to create grayscale images based on various needs for speed, brightness emphasis, or detail preservation. + +- **p**: Controls how often the grayscale transformation is applied. With p=0.01, there is a 1% chance of converting each image to grayscale, making it possible for a mix of color and grayscale images to help the model generalize better. + +The image below shows an example of this grayscale transformation applied. + +An Example of the ToGray Augmentation + +### Contrast Limited Adaptive Histogram Equalization (CLAHE) + +The CLAHE transformation in Albumentations applies Contrast Limited Adaptive Histogram Equalization (CLAHE), a technique that enhances image contrast by equalizing the histogram in localized regions (tiles) instead of across the whole image. CLAHE produces a balanced enhancement effect, avoiding the overly amplified contrast that can result from standard histogram equalization, especially in areas with initially low contrast. + +Here are the parameters and values used in this integration: + +- **clip_limit**: Controls the contrast enhancement range. Set to a default range of (1, 4), it determines the maximum contrast allowed in each tile. Higher values are used for more contrast but may also introduce noise. + +- **tile_grid_size**: Defines the size of the grid of tiles, typically as (rows, columns). The default value is (8, 8), meaning the image is divided into an 8x8 grid. Smaller tile sizes provide more localized adjustments, while larger ones create effects closer to global equalization. + +- **p**: The probability of applying CLAHE. Here, p=0.01 introduces the enhancement effect only 1% of the time, ensuring that contrast adjustments are applied sparingly for occasional variation in training images. + +The image below shows an example of the CLAHE transformation applied. + +An Example of the CLAHE Augmentation + +## Keep Learning about Albumentations + +If you are interested in learning more about Albumentations, check out the following resources for more in-depth instructions and examples: + +- **[Albumentations Documentation](https://albumentations.ai/docs/)**: The official documentation provides a full range of supported transformations and advanced usage techniques. + +- **[Ultralytics Albumentations Guide](https://docs.ultralytics.com/reference/data/augment/?h=albumentation#ultralytics.data.augment.Albumentations)**: Get a closer look at the details of the function that facilitate this integration. + +- **[Albumentations GitHub Repository](https://github.com/albumentations-team/albumentations/)**: The repository includes examples, benchmarks, and discussions to help you get started with customizing augmentations. + +## Key Takeaways + +In this guide, we explored the key aspects of Albumentations, a great Python library for image augmentation. We discussed its wide range of transformations, optimized performance, and how you can use it in your next YOLO11 project. + +Also, if you'd like to know more about other Ultralytics YOLO11 integrations, visit our [integration guide page](../integrations/index.md). You'll find valuable resources and insights there. diff --git a/docs/en/integrations/index.md b/docs/en/integrations/index.md index f2859e8388..05af439936 100644 --- a/docs/en/integrations/index.md +++ b/docs/en/integrations/index.md @@ -59,6 +59,8 @@ Welcome to the Ultralytics Integrations page! This page provides an overview of - [VS Code](vscode.md): An extension for VS Code that provides code snippets for accelerating development workflows with Ultralytics and also for anyone looking for examples to help learn or get started with Ultralytics. +- [Albumentations](albumentations.md): Enhance your Ultralytics models with powerful image augmentations to improve model robustness and generalization. + ## Deployment Integrations - [CoreML](coreml.md): CoreML, developed by [Apple](https://www.apple.com/), is a framework designed for efficiently integrating machine learning models into applications across iOS, macOS, watchOS, and tvOS, using Apple's hardware for effective and secure [model deployment](https://www.ultralytics.com/glossary/model-deployment). diff --git a/mkdocs.yml b/mkdocs.yml index 3ee15f83b0..20d8ec3bf1 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -417,6 +417,7 @@ nav: - TorchScript: integrations/torchscript.md - VS Code: integrations/vscode.md - Weights & Biases: integrations/weights-biases.md + - Albumentations: integrations/albumentations.md - HUB: - hub/index.md - Web: From d0abd95f95211851cb37004510f03191ac8d12be Mon Sep 17 00:00:00 2001 From: Mohammed Yasin <32206511+Y-T-G@users.noreply.github.com> Date: Tue, 5 Nov 2024 17:48:33 +0800 Subject: [PATCH 41/46] Fix error on TensorRT export with float `workspace` value (#17352) --- ultralytics/engine/exporter.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ultralytics/engine/exporter.py b/ultralytics/engine/exporter.py index 223454f600..e764dd4dc0 100644 --- a/ultralytics/engine/exporter.py +++ b/ultralytics/engine/exporter.py @@ -791,7 +791,7 @@ class Exporter: LOGGER.warning(f"{prefix} WARNING ⚠️ 'dynamic=True' model requires max batch size, i.e. 'batch=16'") profile = builder.create_optimization_profile() min_shape = (1, shape[1], 32, 32) # minimum input shape - max_shape = (*shape[:2], *(max(1, self.args.workspace) * d for d in shape[2:])) # max input shape + max_shape = (*shape[:2], *(int(max(1, self.args.workspace) * d) for d in shape[2:])) # max input shape for inp in inputs: profile.set_shape(inp.name, min=min_shape, opt=shape, max=max_shape) config.add_optimization_profile(profile) From da15e27a4db75dfc21090f5fc321c8cde53fcc50 Mon Sep 17 00:00:00 2001 From: Francesco Mattioli Date: Wed, 6 Nov 2024 13:34:06 +0100 Subject: [PATCH 42/46] Added Error for TFLite int8 end2end model export (#17360) Co-authored-by: Glenn Jocher --- ultralytics/engine/exporter.py | 2 ++ 1 file changed, 2 insertions(+) diff --git a/ultralytics/engine/exporter.py b/ultralytics/engine/exporter.py index e764dd4dc0..39d8d400bb 100644 --- a/ultralytics/engine/exporter.py +++ b/ultralytics/engine/exporter.py @@ -226,6 +226,8 @@ class Exporter: if self.args.optimize: assert not ncnn, "optimize=True not compatible with format='ncnn', i.e. use optimize=False" assert self.device.type == "cpu", "optimize=True not compatible with cuda devices, i.e. use device='cpu'" + if self.args.int8 and tflite: + assert not model.end2end, "TFLite INT8 export not supported for end2end models, please use half precision." if edgetpu: if not LINUX: raise SystemError("Edge TPU export only supported on Linux. See https://coral.ai/docs/edgetpu/compiler") From d049e22769bc3ba0ece62aa8b42ed96bd089411d Mon Sep 17 00:00:00 2001 From: Mahdi Amrollahi <44016758+M-Amrollahi@users.noreply.github.com> Date: Thu, 7 Nov 2024 02:33:15 +0330 Subject: [PATCH 43/46] Update kfold-cross-validation.md (#17332) Co-authored-by: Burhan <62214284+Burhan-Q@users.noreply.github.com> --- docs/en/guides/kfold-cross-validation.md | 1 + 1 file changed, 1 insertion(+) diff --git a/docs/en/guides/kfold-cross-validation.md b/docs/en/guides/kfold-cross-validation.md index 381aef0e2a..44ba8d82e8 100644 --- a/docs/en/guides/kfold-cross-validation.md +++ b/docs/en/guides/kfold-cross-validation.md @@ -263,6 +263,7 @@ fold_lbl_distrb.to_csv(save_path / "kfold_label_distribution.csv") for k in range(ksplit): dataset_yaml = ds_yamls[k] + model = YOLO(weights_path, task="detect") model.train(data=dataset_yaml, epochs=epochs, batch=batch, project=project) # include any train arguments results[k] = model.metrics # save output metrics for further analysis ``` From 3c976807b878af320e5bd85c6147bd321db26daa Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Thu, 7 Nov 2024 04:44:05 +0500 Subject: [PATCH 44/46] `ultralytics 8.3.28` new Solutions CLI commands (#17233) Signed-off-by: UltralyticsAssistant Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- .github/workflows/ci.yaml | 3 + .github/workflows/docs.yml | 2 +- docs/en/guides/analytics.md | 16 ++- docs/en/guides/heatmaps.md | 15 ++- docs/en/guides/object-counting.md | 15 ++- docs/en/guides/queue-management.md | 15 ++- docs/en/guides/speed-estimation.md | 15 ++- docs/en/guides/workouts-monitoring.md | 15 ++- docs/en/integrations/albumentations.md | 18 +-- docs/en/reference/cfg/__init__.md | 4 + docs/en/solutions/index.md | 44 +++++--- docs/mkdocs_github_authors.yaml | 18 ++- ultralytics/__init__.py | 2 +- ultralytics/cfg/__init__.py | 146 ++++++++++++++++++++++++- ultralytics/solutions/ai_gym.py | 6 +- ultralytics/solutions/solutions.py | 23 +++- ultralytics/utils/__init__.py | 1 + 17 files changed, 310 insertions(+), 48 deletions(-) diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 381e92c4c1..97b53a306e 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -325,6 +325,7 @@ jobs: yolo train model=yolo11n.pt data=coco8.yaml epochs=1 imgsz=32 yolo val model=yolo11n.pt data=coco8.yaml imgsz=32 yolo export model=yolo11n.pt format=torchscript imgsz=160 + yolo solutions - name: Test Python # Note this step must use the updated default bash environment, not a python environment run: | @@ -335,6 +336,8 @@ jobs: results = model.val(imgsz=160) results = model.predict(imgsz=160) results = model.export(format='onnx', imgsz=160) + from ultralytics.cfg import handle_yolo_solutions + handle_yolo_solutions(["show=False"]) " - name: PyTest run: | diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 360feead0c..d476d82835 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -39,7 +39,7 @@ jobs: uses: actions/checkout@v4 with: repository: ${{ github.event.pull_request.head.repo.full_name || github.repository }} - token: ${{ secrets.GITHUB_TOKEN }} + token: ${{ secrets._GITHUB_TOKEN }} ref: ${{ github.head_ref || github.ref }} fetch-depth: 0 - name: Set up Python diff --git a/docs/en/guides/analytics.md b/docs/en/guides/analytics.md index d073cd25b5..dec9b4cce8 100644 --- a/docs/en/guides/analytics.md +++ b/docs/en/guides/analytics.md @@ -33,9 +33,21 @@ This guide provides a comprehensive overview of three fundamental types of [data - Bar plots, on the other hand, are suitable for comparing quantities across different categories and showing relationships between a category and its numerical value. - Lastly, pie charts are effective for illustrating proportions among categories and showing parts of a whole. -!!! analytics "Analytics Examples" +!!! example "Analytics Examples" - === "Line Graph" + === "CLI" + + ```bash + yolo solutions analytics show=True + + # pass the source + yolo solutions analytics source="path/to/video/file.mp4" + + # generate the pie chart + yolo solutions analytics analytics_type="pie" show=True + ``` + + === "Python" ```python import cv2 diff --git a/docs/en/guides/heatmaps.md b/docs/en/guides/heatmaps.md index 7919bc7d94..66c26eaa01 100644 --- a/docs/en/guides/heatmaps.md +++ b/docs/en/guides/heatmaps.md @@ -36,7 +36,20 @@ A heatmap generated with [Ultralytics YOLO11](https://github.com/ultralytics/ult !!! example "Heatmaps using Ultralytics YOLO11 Example" - === "Heatmap" + === "CLI" + + ```bash + # Run a heatmap example + yolo solutions heatmap show=True + + # Pass a source video + yolo solutions heatmap source="path/to/video/file.mp4" + + # Pass a custom colormap + yolo solutions heatmap colormap=cv2.COLORMAP_INFERNO + ``` + + === "Python" ```python import cv2 diff --git a/docs/en/guides/object-counting.md b/docs/en/guides/object-counting.md index cefc9ae281..144555793d 100644 --- a/docs/en/guides/object-counting.md +++ b/docs/en/guides/object-counting.md @@ -48,7 +48,20 @@ Object counting with [Ultralytics YOLO11](https://github.com/ultralytics/ultraly !!! example "Object Counting using YOLO11 Example" - === "Count in Region" + === "CLI" + + ```bash + # Run a counting example + yolo solutions count show=True + + # Pass a source video + yolo solutions count source="path/to/video/file.mp4" + + # Pass region coordinates + yolo solutions count region=[(20, 400), (1080, 404), (1080, 360), (20, 360)] + ``` + + === "Python" ```python import cv2 diff --git a/docs/en/guides/queue-management.md b/docs/en/guides/queue-management.md index 32cb5b8a4e..2567a2f78a 100644 --- a/docs/en/guides/queue-management.md +++ b/docs/en/guides/queue-management.md @@ -35,7 +35,20 @@ Queue management using [Ultralytics YOLO11](https://github.com/ultralytics/ultra !!! example "Queue Management using YOLO11 Example" - === "Queue Manager" + === "CLI" + + ```bash + # Run a queue example + yolo solutions queue show=True + + # Pass a source video + yolo solutions queue source="path/to/video/file.mp4" + + # Pass queue coordinates + yolo solutions queue region=[(20, 400), (1080, 404), (1080, 360), (20, 360)] + ``` + + === "Python" ```python import cv2 diff --git a/docs/en/guides/speed-estimation.md b/docs/en/guides/speed-estimation.md index 48a9aa09eb..dd9660d149 100644 --- a/docs/en/guides/speed-estimation.md +++ b/docs/en/guides/speed-estimation.md @@ -40,7 +40,20 @@ keywords: Ultralytics YOLO11, speed estimation, object tracking, computer vision !!! example "Speed Estimation using YOLO11 Example" - === "Speed Estimation" + === "CLI" + + ```bash + # Run a speed example + yolo solutions speed show=True + + # Pass a source video + yolo solutions speed source="path/to/video/file.mp4" + + # Pass region coordinates + yolo solutions speed region=[(20, 400), (1080, 404), (1080, 360), (20, 360)] + ``` + + === "Python" ```python import cv2 diff --git a/docs/en/guides/workouts-monitoring.md b/docs/en/guides/workouts-monitoring.md index 34056da3fb..fac47d7ca1 100644 --- a/docs/en/guides/workouts-monitoring.md +++ b/docs/en/guides/workouts-monitoring.md @@ -36,7 +36,20 @@ Monitoring workouts through pose estimation with [Ultralytics YOLO11](https://gi !!! example "Workouts Monitoring Example" - === "Workouts Monitoring" + === "CLI" + + ```bash + # Run a workout example + yolo solutions workout show=True + + # Pass a source video + yolo solutions workout source="path/to/video/file.mp4" + + # Use keypoints for pushups + yolo solutions queue kpts=[6, 8, 10] + ``` + + === "Python" ```python import cv2 diff --git a/docs/en/integrations/albumentations.md b/docs/en/integrations/albumentations.md index 3c407093e9..e7b0d02c6c 100644 --- a/docs/en/integrations/albumentations.md +++ b/docs/en/integrations/albumentations.md @@ -8,7 +8,7 @@ keywords: Albumentations, YOLO11, data augmentation, Ultralytics, computer visio When you are building [computer vision models](../models/index.md), the quality and variety of your [training data](../datasets/index.md) can play a big role in how well your model performs. Albumentations offers a fast, flexible, and efficient way to apply a wide range of image transformations that can improve your model's ability to adapt to real-world scenarios. It easily integrates with [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics) and can help you create robust datasets for [object detection](../tasks/detect.md), [segmentation](../tasks/segment.md), and [classification](../tasks/classify.md) tasks. -By using Albumentations, you can boost your YOLO11 training data with techniques like geometric transformations and color adjustments. In this article, we’ll see how Albumentations can improve your [data augmentation](../guides/preprocessing_annotated_data.md) process and make your [YOLO11 projects](../solutions/index.md) even more impactful. Let’s get started! +By using Albumentations, you can boost your YOLO11 training data with techniques like geometric transformations and color adjustments. In this article, we'll see how Albumentations can improve your [data augmentation](../guides/preprocessing_annotated_data.md) process and make your [YOLO11 projects](../solutions/index.md) even more impactful. Let's get started! ## Albumentations for Image Augmentation @@ -40,7 +40,7 @@ Albumentations offers many useful features that simplify complex image augmentat With respect to image augmentation, Albumentations stands out as a reliable tool for computer vision tasks. Here are a few key reasons why you should consider using it for your Vision AI projects: -- **Easy-to-Use API**: Albumentations provides a single, straightforward API for applying a wide range of augmentations to images, masks, bounding boxes, and keypoints. It’s designed to adapt easily to different datasets, making [data preparation](../guides/data-collection-and-annotation.md) simpler and more efficient. +- **Easy-to-Use API**: Albumentations provides a single, straightforward API for applying a wide range of augmentations to images, masks, bounding boxes, and keypoints. It's designed to adapt easily to different datasets, making [data preparation](../guides/data-collection-and-annotation.md) simpler and more efficient. - **Rigorous Bug Testing**: Bugs in the augmentation pipeline can silently corrupt input data, often going unnoticed but ultimately degrading model performance. Albumentations addresses this with a thorough test suite that helps catch bugs early in development. @@ -48,11 +48,11 @@ With respect to image augmentation, Albumentations stands out as a reliable tool ## How to Use Albumentations to Augment Data for YOLO11 Training -Now that we’ve covered what Albumentations is and what it can do, let’s look at how to use it to augment your data for YOLO11 model training. It’s easy to set up because it integrates directly into [Ultralytics’ training mode](../modes/train.md) and applies automatically if you have the Albumentations package installed. +Now that we've covered what Albumentations is and what it can do, let's look at how to use it to augment your data for YOLO11 model training. It's easy to set up because it integrates directly into [Ultralytics' training mode](../modes/train.md) and applies automatically if you have the Albumentations package installed. ### Installation -To use Albumentations with YOLOv11, start by making sure you have the necessary packages installed. If Albumentations isn’t installed, the augmentations won’t be applied during training. Once set up, you’ll be ready to create an augmented dataset for training, with Albumentations integrated to enhance your model automatically. +To use Albumentations with YOLOv11, start by making sure you have the necessary packages installed. If Albumentations isn't installed, the augmentations won't be applied during training. Once set up, you'll be ready to create an augmented dataset for training, with Albumentations integrated to enhance your model automatically. !!! tip "Installation" @@ -67,7 +67,7 @@ For detailed instructions and best practices related to the installation process ### Usage -After installing the necessary packages, you’re ready to start using Albumentations with YOLO11. When you train YOLOv11, a set of augmentations is automatically applied through its integration with Albumentations, making it easy to enhance your model’s performance. +After installing the necessary packages, you're ready to start using Albumentations with YOLO11. When you train YOLOv11, a set of augmentations is automatically applied through its integration with Albumentations, making it easy to enhance your model's performance. !!! example "Usage" @@ -83,17 +83,17 @@ After installing the necessary packages, you’re ready to start using Albumenta results = model.train(data="coco8.yaml", epochs=100, imgsz=640) ``` -Next, let’s take look a closer look at the specific augmentations that are applied during training. +Next, let's take look a closer look at the specific augmentations that are applied during training. ### Blur -The Blur transformation in Albumentations applies a simple blur effect to the image by averaging pixel values within a small square area, or kernel. This is done using OpenCV’s `cv2.blur` function, which helps reduce noise in the image, though it also slightly reduces image details. +The Blur transformation in Albumentations applies a simple blur effect to the image by averaging pixel values within a small square area, or kernel. This is done using OpenCV's `cv2.blur` function, which helps reduce noise in the image, though it also slightly reduces image details. Here are the parameters and values used in this integration: - **blur_limit**: This controls the size range of the blur effect. The default range is (3, 7), meaning the kernel size for the blur can vary between 3 and 7 pixels, with only odd numbers allowed to keep the blur centered. -- **p**: The probability of applying the blur. In the integration, p=0.01, so there’s a 1% chance that this blur will be applied to each image. The low probability allows for occasional blur effects, introducing a bit of variation to help the model generalize without over-blurring the images. +- **p**: The probability of applying the blur. In the integration, p=0.01, so there's a 1% chance that this blur will be applied to each image. The low probability allows for occasional blur effects, introducing a bit of variation to help the model generalize without over-blurring the images. An Example of the Blur Augmentation @@ -117,7 +117,7 @@ The ToGray transformation in Albumentations converts an image to grayscale, redu Here are the parameters and values used in this integration: -- **num_output_channels**: Sets the number of channels in the output image. If this value is more than 1, the single grayscale channel will be replicated to create a multi-channel grayscale image. By default, it’s set to 3, giving a grayscale image with three identical channels. +- **num_output_channels**: Sets the number of channels in the output image. If this value is more than 1, the single grayscale channel will be replicated to create a multi-channel grayscale image. By default, it's set to 3, giving a grayscale image with three identical channels. - **method**: Defines the grayscale conversion method. The default method, "weighted_average", applies a formula (0.299R + 0.587G + 0.114B) that closely aligns with human perception, providing a natural-looking grayscale effect. Other options, like "from_lab", "desaturation", "average", "max", and "pca", offer alternative ways to create grayscale images based on various needs for speed, brightness emphasis, or detail preservation. diff --git a/docs/en/reference/cfg/__init__.md b/docs/en/reference/cfg/__init__.md index 69652aa06c..92320b126b 100644 --- a/docs/en/reference/cfg/__init__.md +++ b/docs/en/reference/cfg/__init__.md @@ -47,6 +47,10 @@ keywords: Ultralytics, YOLO, configuration, cfg2dict, get_cfg, check_cfg, save_d



+## ::: ultralytics.cfg.handle_yolo_solutions + +



+ ## ::: ultralytics.cfg.handle_streamlit_inference



diff --git a/docs/en/solutions/index.md b/docs/en/solutions/index.md index e5187ed8d4..4046975de7 100644 --- a/docs/en/solutions/index.md +++ b/docs/en/solutions/index.md @@ -14,21 +14,39 @@ Ultralytics Solutions provide cutting-edge applications of YOLO models, offering Here's our curated list of Ultralytics solutions that can be used to create awesome [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) projects. -- [Object Counting](../guides/object-counting.md) 🚀 NEW: Learn to perform real-time object counting with YOLO11. Gain the expertise to accurately count objects in live video streams. -- [Object Cropping](../guides/object-cropping.md) 🚀 NEW: Master object cropping with YOLO11 for precise extraction of objects from images and videos. -- [Object Blurring](../guides/object-blurring.md) 🚀 NEW: Apply object blurring using YOLO11 to protect privacy in image and video processing. -- [Workouts Monitoring](../guides/workouts-monitoring.md) 🚀 NEW: Discover how to monitor workouts using YOLO11. Learn to track and analyze various fitness routines in real time. -- [Objects Counting in Regions](../guides/region-counting.md) 🚀 NEW: Count objects in specific regions using YOLO11 for accurate detection in varied areas. -- [Security Alarm System](../guides/security-alarm-system.md) 🚀 NEW: Create a security alarm system with YOLO11 that triggers alerts upon detecting new objects. Customize the system to fit your specific needs. -- [Heatmaps](../guides/heatmaps.md) 🚀 NEW: Utilize detection heatmaps to visualize data intensity across a matrix, providing clear insights in computer vision tasks. +- [Object Counting](../guides/object-counting.md) 🚀: Learn to perform real-time object counting with YOLO11. Gain the expertise to accurately count objects in live video streams. +- [Object Cropping](../guides/object-cropping.md) 🚀: Master object cropping with YOLO11 for precise extraction of objects from images and videos. +- [Object Blurring](../guides/object-blurring.md) 🚀: Apply object blurring using YOLO11 to protect privacy in image and video processing. +- [Workouts Monitoring](../guides/workouts-monitoring.md) 🚀: Discover how to monitor workouts using YOLO11. Learn to track and analyze various fitness routines in real time. +- [Objects Counting in Regions](../guides/region-counting.md) 🚀: Count objects in specific regions using YOLO11 for accurate detection in varied areas. +- [Security Alarm System](../guides/security-alarm-system.md) 🚀: Create a security alarm system with YOLO11 that triggers alerts upon detecting new objects. Customize the system to fit your specific needs. +- [Heatmaps](../guides/heatmaps.md) 🚀: Utilize detection heatmaps to visualize data intensity across a matrix, providing clear insights in computer vision tasks. - [Instance Segmentation with Object Tracking](../guides/instance-segmentation-and-tracking.md) 🚀 NEW: Implement [instance segmentation](https://www.ultralytics.com/glossary/instance-segmentation) and object tracking with YOLO11 to achieve precise object boundaries and continuous monitoring. -- [VisionEye View Objects Mapping](../guides/vision-eye.md) 🚀 NEW: Develop systems that mimic human eye focus on specific objects, enhancing the computer's ability to discern and prioritize details. -- [Speed Estimation](../guides/speed-estimation.md) 🚀 NEW: Estimate object speed using YOLO11 and object tracking techniques, crucial for applications like autonomous vehicles and traffic monitoring. -- [Distance Calculation](../guides/distance-calculation.md) 🚀 NEW: Calculate distances between objects using [bounding box](https://www.ultralytics.com/glossary/bounding-box) centroids in YOLO11, essential for spatial analysis. -- [Queue Management](../guides/queue-management.md) 🚀 NEW: Implement efficient queue management systems to minimize wait times and improve productivity using YOLO11. -- [Parking Management](../guides/parking-management.md) 🚀 NEW: Organize and direct vehicle flow in parking areas with YOLO11, optimizing space utilization and user experience. +- [VisionEye View Objects Mapping](../guides/vision-eye.md) 🚀: Develop systems that mimic human eye focus on specific objects, enhancing the computer's ability to discern and prioritize details. +- [Speed Estimation](../guides/speed-estimation.md) 🚀: Estimate object speed using YOLO11 and object tracking techniques, crucial for applications like autonomous vehicles and traffic monitoring. +- [Distance Calculation](../guides/distance-calculation.md) 🚀: Calculate distances between objects using [bounding box](https://www.ultralytics.com/glossary/bounding-box) centroids in YOLO11, essential for spatial analysis. +- [Queue Management](../guides/queue-management.md) 🚀: Implement efficient queue management systems to minimize wait times and improve productivity using YOLO11. +- [Parking Management](../guides/parking-management.md) 🚀: Organize and direct vehicle flow in parking areas with YOLO11, optimizing space utilization and user experience. - [Analytics](../guides/analytics.md) 📊 NEW: Conduct comprehensive data analysis to discover patterns and make informed decisions, leveraging YOLO11 for descriptive, predictive, and prescriptive analytics. -- [Live Inference with Streamlit](../guides/streamlit-live-inference.md) 🚀 NEW: Leverage the power of YOLO11 for real-time [object detection](https://www.ultralytics.com/glossary/object-detection) directly through your web browser with a user-friendly Streamlit interface. +- [Live Inference with Streamlit](../guides/streamlit-live-inference.md) 🚀: Leverage the power of YOLO11 for real-time [object detection](https://www.ultralytics.com/glossary/object-detection) directly through your web browser with a user-friendly Streamlit interface. + +## Solutions Usage + +!!! tip "Command Info" + + `yolo SOLUTIONS SOLUTION_NAME ARGS` + + - **SOLUTIONS** is a required keyword. + - **SOLUTION_NAME** (optional) is one of: `['count', 'heatmap', 'queue', 'speed', 'workout', 'analytics']`. + - **ARGS** (optional) are custom `arg=value` pairs, such as `show_in=True`, to override default settings. + + === "CLI" + + ```bash + yolo solutions count show=True # for object counting + + yolo solutions source="path/to/video/file.mp4" # specify video file path + ``` ## Contribute to Our Solutions diff --git a/docs/mkdocs_github_authors.yaml b/docs/mkdocs_github_authors.yaml index f91a730b87..6d91127d59 100644 --- a/docs/mkdocs_github_authors.yaml +++ b/docs/mkdocs_github_authors.yaml @@ -25,6 +25,9 @@ 17316848+maianumerosky@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/17316848?v=4 username: maianumerosky +25704330+JairajJangle@users.noreply.github.com: + avatar: https://avatars.githubusercontent.com/u/25704330?v=4 + username: JairajJangle 32206511+Y-T-G@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/32206511?v=4 username: Y-T-G @@ -40,6 +43,9 @@ 40165666+berry-ding@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/40165666?v=4 username: berry-ding +44016758+M-Amrollahi@users.noreply.github.com: + avatar: https://avatars.githubusercontent.com/u/44016758?v=4 + username: M-Amrollahi 46103969+inisis@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/46103969?v=4 username: inisis @@ -76,6 +82,9 @@ 79740115+0xSynapse@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/79740115?v=4 username: 0xSynapse +8401806+wangzhaode@users.noreply.github.com: + avatar: https://avatars.githubusercontent.com/u/8401806?v=4 + username: wangzhaode 91465467+lalayants@users.noreply.github.com: avatar: https://avatars.githubusercontent.com/u/91465467?v=4 username: lalayants @@ -97,6 +106,9 @@ ayush.chaurarsia@gmail.com: chr043416@gmail.com: avatar: https://avatars.githubusercontent.com/u/62513924?v=4 username: RizwanMunawar +davis.justin@mssm.org: + avatar: https://avatars.githubusercontent.com/u/23462437?v=4 + username: justincdavis glenn.jocher@ultralytics.com: avatar: https://avatars.githubusercontent.com/u/26833433?v=4 username: glenn-jocher @@ -157,9 +169,3 @@ xinwang614@gmail.com: zhaode.wzd@alibaba-inc.com: avatar: https://avatars.githubusercontent.com/u/8401806?v=4 username: wangzhaode -8401806+wangzhaode@users.noreply.github.com: - avatar: https://avatars.githubusercontent.com/u/8401806?v=4 - username: wangzhaode -davis.justin@mssm.org: - avatar: https://avatars.githubusercontent.com/u/23462437?v=4 - username: justincdavis diff --git a/ultralytics/__init__.py b/ultralytics/__init__.py index e24b210eda..f6b1d2e783 100644 --- a/ultralytics/__init__.py +++ b/ultralytics/__init__.py @@ -1,6 +1,6 @@ # Ultralytics YOLO 🚀, AGPL-3.0 license -__version__ = "8.3.27" +__version__ = "8.3.28" import os diff --git a/ultralytics/cfg/__init__.py b/ultralytics/cfg/__init__.py index 0af93a37d3..c0675620b8 100644 --- a/ultralytics/cfg/__init__.py +++ b/ultralytics/cfg/__init__.py @@ -7,11 +7,15 @@ from pathlib import Path from types import SimpleNamespace from typing import Dict, List, Union +import cv2 + from ultralytics.utils import ( ASSETS, + ASSETS_URL, DEFAULT_CFG, DEFAULT_CFG_DICT, DEFAULT_CFG_PATH, + DEFAULT_SOL_DICT, IS_VSCODE, LOGGER, RANK, @@ -30,6 +34,17 @@ from ultralytics.utils import ( yaml_print, ) +# Define valid solutions +SOLUTION_MAP = { + "count": ("ObjectCounter", "count"), + "heatmap": ("Heatmap", "generate_heatmap"), + "queue": ("QueueManager", "process_queue"), + "speed": ("SpeedEstimator", "estimate_speed"), + "workout": ("AIGym", "monitor"), + "analytics": ("Analytics", "process_data"), + "help": None, +} + # Define valid tasks and modes MODES = {"train", "val", "predict", "export", "track", "benchmark"} TASKS = {"detect", "segment", "classify", "pose", "obb"} @@ -57,6 +72,31 @@ TASK2METRIC = { MODELS = {TASK2MODEL[task] for task in TASKS} ARGV = sys.argv or ["", ""] # sometimes sys.argv = [] +SOLUTIONS_HELP_MSG = f""" + Arguments received: {str(['yolo'] + ARGV[1:])}. Ultralytics 'yolo solutions' usage overview: + + yolo SOLUTIONS SOLUTION ARGS + + Where SOLUTIONS (required) is a keyword + SOLUTION (optional) is one of {list(SOLUTION_MAP.keys())} + ARGS (optional) are any number of custom 'arg=value' pairs like 'show_in=True' that override defaults. + See all ARGS at https://docs.ultralytics.com/usage/cfg or with 'yolo cfg' + + 1. Call object counting solution + yolo solutions count source="path/to/video/file.mp4" region=[(20, 400), (1080, 404), (1080, 360), (20, 360)] + + 2. Call heatmaps solution + yolo solutions heatmap colormap=cv2.COLORMAP_PARAULA model=yolo11n.pt + + 3. Call queue management solution + yolo solutions queue region=[(20, 400), (1080, 404), (1080, 360), (20, 360)] model=yolo11n.pt + + 4. Call workouts monitoring solution for push-ups + yolo solutions workout model=yolo11n-pose.pt kpts=[6, 8, 10] + + 5. Generate analytical graphs + yolo solutions analytics analytics_type="pie" + """ CLI_HELP_MSG = f""" Arguments received: {str(['yolo'] + ARGV[1:])}. Ultralytics 'yolo' commands use the following syntax: @@ -78,19 +118,24 @@ CLI_HELP_MSG = f""" 4. Export a YOLO11n classification model to ONNX format at image size 224 by 128 (no TASK required) yolo export model=yolo11n-cls.pt format=onnx imgsz=224,128 - + 5. Streamlit real-time webcam inference GUI yolo streamlit-predict - - 6. Run special commands: + + 6. Ultralytics solutions usage + yolo solutions count or in {list(SOLUTION_MAP.keys())} source="path/to/video/file.mp4" + + 7. Run special commands: yolo help yolo checks yolo version yolo settings yolo copy-cfg yolo cfg + yolo solutions help Docs: https://docs.ultralytics.com + Solutions: https://docs.ultralytics.com/solutions/ Community: https://community.ultralytics.com GitHub: https://github.com/ultralytics/ultralytics """ @@ -568,6 +613,100 @@ def handle_yolo_settings(args: List[str]) -> None: LOGGER.warning(f"WARNING ⚠️ settings error: '{e}'. Please see {url} for help.") +def handle_yolo_solutions(args: List[str]) -> None: + """ + Processes YOLO solutions arguments and runs the specified computer vision solutions pipeline. + + Args: + args (List[str]): Command-line arguments for configuring and running the Ultralytics YOLO + solutions: https://docs.ultralytics.com/solutions/, It can include solution name, source, + and other configuration parameters. + + Returns: + None: The function processes video frames and saves the output but doesn't return any value. + + Examples: + Run people counting solution with default settings: + >>> handle_yolo_solutions(["count"]) + + Run analytics with custom configuration: + >>> handle_yolo_solutions(["analytics", "conf=0.25", "source=path/to/video/file.mp4"]) + + Notes: + - Default configurations are merged from DEFAULT_SOL_DICT and DEFAULT_CFG_DICT + - Arguments can be provided in the format 'key=value' or as boolean flags + - Available solutions are defined in SOLUTION_MAP with their respective classes and methods + - If an invalid solution is provided, defaults to 'count' solution + - Output videos are saved in 'runs/solution/{solution_name}' directory + - For 'analytics' solution, frame numbers are tracked for generating analytical graphs + - Video processing can be interrupted by pressing 'q' + - Processes video frames sequentially and saves output in .avi format + - If no source is specified, downloads and uses a default sample video + """ + full_args_dict = {**DEFAULT_SOL_DICT, **DEFAULT_CFG_DICT} # arguments dictionary + overrides = {} + + # check dictionary alignment + for arg in merge_equals_args(args): + arg = arg.lstrip("-").rstrip(",") + if "=" in arg: + try: + k, v = parse_key_value_pair(arg) + overrides[k] = v + except (NameError, SyntaxError, ValueError, AssertionError) as e: + check_dict_alignment(full_args_dict, {arg: ""}, e) + elif arg in full_args_dict and isinstance(full_args_dict.get(arg), bool): + overrides[arg] = True + check_dict_alignment(full_args_dict, overrides) # dict alignment + + # Get solution name + if args and args[0] in SOLUTION_MAP: + if args[0] != "help": + s_n = args.pop(0) # Extract the solution name directly + else: + LOGGER.info(SOLUTIONS_HELP_MSG) + else: + LOGGER.warning( + f"⚠️ No valid solution provided. Using default 'count'. Available: {', '.join(SOLUTION_MAP.keys())}" + ) + s_n = "count" # Default solution if none provided + + cls, method = SOLUTION_MAP[s_n] # solution class name, method name and default source + + from ultralytics import solutions # import ultralytics solutions + + solution = getattr(solutions, cls)(IS_CLI=True, **overrides) # get solution class i.e ObjectCounter + process = getattr(solution, method) # get specific function of class for processing i.e, count from ObjectCounter + + cap = cv2.VideoCapture(solution.CFG["source"]) # read the video file + + # extract width, height and fps of the video file, create save directory and initialize video writer + import os # for directory creation + from pathlib import Path + + from ultralytics.utils.files import increment_path # for output directory path update + + w, h, fps = (int(cap.get(x)) for x in (cv2.CAP_PROP_FRAME_WIDTH, cv2.CAP_PROP_FRAME_HEIGHT, cv2.CAP_PROP_FPS)) + if s_n == "analytics": # analytical graphs follow fixed shape for output i.e w=1920, h=1080 + w, h = 1920, 1080 + save_dir = increment_path(Path("runs") / "solutions" / "exp", exist_ok=False) + save_dir.mkdir(parents=True, exist_ok=True) # create the output directory + vw = cv2.VideoWriter(os.path.join(save_dir, "solution.avi"), cv2.VideoWriter_fourcc(*"mp4v"), fps, (w, h)) + + try: # Process video frames + f_n = 0 # frame number, required for analytical graphs + while cap.isOpened(): + success, frame = cap.read() + if not success: + break + frame = process(frame, f_n := f_n + 1) if s_n == "analytics" else process(frame) + vw.write(frame) + if cv2.waitKey(1) & 0xFF == ord("q"): + break + finally: + cap.release() + + def handle_streamlit_inference(): """ Open the Ultralytics Live Inference Streamlit app for real-time object detection. @@ -709,6 +848,7 @@ def entrypoint(debug=""): "logout": lambda: handle_yolo_hub(args), "copy-cfg": copy_default_cfg, "streamlit-predict": lambda: handle_streamlit_inference(), + "solutions": lambda: handle_yolo_solutions(args[1:]), } full_args_dict = {**DEFAULT_CFG_DICT, **{k: None for k in TASKS}, **{k: None for k in MODES}, **special} diff --git a/ultralytics/solutions/ai_gym.py b/ultralytics/solutions/ai_gym.py index 0d131bd9d6..68e3697627 100644 --- a/ultralytics/solutions/ai_gym.py +++ b/ultralytics/solutions/ai_gym.py @@ -19,7 +19,6 @@ class AIGym(BaseSolution): up_angle (float): Angle threshold for considering the 'up' position of an exercise. down_angle (float): Angle threshold for considering the 'down' position of an exercise. kpts (List[int]): Indices of keypoints used for angle calculation. - lw (int): Line width for drawing annotations. annotator (Annotator): Object for drawing annotations on the image. Methods: @@ -51,7 +50,6 @@ class AIGym(BaseSolution): self.up_angle = float(self.CFG["up_angle"]) # Pose up predefined angle to consider up pose self.down_angle = float(self.CFG["down_angle"]) # Pose down predefined angle to consider down pose self.kpts = self.CFG["kpts"] # User selected kpts of workouts storage for further usage - self.lw = self.CFG["line_width"] # Store line_width for usage def monitor(self, im0): """ @@ -84,14 +82,14 @@ class AIGym(BaseSolution): self.stage += ["-"] * new_human # Initialize annotator - self.annotator = Annotator(im0, line_width=self.lw) + self.annotator = Annotator(im0, line_width=self.line_width) # Enumerate over keypoints for ind, k in enumerate(reversed(tracks.keypoints.data)): # Get keypoints and estimate the angle kpts = [k[int(self.kpts[i])].cpu() for i in range(3)] self.angle[ind] = self.annotator.estimate_pose_angle(*kpts) - im0 = self.annotator.draw_specific_points(k, self.kpts, radius=self.lw * 3) + im0 = self.annotator.draw_specific_points(k, self.kpts, radius=self.line_width * 3) # Determine stage and count logic based on angle thresholds if self.angle[ind] < self.down_angle: diff --git a/ultralytics/solutions/solutions.py b/ultralytics/solutions/solutions.py index e43aba6441..20c2ce90b7 100644 --- a/ultralytics/solutions/solutions.py +++ b/ultralytics/solutions/solutions.py @@ -5,7 +5,7 @@ from collections import defaultdict import cv2 from ultralytics import YOLO -from ultralytics.utils import DEFAULT_CFG_DICT, DEFAULT_SOL_DICT, LOGGER +from ultralytics.utils import ASSETS_URL, DEFAULT_CFG_DICT, DEFAULT_SOL_DICT, LOGGER from ultralytics.utils.checks import check_imshow, check_requirements @@ -42,8 +42,12 @@ class BaseSolution: >>> solution.display_output(image) """ - def __init__(self, **kwargs): - """Initializes the BaseSolution class with configuration settings and YOLO model for Ultralytics solutions.""" + def __init__(self, IS_CLI=False, **kwargs): + """ + Initializes the `BaseSolution` class with configuration settings and the YOLO model for Ultralytics solutions. + + IS_CLI (optional): Enables CLI mode if set. + """ check_requirements("shapely>=2.0.0") from shapely.geometry import LineString, Point, Polygon @@ -63,9 +67,20 @@ class BaseSolution: ) # Store line_width for usage # Load Model and store classes names - self.model = YOLO(self.CFG["model"] if self.CFG["model"] else "yolov8n.pt") + if self.CFG["model"] is None: + self.CFG["model"] = "yolo11n.pt" + self.model = YOLO(self.CFG["model"]) self.names = self.model.names + if IS_CLI: # for CLI, download the source and init video writer + if self.CFG["source"] is None: + d_s = "solutions_ci_demo.mp4" if "-pose" not in self.CFG["model"] else "solution_ci_pose_demo.mp4" + LOGGER.warning(f"⚠️ WARNING: source not provided. using default source {ASSETS_URL}/{d_s}") + from ultralytics.utils.downloads import safe_download + + safe_download(f"{ASSETS_URL}/{d_s}") # download source from ultralytics assets + self.CFG["source"] = d_s # set default source + # Initialize environment and region setup self.env_check = check_imshow(warn=True) self.track_history = defaultdict(list) diff --git a/ultralytics/utils/__init__.py b/ultralytics/utils/__init__.py index d9cd96e3c4..a2540c6b85 100644 --- a/ultralytics/utils/__init__.py +++ b/ultralytics/utils/__init__.py @@ -37,6 +37,7 @@ ARGV = sys.argv or ["", ""] # sometimes sys.argv = [] FILE = Path(__file__).resolve() ROOT = FILE.parents[1] # YOLO ASSETS = ROOT / "assets" # default images +ASSETS_URL = "https://github.com/ultralytics/assets/releases/download/v0.0.0" # assets GitHub URL DEFAULT_CFG_PATH = ROOT / "cfg/default.yaml" DEFAULT_SOL_CFG_PATH = ROOT / "cfg/solutions/default.yaml" # Ultralytics solutions yaml path NUM_THREADS = min(8, max(1, os.cpu_count() - 1)) # number of YOLO multiprocessing threads From 8e5db0661c54ea21e2d89aea9f031427ae048350 Mon Sep 17 00:00:00 2001 From: Muhammad Rizwan Munawar Date: Thu, 7 Nov 2024 16:18:49 +0500 Subject: [PATCH 45/46] Docs and CI updates (#17386) Co-authored-by: Glenn Jocher --- .github/workflows/ci.yaml | 2 -- docs/en/guides/workouts-monitoring.md | 2 +- 2 files changed, 1 insertion(+), 3 deletions(-) diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 97b53a306e..9b1c5364a6 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -336,8 +336,6 @@ jobs: results = model.val(imgsz=160) results = model.predict(imgsz=160) results = model.export(format='onnx', imgsz=160) - from ultralytics.cfg import handle_yolo_solutions - handle_yolo_solutions(["show=False"]) " - name: PyTest run: | diff --git a/docs/en/guides/workouts-monitoring.md b/docs/en/guides/workouts-monitoring.md index fac47d7ca1..949b298350 100644 --- a/docs/en/guides/workouts-monitoring.md +++ b/docs/en/guides/workouts-monitoring.md @@ -46,7 +46,7 @@ Monitoring workouts through pose estimation with [Ultralytics YOLO11](https://gi yolo solutions workout source="path/to/video/file.mp4" # Use keypoints for pushups - yolo solutions queue kpts=[6, 8, 10] + yolo solutions workout kpts=[6, 8, 10] ``` === "Python" From 6806f15396432fffb951250d650454b840eb4c28 Mon Sep 17 00:00:00 2001 From: Laughing <61612323+Laughing-q@users.noreply.github.com> Date: Thu, 7 Nov 2024 19:20:54 +0800 Subject: [PATCH 46/46] Fix `model.end2end` assert (#17391) Co-authored-by: UltralyticsAssistant Co-authored-by: Glenn Jocher --- ultralytics/engine/exporter.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ultralytics/engine/exporter.py b/ultralytics/engine/exporter.py index 39d8d400bb..00a7b6c7a7 100644 --- a/ultralytics/engine/exporter.py +++ b/ultralytics/engine/exporter.py @@ -227,7 +227,7 @@ class Exporter: assert not ncnn, "optimize=True not compatible with format='ncnn', i.e. use optimize=False" assert self.device.type == "cpu", "optimize=True not compatible with cuda devices, i.e. use device='cpu'" if self.args.int8 and tflite: - assert not model.end2end, "TFLite INT8 export not supported for end2end models, please use half precision." + assert not getattr(model, "end2end", False), "TFLite INT8 export not supported for end2end models." if edgetpu: if not LINUX: raise SystemError("Edge TPU export only supported on Linux. See https://coral.ai/docs/edgetpu/compiler")