description:Many issues are often related to dependency versions and hardware. Please provide the output of `yolo checks` or `ultralytics.checks()` command to help us diagnose the problem.
description:Many issues are often related to dependency versions and hardware. Please provide the output of `yolo checks` (CLI) or `ultralytics.utils.checks.collect_system_info()` (Python) command to help us diagnose the problem.
placeholder:|
Paste output of `yolo checks` or `ultralytics.checks()` command, i.e.:
Paste output of `yolo checks` (CLI) or `ultralytics.utils.checks.collect_system_info()` (Python) command, i.e.:
```
Ultralytics 8.3.2 🚀 Python-3.11.2 torch-2.4.1 CPU (Apple M3)
@ -55,7 +57,7 @@ See below for a quickstart install and usage examples, and see our [Docs](https:
Pip install the ultralytics package including all [requirements](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) in a [**Python>=3.8**](https://www.python.org/) environment with [**PyTorch>=1.8**](https://pytorch.org/get-started/locally/).
- **mAP<sup>val</sup>** values are for single-model single-scale on [COCO val2017](https://cocodataset.org/) dataset. <br>Reproduce by `yolo val segment data=coco-seg.yaml device=0`
- **Speed** averaged over COCO val images using an [Amazon EC2 P4d](https://aws.amazon.com/ec2/instance-types/p4/) instance. <br>Reproduce by `yolo val segment data=coco-seg.yaml batch=1 device=0|cpu`
- **mAP<sup>val</sup>** values are for single-model single-scale on [COCO val2017](https://cocodataset.org/) dataset. <br>Reproduce by `yolo val segment data=coco.yaml device=0`
- **Speed** averaged over COCO val images using an [Amazon EC2 P4d](https://aws.amazon.com/ec2/instance-types/p4/) instance. <br>Reproduce by `yolo val segment data=coco.yaml batch=1 device=0|cpu`
@ -74,6 +74,7 @@ Pose estimation is a technique used to determine the pose of the object relative
- [COCO8-pose](pose/coco8-pose.md): A smaller dataset for pose estimation tasks, containing a subset of 8 COCO images with human pose annotations.
- [Tiger-pose](pose/tiger-pose.md): A compact dataset consisting of 263 images focused on tigers, annotated with 12 keypoints per tiger for pose estimation tasks.
- [Hand-Keypoints](pose/hand-keypoints.md): A concise dataset featuring over 26,000 images centered on human hands, annotated with 21 keypoints per hand, designed for pose estimation tasks.
- [Dog-pose](pose/dog-pose.md): A comprehensive dataset featuring approximately 6,000 images focused on dogs, annotated with 24 keypoints per dog, tailored for pose estimation tasks.
description: Discover the Dog-Pose dataset for pose detection. Featuring 6,773 training and 1,703 test images, it's a robust dataset for training YOLO11 models.
keywords: Dog-Pose, Ultralytics, pose detection dataset, YOLO11, machine learning, computer vision, training data
---
# Dog-Pose Dataset
## Introduction
The [Ultralytics](https://www.ultralytics.com/) Dog-pose dataset is a high-quality and extensive dataset specifically curated for dog keypoint estimation. With 6,773 training images and 1,703 test images, this dataset provides a solid foundation for training robust pose estimation models. Each annotated image includes 24 keypoints with 3 dimensions per keypoint (x, y, visibility), making it a valuable resource for advanced research and development in computer vision.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It includes paths, keypoint details, and other relevant information. In the case of the Dog-pose dataset, The `dog-pose.yaml` is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dog-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dog-pose.yaml).
!!! example "ultralytics/cfg/datasets/dog-pose.yaml"
```yaml
--8<--"ultralytics/cfg/datasets/dog-pose.yaml"
```
## Usage
To train a YOLO11n-pose model on the Dog-pose dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training)
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the Dog-pose dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the Dog-pose dataset in your research or development work, please cite the following paper:
!!! quote ""
=== "BibTeX"
```bibtex
@inproceedings{khosla2011fgvc,
title={Novel dataset for Fine-Grained Image Categorization},
author={Aditya Khosla and Nityananda Jayadevaprakash and Bangpeng Yao and Li Fei-Fei},
booktitle={First Workshop on Fine-Grained Visual Categorization (FGVC), IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2011}
}
@inproceedings{deng2009imagenet,
title={ImageNet: A Large-Scale Hierarchical Image Database},
author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei},
booktitle={IEEE Computer Vision and Pattern Recognition (CVPR)},
year={2009}
}
```
We would like to acknowledge the Stanford team for creating and maintaining this valuable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about the Dog-pose dataset and its creators, visit the [Stanford Dogs Dataset website](http://vision.stanford.edu/aditya86/ImageNetDogs/).
## FAQ
### What is the Dog-pose dataset, and how is it used with Ultralytics YOLO11?
The Dog-Pose dataset features 6,000 images annotated with 17 keypoints for dog pose estimation. Ideal for training and validating models with [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolo11/), it supports applications like animal behavior analysis and veterinary studies.
### How do I train a YOLO11 model using the Dog-pose dataset in Ultralytics?
To train a YOLO11n-pose model on the Dog-pose dataset for 100 epochs with an image size of 640, follow these examples:
For a comprehensive list of training arguments, refer to the model [Training](../../modes/train.md) page.
### What are the benefits of using the Dog-pose dataset?
The Dog-pose dataset offers several benefits:
**Large and Diverse Dataset**: With 6,000 images, it provides a substantial amount of data covering a wide range of dog poses, breeds, and contexts, enabling robust model training and evaluation.
**Pose-specific Annotations**: Offers detailed annotations for pose estimation, ensuring high-quality data for training pose detection models.
**Real-World Scenarios**: Includes images from varied environments, enhancing the model's ability to generalize to real-world applications.
**Model Performance Improvement**: The diversity and scale of the dataset help improve model accuracy and robustness, particularly for tasks involving fine-grained pose estimation.
For more about its features and usage, see the [Dataset Introduction](#introduction) section.
### How does mosaicing benefit the YOLO11 training process using the Dog-pose dataset?
Mosaicing, as illustrated in the sample images from the Dog-pose dataset, merges multiple images into a single composite, enriching the diversity of objects and scenes in each training batch. This approach enhances the model's capacity to generalize across different object sizes, aspect ratios, and contexts, leading to improved performance. For example images, refer to the [Sample Images and Annotations](#sample-images-and-annotations) section.
### Where can I find the Dog-pose dataset YAML file and how do I use it?
The Dog-pose dataset YAML file can be found [here](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dog-pose.yaml). This file defines the dataset configuration, including paths, classes, and other relevant information. Use this file with the YOLO11 training scripts as mentioned in the [Train Example](#how-do-i-train-a-yolo11-model-using-the-dog-pose-dataset-in-ultralytics) section.
For more FAQs and detailed documentation, visit the [Ultralytics Documentation](https://docs.ultralytics.com/).
The hand-keypoints dataset contains 26,768 images of hands annotated with keypoints, making it suitable for training models like Ultralytics YOLO for pose estimation tasks. The annotations were generated using the Google MediaPipe library, ensuring high [accuracy](https://www.ultralytics.com/glossary/accuracy) and consistency, and the dataset is compatible [Ultralytics YOLO11](https://github.com/ultralytics/ultralytics) formats.
@ -127,6 +127,15 @@ This section outlines the datasets that are compatible with Ultralytics YOLO for
- **Usage**: Great for human hand pose estimation.
- [Read more about Hand Keypoints](hand-keypoints.md)
### Dog-Pose
- **Description**: The Dog Pose dataset contains approximately 6,000 images, providing a diverse and extensive resource for training and validation of dog pose estimation models.
- **Label Format**: Follows the Ultralytics YOLO format, with annotations for multiple keypoints specific to dog anatomy.
- **Number of Classes**: 1 (Dog).
- **Keypoints**: Includes 24 keypoints tailored to dog poses, such as limbs, joints, and head positions.
- **Usage**: Ideal for training models to estimate dog poses in various scenarios, from research to real-world applications.
- [Read more about Dog-Pose](dog-pose.md)
### Adding your own dataset
If you have your own dataset and would like to use it for training pose estimation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
| ![People Counting in Different Region using Ultralytics YOLOv8](https://github.com/ultralytics/docs/releases/download/0/people-counting-different-region-ultralytics-yolov8.avif) | ![Crowd Counting in Different Region using Ultralytics YOLOv8](https://github.com/ultralytics/docs/releases/download/0/crowd-counting-different-region-ultralytics-yolov8.avif) |
| People Counting in Different Region using Ultralytics YOLOv8 | Crowd Counting in Different Region using Ultralytics YOLOv8 |
## Steps to Run
### Step 1: Install Required Libraries
Begin by cloning the Ultralytics repository, installing dependencies, and navigating to the local directory using the provided commands in Step 2.
The Ultralytics region counting module is available in our [examples section](https://github.com/ultralytics/ultralytics/blob/main/examples/YOLOv8-Region-Counter/yolov8_region_counter.py). You can explore this example for code customization and modify it to suit your specific use case.
For more options, visit the [Run Region Counting](#steps-to-run) section.
For more options, visit the [Run Region Counting](https://github.com/ultralytics/ultralytics/blob/main/examples/YOLOv8-Region-Counter/readme.md) section.
### Why should I use Ultralytics YOLOv8 for object counting in regions?
@ -121,7 +130,7 @@ Explore deeper benefits in the [Advantages](#advantages-of-object-counting-in-re
### Can the defined regions be adjusted during video playback?
Yes, with Ultralytics YOLOv8, regions can be interactively moved during video playback. Simply click and drag with the left mouse button to reposition the region. This feature enhances flexibility for dynamic environments. Learn more in the tip section for [movable regions](#step-2-run-region-counting-using-ultralytics-yolov8).
Yes, with Ultralytics YOLOv8, regions can be interactively moved during video playback. Simply click and drag with the left mouse button to reposition the region. This feature enhances flexibility for dynamic environments. Learn more in the tip section for [movable regions](https://github.com/ultralytics/ultralytics/blob/33cdaa5782efb2bc2b5ede945771ba647882830d/examples/YOLOv8-Region-Counter/yolov8_region_counter.py#L39).
### What are some real-world applications of object counting in regions?
@ -40,6 +40,12 @@ Streamlit makes it simple to build and deploy interactive web applications. Comb
!!! example "Streamlit Application"
=== "CLI"
```bash
yolo streamlit-predict
```
=== "Python"
```python
@ -50,12 +56,6 @@ Streamlit makes it simple to build and deploy interactive web applications. Comb
### Make sure to run the file using command `streamlit run <file-name.py>`
```
=== "CLI"
```bash
yolo streamlit-predict
```
This will launch the Streamlit application in your default web browser. You will see the main title, subtitle, and the sidebar with configuration options. Select your desired YOLO11 model, set the confidence and NMS thresholds, and click the "Start" button to begin the real-time object detection.
You can optionally supply a specific model in Python:
@ -153,7 +153,8 @@ Ultralytics collects three primary types of data using Google Analytics:
- **Usage Metrics**: These include how often and in what ways the YOLO Python package is used, preferred features, and typical command-line arguments.
- **System Information**: General non-identifiable information about the computing environments where the package is run.
- **Performance Data**: Metrics related to the performance of models during training, validation, and inference.
This data helps us enhance user experience and optimize software performance. Learn more in the [Anonymized Google Analytics](#anonymized-google-analytics) section.
This data helps us enhance user experience and optimize software performance. Learn more in the [Anonymized Google Analytics](#anonymized-google-analytics) section.
### How can I disable data collection in the Ultralytics YOLO package?
@ -17,7 +17,7 @@ We utilize [Snyk](https://snyk.io/advisor/python/ultralytics) to conduct compreh
Our security strategy includes GitHub's [CodeQL](https://docs.github.com/en/code-security/code-scanning/introduction-to-code-scanning/about-code-scanning-with-codeql) scanning. CodeQL delves deep into our codebase, identifying complex vulnerabilities like SQL injection and XSS by analyzing the code's semantic structure. This advanced level of analysis ensures early detection and resolution of potential security risks.
description: Explore Ultralytics HUB for easy training, analysis, preview, deployment and sharing of custom vision AI models using YOLOv8. Start training today!.
keywords: Ultralytics HUB, YOLOv8, custom AI models, model training, model deployment, model analysis, vision AI
description: Explore Ultralytics HUB for easy training, analysis, preview, deployment and sharing of custom vision AI models using YOLO11. Start training today!.
keywords: Ultralytics HUB, YOLO11, custom AI models, model training, model deployment, model analysis, vision AI
---
# Ultralytics HUB Models
@ -66,7 +66,7 @@ In this step, you have to choose the project in which you want to create your mo
!!! info
You can read more about the available [YOLOv8](https://docs.ultralytics.com/models/yolov8/) (and [YOLOv5](https://docs.ultralytics.com/models/yolov5/)) architectures in our documentation.
You can read more about the available [YOLOmodels](https://docs.ultralytics.com/models/) and architectures in our documentation.
By default, your model will use a pre-trained model (trained on the [COCO](https://docs.ultralytics.com/datasets/detect/coco/) dataset) to reduce training time. You can change this behavior and tweak your model's configuration by opening the **Advanced Model Configuration** accordion.
@ -158,3 +158,42 @@ If you are interested in learning more about Albumentations, check out the follo
In this guide, we explored the key aspects of Albumentations, a great Python library for image augmentation. We discussed its wide range of transformations, optimized performance, and how you can use it in your next YOLO11 project.
Also, if you'd like to know more about other Ultralytics YOLO11 integrations, visit our [integration guide page](../integrations/index.md). You'll find valuable resources and insights there.
## FAQ
### How can I integrate Albumentations with YOLO11 for improved data augmentation?
Albumentations integrates seamlessly with YOLO11 and applies automatically during training if you have the package installed. Here's how to get started:
```python
# Install required packages
# !pip install albumentations ultralytics
from ultralytics import YOLO
# Load and train model with automatic augmentations
model = YOLO("yolo11n.pt")
model.train(data="coco8.yaml", epochs=100)
```
The integration includes optimized augmentations like blur, median blur, grayscale conversion, and CLAHE with carefully tuned probabilities to enhance model performance.
### What are the key benefits of using Albumentations over other augmentation libraries?
Albumentations stands out for several reasons:
1. Performance: Built on OpenCV and NumPy with SIMD optimization for superior speed
2. Flexibility: Supports 70+ transformations across pixel-level, spatial-level, and mixing-level augmentations
3. Compatibility: Works seamlessly with popular frameworks like [PyTorch](../integrations/torchscript.md) and [TensorFlow](../integrations/tensorboard.md)
4. Reliability: Extensive test suite prevents silent data corruption
5. Ease of use: Single unified API for all augmentation types
### What types of computer vision tasks can benefit from Albumentations augmentation?
Albumentations enhances various [computer vision tasks](../tasks/index.md) including:
- [Object Detection](../tasks/detect.md): Improves model robustness to lighting, scale, and orientation variations
- [Instance Segmentation](../tasks/segment.md): Enhances mask prediction accuracy through diverse transformations
- [Classification](../tasks/classify.md): Increases model generalization with color and geometric augmentations
- [Pose Estimation](../tasks/pose.md): Helps models adapt to different viewpoints and lighting conditions
The library's diverse augmentation options make it valuable for any vision task requiring robust model performance.
@ -61,6 +61,8 @@ Welcome to the Ultralytics Integrations page! This page provides an overview of
- [Albumentations](albumentations.md): Enhance your Ultralytics models with powerful image augmentations to improve model robustness and generalization.
- [SONY IMX500](sony-imx500.md): Optimize and deploy [Ultralytics YOLOv8](https://docs.ultralytics.com/models/yolov8/) models on Raspberry Pi AI Cameras with the IMX500 sensor for fast, low-power performance.
## Deployment Integrations
- [CoreML](coreml.md): CoreML, developed by [Apple](https://www.apple.com/), is a framework designed for efficiently integrating machine learning models into applications across iOS, macOS, watchOS, and tvOS, using Apple's hardware for effective and secure [model deployment](https://www.ultralytics.com/glossary/model-deployment).
@ -127,7 +127,8 @@ Kaggle offers unique features that make it an excellent choice:
- **Free Access to TPUs**: Speed up training with powerful TPUs without extra costs.
- **Comprehensive History**: Track changes over time with a detailed history of notebook commits.
- **Resource Availability**: Significant resources are provided for each notebook session, including 12 hours of execution time for CPU and GPU sessions.
For a comparison with Google Colab, refer to our [Google Colab guide](./google-colab.md).
For a comparison with Google Colab, refer to our [Google Colab guide](./google-colab.md).
### How can I revert to a previous version of my Kaggle notebook?
description: Learn to export Ultralytics YOLOv8 models to Sony's IMX500 format to optimize your models for efficient deployment.
keywords: Sony, IMX500, IMX 500, Atrios, MCT, model export, quantization, pruning, deep learning optimization, Raspberry Pi AI Camera, edge AI, PyTorch, IMX
---
# Sony IMX500 Export for Ultralytics YOLOv8
This guide covers exporting and deploying Ultralytics YOLOv8 models to Raspberry Pi AI Cameras that feature the Sony IMX500 sensor.
Deploying computer vision models on devices with limited computational power, such as [Raspberry Pi AI Camera](https://www.raspberrypi.com/products/ai-camera/), can be tricky. Using a model format optimized for faster performance makes a huge difference.
The IMX500 model format is designed to use minimal power while delivering fast performance for neural networks. It allows you to optimize your [Ultralytics YOLOv8](https://github.com/ultralytics/ultralytics) models for high-speed and low-power inferencing. In this guide, we'll walk you through exporting and deploying your models to the IMX500 format while making it easier for your models to perform well on the [Raspberry Pi AI Camera](https://www.raspberrypi.com/products/ai-camera/).
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/assets/releases/download/v8.3.0/ai-camera.avif"alt="Raspberry Pi AI Camera">
</p>
## Why Should You Export to IMX500
Sony's [IMX500 Intelligent Vision Sensor](https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera) is a game-changing piece of hardware in edge AI processing. It's the world's first intelligent vision sensor with on-chip AI capabilities. This sensor helps overcome many challenges in edge AI, including data processing bottlenecks, privacy concerns, and performance limitations.
While other sensors merely pass along images and frames, the IMX500 tells a whole story. It processes data directly on the sensor, allowing devices to generate insights in real-time.
## Sony's IMX500 Export for YOLOv8 Models
The IMX500 is designed to transform how devices handle data directly on the sensor, without needing to send it off to the cloud for processing.
The IMX500 works with quantized models. Quantization makes models smaller and faster without losing much [accuracy](https://www.ultralytics.com/glossary/accuracy). It is ideal for the limited resources of edge computing, allowing applications to respond quickly by reducing latency and allowing for quick data processing locally, without cloud dependency. Local processing also keeps user data private and secure since it's not sent to a remote server.
**IMX500 Key Features:**
- **Metadata Output:** Instead of transmitting images only, the IMX500 can output both image and metadata (inference result), and can output metadata only for minimizing data size, reducing bandwidth, and lowering costs.
- **Addresses Privacy Concerns:** By processing data on the device, the IMX500 addresses privacy concerns, ideal for human-centric applications like person counting and occupancy tracking.
- **Real-time Processing:** Fast, on-sensor processing supports real-time decisions, perfect for edge AI applications such as autonomous systems.
**Before You Begin:** For best results, ensure your YOLOv8 model is well-prepared for export by following our [Model Training Guide](https://docs.ultralytics.com/modes/train/), [Data Preparation Guide](https://docs.ultralytics.com/datasets/), and [Hyperparameter Tuning Guide](https://docs.ultralytics.com/guides/hyperparameter-tuning/).
## Usage Examples
Export an Ultralytics YOLOv8 model to IMX500 format and run inference with the exported model.
!!! note
Here we perform inference just to make sure the model works as expected. However, for deployment and inference on the Raspberry Pi AI Camera, please jump to [Using IMX500 Export in Deployment](#using-imx500-export-in-deployment) section.
!!! example
=== "Python"
```python
from ultralytics import YOLO
# Load a YOLOv8n PyTorch model
model = YOLO("yolov8n.pt")
# Export the model
model.export(format="imx") # exports with PTQ quantization by default
The export process will create an ONNX model for quantization validation, along with a directory named `<model-name>_imx_model`. This directory will include the `packerOut.zip` file, which is essential for packaging the model for deployment on the IMX500 hardware. Additionally, the `<model-name>_imx_model` folder will contain a text file (`labels.txt`) listing all the labels associated with the model.
```bash
yolov8n_imx_model
├── dnnParams.xml
├── labels.txt
├── packerOut.zip
├── yolov8n_imx.onnx
├── yolov8n_imx500_model_MemoryReport.json
└── yolov8n_imx500_model.pbtxt
```
## Arguments
When exporting a model to IMX500 format, you can specify various arguments:
| `int8` | `True` | Enable INT8 quantization for the model (default: `True`) |
| `imgsz` | `640` | Image size for the model input (default: `640`) |
## Using IMX500 Export in Deployment
After exporting Ultralytics YOLOv8n model to IMX500 format, it can be deployed to Raspberry Pi AI Camera for inference.
### Hardware Prerequisites
Make sure you have the below hardware:
1. Raspberry Pi 5 or Raspberry Pi 4 Model B
2. Raspberry Pi AI Camera
Connect the Raspberry Pi AI camera to the 15-pin MIPI CSI connector on the Raspberry Pi and power on the Raspberry Pi
### Software Prerequisites
!!! note
This guide has been tested with Raspberry Pi OS Bookworm running on a Raspberry Pi 5
Step 1: Open a terminal window and execute the following commands to update the Raspberry Pi software to the latest version.
```bash
sudo apt update && sudo apt full-upgrade
```
Step 2: Install IMX500 firmware which is required to operate the IMX500 sensor along with a packager tool.
```bash
sudo apt install imx500-all imx500-tools
```
Step 3: Install prerequisites to run `picamera2` application. We will use this application later for the deployment process.
```bash
sudo apt install python3-opencv python3-munkres
```
Step 4: Reboot Raspberry Pi for the changes to take into effect
```bash
sudo reboot
```
### Package Model and Deploy to AI Camera
After obtaining `packerOut.zip` from the IMX500 conversion process, you can pass this file into the packager tool to obtain an RPK file. This file can then be deployed directly to the AI Camera using `picamera2`.
Then you will be able to see live inference output as follows
<palign="center">
<imgwidth="100%"src="https://github.com/ultralytics/assets/releases/download/v8.3.0/imx500-inference-rpi.avif"alt="Inference on Raspberry Pi AI Camera">
</p>
## Benchmarks
YOLOv8 benchmarks below were run by the Ultralytics team on Raspberry Pi AI Camera with `imx` model format measuring speed and accuracy.
| Model | Format | Status | Size (MB) | mAP50-95(B) | Inference time (ms/im) |
[Sony's Model Compression Toolkit (MCT)](https://github.com/sony/model_optimization) is a powerful tool for optimizing deep learning models through quantization and pruning. It supports various quantization methods and provides advanced algorithms to reduce model size and computational complexity without significantly sacrificing accuracy. MCT is particularly useful for deploying models on resource-constrained devices, ensuring efficient inference and reduced latency.
### Supported Features of MCT
Sony's MCT offers a range of features designed to optimize neural network models:
1. **Graph Optimizations**: Transforms models into more efficient versions by folding layers like batch normalization into preceding layers.
2. **Quantization Parameter Search**: Minimizes quantization noise using metrics like Mean-Square-Error, No-Clipping, and Mean-Average-Error.
MCT also supports various quantization schemes for weights and activations:
1. Power-of-Two (hardware-friendly)
2. Symmetric
3. Uniform
#### Structured Pruning
MCT introduces structured, hardware-aware model pruning designed for specific hardware architectures. This technique leverages the target platform's Single Instruction, Multiple Data (SIMD) capabilities by pruning SIMD groups. This reduces model size and complexity while optimizing channel utilization, aligned with the SIMD architecture for targeted resource utilization of weights memory footprint. Available via Keras and PyTorch APIs.
### IMX500 Converter Tool (Compiler)
The IMX500 Converter Tool is integral to the IMX500 toolset, allowing the compilation of models for deployment on Sony's IMX500 sensor (for instance, Raspberry Pi AI Cameras). This tool facilitates the transition of Ultralytics YOLOv8 models processed through Ultralytics software, ensuring they are compatible and perform efficiently on the specified hardware. The export procedure following model quantization involves the generation of binary files that encapsulate essential data and device-specific configurations, streamlining the deployment process on the Raspberry Pi AI Camera.
## Real-World Use Cases
Export to IMX500 format has wide applicability across industries. Here are some examples:
- **Edge AI and IoT**: Enable object detection on drones or security cameras, where real-time processing on low-power devices is essential.
- **Wearable Devices**: Deploy models optimized for small-scale AI processing on health-monitoring wearables.
- **Smart Cities**: Use IMX500-exported YOLOv8 models for traffic monitoring and safety analysis with faster processing and minimal latency.
- **Retail Analytics**: Enhance in-store monitoring by deploying optimized models in point-of-sale systems or smart shelves.
## Conclusion
Exporting Ultralytics YOLOv8 models to Sony's IMX500 format allows you to deploy your models for efficient inference on IMX500-based cameras. By leveraging advanced quantization techniques, you can reduce model size and improve inference speed without significantly compromising accuracy.
For more information and detailed guidelines, refer to Sony's [IMX500 website](https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera).
## FAQ
### How do I export a YOLOv8 model to IMX500 format for Raspberry Pi AI Camera?
To export a YOLOv8 model to IMX500 format, use either the Python API or CLI command:
```python
from ultralytics import YOLO
model = YOLO("yolov8n.pt")
model.export(format="imx") # Exports with PTQ quantization by default
```
The export process will create a directory containing the necessary files for deployment, including `packerOut.zip` which can be used with the IMX500 packager tool on Raspberry Pi.
### What are the key benefits of using the IMX500 format for edge AI deployment?
The IMX500 format offers several important advantages for edge deployment:
- On-chip AI processing reduces latency and power consumption
- Outputs both image and metadata (inference result) instead of images only
- Enhanced privacy by processing data locally without cloud dependency
- Real-time processing capabilities ideal for time-sensitive applications
- Optimized quantization for efficient model deployment on resource-constrained devices
### What hardware and software prerequisites are needed for IMX500 deployment?
For deploying IMX500 models, you'll need:
Hardware:
- Raspberry Pi 5 or Raspberry Pi 4 Model B
- Raspberry Pi AI Camera with IMX500 sensor
Software:
- Raspberry Pi OS Bookworm
- IMX500 firmware and tools (`sudo apt install imx500-all imx500-tools`)
- Python packages for `picamera2` (`sudo apt install python3-opencv python3-munkres`)
### What performance can I expect from YOLOv8 models on the IMX500?
Based on Ultralytics benchmarks on Raspberry Pi AI Camera:
- YOLOv8n achieves 66.66ms inference time per image
- mAP50-95 of 0.522 on COCO8 dataset
- Model size of only 2.9MB after quantization
This demonstrates that IMX500 format provides efficient real-time inference while maintaining good accuracy for edge AI applications.
### How do I package and deploy my exported model to the Raspberry Pi AI Camera?
@ -127,11 +127,11 @@ The arguments provided when using [export](../modes/export.md) for an Ultralytic
- Adjust the `workspace` value according to your calibration needs and resource availability. While a larger `workspace` may increase calibration time, it allows TensorRT to explore a wider range of optimization tactics, potentially enhancing model performance and [accuracy](https://www.ultralytics.com/glossary/accuracy). Conversely, a smaller `workspace` can reduce calibration time but may limit the optimization strategies, affecting the quality of the quantized model.
- Default is `workspace=4` (GiB), this value may need to be increased if calibration crashes (exits without warning).
- Default is `workspace=None`, which will allow for TensorRT to automatically allocate memory, when configuring manually, this value may need to be increased if calibration crashes (exits without warning).
- TensorRT will report `UNSUPPORTED_STATE` during export if the value for `workspace` is larger than the memory available to the device, which means the value for `workspace` should be lowered.
- TensorRT will report `UNSUPPORTED_STATE` during export if the value for `workspace` is larger than the memory available to the device, which means the value for `workspace` should be lowered or set to `None`.
- If `workspace` is set to max value and calibration fails/crashes, consider reducing the values for `imgsz` and `batch` to reduce memory requirements.
- If `workspace` is set to max value and calibration fails/crashes, consider using `None` for auto-allocation or by reducing the values for `imgsz` and `batch` to reduce memory requirements.
- <u><b>Remember</b> calibration for INT8 is specific to each device</u>, borrowing a "high-end" GPU for calibration, might result in poor performance when inference is run on another device.
| `format` | `str` | `'torchscript'` | Target format for the exported model, such as `'onnx'`, `'torchscript'`, `'tensorflow'`, or others, defining compatibility with various deployment environments. |
| `imgsz` | `int` or `tuple` | `640` | Desired image size for the model input. Can be an integer for square images or a tuple `(height, width)` for specific dimensions. |
| `keras` | `bool` | `False` | Enables export to Keras format for [TensorFlow](https://www.ultralytics.com/glossary/tensorflow) SavedModel, providing compatibility with TensorFlow serving and APIs. |
@ -9,7 +9,7 @@
| `dynamic` | `bool` | `False` | Allows dynamic input sizes for ONNX, TensorRT and OpenVINO exports, enhancing flexibility in handling varying image dimensions. |
| `simplify` | `bool` | `True` | Simplifies the model graph for ONNX exports with `onnxslim`, potentially improving performance and compatibility. |
| `opset` | `int` | `None` | Specifies the ONNX opset version for compatibility with different ONNX parsers and runtimes. If not set, uses the latest supported version. |
| `workspace` | `float` | `4.0` | Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance. |
| `workspace` | `float`or `None` | `None` | Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance; use `None` for auto-allocation by TensorRT up to device maximum. |
| `nms` | `bool` | `False` | Adds Non-Maximum Suppression (NMS) to the CoreML export, essential for accurate and efficient detection post-processing. |
| `batch` | `int` | `1` | Specifies export model batch inference size or the max number of images the exported model will process concurrently in `predict` mode. |
| `device` | `str` | `None` | Specifies the device for exporting: GPU (`device=0`), CPU (`device=cpu`), MPS for Apple silicon (`device=mps`) or DLA for NVIDIA Jetson (`device=dla:0` or `device=dla:1`). |
| `augment` | `bool` | `False` | Enables test-time augmentation (TTA) for predictions, potentially improving detection robustness at the cost of inference speed. |
| `agnostic_nms` | `bool` | `False` | Enables class-agnostic Non-Maximum Suppression (NMS), which merges overlapping boxes of different classes. Useful in multi-class detection scenarios where class overlap is common. |
| `classes` | `list[int]` | `None` | Filters predictions to a set of class IDs. Only detections belonging to the specified classes will be returned. Useful for focusing on relevant objects in multi-class detection tasks. |
| `retina_masks` | `bool` | `False` | Uses high-resolution segmentation masks if available in the model. This can enhance mask quality for segmentation tasks, providing finer detail. |
| `retina_masks` | `bool` | `False` | Returns high-resolution segmentation masks. The returned masks (`masks.data`) will match the original image size if enabled. If disabled, they have the image size used during inference. |
| `embed` | `list[int]` | `None` | Specifies the layers from which to extract feature vectors or [embeddings](https://www.ultralytics.com/glossary/embeddings). Useful for downstream tasks like clustering or similarity search. |
| `project` | `str` | `None` | Name of the project directory where prediction outputs are saved if `save` is enabled. |
| `name` | `str` | `None` | Name of the prediction run. Used for creating a subdirectory within the project folder, where prediction outputs are stored if `save` is enabled. |
| `exist_ok` | `False` | If True, allows overwriting of an existing project/name directory. Useful for iterative experimentation without needing to manually clear previous outputs. |
| `pretrained` | `True` | Determines whether to start training from a pretrained model. Can be a boolean value or a string path to a specific model from which to load weights. Enhances training efficiency and model performance. |
| `optimizer` | `'auto'` | Choice of optimizer for training. Options include `SGD`, `Adam`, `AdamW`, `NAdam`, `RAdam`, `RMSProp` etc., or `auto` for automatic selection based on model configuration. Affects convergence speed and stability. |
| `verbose` | `False` | Enables verbose output during training, providing detailed logs and progress updates. Useful for debugging and closely monitoring the training process. |
| `seed` | `0` | Sets the random seed for training, ensuring reproducibility of results across runs with the same configurations. |
| `deterministic` | `True` | Forces deterministic algorithm use, ensuring reproducibility but may affect performance and speed due to the restriction on non-deterministic algorithms. |
| `single_cls` | `False` | Treats all classes in multi-class datasets as a single class during training. Useful for binary classification tasks or when focusing on object presence rather than classification. |
@ -41,7 +40,6 @@
| `dfl` | `1.5` | Weight of the distribution focal loss, used in certain YOLO versions for fine-grained classification. |
| `pose` | `12.0` | Weight of the pose loss in models trained for pose estimation, influencing the emphasis on accurately predicting pose keypoints. |
| `kobj` | `2.0` | Weight of the keypoint objectness loss in pose estimation models, balancing detection confidence with pose accuracy. |
| `label_smoothing` | `0.0` | Applies label smoothing, softening hard labels to a mix of the target label and a uniform distribution over labels, can improve generalization. |
| `nbs` | `64` | Nominal batch size for normalization of loss. |
| `overlap_mask` | `True` | Determines whether object masks should be merged into a single mask for training, or kept separate for each object. In case of overlap, the smaller mask is overlayed on top of the larger mask during merge. |
| `mask_ratio` | `4` | Downsample ratio for segmentation masks, affecting the resolution of masks used during training. |
| `device` | `str` | `None` | Specifies the device for validation (`cpu`, `cuda:0`, etc.). Allows flexibility in utilizing CPU or GPU resources. |
| `dnn` | `bool` | `False` | If `True`, uses the [OpenCV](https://www.ultralytics.com/glossary/opencv) DNN module for ONNX model inference, offering an alternative to [PyTorch](https://www.ultralytics.com/glossary/pytorch) inference methods. |
| `plots` | `bool` | `False` | When set to `True`, generates and saves plots of predictions versus ground truth for visual evaluation of the model's performance. |
| `rect` | `bool` | `False` | If `True`, uses rectangular inference for batching, reducing padding and potentially increasing speed and efficiency. |
| `rect` | `bool` | `True` | If `True`, uses rectangular inference for batching, reducing padding and potentially increasing speed and efficiency. |
| `split` | `str` | `val` | Determines the dataset split to use for validation (`val`, `test`, or `train`). Allows flexibility in choosing the data segment for performance evaluation. |
| `project` | `str` | `None` | Name of the project directory where validation outputs are saved. |
| `name` | `str` | `None` | Name of the validation run. Used for creating a subdirectory within the project folder, where valdiation logs and outputs are stored. |
@ -149,7 +149,8 @@ YOLO-NAS introduces several key features that make it a superior choice for obje
- **Quantization-Friendly Basic Block:** Enhanced architecture that improves model performance with minimal [precision](https://www.ultralytics.com/glossary/precision) drop post quantization.
- **Sophisticated Training and Quantization:** Employs advanced training schemes and post-training quantization techniques.
- **AutoNAC Optimization and Pre-training:** Utilizes AutoNAC optimization and is pre-trained on prominent datasets like COCO, Objects365, and Roboflow 100.
These features contribute to its high accuracy, efficient performance, and suitability for deployment in production environments. Learn more in the [Key Features](#key-features) section.
These features contribute to its high accuracy, efficient performance, and suitability for deployment in production environments. Learn more in the [Key Features](#key-features) section.
### Which tasks and modes are supported by YOLO-NAS models?
@ -130,7 +130,7 @@ Note that the example below is for YOLO11 [Detect](../tasks/detect.md) models fo
!!! tip "Ultralytics YOLO11 Publication"
Ultralytics has not published a formal research paper for YOLO11 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com).
Ultralytics has not published a formal research paper for YOLO11 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com/).
If you use YOLO11 or any other software from this repository in your work, please cite it using the following format:
@ -94,7 +94,7 @@ This example provides simple YOLOv5 training and inference examples. For full do
!!! tip "Ultralytics YOLOv5 Publication"
Ultralytics has not published a formal research paper for YOLOv5 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com).
Ultralytics has not published a formal research paper for YOLOv5 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com/).
If you use YOLOv5 or YOLOv5u in your research, please cite the Ultralytics YOLOv5 repository as follows:
@ -151,4 +151,5 @@ YOLOv7 offers several key features that revolutionize real-time object detection
- **Dynamic Label Assignment**: Uses a coarse-to-fine lead guided method to assign dynamic targets for outputs across different branches, improving accuracy.
- **Extended and Compound Scaling**: Efficiently utilizes parameters and computation to scale the model for various real-time applications.
- **Efficiency**: Reduces parameter count by 40% and computation by 50% compared to other state-of-the-art models while achieving faster inference speeds.
For further details on these features, see the [YOLOv7 Overview](#overview) section.
For further details on these features, see the [YOLOv7 Overview](#overview) section.
@ -167,7 +167,7 @@ Note the below example is for YOLOv8 [Detect](../tasks/detect.md) models for obj
!!! tip "Ultralytics YOLOv8 Publication"
Ultralytics has not published a formal research paper for YOLOv8 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com).
Ultralytics has not published a formal research paper for YOLOv8 due to the rapidly evolving nature of the models. We focus on advancing the technology and making it easier to use, rather than producing static documentation. For the most up-to-date information on YOLO architecture, features, and usage, please refer to our [GitHub repository](https://github.com/ultralytics/ultralytics) and [documentation](https://docs.ultralytics.com/).
If you use the YOLOv8 model or any other software from this repository in your work, please cite it using the following format:
<imgwidth="1024"src="https://github.com/ultralytics/docs/releases/download/0/ultralytics-yolov8-ecosystem-integrations.avif"alt="Ultralytics YOLO ecosystem and integrations">
@ -102,7 +113,7 @@ Arguments such as `model`, `data`, `imgsz`, `half`, `device`, and `verbose` prov
| `imgsz` | `640` | The input image size for the model. Can be a single integer for square images or a tuple `(width, height)` for non-square, e.g., `(640, 480)`. |
| `half` | `False` | Enables FP16 (half-precision) inference, reducing memory usage and possibly increasing speed on compatible hardware. Use `half=True` to enable. |
| `int8` | `False` | Activates INT8 quantization for further optimized performance on supported devices, especially useful for edge devices. Set `int8=True` to use. |
| `device` | `None` | Defines the computation device(s) for benchmarking, such as `"cpu"`, `"cuda:0"`, or a list of devices like `"cuda:0,1"` for multi-GPU setups. |
| `device` | `None` | Defines the computation device(s) for benchmarking, such as `"cpu"` or `"cuda:0"`. |
| `verbose` | `False` | Controls the level of detail in logging output. A boolean value; set `verbose=True` for detailed logs or a float for thresholding errors. |
## Export Formats
@ -145,7 +156,8 @@ Exporting YOLO11 models to different formats such as ONNX, TensorRT, and OpenVIN
- **ONNX:** Provides up to 3x CPU speedup.
- **TensorRT:** Offers up to 5x GPU speedup.
- **OpenVINO:** Specifically optimized for Intel hardware.
These formats enhance both the speed and accuracy of your models, making them more efficient for various real-world applications. Visit the [Export](../modes/export.md) page for complete details.
These formats enhance both the speed and accuracy of your models, making them more efficient for various real-world applications. Visit the [Export](../modes/export.md) page for complete details.
### Why is benchmarking crucial in evaluating YOLO11 models?
@ -155,7 +167,8 @@ Benchmarking your YOLO11 models is essential for several reasons:
- **Resource Allocation:** Gauge the performance across different hardware options.
- **Optimization:** Determine which export format offers the best performance for specific use cases.
- **Cost Efficiency:** Optimize hardware usage based on benchmark results.
Key metrics such as mAP50-95, Top-5 accuracy, and inference time help in making these evaluations. Refer to the [Key Metrics](#key-metrics-in-benchmark-mode) section for more information.
Key metrics such as mAP50-95, Top-5 accuracy, and inference time help in making these evaluations. Refer to the [Key Metrics](#key-metrics-in-benchmark-mode) section for more information.
### Which export formats are supported by YOLO11, and what are their advantages?
@ -165,7 +178,8 @@ YOLO11 supports a variety of export formats, each tailored for specific hardware
- **TensorRT:** Ideal for GPU efficiency.
- **OpenVINO:** Optimized for Intel hardware.
- **CoreML & [TensorFlow](https://www.ultralytics.com/glossary/tensorflow):** Useful for iOS and general ML applications.
For a complete list of supported formats and their respective advantages, check out the [Supported Export Formats](#supported-export-formats) section.
For a complete list of supported formats and their respective advantages, check out the [Supported Export Formats](#supported-export-formats) section.
### What arguments can I use to fine-tune my YOLO11 benchmarks?
@ -178,4 +192,5 @@ When running benchmarks, several arguments can be customized to suit specific ne
- **int8:** Activate INT8 quantization for edge devices.
- **device:** Specify the computation device (e.g., "cpu", "cuda:0").
- **verbose:** Control the level of logging detail.
For a full list of arguments, refer to the [Arguments](#arguments) section.
For a full list of arguments, refer to the [Arguments](#arguments) section.
@ -28,7 +28,7 @@ Ultralytics provides various installation methods including pip, conda, and Dock
Install the `ultralytics` package using pip, or update an existing installation by running `pip install -U ultralytics`. Visit the Python Package Index (PyPI) for more details on the `ultralytics` package: [https://pypi.org/project/ultralytics/](https://pypi.org/project/ultralytics/).
# Reference for `ultralytics/solutions/region_counter.py`
!!! note
This file is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/solutions/region_counter.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/solutions/region_counter.py). If you spot a problem please help fix it by [contributing](https://docs.ultralytics.com/help/contributing/) a [Pull Request](https://github.com/ultralytics/ultralytics/edit/main/ultralytics/solutions/region_counter.py) 🛠️. Thank you 🙏!
@ -36,8 +36,8 @@ YOLO11 pretrained Segment models are shown here. Detect, Segment and Pose models
{% include "macros/yolo-seg-perf.md" %}
- **mAP<sup>val</sup>** values are for single-model single-scale on [COCO val2017](https://cocodataset.org/) dataset. <br>Reproduce by `yolo val segment data=coco-seg.yaml device=0`
- **Speed** averaged over COCO val images using an [Amazon EC2 P4d](https://aws.amazon.com/ec2/instance-types/p4/) instance. <br>Reproduce by `yolo val segment data=coco-seg.yaml batch=1 device=0|cpu`
- **mAP<sup>val</sup>** values are for single-model single-scale on [COCO val2017](https://cocodataset.org/) dataset. <br>Reproduce by `yolo val segment data=coco.yaml device=0`
- **Speed** averaged over COCO val images using an [Amazon EC2 P4d](https://aws.amazon.com/ec2/instance-types/p4/) instance. <br>Reproduce by `yolo val segment data=coco.yaml batch=1 device=0|cpu`
"Pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.\n",
"Pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.\n",
"Pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.\n",
"Pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.\n",
"Pip install `ultralytics` and [dependencies](https://github.com/ultralytics/ultralytics/blob/main/pyproject.toml) and check software and hardware.\n",