commit
653508d64f
81 changed files with 1333 additions and 727 deletions
@ -0,0 +1,141 @@ |
||||
--- |
||||
comments: true |
||||
description: Discover the Dog-Pose dataset for pose detection. Featuring 6,773 training and 1,703 test images, it's a robust dataset for training YOLO11 models. |
||||
keywords: Dog-Pose, Ultralytics, pose detection dataset, YOLO11, machine learning, computer vision, training data |
||||
--- |
||||
|
||||
# Dog-Pose Dataset |
||||
|
||||
## Introduction |
||||
|
||||
The [Ultralytics](https://www.ultralytics.com/) Dog-pose dataset is a high-quality and extensive dataset specifically curated for dog keypoint estimation. With 6,773 training images and 1,703 test images, this dataset provides a solid foundation for training robust pose estimation models. Each annotated image includes 24 keypoints with 3 dimensions per keypoint (x, y, visibility), making it a valuable resource for advanced research and development in computer vision. |
||||
|
||||
<img src="https://github.com/ultralytics/docs/releases/download/0/ultralytics-dogs.avif" alt="Ultralytics Dog-pose display image" width="800"> |
||||
|
||||
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com/) and [YOLO11](https://github.com/ultralytics/ultralytics). |
||||
|
||||
## Dataset YAML |
||||
|
||||
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It includes paths, keypoint details, and other relevant information. In the case of the Dog-pose dataset, The `dog-pose.yaml` is available at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dog-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dog-pose.yaml). |
||||
|
||||
!!! example "ultralytics/cfg/datasets/dog-pose.yaml" |
||||
|
||||
```yaml |
||||
--8<-- "ultralytics/cfg/datasets/dog-pose.yaml" |
||||
``` |
||||
|
||||
## Usage |
||||
|
||||
To train a YOLO11n-pose model on the Dog-pose dataset for 100 [epochs](https://www.ultralytics.com/glossary/epoch) with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page. |
||||
|
||||
!!! example "Train Example" |
||||
|
||||
=== "Python" |
||||
|
||||
```python |
||||
from ultralytics import YOLO |
||||
|
||||
# Load a model |
||||
model = YOLO("yolo11n-pose.pt") # load a pretrained model (recommended for training) |
||||
|
||||
# Train the model |
||||
results = model.train(data="dog-pose.yaml", epochs=100, imgsz=640) |
||||
``` |
||||
|
||||
=== "CLI" |
||||
|
||||
```bash |
||||
# Start training from a pretrained *.pt model |
||||
yolo pose train data=dog-pose.yaml model=yolo11n-pose.pt epochs=100 imgsz=640 |
||||
``` |
||||
|
||||
## Sample Images and Annotations |
||||
|
||||
Here are some examples of images from the Dog-pose dataset, along with their corresponding annotations: |
||||
|
||||
<img src="https://github.com/ultralytics/docs/releases/download/0/mosaiced-training-batch-2-dog-pose.avif" alt="Dataset sample image" width="800"> |
||||
|
||||
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts. |
||||
|
||||
The example showcases the variety and complexity of the images in the Dog-pose dataset and the benefits of using mosaicing during the training process. |
||||
|
||||
## Citations and Acknowledgments |
||||
|
||||
If you use the Dog-pose dataset in your research or development work, please cite the following paper: |
||||
|
||||
!!! quote "" |
||||
|
||||
=== "BibTeX" |
||||
|
||||
```bibtex |
||||
@inproceedings{khosla2011fgvc, |
||||
title={Novel dataset for Fine-Grained Image Categorization}, |
||||
author={Aditya Khosla and Nityananda Jayadevaprakash and Bangpeng Yao and Li Fei-Fei}, |
||||
booktitle={First Workshop on Fine-Grained Visual Categorization (FGVC), IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}, |
||||
year={2011} |
||||
} |
||||
@inproceedings{deng2009imagenet, |
||||
title={ImageNet: A Large-Scale Hierarchical Image Database}, |
||||
author={Jia Deng and Wei Dong and Richard Socher and Li-Jia Li and Kai Li and Li Fei-Fei}, |
||||
booktitle={IEEE Computer Vision and Pattern Recognition (CVPR)}, |
||||
year={2009} |
||||
} |
||||
``` |
||||
|
||||
We would like to acknowledge the Stanford team for creating and maintaining this valuable resource for the [computer vision](https://www.ultralytics.com/glossary/computer-vision-cv) community. For more information about the Dog-pose dataset and its creators, visit the [Stanford Dogs Dataset website](http://vision.stanford.edu/aditya86/ImageNetDogs/). |
||||
|
||||
## FAQ |
||||
|
||||
### What is the Dog-pose dataset, and how is it used with Ultralytics YOLO11? |
||||
|
||||
The Dog-Pose dataset features 6,000 images annotated with 17 keypoints for dog pose estimation. Ideal for training and validating models with [Ultralytics YOLO11](https://docs.ultralytics.com/models/yolo11/), it supports applications like animal behavior analysis and veterinary studies. |
||||
|
||||
### How do I train a YOLO11 model using the Dog-pose dataset in Ultralytics? |
||||
|
||||
To train a YOLO11n-pose model on the Dog-pose dataset for 100 epochs with an image size of 640, follow these examples: |
||||
|
||||
!!! example "Train Example" |
||||
|
||||
=== "Python" |
||||
|
||||
```python |
||||
from ultralytics import YOLO |
||||
|
||||
# Load a model |
||||
model = YOLO("yolo11n-pose.pt") |
||||
|
||||
# Train the model |
||||
results = model.train(data="dog-pose.yaml", epochs=100, imgsz=640) |
||||
``` |
||||
|
||||
=== "CLI" |
||||
|
||||
```bash |
||||
yolo pose train data=dog-pose.yaml model=yolo11n-pose.pt epochs=100 imgsz=640 |
||||
``` |
||||
|
||||
For a comprehensive list of training arguments, refer to the model [Training](../../modes/train.md) page. |
||||
|
||||
### What are the benefits of using the Dog-pose dataset? |
||||
|
||||
The Dog-pose dataset offers several benefits: |
||||
|
||||
**Large and Diverse Dataset**: With 6,000 images, it provides a substantial amount of data covering a wide range of dog poses, breeds, and contexts, enabling robust model training and evaluation. |
||||
|
||||
**Pose-specific Annotations**: Offers detailed annotations for pose estimation, ensuring high-quality data for training pose detection models. |
||||
|
||||
**Real-World Scenarios**: Includes images from varied environments, enhancing the model's ability to generalize to real-world applications. |
||||
|
||||
**Model Performance Improvement**: The diversity and scale of the dataset help improve model accuracy and robustness, particularly for tasks involving fine-grained pose estimation. |
||||
|
||||
For more about its features and usage, see the [Dataset Introduction](#introduction) section. |
||||
|
||||
### How does mosaicing benefit the YOLO11 training process using the Dog-pose dataset? |
||||
|
||||
Mosaicing, as illustrated in the sample images from the Dog-pose dataset, merges multiple images into a single composite, enriching the diversity of objects and scenes in each training batch. This approach enhances the model's capacity to generalize across different object sizes, aspect ratios, and contexts, leading to improved performance. For example images, refer to the [Sample Images and Annotations](#sample-images-and-annotations) section. |
||||
|
||||
### Where can I find the Dog-pose dataset YAML file and how do I use it? |
||||
|
||||
The Dog-pose dataset YAML file can be found [here](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/dog-pose.yaml). This file defines the dataset configuration, including paths, classes, and other relevant information. Use this file with the YOLO11 training scripts as mentioned in the [Train Example](#how-do-i-train-a-yolo11-model-using-the-dog-pose-dataset-in-ultralytics) section. |
||||
|
||||
For more FAQs and detailed documentation, visit the [Ultralytics Documentation](https://docs.ultralytics.com/). |
@ -0,0 +1,325 @@ |
||||
--- |
||||
comments: true |
||||
description: Learn to export Ultralytics YOLOv8 models to Sony's IMX500 format to optimize your models for efficient deployment. |
||||
keywords: Sony, IMX500, IMX 500, Atrios, MCT, model export, quantization, pruning, deep learning optimization, Raspberry Pi AI Camera, edge AI, PyTorch, IMX |
||||
--- |
||||
|
||||
# Sony IMX500 Export for Ultralytics YOLOv8 |
||||
|
||||
This guide covers exporting and deploying Ultralytics YOLOv8 models to Raspberry Pi AI Cameras that feature the Sony IMX500 sensor. |
||||
|
||||
Deploying computer vision models on devices with limited computational power, such as [Raspberry Pi AI Camera](https://www.raspberrypi.com/products/ai-camera/), can be tricky. Using a model format optimized for faster performance makes a huge difference. |
||||
|
||||
The IMX500 model format is designed to use minimal power while delivering fast performance for neural networks. It allows you to optimize your [Ultralytics YOLOv8](https://github.com/ultralytics/ultralytics) models for high-speed and low-power inferencing. In this guide, we'll walk you through exporting and deploying your models to the IMX500 format while making it easier for your models to perform well on the [Raspberry Pi AI Camera](https://www.raspberrypi.com/products/ai-camera/). |
||||
|
||||
<p align="center"> |
||||
<img width="100%" src="https://github.com/ultralytics/assets/releases/download/v8.3.0/ai-camera.avif" alt="Raspberry Pi AI Camera"> |
||||
</p> |
||||
|
||||
## Why Should You Export to IMX500 |
||||
|
||||
Sony's [IMX500 Intelligent Vision Sensor](https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera) is a game-changing piece of hardware in edge AI processing. It's the world's first intelligent vision sensor with on-chip AI capabilities. This sensor helps overcome many challenges in edge AI, including data processing bottlenecks, privacy concerns, and performance limitations. |
||||
While other sensors merely pass along images and frames, the IMX500 tells a whole story. It processes data directly on the sensor, allowing devices to generate insights in real-time. |
||||
|
||||
## Sony's IMX500 Export for YOLOv8 Models |
||||
|
||||
The IMX500 is designed to transform how devices handle data directly on the sensor, without needing to send it off to the cloud for processing. |
||||
|
||||
The IMX500 works with quantized models. Quantization makes models smaller and faster without losing much [accuracy](https://www.ultralytics.com/glossary/accuracy). It is ideal for the limited resources of edge computing, allowing applications to respond quickly by reducing latency and allowing for quick data processing locally, without cloud dependency. Local processing also keeps user data private and secure since it's not sent to a remote server. |
||||
|
||||
**IMX500 Key Features:** |
||||
|
||||
- **Metadata Output:** Instead of transmitting images only, the IMX500 can output both image and metadata (inference result), and can output metadata only for minimizing data size, reducing bandwidth, and lowering costs. |
||||
- **Addresses Privacy Concerns:** By processing data on the device, the IMX500 addresses privacy concerns, ideal for human-centric applications like person counting and occupancy tracking. |
||||
- **Real-time Processing:** Fast, on-sensor processing supports real-time decisions, perfect for edge AI applications such as autonomous systems. |
||||
|
||||
**Before You Begin:** For best results, ensure your YOLOv8 model is well-prepared for export by following our [Model Training Guide](https://docs.ultralytics.com/modes/train/), [Data Preparation Guide](https://docs.ultralytics.com/datasets/), and [Hyperparameter Tuning Guide](https://docs.ultralytics.com/guides/hyperparameter-tuning/). |
||||
|
||||
## Usage Examples |
||||
|
||||
Export an Ultralytics YOLOv8 model to IMX500 format and run inference with the exported model. |
||||
|
||||
!!! note |
||||
|
||||
Here we perform inference just to make sure the model works as expected. However, for deployment and inference on the Raspberry Pi AI Camera, please jump to [Using IMX500 Export in Deployment](#using-imx500-export-in-deployment) section. |
||||
|
||||
!!! example |
||||
|
||||
=== "Python" |
||||
|
||||
```python |
||||
from ultralytics import YOLO |
||||
|
||||
# Load a YOLOv8n PyTorch model |
||||
model = YOLO("yolov8n.pt") |
||||
|
||||
# Export the model |
||||
model.export(format="imx") # exports with PTQ quantization by default |
||||
|
||||
# Load the exported model |
||||
imx_model = YOLO("yolov8n_imx_model") |
||||
|
||||
# Run inference |
||||
results = imx_model("https://ultralytics.com/images/bus.jpg") |
||||
``` |
||||
|
||||
=== "CLI" |
||||
|
||||
```bash |
||||
# Export a YOLOv8n PyTorch model to imx format with Post-Training Quantization (PTQ) |
||||
yolo export model=yolov8n.pt format=imx |
||||
|
||||
# Run inference with the exported model |
||||
yolo predict model=yolov8n_imx_model source='https://ultralytics.com/images/bus.jpg' |
||||
``` |
||||
|
||||
The export process will create an ONNX model for quantization validation, along with a directory named `<model-name>_imx_model`. This directory will include the `packerOut.zip` file, which is essential for packaging the model for deployment on the IMX500 hardware. Additionally, the `<model-name>_imx_model` folder will contain a text file (`labels.txt`) listing all the labels associated with the model. |
||||
|
||||
```bash |
||||
yolov8n_imx_model |
||||
├── dnnParams.xml |
||||
├── labels.txt |
||||
├── packerOut.zip |
||||
├── yolov8n_imx.onnx |
||||
├── yolov8n_imx500_model_MemoryReport.json |
||||
└── yolov8n_imx500_model.pbtxt |
||||
``` |
||||
|
||||
## Arguments |
||||
|
||||
When exporting a model to IMX500 format, you can specify various arguments: |
||||
|
||||
| Key | Value | Description | |
||||
| -------- | ------ | -------------------------------------------------------- | |
||||
| `format` | `imx` | Format to export to (imx) | |
||||
| `int8` | `True` | Enable INT8 quantization for the model (default: `True`) | |
||||
| `imgsz` | `640` | Image size for the model input (default: `640`) | |
||||
|
||||
## Using IMX500 Export in Deployment |
||||
|
||||
After exporting Ultralytics YOLOv8n model to IMX500 format, it can be deployed to Raspberry Pi AI Camera for inference. |
||||
|
||||
### Hardware Prerequisites |
||||
|
||||
Make sure you have the below hardware: |
||||
|
||||
1. Raspberry Pi 5 or Raspberry Pi 4 Model B |
||||
2. Raspberry Pi AI Camera |
||||
|
||||
Connect the Raspberry Pi AI camera to the 15-pin MIPI CSI connector on the Raspberry Pi and power on the Raspberry Pi |
||||
|
||||
### Software Prerequisites |
||||
|
||||
!!! note |
||||
|
||||
This guide has been tested with Raspberry Pi OS Bookworm running on a Raspberry Pi 5 |
||||
|
||||
Step 1: Open a terminal window and execute the following commands to update the Raspberry Pi software to the latest version. |
||||
|
||||
```bash |
||||
sudo apt update && sudo apt full-upgrade |
||||
``` |
||||
|
||||
Step 2: Install IMX500 firmware which is required to operate the IMX500 sensor along with a packager tool. |
||||
|
||||
```bash |
||||
sudo apt install imx500-all imx500-tools |
||||
``` |
||||
|
||||
Step 3: Install prerequisites to run `picamera2` application. We will use this application later for the deployment process. |
||||
|
||||
```bash |
||||
sudo apt install python3-opencv python3-munkres |
||||
``` |
||||
|
||||
Step 4: Reboot Raspberry Pi for the changes to take into effect |
||||
|
||||
```bash |
||||
sudo reboot |
||||
``` |
||||
|
||||
### Package Model and Deploy to AI Camera |
||||
|
||||
After obtaining `packerOut.zip` from the IMX500 conversion process, you can pass this file into the packager tool to obtain an RPK file. This file can then be deployed directly to the AI Camera using `picamera2`. |
||||
|
||||
Step 1: Package the model into RPK file |
||||
|
||||
```bash |
||||
imx500-package -i <path to packerOut.zip> -o <output folder> |
||||
``` |
||||
|
||||
The above will generate a `network.rpk` file inside the specified output folder. |
||||
|
||||
Step 2: Clone `picamera2` repository, install it and navigate to the imx500 examples |
||||
|
||||
```bash |
||||
git clone -b next https://github.com/raspberrypi/picamera2 |
||||
cd picamera2 |
||||
pip install -e . --break-system-packages |
||||
cd examples/imx500 |
||||
``` |
||||
|
||||
Step 3: Run YOLOv8 object detection, using the labels.txt file that has been generated during the IMX500 export. |
||||
|
||||
```bash |
||||
python imx500_object_detection_demo.py --model <path to network.rpk> --fps 25 --bbox-normalization --ignore-dash-labels --bbox-order xy –labels <path to labels.txt> |
||||
``` |
||||
|
||||
Then you will be able to see live inference output as follows |
||||
|
||||
<p align="center"> |
||||
<img width="100%" src="https://github.com/ultralytics/assets/releases/download/v8.3.0/imx500-inference-rpi.avif" alt="Inference on Raspberry Pi AI Camera"> |
||||
</p> |
||||
|
||||
## Benchmarks |
||||
|
||||
YOLOv8 benchmarks below were run by the Ultralytics team on Raspberry Pi AI Camera with `imx` model format measuring speed and accuracy. |
||||
|
||||
| Model | Format | Status | Size (MB) | mAP50-95(B) | Inference time (ms/im) | |
||||
| ------- | ------ | ------ | --------- | ----------- | ---------------------- | |
||||
| YOLOv8n | imx | ✅ | 2.9 | 0.522 | 66.66 | |
||||
|
||||
!!! note |
||||
|
||||
Validation for the above benchmark was done using coco8 dataset |
||||
|
||||
## What's Under the Hood? |
||||
|
||||
<p align="center"> |
||||
<img width="640" src="https://github.com/ultralytics/assets/releases/download/v8.3.0/imx500-deploy.avif" alt="IMX500 deployment"> |
||||
</p> |
||||
|
||||
### Sony Model Compression Toolkit (MCT) |
||||
|
||||
[Sony's Model Compression Toolkit (MCT)](https://github.com/sony/model_optimization) is a powerful tool for optimizing deep learning models through quantization and pruning. It supports various quantization methods and provides advanced algorithms to reduce model size and computational complexity without significantly sacrificing accuracy. MCT is particularly useful for deploying models on resource-constrained devices, ensuring efficient inference and reduced latency. |
||||
|
||||
### Supported Features of MCT |
||||
|
||||
Sony's MCT offers a range of features designed to optimize neural network models: |
||||
|
||||
1. **Graph Optimizations**: Transforms models into more efficient versions by folding layers like batch normalization into preceding layers. |
||||
2. **Quantization Parameter Search**: Minimizes quantization noise using metrics like Mean-Square-Error, No-Clipping, and Mean-Average-Error. |
||||
3. **Advanced Quantization Algorithms**: |
||||
- **Shift Negative Correction**: Addresses performance issues from symmetric activation quantization. |
||||
- **Outliers Filtering**: Uses z-score to detect and remove outliers. |
||||
- **Clustering**: Utilizes non-uniform quantization grids for better distribution matching. |
||||
- **Mixed-Precision Search**: Assigns different quantization bit-widths per layer based on sensitivity. |
||||
4. **Visualization**: Use TensorBoard to observe model performance insights, quantization phases, and bit-width configurations. |
||||
|
||||
#### Quantization |
||||
|
||||
MCT supports several quantization methods to reduce model size and improve inference speed: |
||||
|
||||
1. **Post-Training Quantization (PTQ)**: |
||||
- Available via Keras and PyTorch APIs. |
||||
- Complexity: Low |
||||
- Computational Cost: Low (CPU minutes) |
||||
2. **Gradient-based Post-Training Quantization (GPTQ)**: |
||||
- Available via Keras and PyTorch APIs. |
||||
- Complexity: Medium |
||||
- Computational Cost: Moderate (2-3 GPU hours) |
||||
3. **Quantization-Aware Training (QAT)**: |
||||
- Complexity: High |
||||
- Computational Cost: High (12-36 GPU hours) |
||||
|
||||
MCT also supports various quantization schemes for weights and activations: |
||||
|
||||
1. Power-of-Two (hardware-friendly) |
||||
2. Symmetric |
||||
3. Uniform |
||||
|
||||
#### Structured Pruning |
||||
|
||||
MCT introduces structured, hardware-aware model pruning designed for specific hardware architectures. This technique leverages the target platform's Single Instruction, Multiple Data (SIMD) capabilities by pruning SIMD groups. This reduces model size and complexity while optimizing channel utilization, aligned with the SIMD architecture for targeted resource utilization of weights memory footprint. Available via Keras and PyTorch APIs. |
||||
|
||||
### IMX500 Converter Tool (Compiler) |
||||
|
||||
The IMX500 Converter Tool is integral to the IMX500 toolset, allowing the compilation of models for deployment on Sony's IMX500 sensor (for instance, Raspberry Pi AI Cameras). This tool facilitates the transition of Ultralytics YOLOv8 models processed through Ultralytics software, ensuring they are compatible and perform efficiently on the specified hardware. The export procedure following model quantization involves the generation of binary files that encapsulate essential data and device-specific configurations, streamlining the deployment process on the Raspberry Pi AI Camera. |
||||
|
||||
## Real-World Use Cases |
||||
|
||||
Export to IMX500 format has wide applicability across industries. Here are some examples: |
||||
|
||||
- **Edge AI and IoT**: Enable object detection on drones or security cameras, where real-time processing on low-power devices is essential. |
||||
- **Wearable Devices**: Deploy models optimized for small-scale AI processing on health-monitoring wearables. |
||||
- **Smart Cities**: Use IMX500-exported YOLOv8 models for traffic monitoring and safety analysis with faster processing and minimal latency. |
||||
- **Retail Analytics**: Enhance in-store monitoring by deploying optimized models in point-of-sale systems or smart shelves. |
||||
|
||||
## Conclusion |
||||
|
||||
Exporting Ultralytics YOLOv8 models to Sony's IMX500 format allows you to deploy your models for efficient inference on IMX500-based cameras. By leveraging advanced quantization techniques, you can reduce model size and improve inference speed without significantly compromising accuracy. |
||||
|
||||
For more information and detailed guidelines, refer to Sony's [IMX500 website](https://developer.aitrios.sony-semicon.com/en/raspberrypi-ai-camera). |
||||
|
||||
## FAQ |
||||
|
||||
### How do I export a YOLOv8 model to IMX500 format for Raspberry Pi AI Camera? |
||||
|
||||
To export a YOLOv8 model to IMX500 format, use either the Python API or CLI command: |
||||
|
||||
```python |
||||
from ultralytics import YOLO |
||||
|
||||
model = YOLO("yolov8n.pt") |
||||
model.export(format="imx") # Exports with PTQ quantization by default |
||||
``` |
||||
|
||||
The export process will create a directory containing the necessary files for deployment, including `packerOut.zip` which can be used with the IMX500 packager tool on Raspberry Pi. |
||||
|
||||
### What are the key benefits of using the IMX500 format for edge AI deployment? |
||||
|
||||
The IMX500 format offers several important advantages for edge deployment: |
||||
|
||||
- On-chip AI processing reduces latency and power consumption |
||||
- Outputs both image and metadata (inference result) instead of images only |
||||
- Enhanced privacy by processing data locally without cloud dependency |
||||
- Real-time processing capabilities ideal for time-sensitive applications |
||||
- Optimized quantization for efficient model deployment on resource-constrained devices |
||||
|
||||
### What hardware and software prerequisites are needed for IMX500 deployment? |
||||
|
||||
For deploying IMX500 models, you'll need: |
||||
|
||||
Hardware: |
||||
|
||||
- Raspberry Pi 5 or Raspberry Pi 4 Model B |
||||
- Raspberry Pi AI Camera with IMX500 sensor |
||||
|
||||
Software: |
||||
|
||||
- Raspberry Pi OS Bookworm |
||||
- IMX500 firmware and tools (`sudo apt install imx500-all imx500-tools`) |
||||
- Python packages for `picamera2` (`sudo apt install python3-opencv python3-munkres`) |
||||
|
||||
### What performance can I expect from YOLOv8 models on the IMX500? |
||||
|
||||
Based on Ultralytics benchmarks on Raspberry Pi AI Camera: |
||||
|
||||
- YOLOv8n achieves 66.66ms inference time per image |
||||
- mAP50-95 of 0.522 on COCO8 dataset |
||||
- Model size of only 2.9MB after quantization |
||||
|
||||
This demonstrates that IMX500 format provides efficient real-time inference while maintaining good accuracy for edge AI applications. |
||||
|
||||
### How do I package and deploy my exported model to the Raspberry Pi AI Camera? |
||||
|
||||
After exporting to IMX500 format: |
||||
|
||||
1. Use the packager tool to create an RPK file: |
||||
|
||||
```bash |
||||
imx500-package -i <path to packerOut.zip> -o <output folder> |
||||
``` |
||||
|
||||
2. Clone and install picamera2: |
||||
|
||||
```bash |
||||
git clone -b next https://github.com/raspberrypi/picamera2 |
||||
cd picamera2 && pip install -e . --break-system-packages |
||||
``` |
||||
|
||||
3. Run inference using the generated RPK file: |
||||
|
||||
```bash |
||||
python imx500_object_detection_demo.py --model <path to network.rpk> --fps 25 --bbox-normalization --labels <path to labels.txt> |
||||
``` |
@ -0,0 +1,23 @@ |
||||
# Ultralytics YOLO 🚀, AGPL-3.0 license |
||||
# Dogs dataset http://vision.stanford.edu/aditya86/ImageNetDogs/ by Stanford |
||||
# Documentation: https://docs.ultralytics.com/datasets/pose/dog-pose/ |
||||
# Example usage: yolo train data=dog-pose.yaml |
||||
# parent |
||||
# ├── ultralytics |
||||
# └── datasets |
||||
# └── dog-pose ← downloads here (337 MB) |
||||
|
||||
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..] |
||||
path: ../datasets/dog-pose # dataset root dir |
||||
train: train # train images (relative to 'path') 6773 images |
||||
val: val # val images (relative to 'path') 1703 images |
||||
|
||||
# Keypoints |
||||
kpt_shape: [24, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible) |
||||
|
||||
# Classes |
||||
names: |
||||
0: dog |
||||
|
||||
# Download script/URL (optional) |
||||
download: https://github.com/ultralytics/assets/releases/download/v0.0.0/dog-pose.zip |
@ -0,0 +1,112 @@ |
||||
# Ultralytics YOLO 🚀, AGPL-3.0 license |
||||
|
||||
from ultralytics.solutions.solutions import BaseSolution |
||||
from ultralytics.utils.plotting import Annotator, colors |
||||
|
||||
|
||||
class RegionCounter(BaseSolution): |
||||
""" |
||||
A class designed for real-time counting of objects within user-defined regions in a video stream. |
||||
|
||||
This class inherits from `BaseSolution` and offers functionalities to define polygonal regions in a video |
||||
frame, track objects, and count those objects that pass through each defined region. This makes it useful |
||||
for applications that require counting in specified areas, such as monitoring zones or segmented sections. |
||||
|
||||
Attributes: |
||||
region_template (dict): A template for creating new counting regions with default attributes including |
||||
the name, polygon coordinates, and display colors. |
||||
counting_regions (list): A list storing all defined regions, where each entry is based on `region_template` |
||||
and includes specific region settings like name, coordinates, and color. |
||||
|
||||
Methods: |
||||
add_region: Adds a new counting region with specified attributes, such as the region's name, polygon points, |
||||
region color, and text color. |
||||
count: Processes video frames to count objects in each region, drawing regions and displaying counts |
||||
on the frame. Handles object detection, region definition, and containment checks. |
||||
""" |
||||
|
||||
def __init__(self, **kwargs): |
||||
"""Initializes the RegionCounter class for real-time counting in different regions of the video streams.""" |
||||
super().__init__(**kwargs) |
||||
self.region_template = { |
||||
"name": "Default Region", |
||||
"polygon": None, |
||||
"counts": 0, |
||||
"dragging": False, |
||||
"region_color": (255, 255, 255), |
||||
"text_color": (0, 0, 0), |
||||
} |
||||
self.counting_regions = [] |
||||
|
||||
def add_region(self, name, polygon_points, region_color, text_color): |
||||
""" |
||||
Adds a new region to the counting list based on the provided template with specific attributes. |
||||
|
||||
Args: |
||||
name (str): Name assigned to the new region. |
||||
polygon_points (list[tuple]): List of (x, y) coordinates defining the region's polygon. |
||||
region_color (tuple): BGR color for region visualization. |
||||
text_color (tuple): BGR color for the text within the region. |
||||
""" |
||||
region = self.region_template.copy() |
||||
region.update( |
||||
{ |
||||
"name": name, |
||||
"polygon": self.Polygon(polygon_points), |
||||
"region_color": region_color, |
||||
"text_color": text_color, |
||||
} |
||||
) |
||||
self.counting_regions.append(region) |
||||
|
||||
def count(self, im0): |
||||
""" |
||||
Processes the input frame to detect and count objects within each defined region. |
||||
|
||||
Args: |
||||
im0 (numpy.ndarray): Input image frame where objects and regions are annotated. |
||||
|
||||
Returns: |
||||
im0 (numpy.ndarray): Processed image frame with annotated counting information. |
||||
""" |
||||
self.annotator = Annotator(im0, line_width=self.line_width) |
||||
self.extract_tracks(im0) |
||||
|
||||
# Region initialization and conversion |
||||
if self.region is None: |
||||
self.initialize_region() |
||||
regions = {"Region#01": self.region} |
||||
else: |
||||
regions = self.region if isinstance(self.region, dict) else {"Region#01": self.region} |
||||
|
||||
# Draw regions and process counts for each defined area |
||||
for idx, (region_name, reg_pts) in enumerate(regions.items(), start=1): |
||||
color = colors(idx, True) |
||||
self.annotator.draw_region(reg_pts=reg_pts, color=color, thickness=self.line_width * 2) |
||||
self.add_region(region_name, reg_pts, color, self.annotator.get_txt_color()) |
||||
|
||||
# Prepare regions for containment check |
||||
for region in self.counting_regions: |
||||
region["prepared_polygon"] = self.prep(region["polygon"]) |
||||
|
||||
# Process bounding boxes and count objects within each region |
||||
for box, cls in zip(self.boxes, self.clss): |
||||
self.annotator.box_label(box, label=self.names[cls], color=colors(cls, True)) |
||||
bbox_center = ((box[0] + box[2]) / 2, (box[1] + box[3]) / 2) |
||||
|
||||
for region in self.counting_regions: |
||||
if region["prepared_polygon"].contains(self.Point(bbox_center)): |
||||
region["counts"] += 1 |
||||
|
||||
# Display counts in each region |
||||
for region in self.counting_regions: |
||||
self.annotator.text_label( |
||||
region["polygon"].bounds, |
||||
label=str(region["counts"]), |
||||
color=region["region_color"], |
||||
txt_color=region["text_color"], |
||||
) |
||||
region["counts"] = 0 # Reset count for next frame |
||||
|
||||
self.display_output(im0) |
||||
return im0 |
Loading…
Reference in new issue