added docs fixed yolo11 CI

mct-2.1.1
Francesco Mattioli 5 months ago
parent 629e2c59b1
commit f3afd842e4
  1. 108
      docs/en/integrations/sony-mct.md
  2. 3
      docs/en/macros/export-args.md
  3. 1
      docs/en/macros/export-table.md
  4. 7
      ultralytics/engine/exporter.py

@ -0,0 +1,108 @@
---
comments: true
description:
keywords:
---
# Supported Features of MCT
MCT offers a comprehensive suite of features designed to optimize neural network models for efficient deployment. These features enhance model performance and compatibility across various platforms. Here's a detailed overview of the supported features:
## Quantization
MCT supports several quantization methods, each with varying levels of complexity and computational cost:
- **Post-training quantization (PTQ)**:
- Available via Keras API and PyTorch API.
- Complexity: Low
- Computational Cost: Low (order of minutes)
- **Gradient-based post-training quantization (GPTQ)**:
- Available via Keras API and PyTorch API.
- Complexity: Mild
- Computational Cost: Mild (order of 2-3 hours)
- **Quantization-aware training (QAT)**:
- Complexity: High
- Computational Cost: High (order of 12-36 hours)
In addition, MCT supports various quantization schemes for weights and activations:
- Power-Of-Two (hardware-friendly)
- Symmetric
- Uniform
### Main Features
- **Graph Optimizations**: Transform models into more efficient versions (e.g., folding batch-normalization layers into preceding linear layers).
- **Quantization Parameter Search**: Minimize quantization noise using methods like Mean-Square-Error or other metrics like No-Clipping and Mean-Average-Error.
- **Advanced Quantization Algorithms**:
- **Shift Negative Correction**: Addresses performance issues from symmetric activation quantization.
- **Outliers Filtering**: Uses z-score to detect and remove outliers.
- **Clustering**: Utilizes non-uniform quantization grids for better distribution matching.
- **Mixed-Precision Search**: Assigns quantization bit-width per layer based on sensitivity to various bit-widths.
- **Visualization**: Use TensorBoard to observe model performance insights, like quantization phases and bit-width configurations.
### Enhanced Post-Training Quantization (EPTQ)
As part of the GPTQ, MCT includes the Enhanced Post-Training Quantization (EPTQ) algorithm for advanced optimization. Details can be found in the paper: "EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian". For usage instructions, refer to the [EPTQ guidelines](#).
### Structured Pruning
MCT introduces structured, hardware-aware model pruning designed for specific hardware architectures. This technique leverages the target platform's Single Instruction, Multiple Data (SIMD) capabilities. By pruning SIMD groups, it reduces model size and complexity while optimizing channel utilization, aligned with the SIMD architecture for a targeted resource utilization of weights memory footprint. Available via Keras API and PyTorch API.
## Exporting YOLO Models with MCT
### Installation
To install the required package, run:
!!! tip "Installation"
=== "CLI"
```bash
# Install the required package for YOLO11
pip install ultralytics
```
For detailed instructions and best practices related to the installation process, check our [YOLO11 Installation guide](../quickstart.md). While installing the required packages for YOLO11, if you encounter any difficulties, consult our [Common Issues guide](../guides/yolo-common-issues.md) for solutions and tips.
### Usage
Before diving into the usage instructions, be sure to check out the range of [YOLO11 models offered by Ultralytics](../models/index.md). This will help you choose the most appropriate model for your project requirements.
!!! example "Usage"
=== "Python"
```python
from ultralytics import YOLO
# Load the YOLO11 model
model = YOLO("yolo11n.pt")
# Export the model to MCT format
model.export(format="mct") # export with ptq quantization by default
# or
# model.export(format="mct", gptq=True) # export with gptq quantization
# Load the exported MCT ONNX model
mct_onnx_model = YOLO("yolo11n_mct_model.onnx")
# Run inference
results = mct_onnx_model("https://ultralytics.com/images/bus.jpg")
```
=== "CLI"
```bash
# Export yolo11n to MCT format
yolo export model=yolo11n.pt format=mct
# Run inference with the exported model
yolo predict model=yolo11n_mct_model.onnx source='https://ultralytics.com/images/bus.jpg'
```
For more details about the export process, visit the [Ultralytics documentation page on exporting](../modes/export.md).

@ -10,5 +10,6 @@
| `simplify` | `bool` | `True` | Simplifies the model graph for ONNX exports with `onnxslim`, potentially improving performance and compatibility. |
| `opset` | `int` | `None` | Specifies the ONNX opset version for compatibility with different ONNX parsers and runtimes. If not set, uses the latest supported version. |
| `workspace` | `float` | `4.0` | Sets the maximum workspace size in GiB for TensorRT optimizations, balancing memory usage and performance. |
| `nms` | `bool` | `False` | Adds Non-Maximum Suppression (NMS) to the CoreML export, essential for accurate and efficient detection post-processing. |
| `nms` | `bool` | `False` | Adds Non-Maximum Suppression (NMS) to the CoreML and MCT export, essential for accurate and efficient detection post-processing. |
| `batch` | `int` | `1` | Specifies export model batch inference size or the max number of images the exported model will process concurrently in `predict` mode. |
| `gptq` | `bool` | `False` | Enables GPTQ quantization for sony mct export. |

@ -13,3 +13,4 @@
| [TF.js](../integrations/tfjs.md) | `tfjs` | `{{ model_name or "yolo11n" }}_web_model/` | ✅ | `imgsz`, `half`, `int8`, `batch` |
| [PaddlePaddle](../integrations/paddlepaddle.md) | `paddle` | `{{ model_name or "yolo11n" }}_paddle_model/` | ✅ | `imgsz`, `batch` |
| [NCNN](../integrations/ncnn.md) | `ncnn` | `{{ model_name or "yolo11n" }}_ncnn_model/` | ✅ | `imgsz`, `half`, `batch` |
| [Sony MCT](../integrations/sony-mct.md) | `mct` | `{{ model_name or "yolo11n" }}_mct_model/` | ✅ | `imgsz`, `gptq`, `nms` |

@ -17,6 +17,7 @@ TensorFlow Edge TPU | `edgetpu` | yolov8n_edgetpu.tflite
TensorFlow.js | `tfjs` | yolov8n_web_model/
PaddlePaddle | `paddle` | yolov8n_paddle_model/
NCNN | `ncnn` | yolov8n_ncnn_model/
Sony MCT | `mct` | yolov8n_mct_model.onnx
Requirements:
$ pip install "ultralytics[export]"
@ -42,6 +43,7 @@ Inference:
yolov8n_edgetpu.tflite # TensorFlow Edge TPU
yolov8n_paddle_model # PaddlePaddle
yolov8n_ncnn_model # NCNN
yolov8n_mct_model.onnx # Sony MCT
TensorFlow.js:
$ cd .. && git clone https://github.com/zldrobit/tfjs-yolov5-example.git && cd tfjs-yolov5-example
@ -1111,11 +1113,6 @@ class Exporter:
],
16,
)
else: # yolo11 model
bit_cfg.set_manual_activation_bit_width(
[NodeNameScopeFilter("sub")],
16,
)
config = mct.core.CoreConfig(
mixed_precision_config=mct.core.MixedPrecisionQuantizationConfig(num_of_images=10),

Loading…
Cancel
Save