scipy_removal
Glenn Jocher 2 years ago
commit fea03cf22d
  1. 2
      .github/ISSUE_TEMPLATE/config.yml
  2. 132
      .github/workflows/ci.yaml
  3. 129
      .github/workflows/docker.yaml
  4. 4
      .github/workflows/greetings.yml
  5. 4
      .github/workflows/links.yml
  6. 12
      .github/workflows/publish.yml
  7. 10
      .gitignore
  8. 10
      .pre-commit-config.yaml
  9. 2
      CONTRIBUTING.md
  10. 37
      README.md
  11. 35
      README.zh-CN.md
  12. 22
      docker/Dockerfile
  13. 11
      docker/Dockerfile-arm64
  14. 24
      docker/Dockerfile-cpu
  15. 19
      docker/Dockerfile-jetson
  16. 49
      docker/Dockerfile-python
  17. 2
      docs/CNAME
  18. 7
      docs/README.md
  19. 30
      docs/SECURITY.md
  20. 41
      docs/build_reference.py
  21. 81
      docs/datasets/classify/caltech101.md
  22. 78
      docs/datasets/classify/caltech256.md
  23. 80
      docs/datasets/classify/cifar10.md
  24. 80
      docs/datasets/classify/cifar100.md
  25. 79
      docs/datasets/classify/fashion-mnist.md
  26. 83
      docs/datasets/classify/imagenet.md
  27. 78
      docs/datasets/classify/imagenet10.md
  28. 113
      docs/datasets/classify/imagenette.md
  29. 84
      docs/datasets/classify/imagewoof.md
  30. 120
      docs/datasets/classify/index.md
  31. 86
      docs/datasets/classify/mnist.md
  32. 97
      docs/datasets/detect/argoverse.md
  33. 94
      docs/datasets/detect/coco.md
  34. 84
      docs/datasets/detect/coco8.md
  35. 91
      docs/datasets/detect/globalwheat2020.md
  36. 104
      docs/datasets/detect/index.md
  37. 92
      docs/datasets/detect/objects365.md
  38. 110
      docs/datasets/detect/open-images-v7.md
  39. 93
      docs/datasets/detect/sku-110k.md
  40. 92
      docs/datasets/detect/visdrone.md
  41. 95
      docs/datasets/detect/voc.md
  42. 97
      docs/datasets/detect/xview.md
  43. 66
      docs/datasets/index.md
  44. 129
      docs/datasets/obb/dota-v2.md
  45. 80
      docs/datasets/obb/index.md
  46. 95
      docs/datasets/pose/coco.md
  47. 84
      docs/datasets/pose/coco8-pose.md
  48. 128
      docs/datasets/pose/index.md
  49. 94
      docs/datasets/segment/coco.md
  50. 84
      docs/datasets/segment/coco8-seg.md
  51. 140
      docs/datasets/segment/index.md
  52. 30
      docs/datasets/track/index.md
  53. 19
      docs/guides/index.md
  54. 265
      docs/guides/kfold-cross-validation.md
  55. 62
      docs/help/CI.md
  56. 5
      docs/help/CLA.md
  57. 4
      docs/help/FAQ.md
  58. 4
      docs/help/code_of_conduct.md
  59. 14
      docs/help/contributing.md
  60. 37
      docs/help/environmental-health-safety.md
  61. 6
      docs/help/index.md
  62. 6
      docs/help/minimum_reproducible_example.md
  63. 116
      docs/hub.md
  64. 66
      docs/hub/app/android.md
  65. 30
      docs/hub/app/index.md
  66. 56
      docs/hub/app/ios.md
  67. 159
      docs/hub/datasets.md
  68. 42
      docs/hub/index.md
  69. 458
      docs/hub/inference_api.md
  70. 7
      docs/hub/integrations.md
  71. 213
      docs/hub/models.md
  72. 169
      docs/hub/projects.md
  73. 7
      docs/hub/quickstart.md
  74. 14
      docs/index.md
  75. 429
      docs/inference_api.md
  76. 61
      docs/integrations/index.md
  77. 271
      docs/integrations/openvino.md
  78. 174
      docs/integrations/ray-tune.md
  79. 186
      docs/models/fast-sam.md
  80. 66
      docs/models/index.md
  81. 109
      docs/models/mobile-sam.md
  82. 101
      docs/models/rtdetr.md
  83. 227
      docs/models/sam.md
  84. 127
      docs/models/yolo-nas.md
  85. 104
      docs/models/yolov3.md
  86. 71
      docs/models/yolov4.md
  87. 108
      docs/models/yolov5.md
  88. 114
      docs/models/yolov6.md
  89. 65
      docs/models/yolov7.md
  90. 91
      docs/models/yolov8.md
  91. 66
      docs/modes/benchmark.md
  92. 43
      docs/modes/export.md
  93. 14
      docs/modes/index.md
  94. 649
      docs/modes/predict.md
  95. 270
      docs/modes/track.md
  96. 288
      docs/modes/train.md
  97. 48
      docs/modes/val.md
  98. 2
      docs/overrides/partials/comments.html
  99. 26
      docs/overrides/partials/source-file.html
  100. 181
      docs/quickstart.md
  101. Some files were not shown because too many files have changed in this diff Show More

@ -7,5 +7,5 @@ contact_links:
url: https://community.ultralytics.com/
about: Ask on Ultralytics Community Forum
- name: 🎧 Discord
url: https://discord.gg/n6cFeSPZdD
url: https://ultralytics.com/discord
about: Ask on Ultralytics Discord

@ -10,17 +10,30 @@ on:
branches: [main]
schedule:
- cron: '0 0 * * *' # runs at 00:00 UTC every day
workflow_dispatch:
inputs:
hub:
description: 'Run HUB'
default: false
type: boolean
tests:
description: 'Run Tests'
default: false
type: boolean
benchmarks:
description: 'Run Benchmarks'
default: false
type: boolean
jobs:
HUB:
if: github.repository == 'ultralytics/ultralytics' && (github.event_name == 'schedule' || github.event_name == 'push')
if: github.repository == 'ultralytics/ultralytics' && (github.event_name == 'schedule' || github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && github.event.inputs.hub == 'true'))
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
python-version: ['3.10']
model: [yolov5n]
python-version: ['3.11']
steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
@ -56,8 +69,26 @@ jobs:
hub.reset_model(model_id)
model = YOLO('https://hub.ultralytics.com/models/' + model_id)
model.train()
- name: Test HUB inference API
shell: python
env:
API_KEY: ${{ secrets.ULTRALYTICS_HUB_API_KEY }}
MODEL_ID: ${{ secrets.ULTRALYTICS_HUB_MODEL_ID }}
run: |
import os
import requests
import json
api_key, model_id = os.environ['API_KEY'], os.environ['MODEL_ID']
url = f"https://api.ultralytics.com/v1/predict/{model_id}"
headers = {"x-api-key": api_key}
data = {"size": 320, "confidence": 0.25, "iou": 0.45}
with open("ultralytics/assets/zidane.jpg", "rb") as f:
response = requests.post(url, headers=headers, data=data, files={"image": f})
assert response.status_code == 200, f'Status code {response.status_code}, Reason {response.reason}'
print(json.dumps(response.json(), indent=2))
Benchmarks:
if: github.event_name != 'workflow_dispatch' || github.event.inputs.benchmarks == 'true'
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
@ -75,12 +106,8 @@ jobs:
shell: bash # for Windows compatibility
run: |
python -m pip install --upgrade pip wheel
if [ "${{ matrix.os }}" == "macos-latest" ]; then
pip install -e '.[export]' --extra-index-url https://download.pytorch.org/whl/cpu
else
pip install -e '.[export]' --extra-index-url https://download.pytorch.org/whl/cpu
fi
yolo export format=tflite imgsz=32
pip install -e ".[export]" coverage --extra-index-url https://download.pytorch.org/whl/cpu
yolo export format=tflite imgsz=32 || true
- name: Check environment
run: |
echo "RUNNER_OS is ${{ runner.os }}"
@ -93,44 +120,44 @@ jobs:
pip --version
pip list
- name: Benchmark DetectionModel
shell: python
run: |
from ultralytics.yolo.utils.benchmarks import benchmark
benchmark(model='${{ matrix.model }}.pt', imgsz=160, half=False, hard_fail=0.20)
shell: bash
run: coverage run -a --source=ultralytics -m ultralytics.cfg.__init__ benchmark model='path with spaces/${{ matrix.model }}.pt' imgsz=160 verbose=0.26
- name: Benchmark SegmentationModel
shell: python
run: |
from ultralytics.yolo.utils.benchmarks import benchmark
benchmark(model='${{ matrix.model }}-seg.pt', imgsz=160, half=False, hard_fail=0.14)
shell: bash
run: coverage run -a --source=ultralytics -m ultralytics.cfg.__init__ benchmark model='path with spaces/${{ matrix.model }}-seg.pt' imgsz=160 verbose=0.30
- name: Benchmark ClassificationModel
shell: python
run: |
from ultralytics.yolo.utils.benchmarks import benchmark
benchmark(model='${{ matrix.model }}-cls.pt', imgsz=160, half=False, hard_fail=0.61)
shell: bash
run: coverage run -a --source=ultralytics -m ultralytics.cfg.__init__ benchmark model='path with spaces/${{ matrix.model }}-cls.pt' imgsz=160 verbose=0.36
- name: Benchmark PoseModel
shell: python
shell: bash
run: coverage run -a --source=ultralytics -m ultralytics.cfg.__init__ benchmark model='path with spaces/${{ matrix.model }}-pose.pt' imgsz=160 verbose=0.17
- name: Merge Coverage Reports
run: |
from ultralytics.yolo.utils.benchmarks import benchmark
benchmark(model='${{ matrix.model }}-pose.pt', imgsz=160, half=False, hard_fail=0.0)
coverage xml -o coverage-benchmarks.xml
- name: Upload Coverage Reports to CodeCov
uses: codecov/codecov-action@v3
with:
flags: Benchmarks
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
- name: Benchmark Summary
run: |
cat benchmarks.log
echo "$(cat benchmarks.log)" >> $GITHUB_STEP_SUMMARY
Tests:
if: github.event_name != 'workflow_dispatch' || github.event.inputs.tests == 'true'
timeout-minutes: 60
runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
os: [ubuntu-latest]
python-version: ['3.7', '3.8', '3.9', '3.10']
model: [yolov8n]
python-version: ['3.11']
torch: [latest]
include:
- os: ubuntu-latest
python-version: '3.8' # torch 1.7.0 requires python >=3.6, <=3.8
model: yolov8n
python-version: '3.8' # torch 1.8.0 requires python >=3.6, <=3.8
torch: '1.8.0' # min torch version CI https://pypi.org/project/torchvision/
steps:
- uses: actions/checkout@v3
@ -140,12 +167,12 @@ jobs:
cache: 'pip' # caching pip dependencies
- name: Install requirements
shell: bash # for Windows compatibility
run: |
run: | # CoreML must be installed before export due to protobuf error from AutoInstall
python -m pip install --upgrade pip wheel
if [ "${{ matrix.torch }}" == "1.8.0" ]; then
pip install -e . torch==1.8.0 torchvision==0.9.0 pytest --extra-index-url https://download.pytorch.org/whl/cpu
pip install -e . torch==1.8.0 torchvision==0.9.0 pytest-cov "coremltools>=7.0.b1" --extra-index-url https://download.pytorch.org/whl/cpu
else
pip install -e . pytest --extra-index-url https://download.pytorch.org/whl/cpu
pip install -e . pytest-cov "coremltools>=7.0.b1" --extra-index-url https://download.pytorch.org/whl/cpu
fi
- name: Check environment
run: |
@ -158,41 +185,16 @@ jobs:
python --version
pip --version
pip list
- name: Test Detect
shell: bash # for Windows compatibility
run: |
yolo detect train data=coco8.yaml model=yolov8n.yaml epochs=1 imgsz=32
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=1 imgsz=32
yolo detect val data=coco8.yaml model=runs/detect/train/weights/last.pt imgsz=32
yolo detect predict model=runs/detect/train/weights/last.pt imgsz=32 source=ultralytics/assets/bus.jpg
yolo export model=runs/detect/train/weights/last.pt imgsz=32 format=torchscript
- name: Test Segment
shell: bash # for Windows compatibility
run: |
yolo segment train data=coco8-seg.yaml model=yolov8n-seg.yaml epochs=1 imgsz=32
yolo segment train data=coco8-seg.yaml model=yolov8n-seg.pt epochs=1 imgsz=32
yolo segment val data=coco8-seg.yaml model=runs/segment/train/weights/last.pt imgsz=32
yolo segment predict model=runs/segment/train/weights/last.pt imgsz=32 source=ultralytics/assets/bus.jpg
yolo export model=runs/segment/train/weights/last.pt imgsz=32 format=torchscript
- name: Test Classify
shell: bash # for Windows compatibility
run: |
yolo classify train data=imagenet10 model=yolov8n-cls.yaml epochs=1 imgsz=32
yolo classify train data=imagenet10 model=yolov8n-cls.pt epochs=1 imgsz=32
yolo classify val data=imagenet10 model=runs/classify/train/weights/last.pt imgsz=32
yolo classify predict model=runs/classify/train/weights/last.pt imgsz=32 source=ultralytics/assets/bus.jpg
yolo export model=runs/classify/train/weights/last.pt imgsz=32 format=torchscript
- name: Test Pose
shell: bash # for Windows compatibility
run: |
yolo pose train data=coco8-pose.yaml model=yolov8n-pose.yaml epochs=1 imgsz=32
yolo pose train data=coco8-pose.yaml model=yolov8n-pose.pt epochs=1 imgsz=32
yolo pose val data=coco8-pose.yaml model=runs/pose/train/weights/last.pt imgsz=32
yolo pose predict model=runs/pose/train/weights/last.pt imgsz=32 source=ultralytics/assets/bus.jpg
yolo export model=runs/pose/train/weights/last.pt imgsz=32 format=torchscript
- name: Pytest tests
shell: bash # for Windows compatibility
run: pytest tests
run: pytest --cov=ultralytics/ --cov-report xml tests/
- name: Upload Coverage Reports to CodeCov
if: github.repository == 'ultralytics/ultralytics' && matrix.os == 'ubuntu-latest' && matrix.python-version == '3.11'
uses: codecov/codecov-action@v3
with:
flags: Tests
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
Summary:
runs-on: ubuntu-latest
@ -201,7 +203,7 @@ jobs:
steps:
- name: Check for failure and notify
if: (needs.HUB.result == 'failure' || needs.Benchmarks.result == 'failure' || needs.Tests.result == 'failure') && github.repository == 'ultralytics/ultralytics' && (github.event_name == 'schedule' || github.event_name == 'push')
uses: slackapi/slack-github-action@v1.23.0
uses: slackapi/slack-github-action@v1.24.0
with:
payload: |
{"text": "<!channel> GitHub Actions error for ${{ github.workflow }} ❌\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* ${{ github.event_name }}\n"}

@ -6,12 +6,47 @@ name: Publish Docker Images
on:
push:
branches: [main]
workflow_dispatch:
inputs:
dockerfile:
type: choice
description: Select Dockerfile
options:
- Dockerfile-arm64
- Dockerfile-jetson
- Dockerfile-python
- Dockerfile-cpu
- Dockerfile
push:
type: boolean
description: Push image to Docker Hub
default: true
jobs:
docker:
if: github.repository == 'ultralytics/ultralytics'
name: Push Docker image to Docker Hub
name: Push
runs-on: ubuntu-latest
strategy:
fail-fast: false
max-parallel: 5
matrix:
include:
- dockerfile: "Dockerfile-arm64"
tags: "latest-arm64"
platforms: "linux/arm64"
- dockerfile: "Dockerfile-jetson"
tags: "latest-jetson"
platforms: "linux/arm64"
- dockerfile: "Dockerfile-python"
tags: "latest-python"
platforms: "linux/amd64"
- dockerfile: "Dockerfile-cpu"
tags: "latest-cpu"
platforms: "linux/amd64"
- dockerfile: "Dockerfile"
tags: "latest"
platforms: "linux/amd64"
steps:
- name: Checkout repo
uses: actions/checkout@v3
@ -28,40 +63,66 @@ jobs:
username: ${{ secrets.DOCKERHUB_USERNAME }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push arm64 image
uses: docker/build-push-action@v4
continue-on-error: true
with:
context: .
platforms: linux/arm64
file: docker/Dockerfile-arm64
push: true
tags: ultralytics/ultralytics:latest-arm64
- name: Retrieve Ultralytics version
id: get_version
run: |
VERSION=$(grep "^__version__ =" ultralytics/__init__.py | awk -F"'" '{print $2}')
echo "Retrieved Ultralytics version: $VERSION"
echo "version=$VERSION" >> $GITHUB_OUTPUT
- name: Build and push Jetson image
uses: docker/build-push-action@v4
continue-on-error: true
with:
context: .
platforms: linux/arm64
file: docker/Dockerfile-jetson
push: true
tags: ultralytics/ultralytics:latest-jetson
VERSION_TAG=$(echo "${{ matrix.tags }}" | sed "s/latest/${VERSION}/")
echo "Intended version tag: $VERSION_TAG"
echo "version_tag=$VERSION_TAG" >> $GITHUB_OUTPUT
- name: Build and push CPU image
uses: docker/build-push-action@v4
continue-on-error: true
with:
context: .
file: docker/Dockerfile-cpu
push: true
tags: ultralytics/ultralytics:latest-cpu
- name: Check if version tag exists on DockerHub
id: check_tag
run: |
RESPONSE=$(curl -s https://hub.docker.com/v2/repositories/ultralytics/ultralytics/tags/$VERSION_TAG)
MESSAGE=$(echo $RESPONSE | jq -r '.message')
if [[ "$MESSAGE" == "null" ]]; then
echo "Tag $VERSION_TAG already exists on DockerHub."
echo "exists=true" >> $GITHUB_OUTPUT
elif [[ "$MESSAGE" == *"404"* ]]; then
echo "Tag $VERSION_TAG does not exist on DockerHub."
echo "exists=false" >> $GITHUB_OUTPUT
else
echo "Unexpected response from DockerHub. Please check manually."
echo "exists=false" >> $GITHUB_OUTPUT
fi
env:
VERSION_TAG: ${{ steps.get_version.outputs.version_tag }}
- name: Build Image
if: github.event_name == 'push' || github.event.inputs.dockerfile == matrix.dockerfile
run: |
docker build --platform ${{ matrix.platforms }} -f docker/${{ matrix.dockerfile }} \
-t ultralytics/ultralytics:${{ matrix.tags }} \
-t ultralytics/ultralytics:${{ steps.get_version.outputs.version_tag }} .
- name: Run Tests
if: (github.event_name == 'push' || github.event.inputs.dockerfile == matrix.dockerfile) && matrix.platforms == 'linux/amd64' # arm64 images not supported on GitHub CI runners
run: docker run ultralytics/ultralytics:${{ matrix.tags }} /bin/bash -c "pip install pytest && pytest tests"
- name: Run Benchmarks
# WARNING: Dockerfile (GPU) error on TF.js export 'module 'numpy' has no attribute 'object'.
if: (github.event_name == 'push' || github.event.inputs.dockerfile == matrix.dockerfile) && matrix.platforms == 'linux/amd64' && matrix.dockerfile != 'Dockerfile' # arm64 images not supported on GitHub CI runners
run: docker run ultralytics/ultralytics:${{ matrix.tags }} yolo benchmark model=yolov8n.pt imgsz=160 verbose=0.26
- name: Push Docker Image with Ultralytics version tag
if: (github.event_name == 'push' || (github.event.inputs.dockerfile == matrix.dockerfile && github.event.inputs.push == 'true')) && steps.check_tag.outputs.exists == 'false'
run: |
docker push ultralytics/ultralytics:${{ steps.get_version.outputs.version_tag }}
- name: Push Docker Image with latest tag
if: github.event_name == 'push' || (github.event.inputs.dockerfile == matrix.dockerfile && github.event.inputs.push == 'true')
run: |
docker push ultralytics/ultralytics:${{ matrix.tags }}
- name: Build and push GPU image
uses: docker/build-push-action@v4
continue-on-error: true
- name: Notify on failure
if: github.event_name == 'push' && failure() # do not notify on cancelled() as cancelling is performed by hand
uses: slackapi/slack-github-action@v1.24.0
with:
context: .
file: docker/Dockerfile
push: true
tags: ultralytics/ultralytics:latest
payload: |
{"text": "<!channel> GitHub Actions error for ${{ github.workflow }} ❌\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* ${{ github.event_name }}\n"}
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_YOLO }}

@ -32,9 +32,11 @@ jobs:
If this is a custom training ❓ Question, please provide as much information as possible, including dataset image examples and training logs, and verify you are following our [Tips for Best Training Results](https://docs.ultralytics.com/yolov5/tutorials/tips_for_best_training_results/).
Join the vibrant [Ultralytics Discord](https://ultralytics.com/discord) 🎧 community for real-time conversations and collaborations. This platform offers a perfect space to inquire, showcase your work, and connect with fellow Ultralytics users.
## Install
Pip install the `ultralytics` package including all [requirements](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) in a [**Python>=3.7**](https://www.python.org/) environment with [**PyTorch>=1.7**](https://pytorch.org/get-started/locally/).
Pip install the `ultralytics` package including all [requirements](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) in a [**Python>=3.8**](https://www.python.org/) environment with [**PyTorch>=1.8**](https://pytorch.org/get-started/locally/).
```bash
pip install ultralytics

@ -28,7 +28,7 @@ jobs:
timeout_minutes: 5
retry_wait_seconds: 60
max_attempts: 3
command: lychee --accept 429,999 --exclude-loopback --exclude twitter.com --exclude-path '**/ci.yaml' --exclude-mail --github-token ${{ secrets.GITHUB_TOKEN }} './**/*.md' './**/*.html'
command: lychee --accept 429,999 --exclude-loopback --exclude 'https?://(www\.)?(linkedin\.com|twitter\.com|instagram\.com)' --exclude-path '**/ci.yaml' --exclude-mail --github-token ${{ secrets.GITHUB_TOKEN }} './**/*.md' './**/*.html'
- name: Test Markdown, HTML, YAML, Python and Notebook links with retry
if: github.event_name == 'workflow_dispatch'
@ -37,4 +37,4 @@ jobs:
timeout_minutes: 5
retry_wait_seconds: 60
max_attempts: 3
command: lychee --accept 429,999 --exclude-loopback --exclude twitter.com,url.com --exclude-path '**/ci.yaml' --exclude-mail --github-token ${{ secrets.GITHUB_TOKEN }} './**/*.md' './**/*.html' './**/*.yml' './**/*.yaml' './**/*.py' './**/*.ipynb'
command: lychee --accept 429,999 --exclude-loopback --exclude 'https?://(www\.)?(linkedin\.com|twitter\.com|instagram\.com|url\.com)' --exclude-path '**/ci.yaml' --exclude-mail --github-token ${{ secrets.GITHUB_TOKEN }} './**/*.md' './**/*.html' './**/*.yml' './**/*.yaml' './**/*.py' './**/*.ipynb'

@ -23,6 +23,8 @@ jobs:
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: "0" # pulls all commits (needed correct last updated dates in Docs)
- name: Set up Python environment
uses: actions/setup-python@v4
with:
@ -31,14 +33,14 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip wheel build twine
pip install -e '.[dev]' --extra-index-url https://download.pytorch.org/whl/cpu
pip install -e ".[dev]" --extra-index-url https://download.pytorch.org/whl/cpu
- name: Check PyPI version
shell: python
run: |
import os
import pkg_resources as pkg
import ultralytics
from ultralytics.yolo.utils.checks import check_latest_pypi_version
from ultralytics.utils.checks import check_latest_pypi_version
v_local = pkg.parse_version(ultralytics.__version__).release
v_pypi = pkg.parse_version(check_latest_pypi_version()).release
@ -61,7 +63,7 @@ jobs:
python -m twine upload dist/* -u __token__ -p $PYPI_TOKEN
- name: Deploy Docs
continue-on-error: true
if: (github.event_name == 'push' && steps.check_pypi.outputs.increment == 'True') || github.event.inputs.docs == 'true'
if: (github.event_name == 'push' || github.event.inputs.docs == 'true') && github.repository == 'ultralytics/ultralytics' && github.actor == 'glenn-jocher'
env:
PERSONAL_ACCESS_TOKEN: ${{ secrets.PERSONAL_ACCESS_TOKEN }}
run: |
@ -92,7 +94,7 @@ jobs:
echo "PR_TITLE=$PR_TITLE" >> $GITHUB_ENV
- name: Notify on Slack (Success)
if: success() && github.event_name == 'push' && steps.check_pypi.outputs.increment == 'True'
uses: slackapi/slack-github-action@v1.23.0
uses: slackapi/slack-github-action@v1.24.0
with:
payload: |
{"text": "<!channel> GitHub Actions success for ${{ github.workflow }} ✅\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* NEW 'ultralytics ${{ steps.check_pypi.outputs.version }}' pip package published 😃\n*Job Status:* ${{ job.status }}\n*Pull Request:* <https://github.com/${{ github.repository }}/pull/${{ env.PR_NUMBER }}> ${{ env.PR_TITLE }}\n"}
@ -100,7 +102,7 @@ jobs:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_YOLO }}
- name: Notify on Slack (Failure)
if: failure()
uses: slackapi/slack-github-action@v1.23.0
uses: slackapi/slack-github-action@v1.24.0
with:
payload: |
{"text": "<!channel> GitHub Actions error for ${{ github.workflow }} ❌\n\n\n*Repository:* https://github.com/${{ github.repository }}\n*Action:* https://github.com/${{ github.repository }}/actions/runs/${{ github.run_id }}\n*Author:* ${{ github.actor }}\n*Event:* ${{ github.event_name }}\n*Job Status:* ${{ job.status }}\n*Pull Request:* <https://github.com/${{ github.repository }}/pull/${{ env.PR_NUMBER }}> ${{ env.PR_TITLE }}\n"}

10
.gitignore vendored

@ -107,6 +107,7 @@ celerybeat.pid
# Environments
.env
.venv
.idea
env/
venv/
ENV/
@ -117,11 +118,15 @@ venv.bak/
.spyderproject
.spyproject
# VSCode project settings
.vscode/
# Rope project settings
.ropeproject
# mkdocs documentation
/site
mkdocs_github_authors.yaml
# mypy
.mypy_cache/
@ -135,7 +140,6 @@ dmypy.json
datasets/
runs/
wandb/
.DS_Store
# Neural Network weights -----------------------------------------------------------------------------------------------
@ -146,6 +150,7 @@ weights/
*.onnx
*.engine
*.mlmodel
*.mlpackage
*.torchscript
*.tflite
*.h5
@ -153,3 +158,6 @@ weights/
*_web_model/
*_openvino_model/
*_paddle_model/
# Autogenerated files for tests
/ultralytics/assets/

@ -22,7 +22,7 @@ repos:
- id: detect-private-key
- repo: https://github.com/asottile/pyupgrade
rev: v3.3.1
rev: v3.10.1
hooks:
- id: pyupgrade
name: Upgrade code
@ -34,7 +34,7 @@ repos:
name: Sort imports
- repo: https://github.com/google/yapf
rev: v0.32.0
rev: v0.40.0
hooks:
- id: yapf
name: YAPF formatting
@ -50,17 +50,17 @@ repos:
# exclude: "README.md|README.zh-CN.md|CONTRIBUTING.md"
- repo: https://github.com/PyCQA/flake8
rev: 6.0.0
rev: 6.1.0
hooks:
- id: flake8
name: PEP8
- repo: https://github.com/codespell-project/codespell
rev: v2.2.4
rev: v2.2.5
hooks:
- id: codespell
args:
- --ignore-words-list=crate,nd,strack,dota
- --ignore-words-list=crate,nd,strack,dota,ane,segway,fo
# - repo: https://github.com/asottile/yesqa
# rev: v1.4.0

@ -65,7 +65,7 @@ Here is an example:
```python
"""
What the function does. Performs NMS on given detection predictions.
What the function does. Performs NMS on given detection predictions.
Args:
arg1: The description of the 1st argument

@ -9,6 +9,7 @@
<div>
<a href="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml"><img src="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml/badge.svg" alt="Ultralytics CI"></a>
<a href="https://codecov.io/github/ultralytics/ultralytics"><img src="https://codecov.io/github/ultralytics/ultralytics/branch/main/graph/badge.svg?token=HHW7IIVFVY" alt="Ultralytics Code Coverage"></a>
<a href="https://zenodo.org/badge/latestdoi/264818686"><img src="https://zenodo.org/badge/264818686.svg" alt="YOLOv8 Citation"></a>
<a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
<br>
@ -20,7 +21,7 @@
[Ultralytics](https://ultralytics.com) [YOLOv8](https://github.com/ultralytics/ultralytics) is a cutting-edge, state-of-the-art (SOTA) model that builds upon the success of previous YOLO versions and introduces new features and improvements to further boost performance and flexibility. YOLOv8 is designed to be fast, accurate, and easy to use, making it an excellent choice for a wide range of object detection and tracking, instance segmentation, image classification and pose estimation tasks.
We hope that the resources here will help you get the most out of YOLOv8. Please browse the YOLOv8 <a href="https://docs.ultralytics.com/">Docs</a> for details, raise an issue on <a href="https://github.com/ultralytics/ultralytics/issues/new/choose">GitHub</a> for support, and join our <a href="https://discord.gg/n6cFeSPZdD">Discord</a> community for questions and discussions!
We hope that the resources here will help you get the most out of YOLOv8. Please browse the YOLOv8 <a href="https://docs.ultralytics.com/">Docs</a> for details, raise an issue on <a href="https://github.com/ultralytics/ultralytics/issues/new/choose">GitHub</a> for support, and join our <a href="https://ultralytics.com/discord">Discord</a> community for questions and discussions!
To request an Enterprise License please complete the form at [Ultralytics Licensing](https://ultralytics.com/license).
@ -30,7 +31,7 @@ To request an Enterprise License please complete the form at [Ultralytics Licens
<a href="https://github.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://www.linkedin.com/company/ultralytics" style="text-decoration:none;">
<a href="https://www.linkedin.com/company/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://twitter.com/ultralytics" style="text-decoration:none;">
@ -45,7 +46,7 @@ To request an Enterprise License please complete the form at [Ultralytics Licens
<a href="https://www.instagram.com/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://discord.gg/n6cFeSPZdD" style="text-decoration:none;">
<a href="https://ultralytics.com/discord" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/blob/main/social/logo-social-discord.png" width="2%" alt="" /></a>
</div>
</div>
@ -57,12 +58,16 @@ See below for a quickstart installation and usage example, and see the [YOLOv8 D
<details open>
<summary>Install</summary>
Pip install the ultralytics package including all [requirements](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) in a [**Python>=3.7**](https://www.python.org/) environment with [**PyTorch>=1.7**](https://pytorch.org/get-started/locally/).
Pip install the ultralytics package including all [requirements](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) in a [**Python>=3.8**](https://www.python.org/) environment with [**PyTorch>=1.8**](https://pytorch.org/get-started/locally/).
[![PyPI version](https://badge.fury.io/py/ultralytics.svg)](https://badge.fury.io/py/ultralytics) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics)
```bash
pip install ultralytics
```
For alternative installation methods including [Conda](https://anaconda.org/conda-forge/ultralytics), [Docker](https://hub.docker.com/r/ultralytics/ultralytics), and Git, please refer to the [Quickstart Guide](https://docs.ultralytics.com/quickstart).
</details>
<details open>
@ -93,18 +98,20 @@ model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)
model.train(data="coco128.yaml", epochs=3) # train the model
metrics = model.val() # evaluate model performance on the validation set
results = model("https://ultralytics.com/images/bus.jpg") # predict on an image
success = model.export(format="onnx") # export the model to ONNX format
path = model.export(format="onnx") # export the model to ONNX format
```
[Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models) download automatically from the latest Ultralytics [release](https://github.com/ultralytics/assets/releases). See YOLOv8 [Python Docs](https://docs.ultralytics.com/usage/python) for more examples.
[Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models) download automatically from the latest Ultralytics [release](https://github.com/ultralytics/assets/releases). See YOLOv8 [Python Docs](https://docs.ultralytics.com/usage/python) for more examples.
</details>
## <div align="center">Models</div>
All YOLOv8 pretrained models are available here. Detect, Segment and Pose models are pretrained on the [COCO](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/datasets/coco.yaml) dataset, while Classify models are pretrained on the [ImageNet](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/datasets/ImageNet.yaml) dataset.
YOLOv8 [Detect](https://docs.ultralytics.com/tasks/detect), [Segment](https://docs.ultralytics.com/tasks/segment) and [Pose](https://docs.ultralytics.com/tasks/pose) models pretrained on the [COCO](https://docs.ultralytics.com/datasets/detect/coco) dataset are available here, as well as YOLOv8 [Classify](https://docs.ultralytics.com/tasks/classify) models pretrained on the [ImageNet](https://docs.ultralytics.com/datasets/classify/imagenet) dataset. [Track](https://docs.ultralytics.com/modes/track) mode is available for all Detect, Segment and Pose models.
[Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models) download automatically from the latest Ultralytics [release](https://github.com/ultralytics/assets/releases) on first use.
<img width="1024" src="https://raw.githubusercontent.com/ultralytics/assets/main/im/banner-tasks.png">
All [Models](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models) download automatically from the latest Ultralytics [release](https://github.com/ultralytics/assets/releases) on first use.
<details open><summary>Detection</summary>
@ -186,6 +193,8 @@ See [Pose Docs](https://docs.ultralytics.com/tasks/pose) for usage examples with
## <div align="center">Integrations</div>
Our key integrations with leading AI platforms extend the functionality of Ultralytics' offerings, enhancing tasks like dataset labeling, training, visualization, and model management. Discover how Ultralytics, in collaboration with [Roboflow](https://roboflow.com/?ref=ultralytics), ClearML, [Comet](https://bit.ly/yolov8-readme-comet), Neural Magic and [OpenVINO](https://docs.ultralytics.com/integrations/openvino), can optimize your AI workflow.
<br>
<a href="https://bit.ly/ultralytics_hub" target="_blank">
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png"></a>
@ -228,21 +237,21 @@ We love your input! YOLOv5 and YOLOv8 would not be possible without help from ou
## <div align="center">License</div>
YOLOv8 is available under two different licenses:
Ultralytics offers two licensing options to accommodate diverse use cases:
- **AGPL-3.0 License**: See [LICENSE](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) file for details.
- **Enterprise License**: Provides greater flexibility for commercial product development without the open-source requirements of AGPL-3.0. Typical use cases are embedding Ultralytics software and AI models in commercial products and applications. Request an Enterprise License at [Ultralytics Licensing](https://ultralytics.com/license).
- **AGPL-3.0 License**: This [OSI-approved](https://opensource.org/licenses/) open-source license is ideal for students and enthusiasts, promoting open collaboration and knowledge sharing. See the [LICENSE](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) file for more details.
- **Enterprise License**: Designed for commercial use, this license permits seamless integration of Ultralytics software and AI models into commercial goods and services, bypassing the open-source requirements of AGPL-3.0. If your scenario involves embedding our solutions into a commercial offering, reach out through [Ultralytics Licensing](https://ultralytics.com/license).
## <div align="center">Contact</div>
For YOLOv8 bug reports and feature requests please visit [GitHub Issues](https://github.com/ultralytics/ultralytics/issues), and join our [Discord](https://discord.gg/n6cFeSPZdD) community for questions and discussions!
For Ultralytics bug reports and feature requests please visit [GitHub Issues](https://github.com/ultralytics/ultralytics/issues), and join our [Discord](https://ultralytics.com/discord) community for questions and discussions!
<br>
<div align="center">
<a href="https://github.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="" />
<a href="https://www.linkedin.com/company/ultralytics" style="text-decoration:none;">
<a href="https://www.linkedin.com/company/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="" />
<a href="https://twitter.com/ultralytics" style="text-decoration:none;">
@ -257,6 +266,6 @@ For YOLOv8 bug reports and feature requests please visit [GitHub Issues](https:/
<a href="https://www.instagram.com/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="" />
<a href="https://discord.gg/n6cFeSPZdD" style="text-decoration:none;">
<a href="https://ultralytics.com/discord" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/blob/main/social/logo-social-discord.png" width="3%" alt="" /></a>
</div>

@ -9,6 +9,7 @@
<div>
<a href="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml"><img src="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml/badge.svg" alt="Ultralytics CI"></a>
<a href="https://codecov.io/github/ultralytics/ultralytics"><img src="https://codecov.io/github/ultralytics/ultralytics/branch/main/graph/badge.svg?token=HHW7IIVFVY" alt="Ultralytics Code Coverage"></a>
<a href="https://zenodo.org/badge/latestdoi/264818686"><img src="https://zenodo.org/badge/264818686.svg" alt="YOLOv8 Citation"></a>
<a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
<br>
@ -20,7 +21,7 @@
[Ultralytics](https://ultralytics.com) [YOLOv8](https://github.com/ultralytics/ultralytics) 是一款前沿、最先进(SOTA)的模型,基于先前 YOLO 版本的成功,引入了新功能和改进,进一步提升性能和灵活性。YOLOv8 设计快速、准确且易于使用,使其成为各种物体检测与跟踪、实例分割、图像分类和姿态估计任务的绝佳选择。
我们希望这里的资源能帮助您充分利用 YOLOv8。请浏览 YOLOv8 <a href="https://docs.ultralytics.com/">文档</a> 了解详细信息,在 <a href="https://github.com/ultralytics/ultralytics/issues/new/choose">GitHub</a> 上提交问题以获得支持,并加入我们的 <a href="https://discord.gg/n6cFeSPZdD">Discord</a> 社区进行问题和讨论!
我们希望这里的资源能帮助您充分利用 YOLOv8。请浏览 YOLOv8 <a href="https://docs.ultralytics.com/">文档</a> 了解详细信息,在 <a href="https://github.com/ultralytics/ultralytics/issues/new/choose">GitHub</a> 上提交问题以获得支持,并加入我们的 <a href="https://ultralytics.com/discord">Discord</a> 社区进行问题和讨论!
如需申请企业许可,请在 [Ultralytics Licensing](https://ultralytics.com/license) 处填写表格
@ -30,7 +31,7 @@
<a href="https://github.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://www.linkedin.com/company/ultralytics" style="text-decoration:none;">
<a href="https://www.linkedin.com/company/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://twitter.com/ultralytics" style="text-decoration:none;">
@ -45,7 +46,7 @@
<a href="https://www.instagram.com/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://discord.gg/n6cFeSPZdD" style="text-decoration:none;">
<a href="https://ultralytics.com/discord" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/blob/main/social/logo-social-discord.png" width="2%" alt="" /></a>
</div>
</div>
@ -57,12 +58,16 @@
<details open>
<summary>安装</summary>
在一个 [**Python>=3.7**](https://www.python.org/) 环境中,使用 [**PyTorch>=1.7**](https://pytorch.org/get-started/locally/),通过 pip 安装 ultralytics 软件包以及所有[依赖项](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt)。
使用Pip在一个[**Python>=3.8**](https://www.python.org/)环境中安装`ultralytics`包,此环境还需包含[**PyTorch>=1.8**](https://pytorch.org/get-started/locally/)。这也会安装所有必要的[依赖项](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt)。
[![PyPI version](https://badge.fury.io/py/ultralytics.svg)](https://badge.fury.io/py/ultralytics) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics)
```bash
pip install ultralytics
```
如需使用包括[Conda](https://anaconda.org/conda-forge/ultralytics)、[Docker](https://hub.docker.com/r/ultralytics/ultralytics)和Git在内的其他安装方法,请参考[快速入门指南](https://docs.ultralytics.com/quickstart)。
</details>
<details open>
@ -96,15 +101,17 @@ results = model("https://ultralytics.com/images/bus.jpg") # 对图像进行预
success = model.export(format="onnx") # 将模型导出为 ONNX 格式
```
[模型](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models) 会自动从最新的 Ultralytics [发布版本](https://github.com/ultralytics/assets/releases)中下载。查看 YOLOv8 [Python 文档](https://docs.ultralytics.com/usage/python)以获取更多示例。
[模型](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models) 会自动从最新的 Ultralytics [发布版本](https://github.com/ultralytics/assets/releases)中下载。查看 YOLOv8 [Python 文档](https://docs.ultralytics.com/usage/python)以获取更多示例。
</details>
## <div align="center">模型</div>
所有的 YOLOv8 预训练模型都可以在此找到。检测、分割和姿态模型在 [COCO](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/datasets/coco.yaml) 数据集上进行预训练,而分类模型在 [ImageNet](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/datasets/ImageNet.yaml) 数据集上进行预训练
在[COCO](https://docs.ultralytics.com/datasets/detect/coco)数据集上预训练的YOLOv8 [检测](https://docs.ultralytics.com/tasks/detect),[分割](https://docs.ultralytics.com/tasks/segment)和[姿态](https://docs.ultralytics.com/tasks/pose)模型可以在这里找到,以及在[ImageNet](https://docs.ultralytics.com/datasets/classify/imagenet)数据集上预训练的YOLOv8 [分类](https://docs.ultralytics.com/tasks/classify)模型。所有的检测,分割和姿态模型都支持[追踪](https://docs.ultralytics.com/modes/track)模式
在首次使用时,[模型](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/models) 会自动从最新的 Ultralytics [发布版本](https://github.com/ultralytics/assets/releases)中下载。
<img width="1024" src="https://raw.githubusercontent.com/ultralytics/assets/main/im/banner-tasks.png">
所有[模型](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/models)在首次使用时会自动从最新的Ultralytics [发布版本](https://github.com/ultralytics/assets/releases)下载。
<details open><summary>检测</summary>
@ -185,6 +192,8 @@ success = model.export(format="onnx") # 将模型导出为 ONNX 格式
## <div align="center">集成</div>
我们与领先的AI平台的关键整合扩展了Ultralytics产品的功能,增强了数据集标签化、训练、可视化和模型管理等任务。探索Ultralytics如何与[Roboflow](https://roboflow.com/?ref=ultralytics)、ClearML、[Comet](https://bit.ly/yolov8-readme-comet)、Neural Magic以及[OpenVINO](https://docs.ultralytics.com/integrations/openvino)合作,优化您的AI工作流程。
<br>
<a href="https://bit.ly/ultralytics_hub" target="_blank">
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png"></a>
@ -227,21 +236,21 @@ success = model.export(format="onnx") # 将模型导出为 ONNX 格式
## <div align="center">许可证</div>
YOLOv8 提供两种不同的许可证
Ultralytics 提供两种许可证选项以适应各种使用场景
- **AGPL-3.0 许可证**详细信息请参阅 [LICENSE](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) 文件。
- **企业许可证**为商业产品开发提供更大的灵活性,无需遵循 AGPL-3.0 的开源要求。典型的用例是将 Ultralytics 软件和 AI 模型嵌入商业产品和应用中。在 [Ultralytics 授权](https://ultralytics.com/license) 处申请企业许可证
- **AGPL-3.0 许可证**这个[OSI 批准](https://opensource.org/licenses/)的开源许可证非常适合学生和爱好者,可以推动开放的协作和知识分享。请查看[LICENSE](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) 文件以了解更多细节
- **企业许可证**专为商业用途设计,该许可证允许将 Ultralytics 的软件和 AI 模型无缝集成到商业产品和服务中,从而绕过 AGPL-3.0 的开源要求。如果您的场景涉及将我们的解决方案嵌入到商业产品中,请通过 [Ultralytics Licensing](https://ultralytics.com/license)与我们联系
## <div align="center">联系方式</div>
对于 YOLOv8 的错误报告和功能请求,请访问 [GitHub Issues](https://github.com/ultralytics/ultralytics/issues),并加入我们的 [Discord](https://discord.gg/n6cFeSPZdD) 社区进行问题和讨论!
对于 Ultralytics 的错误报告和功能请求,请访问 [GitHub Issues](https://github.com/ultralytics/ultralytics/issues),并加入我们的 [Discord](https://ultralytics.com/discord) 社区进行问题和讨论!
<br>
<div align="center">
<a href="https://github.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="3%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="" />
<a href="https://www.linkedin.com/company/ultralytics" style="text-decoration:none;">
<a href="https://www.linkedin.com/company/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="3%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="" />
<a href="https://twitter.com/ultralytics" style="text-decoration:none;">
@ -255,6 +264,6 @@ YOLOv8 提供两种不同的许可证:
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="3%" alt="" />
<a href="https://www.instagram.com/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="3%" alt="" /></a>
<a href="https://discord.gg/n6cFeSPZdD" style="text-decoration:none;">
<a href="https://ultralytics.com/discord" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/blob/main/social/logo-social-discord.png" width="3%" alt="" /></a>
</div>

@ -3,15 +3,16 @@
# Image is CUDA-optimized for YOLOv8 single/multi-GPU training and inference
# Start FROM PyTorch image https://hub.docker.com/r/pytorch/pytorch or nvcr.io/nvidia/pytorch:23.03-py3
FROM pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime
FROM pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
RUN pip install --no-cache nvidia-tensorrt --index-url https://pypi.ngc.nvidia.com
# Downloads to user config dir
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/
# Install linux packages
# g++ required to build 'tflite_support' package
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \
&& apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++
&& apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0
# RUN alias python=python3
# Security updates
@ -19,7 +20,6 @@ RUN apt update \
RUN apt upgrade --no-install-recommends -y openssl tar
# Create working directory
RUN mkdir -p /usr/src/ultralytics
WORKDIR /usr/src/ultralytics
# Copy contents
@ -29,10 +29,22 @@ ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt /u
# Install pip packages
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache -e . albumentations comet tensorboard
RUN pip install --no-cache -e ".[export]" thop albumentations comet pycocotools
# Run exports to AutoInstall packages
RUN yolo export model=tmp/yolov8n.pt format=edgetpu imgsz=32
RUN yolo export model=tmp/yolov8n.pt format=ncnn imgsz=32
# Requires <= Python 3.10, bug with paddlepaddle==2.5.0
RUN pip install --no-cache paddlepaddle==2.4.2 x2paddle
# Fix error: `np.bool` was a deprecated alias for the builtin `bool`
RUN pip install --no-cache numpy==1.23.5
# Remove exported models
RUN rm -rf tmp
# Set environment variables
ENV OMP_NUM_THREADS=1
# Avoid DDP error "MKL_THREADING_LAYER=INTEL is incompatible with libgomp.so.1 library" https://github.com/pytorch/pytorch/issues/37377
ENV MKL_THREADING_LAYER=GNU
# Usage Examples -------------------------------------------------------------------------------------------------------

@ -9,12 +9,12 @@ FROM arm64v8/ubuntu:22.10
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/
# Install linux packages
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \
&& apt install --no-install-recommends -y python3-pip git zip curl htop gcc libgl1-mesa-glx libglib2.0-0 libpython3-dev
&& apt install --no-install-recommends -y python3-pip git zip curl htop gcc libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0
# RUN alias python=python3
# Create working directory
RUN mkdir -p /usr/src/ultralytics
WORKDIR /usr/src/ultralytics
# Copy contents
@ -24,7 +24,7 @@ ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt /u
# Install pip packages
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache -e .
RUN pip install --no-cache -e . thop
# Usage Examples -------------------------------------------------------------------------------------------------------
@ -32,5 +32,8 @@ RUN pip install --no-cache -e .
# Build and Push
# t=ultralytics/ultralytics:latest-arm64 && sudo docker build --platform linux/arm64 -f docker/Dockerfile-arm64 -t $t . && sudo docker push $t
# Pull and Run
# Run
# t=ultralytics/ultralytics:latest-arm64 && sudo docker run -it --ipc=host $t
# Pull and Run with local volume mounted
# t=ultralytics/ultralytics:latest-arm64 && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/datasets:/usr/src/datasets $t

@ -3,19 +3,18 @@
# Image is CPU-optimized for ONNX, OpenVINO and PyTorch YOLOv8 deployments
# Start FROM Ubuntu image https://hub.docker.com/_/ubuntu
FROM ubuntu:22.10
FROM ubuntu:lunar-20230615
# Downloads to user config dir
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/
# Install linux packages
# g++ required to build 'tflite_support' package
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \
&& apt install --no-install-recommends -y python3-pip git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++
&& apt install --no-install-recommends -y python3-pip git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0
# RUN alias python=python3
# Create working directory
RUN mkdir -p /usr/src/ultralytics
WORKDIR /usr/src/ultralytics
# Copy contents
@ -23,15 +22,28 @@ WORKDIR /usr/src/ultralytics
RUN git clone https://github.com/ultralytics/ultralytics /usr/src/ultralytics
ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt /usr/src/ultralytics/
# Remove python3.11/EXTERNALLY-MANAGED or use 'pip install --break-system-packages' avoid 'externally-managed-environment' Ubuntu nightly error
RUN rm -rf /usr/lib/python3.11/EXTERNALLY-MANAGED
# Install pip packages
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache -e . --extra-index-url https://download.pytorch.org/whl/cpu
RUN pip install --no-cache -e ".[export]" thop --extra-index-url https://download.pytorch.org/whl/cpu
# Run exports to AutoInstall packages
RUN yolo export model=tmp/yolov8n.pt format=edgetpu imgsz=32
RUN yolo export model=tmp/yolov8n.pt format=ncnn imgsz=32
# Requires <= Python 3.10, bug with paddlepaddle==2.5.0
# RUN pip install --no-cache paddlepaddle==2.4.2 x2paddle
# Remove exported models
RUN rm -rf tmp
# Usage Examples -------------------------------------------------------------------------------------------------------
# Build and Push
# t=ultralytics/ultralytics:latest-cpu && sudo docker build -f docker/Dockerfile-cpu -t $t . && sudo docker push $t
# Pull and Run
# Run
# t=ultralytics/ultralytics:latest-cpu && sudo docker run -it --ipc=host $t
# Pull and Run with local volume mounted
# t=ultralytics/ultralytics:latest-cpu && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/datasets:/usr/src/datasets $t

@ -9,13 +9,12 @@ FROM nvcr.io/nvidia/l4t-pytorch:r35.2.1-pth2.0-py3
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/
# Install linux packages
# g++ required to build 'tflite_support' package
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \
&& apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++
&& apt install --no-install-recommends -y gcc git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0
# RUN alias python=python3
# Create working directory
RUN mkdir -p /usr/src/ultralytics
WORKDIR /usr/src/ultralytics
# Copy contents
@ -23,10 +22,13 @@ WORKDIR /usr/src/ultralytics
RUN git clone https://github.com/ultralytics/ultralytics /usr/src/ultralytics
ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt /usr/src/ultralytics/
# Remove opencv-python from requirements.txt as it conflicts with opencv-python installed in base image
RUN grep -v '^opencv-python' requirements.txt > tmp.txt && mv tmp.txt requirements.txt
# Install pip packages manually for TensorRT compatibility https://github.com/NVIDIA/TensorRT/issues/2567
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache tqdm matplotlib pyyaml psutil thop pandas onnx "numpy==1.23"
RUN pip install --no-cache -e . --no-deps
RUN pip install --no-cache tqdm matplotlib pyyaml psutil pandas onnx thop "numpy==1.23"
RUN pip install --no-cache -e .
# Set environment variables
ENV OMP_NUM_THREADS=1
@ -37,5 +39,8 @@ ENV OMP_NUM_THREADS=1
# Build and Push
# t=ultralytics/ultralytics:latest-jetson && sudo docker build --platform linux/arm64 -f docker/Dockerfile-jetson -t $t . && sudo docker push $t
# Pull and Run
# t=ultralytics/ultralytics:jetson && sudo docker pull $t && sudo docker run -it --runtime=nvidia $t
# Run
# t=ultralytics/ultralytics:latest-jetson && sudo docker run -it --ipc=host $t
# Pull and Run with NVIDIA runtime
# t=ultralytics/ultralytics:latest-jetson && sudo docker pull $t && sudo docker run -it --ipc=host --runtime=nvidia $t

@ -0,0 +1,49 @@
# Ultralytics YOLO 🚀, AGPL-3.0 license
# Builds ultralytics/ultralytics:latest-cpu image on DockerHub https://hub.docker.com/r/ultralytics/ultralytics
# Image is CPU-optimized for ONNX, OpenVINO and PyTorch YOLOv8 deployments
# Use the official Python 3.10 slim-bookworm as base image
FROM python:3.10-slim-bookworm
# Downloads to user config dir
ADD https://ultralytics.com/assets/Arial.ttf https://ultralytics.com/assets/Arial.Unicode.ttf /root/.config/Ultralytics/
# Install linux packages
# g++ required to build 'tflite_support' and 'lap' packages, libusb-1.0-0 required for 'tflite_support' package
RUN apt update \
&& apt install --no-install-recommends -y python3-pip git zip curl htop libgl1-mesa-glx libglib2.0-0 libpython3-dev gnupg g++ libusb-1.0-0
# RUN alias python=python3
# Create working directory
WORKDIR /usr/src/ultralytics
# Copy contents
# COPY . /usr/src/app (issues as not a .git directory)
RUN git clone https://github.com/ultralytics/ultralytics /usr/src/ultralytics
ADD https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8n.pt /usr/src/ultralytics/
# Remove python3.11/EXTERNALLY-MANAGED or use 'pip install --break-system-packages' avoid 'externally-managed-environment' Ubuntu nightly error
# RUN rm -rf /usr/lib/python3.11/EXTERNALLY-MANAGED
# Install pip packages
RUN python3 -m pip install --upgrade pip wheel
RUN pip install --no-cache -e ".[export]" thop --extra-index-url https://download.pytorch.org/whl/cpu
# Run exports to AutoInstall packages
RUN yolo export model=tmp/yolov8n.pt format=edgetpu imgsz=32
RUN yolo export model=tmp/yolov8n.pt format=ncnn imgsz=32
# Requires <= Python 3.10, bug with paddlepaddle==2.5.0
RUN pip install --no-cache paddlepaddle==2.4.2 x2paddle
# Remove exported models
RUN rm -rf tmp
# Usage Examples -------------------------------------------------------------------------------------------------------
# Build and Push
# t=ultralytics/ultralytics:latest-python && sudo docker build -f docker/Dockerfile-python -t $t . && sudo docker push $t
# Run
# t=ultralytics/ultralytics:latest-python && sudo docker run -it --ipc=host $t
# Pull and Run with local volume mounted
# t=ultralytics/ultralytics:latest-python && sudo docker pull $t && sudo docker run -it --ipc=host -v "$(pwd)"/datasets:/usr/src/datasets $t

@ -1 +1 @@
docs.ultralytics.com
docs.ultralytics.com

@ -1,3 +1,8 @@
---
description: Learn how to install Ultralytics in developer mode, build and serve it locally for testing, and deploy your documentation site on platforms like GitHub Pages, GitLab Pages, and Amazon S3.
keywords: Ultralytics, documentation, mkdocs, installation, developer mode, building, deployment, local server, GitHub Pages, GitLab Pages, Amazon S3
---
# Ultralytics Docs
Ultralytics Docs are deployed to [https://docs.ultralytics.com](https://docs.ultralytics.com).
@ -22,7 +27,7 @@ cd ultralytics
3. Install the package in developer mode using pip:
```bash
pip install -e '.[dev]'
pip install -e ".[dev]"
```
This will install the ultralytics package and its dependencies in developer mode, allowing you to make changes to the

@ -1,28 +1,26 @@
# Security Policy
---
description: Discover how Ultralytics ensures the safety of user data and systems. Check out the measures we have implemented, including Snyk and GitHub CodeQL Scanning.
keywords: Ultralytics, Security Policy, data security, open-source projects, Snyk scanning, CodeQL scanning, vulnerability detection, threat prevention
---
At [Ultralytics](https://ultralytics.com), the security of our users' data and systems is of utmost importance. To
ensure the safety and security of our [open-source projects](https://github.com/ultralytics), we have implemented
several measures to detect and prevent security vulnerabilities.
# Security Policy
[![ultralytics](https://snyk.io/advisor/python/ultralytics/badge.svg)](https://snyk.io/advisor/python/ultralytics)
At [Ultralytics](https://ultralytics.com), the security of our users' data and systems is of utmost importance. To ensure the safety and security of our [open-source projects](https://github.com/ultralytics), we have implemented several measures to detect and prevent security vulnerabilities.
## Snyk Scanning
We use [Snyk](https://snyk.io/advisor/python/ultralytics) to regularly scan the YOLOv8 repository for vulnerabilities
and security issues. Our goal is to identify and remediate any potential threats as soon as possible, to minimize any
risks to our users.
We use [Snyk](https://snyk.io/advisor/python/ultralytics) to regularly scan all Ultralytics repositories for vulnerabilities and security issues. Our goal is to identify and remediate any potential threats as soon as possible, to minimize any risks to our users.
[![ultralytics](https://snyk.io/advisor/python/ultralytics/badge.svg)](https://snyk.io/advisor/python/ultralytics)
## GitHub CodeQL Scanning
In addition to our Snyk scans, we also use
GitHub's [CodeQL](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/about-code-scanning-with-codeql)
scans to proactively identify and address security vulnerabilities.
In addition to our Snyk scans, we also use GitHub's [CodeQL](https://docs.github.com/en/code-security/code-scanning/automatically-scanning-your-code-for-vulnerabilities-and-errors/about-code-scanning-with-codeql) scans to proactively identify and address security vulnerabilities across all Ultralytics repositories.
[![CodeQL](https://github.com/ultralytics/ultralytics/actions/workflows/codeql.yaml/badge.svg)](https://github.com/ultralytics/ultralytics/actions/workflows/codeql.yaml)
## Reporting Security Issues
If you suspect or discover a security vulnerability in the YOLOv8 repository, please let us know immediately. You can
reach out to us directly via our [contact form](https://ultralytics.com/contact) or
via [security@ultralytics.com](mailto:security@ultralytics.com). Our security team will investigate and respond as soon
as possible.
If you suspect or discover a security vulnerability in any of our repositories, please let us know immediately. You can reach out to us directly via our [contact form](https://ultralytics.com/contact) or via [security@ultralytics.com](mailto:security@ultralytics.com). Our security team will investigate and respond as soon as possible.
We appreciate your help in keeping the YOLOv8 repository secure and safe for everyone.
We appreciate your help in keeping all Ultralytics open-source projects secure and safe for everyone.

@ -10,7 +10,8 @@ import os
import re
from collections import defaultdict
from pathlib import Path
from ultralytics.yolo.utils import ROOT
from ultralytics.utils import ROOT
NEW_YAML_DIR = ROOT.parent
CODE_DIR = ROOT
@ -21,8 +22,8 @@ def extract_classes_and_functions(filepath):
with open(filepath, 'r') as file:
content = file.read()
class_pattern = r"(?:^|\n)class\s(\w+)(?:\(|:)"
func_pattern = r"(?:^|\n)def\s(\w+)\("
class_pattern = r'(?:^|\n)class\s(\w+)(?:\(|:)'
func_pattern = r'(?:^|\n)def\s(\w+)\('
classes = re.findall(class_pattern, content)
functions = re.findall(func_pattern, content)
@ -33,9 +34,27 @@ def extract_classes_and_functions(filepath):
def create_markdown(py_filepath, module_path, classes, functions):
md_filepath = py_filepath.with_suffix('.md')
md_content = [f"# {class_name}\n---\n:::{module_path}.{class_name}\n<br><br>\n" for class_name in classes]
md_content.extend(f"# {func_name}\n---\n:::{module_path}.{func_name}\n<br><br>\n" for func_name in functions)
md_content = "\n".join(md_content)
# Read existing content and keep header content between first two ---
header_content = ''
if md_filepath.exists():
with open(md_filepath, 'r') as file:
existing_content = file.read()
header_parts = existing_content.split('---')
for part in header_parts:
if 'description:' in part or 'comments:' in part:
header_content += f'---{part}---\n\n'
module_name = module_path.replace('.__init__', '')
module_path = module_path.replace(".", "/")
url = f'https://github.com/ultralytics/ultralytics/blob/main/{module_path}.py'
title_content = (f'# Reference for `{module_path}.py`\n\n'
f'!!! note\n\n'
f' Full source code for this file is available at [{url}]({url}). Help us fix any issues you see by submitting a [Pull Request](https://docs.ultralytics.com/help/contributing/) 🛠. Thank you 🙏!\n\n')
md_content = [f'---\n## ::: {module_name}.{class_name}\n<br><br>\n' for class_name in classes]
md_content.extend(f'---\n## ::: {module_name}.{func_name}\n<br><br>\n' for func_name in functions)
md_content = header_content + title_content + '\n'.join(md_content)
if not md_content.endswith('\n'):
md_content += '\n'
os.makedirs(os.path.dirname(md_filepath), exist_ok=True)
with open(md_filepath, 'w') as file:
@ -71,11 +90,11 @@ def create_nav_menu_yaml(nav_items):
nav_tree_sorted = sort_nested_dict(nav_tree)
def _dict_to_yaml(d, level=0):
yaml_str = ""
indent = " " * level
yaml_str = ''
indent = ' ' * level
for k, v in d.items():
if isinstance(v, dict):
yaml_str += f"{indent}- {k}:\n{_dict_to_yaml(v, level + 1)}"
yaml_str += f'{indent}- {k}:\n{_dict_to_yaml(v, level + 1)}'
else:
yaml_str += f"{indent}- {k}: {str(v).replace('docs/', '')}\n"
return yaml_str
@ -89,7 +108,7 @@ def main():
nav_items = []
for root, _, files in os.walk(CODE_DIR):
for file in files:
if file.endswith(".py") and file != "__init__.py":
if file.endswith('.py'):
py_filepath = Path(root) / file
classes, functions = extract_classes_and_functions(py_filepath)
@ -103,5 +122,5 @@ def main():
create_nav_menu_yaml(nav_items)
if __name__ == "__main__":
if __name__ == '__main__':
main()

@ -0,0 +1,81 @@
---
comments: true
description: Learn about the Caltech-101 dataset, its structure and uses in machine learning. Includes instructions to train a YOLO model using this dataset.
keywords: Caltech-101, dataset, YOLO training, machine learning, object recognition, ultralytics
---
# Caltech-101 Dataset
The [Caltech-101](https://data.caltech.edu/records/mzrjq-6wc02) dataset is a widely used dataset for object recognition tasks, containing around 9,000 images from 101 object categories. The categories were chosen to reflect a variety of real-world objects, and the images themselves were carefully selected and annotated to provide a challenging benchmark for object recognition algorithms.
## Key Features
- The Caltech-101 dataset comprises around 9,000 color images divided into 101 categories.
- The categories encompass a wide variety of objects, including animals, vehicles, household items, and people.
- The number of images per category varies, with about 40 to 800 images in each category.
- Images are of variable sizes, with most images being medium resolution.
- Caltech-101 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks.
## Dataset Structure
Unlike many other datasets, the Caltech-101 dataset is not formally split into training and testing sets. Users typically create their own splits based on their specific needs. However, a common practice is to use a random subset of images for training (e.g., 30 images per category) and the remaining images for testing.
## Applications
The Caltech-101 dataset is extensively used for training and evaluating deep learning models in object recognition tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its wide variety of categories and high-quality images make it an excellent dataset for research and development in the field of machine learning and computer vision.
## Usage
To train a YOLO model on the Caltech-101 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='caltech101', epochs=100, imgsz=416)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=caltech101 model=yolov8n-cls.pt epochs=100 imgsz=416
```
## Sample Images and Annotations
The Caltech-101 dataset contains high-quality color images of various objects, providing a well-structured dataset for object recognition tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239366386-44171121-b745-4206-9b59-a3be41e16089.png)
The example showcases the variety and complexity of the objects in the Caltech-101 dataset, emphasizing the significance of a diverse dataset for training robust object recognition models.
## Citations and Acknowledgments
If you use the Caltech-101 dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{fei2007learning,
title={Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories},
author={Fei-Fei, Li and Fergus, Rob and Perona, Pietro},
journal={Computer vision and Image understanding},
volume={106},
number={1},
pages={59--70},
year={2007},
publisher={Elsevier}
}
```
We would like to acknowledge Li Fei-Fei, Rob Fergus, and Pietro Perona for creating and maintaining the Caltech-101 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the Caltech-101 dataset and its creators, visit the [Caltech-101 dataset website](https://data.caltech.edu/records/mzrjq-6wc02).

@ -0,0 +1,78 @@
---
comments: true
description: Explore the Caltech-256 dataset, a diverse collection of images used for object recognition tasks in machine learning. Learn to train a YOLO model on the dataset.
keywords: Ultralytics, YOLO, Caltech-256, dataset, object recognition, machine learning, computer vision, deep learning
---
# Caltech-256 Dataset
The [Caltech-256](https://data.caltech.edu/records/nyy15-4j048) dataset is an extensive collection of images used for object classification tasks. It contains around 30,000 images divided into 257 categories (256 object categories and 1 background category). The images are carefully curated and annotated to provide a challenging and diverse benchmark for object recognition algorithms.
## Key Features
- The Caltech-256 dataset comprises around 30,000 color images divided into 257 categories.
- Each category contains a minimum of 80 images.
- The categories encompass a wide variety of real-world objects, including animals, vehicles, household items, and people.
- Images are of variable sizes and resolutions.
- Caltech-256 is widely used for training and testing in the field of machine learning, particularly for object recognition tasks.
## Dataset Structure
Like Caltech-101, the Caltech-256 dataset does not have a formal split between training and testing sets. Users typically create their own splits according to their specific needs. A common practice is to use a random subset of images for training and the remaining images for testing.
## Applications
The Caltech-256 dataset is extensively used for training and evaluating deep learning models in object recognition tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. Its diverse set of categories and high-quality images make it an invaluable dataset for research and development in the field of machine learning and computer vision.
## Usage
To train a YOLO model on the Caltech-256 dataset for 100 epochs, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='caltech256', epochs=100, imgsz=416)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=caltech256 model=yolov8n-cls.pt epochs=100 imgsz=416
```
## Sample Images and Annotations
The Caltech-256 dataset contains high-quality color images of various objects, providing a comprehensive dataset for object recognition tasks. Here are some examples of images from the dataset ([credit](https://ml4a.github.io/demos/tsne_viewer.html)):
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239365061-1e5f7857-b1e8-44ca-b3d7-d0befbcd33f9.jpg)
The example showcases the diversity and complexity of the objects in the Caltech-256 dataset, emphasizing the importance of a varied dataset for training robust object recognition models.
## Citations and Acknowledgments
If you use the Caltech-256 dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{griffin2007caltech,
title={Caltech-256 object category dataset},
author={Griffin, Gregory and Holub, Alex and Perona, Pietro},
year={2007}
}
```
We would like to acknowledge Gregory Griffin, Alex Holub, and Pietro Perona for creating and maintaining the Caltech-256 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the
Caltech-256 dataset and its creators, visit the [Caltech-256 dataset website](https://data.caltech.edu/records/nyy15-4j048).

@ -0,0 +1,80 @@
---
comments: true
description: Explore the CIFAR-10 dataset, widely used for training in machine learning and computer vision, and learn how to use it with Ultralytics YOLO.
keywords: CIFAR-10, dataset, machine learning, image classification, computer vision, YOLO, Ultralytics, training, testing, deep learning, Convolutional Neural Networks, Support Vector Machines
---
# CIFAR-10 Dataset
The [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) (Canadian Institute For Advanced Research) dataset is a collection of images used widely for machine learning and computer vision algorithms. It was developed by researchers at the CIFAR institute and consists of 60,000 32x32 color images in 10 different classes.
## Key Features
- The CIFAR-10 dataset consists of 60,000 images, divided into 10 classes.
- Each class contains 6,000 images, split into 5,000 for training and 1,000 for testing.
- The images are colored and of size 32x32 pixels.
- The 10 different classes represent airplanes, cars, birds, cats, deer, dogs, frogs, horses, ships, and trucks.
- CIFAR-10 is commonly used for training and testing in the field of machine learning and computer vision.
## Dataset Structure
The CIFAR-10 dataset is split into two subsets:
1. **Training Set**: This subset contains 50,000 images used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Applications
The CIFAR-10 dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The diversity of the dataset in terms of classes and the presence of color images make it a well-rounded dataset for research and development in the field of machine learning and computer vision.
## Usage
To train a YOLO model on the CIFAR-10 dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='cifar10', epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=cifar10 model=yolov8n-cls.pt epochs=100 imgsz=32
```
## Sample Images and Annotations
The CIFAR-10 dataset contains color images of various objects, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://miro.medium.com/max/1100/1*SZnidBt7CQ4Xqcag6rd8Ew.png)
The example showcases the variety and complexity of the objects in the CIFAR-10 dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Citations and Acknowledgments
If you use the CIFAR-10 dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@TECHREPORT{Krizhevsky09learningmultiple,
author={Alex Krizhevsky},
title={Learning multiple layers of features from tiny images},
institution={},
year={2009}
}
```
We would like to acknowledge Alex Krizhevsky for creating and maintaining the CIFAR-10 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the CIFAR-10 dataset and its creator, visit the [CIFAR-10 dataset website](https://www.cs.toronto.edu/~kriz/cifar.html).

@ -0,0 +1,80 @@
---
comments: true
description: Discover how to leverage the CIFAR-100 dataset for machine learning and computer vision tasks with YOLO. Gain insights on its structure, use, and utilization for model training.
keywords: Ultralytics, YOLO, CIFAR-100 dataset, image classification, machine learning, computer vision, YOLO model training
---
# CIFAR-100 Dataset
The [CIFAR-100](https://www.cs.toronto.edu/~kriz/cifar.html) (Canadian Institute For Advanced Research) dataset is a significant extension of the CIFAR-10 dataset, composed of 60,000 32x32 color images in 100 different classes. It was developed by researchers at the CIFAR institute, offering a more challenging dataset for more complex machine learning and computer vision tasks.
## Key Features
- The CIFAR-100 dataset consists of 60,000 images, divided into 100 classes.
- Each class contains 600 images, split into 500 for training and 100 for testing.
- The images are colored and of size 32x32 pixels.
- The 100 different classes are grouped into 20 coarse categories for higher level classification.
- CIFAR-100 is commonly used for training and testing in the field of machine learning and computer vision.
## Dataset Structure
The CIFAR-100 dataset is split into two subsets:
1. **Training Set**: This subset contains 50,000 images used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Applications
The CIFAR-100 dataset is extensively used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The diversity of the dataset in terms of classes and the presence of color images make it a more challenging and comprehensive dataset for research and development in the field of machine learning and computer vision.
## Usage
To train a YOLO model on the CIFAR-100 dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='cifar100', epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=cifar100 model=yolov8n-cls.pt epochs=100 imgsz=32
```
## Sample Images and Annotations
The CIFAR-100 dataset contains color images of various objects, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239363319-62ebf02f-7469-4178-b066-ccac3cd334db.jpg)
The example showcases the variety and complexity of the objects in the CIFAR-100 dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Citations and Acknowledgments
If you use the CIFAR-100 dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@TECHREPORT{Krizhevsky09learningmultiple,
author={Alex Krizhevsky},
title={Learning multiple layers of features from tiny images},
institution={},
year={2009}
}
```
We would like to acknowledge Alex Krizhevsky for creating and maintaining the CIFAR-100 dataset as a valuable resource for the machine learning and computer vision research community. For more information about the CIFAR-100 dataset and its creator, visit the [CIFAR-100 dataset website](https://www.cs.toronto.edu/~kriz/cifar.html).

@ -0,0 +1,79 @@
---
comments: true
description: Learn how to use the Fashion-MNIST dataset for image classification with the Ultralytics YOLO model. Covers dataset structure, labels, applications, and usage.
keywords: Ultralytics, YOLO, Fashion-MNIST, dataset, image classification, machine learning, deep learning, neural networks, training, testing
---
# Fashion-MNIST Dataset
The [Fashion-MNIST](https://github.com/zalandoresearch/fashion-mnist) dataset is a database of Zalando's article images—consisting of a training set of 60,000 examples and a test set of 10,000 examples. Each example is a 28x28 grayscale image, associated with a label from 10 classes. Fashion-MNIST is intended to serve as a direct drop-in replacement for the original MNIST dataset for benchmarking machine learning algorithms.
## Key Features
- Fashion-MNIST contains 60,000 training images and 10,000 testing images of Zalando's article images.
- The dataset comprises grayscale images of size 28x28 pixels.
- Each pixel has a single pixel-value associated with it, indicating the lightness or darkness of that pixel, with higher numbers meaning darker. This pixel-value is an integer between 0 and 255.
- Fashion-MNIST is widely used for training and testing in the field of machine learning, especially for image classification tasks.
## Dataset Structure
The Fashion-MNIST dataset is split into two subsets:
1. **Training Set**: This subset contains 60,000 images used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Labels
Each training and test example is assigned to one of the following labels:
0. T-shirt/top
1. Trouser
2. Pullover
3. Dress
4. Coat
5. Sandal
6. Shirt
7. Sneaker
8. Bag
9. Ankle boot
## Applications
The Fashion-MNIST dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The dataset's simple and well-structured format makes it an essential resource for researchers and practitioners in the field of machine learning and computer vision.
## Usage
To train a CNN model on the Fashion-MNIST dataset for 100 epochs with an image size of 28x28, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='fashion-mnist', epochs=100, imgsz=28)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=fashion-mnist model=yolov8n-cls.pt epochs=100 imgsz=28
```
## Sample Images and Annotations
The Fashion-MNIST dataset contains grayscale images of Zalando's article images, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239359139-ce0a434e-9056-43e0-a306-3214f193dcce.png)
The example showcases the variety and complexity of the images in the Fashion-MNIST dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Acknowledgments
If you use the Fashion-MNIST dataset in your research or development work, please acknowledge the dataset by linking to the [GitHub repository](https://github.com/zalandoresearch/fashion-mnist). This dataset was made available by Zalando Research.

@ -0,0 +1,83 @@
---
comments: true
description: Understand how to use ImageNet, an extensive annotated image dataset for object recognition research, with Ultralytics YOLO models. Learn about its structure, usage, and significance in computer vision.
keywords: Ultralytics, YOLO, ImageNet, dataset, object recognition, deep learning, computer vision, machine learning, dataset training, model training, image classification, object detection
---
# ImageNet Dataset
[ImageNet](https://www.image-net.org/) is a large-scale database of annotated images designed for use in visual object recognition research. It contains over 14 million images, with each image annotated using WordNet synsets, making it one of the most extensive resources available for training deep learning models in computer vision tasks.
## Key Features
- ImageNet contains over 14 million high-resolution images spanning thousands of object categories.
- The dataset is organized according to the WordNet hierarchy, with each synset representing a category.
- ImageNet is widely used for training and benchmarking in the field of computer vision, particularly for image classification and object detection tasks.
- The annual ImageNet Large Scale Visual Recognition Challenge (ILSVRC) has been instrumental in advancing computer vision research.
## Dataset Structure
The ImageNet dataset is organized using the WordNet hierarchy. Each node in the hierarchy represents a category, and each category is described by a synset (a collection of synonymous terms). The images in ImageNet are annotated with one or more synsets, providing a rich resource for training models to recognize various objects and their relationships.
## ImageNet Large Scale Visual Recognition Challenge (ILSVRC)
The annual [ImageNet Large Scale Visual Recognition Challenge (ILSVRC)](http://image-net.org/challenges/LSVRC/) has been an important event in the field of computer vision. It has provided a platform for researchers and developers to evaluate their algorithms and models on a large-scale dataset with standardized evaluation metrics. The ILSVRC has led to significant advancements in the development of deep learning models for image classification, object detection, and other computer vision tasks.
## Applications
The ImageNet dataset is widely used for training and evaluating deep learning models in various computer vision tasks, such as image classification, object detection, and object localization. Some popular deep learning architectures, such as AlexNet, VGG, and ResNet, were developed and benchmarked using the ImageNet dataset.
## Usage
To train a deep learning model on the ImageNet dataset for 100 epochs with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='imagenet', epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo train data=imagenet model=yolov8n-cls.pt epochs=100 imgsz=224
```
## Sample Images and Annotations
The ImageNet dataset contains high-resolution images spanning thousands of object categories, providing a diverse and extensive dataset for training and evaluating computer vision models. Here are some examples of images from the dataset:
![Dataset sample images](https://user-images.githubusercontent.com/26833433/239280348-3d8f30c7-6f05-4dda-9cfe-d62ad9faecc9.png)
The example showcases the variety and complexity of the images in the ImageNet dataset, highlighting the importance of a diverse dataset for training robust computer vision models.
## Citations and Acknowledgments
If you use the ImageNet dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{ILSVRC15,
author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
title={ImageNet Large Scale Visual Recognition Challenge},
year={2015},
journal={International Journal of Computer Vision (IJCV)},
volume={115},
number={3},
pages={211-252}
}
```
We would like to acknowledge the ImageNet team, led by Olga Russakovsky, Jia Deng, and Li Fei-Fei, for creating and maintaining the ImageNet dataset as a valuable resource for the machine learning and computer vision research community. For more information about the ImageNet dataset and its creators, visit the [ImageNet website](https://www.image-net.org/).

@ -0,0 +1,78 @@
---
comments: true
description: Explore the compact ImageNet10 Dataset developed by Ultralytics. Ideal for fast testing of computer vision training pipelines and CV model sanity checks.
keywords: Ultralytics, YOLO, ImageNet10 Dataset, Image detection, Deep Learning, ImageNet, AI model testing, Computer vision, Machine learning
---
# ImageNet10 Dataset
The [ImageNet10](https://github.com/ultralytics/yolov5/releases/download/v1.0/imagenet10.zip) dataset is a small-scale subset of the [ImageNet](https://www.image-net.org/) database, developed by [Ultralytics](https://ultralytics.com) and designed for CI tests, sanity checks, and fast testing of training pipelines. This dataset is composed of the first image in the training set and the first image from the validation set of the first 10 classes in ImageNet. Although significantly smaller, it retains the structure and diversity of the original ImageNet dataset.
## Key Features
- ImageNet10 is a compact version of ImageNet, with 20 images representing the first 10 classes of the original dataset.
- The dataset is organized according to the WordNet hierarchy, mirroring the structure of the full ImageNet dataset.
- It is ideally suited for CI tests, sanity checks, and rapid testing of training pipelines in computer vision tasks.
- Although not designed for model benchmarking, it can provide a quick indication of a model's basic functionality and correctness.
## Dataset Structure
The ImageNet10 dataset, like the original ImageNet, is organized using the WordNet hierarchy. Each of the 10 classes in ImageNet10 is described by a synset (a collection of synonymous terms). The images in ImageNet10 are annotated with one or more synsets, providing a compact resource for testing models to recognize various objects and their relationships.
## Applications
The ImageNet10 dataset is useful for quickly testing and debugging computer vision models and pipelines. Its small size allows for rapid iteration, making it ideal for continuous integration tests and sanity checks. It can also be used for fast preliminary testing of new models or changes to existing models before moving on to full-scale testing with the complete ImageNet dataset.
## Usage
To test a deep learning model on the ImageNet10 dataset with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Test Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='imagenet10', epochs=5, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo train data=imagenet10 model=yolov8n-cls.pt epochs=5 imgsz=224
```
## Sample Images and Annotations
The ImageNet10 dataset contains a subset of images from the original ImageNet dataset. These images are chosen to represent the first 10 classes in the dataset, providing a diverse yet compact dataset for quick testing and evaluation.
![Dataset sample images](https://user-images.githubusercontent.com/26833433/239689723-16f9b4a7-becc-4deb-b875-d3e5c28eb03b.png)
The example showcases the variety and complexity of the images in the ImageNet10 dataset, highlighting its usefulness for sanity checks and quick testing of computer vision models.
## Citations and Acknowledgments
If you use the ImageNet10 dataset in your research or development work, please cite the original ImageNet paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{ILSVRC15,
author = {Olga Russakovsky and Jia Deng and Hao Su and Jonathan Krause and Sanjeev Satheesh and Sean Ma and Zhiheng Huang and Andrej Karpathy and Aditya Khosla and Michael Bernstein and Alexander C. Berg and Li Fei-Fei},
title={ImageNet Large Scale Visual Recognition Challenge},
year={2015},
journal={International Journal of Computer Vision (IJCV)},
volume={115},
number={3},
pages={211-252}
}
```
We would like to acknowledge the ImageNet team, led by Olga Russakovsky, Jia Deng, and Li Fei-Fei, for creating and maintaining the ImageNet dataset. The ImageNet10 dataset, while a compact subset, is a valuable resource for quick testing and debugging in the machine learning and computer vision research community. For more information about the ImageNet dataset and its creators, visit the [ImageNet website](https://www.image-net.org/).

@ -0,0 +1,113 @@
---
comments: true
description: Learn about the ImageNette dataset and its usage in deep learning model training. Find code snippets for model training and explore ImageNette datatypes.
keywords: ImageNette dataset, Ultralytics, YOLO, Image classification, Machine Learning, Deep learning, Training code snippets, CNN, ImageNette160, ImageNette320
---
# ImageNette Dataset
The [ImageNette](https://github.com/fastai/imagenette) dataset is a subset of the larger [Imagenet](http://www.image-net.org/) dataset, but it only includes 10 easily distinguishable classes. It was created to provide a quicker, easier-to-use version of Imagenet for software development and education.
## Key Features
- ImageNette contains images from 10 different classes such as tench, English springer, cassette player, chain saw, church, French horn, garbage truck, gas pump, golf ball, parachute.
- The dataset comprises colored images of varying dimensions.
- ImageNette is widely used for training and testing in the field of machine learning, especially for image classification tasks.
## Dataset Structure
The ImageNette dataset is split into two subsets:
1. **Training Set**: This subset contains several thousands of images used for training machine learning models. The exact number varies per class.
2. **Validation Set**: This subset consists of several hundreds of images used for validating and benchmarking the trained models. Again, the exact number varies per class.
## Applications
The ImageNette dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), and various other machine learning algorithms. The dataset's straightforward format and well-chosen classes make it a handy resource for both beginner and experienced practitioners in the field of machine learning and computer vision.
## Usage
To train a model on the ImageNette dataset for 100 epochs with a standard image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='imagenette', epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=imagenette model=yolov8n-cls.pt epochs=100 imgsz=224
```
## Sample Images and Annotations
The ImageNette dataset contains colored images of various objects and scenes, providing a diverse dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://docs.fast.ai/22_tutorial.imagenette_files/figure-html/cell-21-output-1.png)
The example showcases the variety and complexity of the images in the ImageNette dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## ImageNette160 and ImageNette320
For faster prototyping and training, the ImageNette dataset is also available in two reduced sizes: ImageNette160 and ImageNette320. These datasets maintain the same classes and structure as the full ImageNette dataset, but the images are resized to a smaller dimension. As such, these versions of the dataset are particularly useful for preliminary model testing, or when computational resources are limited.
To use these datasets, simply replace 'imagenette' with 'imagenette160' or 'imagenette320' in the training command. The following code snippets illustrate this:
!!! example "Train Example with ImageNette160"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model with ImageNette160
results = model.train(data='imagenette160', epochs=100, imgsz=160)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model with ImageNette160
yolo detect train data=imagenette160 model=yolov8n-cls.pt epochs=100 imgsz=160
```
!!! example "Train Example with ImageNette320"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model with ImageNette320
results = model.train(data='imagenette320', epochs=100, imgsz=320)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model with ImageNette320
yolo detect train data=imagenette320 model=yolov8n-cls.pt epochs=100 imgsz=320
```
These smaller versions of the dataset allow for rapid iterations during the development process while still providing valuable and realistic image classification tasks.
## Citations and Acknowledgments
If you use the ImageNette dataset in your research or development work, please acknowledge it appropriately. For more information about the ImageNette dataset, visit the [ImageNette dataset GitHub page](https://github.com/fastai/imagenette).

@ -0,0 +1,84 @@
---
comments: true
description: Explore the ImageWoof dataset, designed for challenging dog breed classification. Train AI models with Ultralytics YOLO using this dataset.
keywords: ImageWoof, image classification, dog breeds, machine learning, deep learning, Ultralytics, YOLO, dataset
---
# ImageWoof Dataset
The [ImageWoof](https://github.com/fastai/imagenette) dataset is a subset of the ImageNet consisting of 10 classes that are challenging to classify, since they're all dog breeds. It was created as a more difficult task for image classification algorithms to solve, aiming at encouraging development of more advanced models.
## Key Features
- ImageWoof contains images of 10 different dog breeds: Australian terrier, Border terrier, Samoyed, Beagle, Shih-Tzu, English foxhound, Rhodesian ridgeback, Dingo, Golden retriever, and Old English sheepdog.
- The dataset provides images at various resolutions (full size, 320px, 160px), accommodating for different computational capabilities and research needs.
- It also includes a version with noisy labels, providing a more realistic scenario where labels might not always be reliable.
## Dataset Structure
The ImageWoof dataset structure is based on the dog breed classes, with each breed having its own directory of images.
## Applications
The ImageWoof dataset is widely used for training and evaluating deep learning models in image classification tasks, especially when it comes to more complex and similar classes. The dataset's challenge lies in the subtle differences between the dog breeds, pushing the limits of model's performance and generalization.
## Usage
To train a CNN model on the ImageWoof dataset for 100 epochs with an image size of 224x224, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='imagewoof', epochs=100, imgsz=224)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=imagewoof model=yolov8n-cls.pt epochs=100 imgsz=224
```
## Dataset Variants
ImageWoof dataset comes in three different sizes to accommodate various research needs and computational capabilities:
1. **Full Size (imagewoof)**: This is the original version of the ImageWoof dataset. It contains full-sized images and is ideal for final training and performance benchmarking.
2. **Medium Size (imagewoof320)**: This version contains images resized to have a maximum edge length of 320 pixels. It's suitable for faster training without significantly sacrificing model performance.
3. **Small Size (imagewoof160)**: This version contains images resized to have a maximum edge length of 160 pixels. It's designed for rapid prototyping and experimentation where training speed is a priority.
To use these variants in your training, simply replace 'imagewoof' in the dataset argument with 'imagewoof320' or 'imagewoof160'. For example:
```python
# For medium-sized dataset
model.train(data='imagewoof320', epochs=100, imgsz=224)
# For small-sized dataset
model.train(data='imagewoof160', epochs=100, imgsz=224)
```
It's important to note that using smaller images will likely yield lower performance in terms of classification accuracy. However, it's an excellent way to iterate quickly in the early stages of model development and prototyping.
## Sample Images and Annotations
The ImageWoof dataset contains colorful images of various dog breeds, providing a challenging dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239357533-ec833254-4351-491b-8cb3-59578ea5d0b2.png)
The example showcases the subtle differences and similarities among the different dog breeds in the ImageWoof dataset, highlighting the complexity and difficulty of the classification task.
## Citations and Acknowledgments
If you use the ImageWoof dataset in your research or development work, please make sure to acknowledge the creators of the dataset by linking to the [official dataset repository](https://github.com/fastai/imagenette).
We would like to acknowledge the FastAI team for creating and maintaining the ImageWoof dataset as a valuable resource for the machine learning and computer vision research community. For more information about the ImageWoof dataset, visit the [ImageWoof dataset repository](https://github.com/fastai/imagenette).

@ -0,0 +1,120 @@
---
comments: true
description: Explore image classification datasets supported by Ultralytics, learn the standard dataset format, and set up your own dataset for training models.
keywords: Ultralytics, image classification, dataset, machine learning, CIFAR-10, ImageNet, MNIST, torchvision
---
# Image Classification Datasets Overview
## Dataset format
The folder structure for classification datasets in torchvision typically follows a standard format:
```
root/
|-- class1/
| |-- img1.jpg
| |-- img2.jpg
| |-- ...
|
|-- class2/
| |-- img1.jpg
| |-- img2.jpg
| |-- ...
|
|-- class3/
| |-- img1.jpg
| |-- img2.jpg
| |-- ...
|
|-- ...
```
In this folder structure, the `root` directory contains one subdirectory for each class in the dataset. Each subdirectory is named after the corresponding class and contains all the images for that class. Each image file is named uniquely and is typically in a common image file format such as JPEG or PNG.
** Example **
For example, in the CIFAR10 dataset, the folder structure would look like this:
```
cifar-10-/
|
|-- train/
| |-- airplane/
| | |-- 10008_airplane.png
| | |-- 10009_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 1000_automobile.png
| | |-- 1001_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 10014_bird.png
| | |-- 10015_bird.png
| | |-- ...
| |
| |-- ...
|
|-- test/
| |-- airplane/
| | |-- 10_airplane.png
| | |-- 11_airplane.png
| | |-- ...
| |
| |-- automobile/
| | |-- 100_automobile.png
| | |-- 101_automobile.png
| | |-- ...
| |
| |-- bird/
| | |-- 1000_bird.png
| | |-- 1001_bird.png
| | |-- ...
| |
| |-- ...
```
In this example, the `train` directory contains subdirectories for each class in the dataset, and each class subdirectory contains all the images for that class. The `test` directory has a similar structure. The `root` directory also contains other files that are part of the CIFAR10 dataset.
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='path/to/dataset', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=path/to/data model=yolov8n-cls.pt epochs=100 imgsz=640
```
## Supported Datasets
Ultralytics supports the following datasets with automatic download:
* [Caltech 101](caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
* [Caltech 256](caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
* [CIFAR-10](cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
* [CIFAR-100](cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
* [Fashion-MNIST](fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
* [ImageNet](imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
* [ImageNet-10](imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
* [Imagenette](imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
* [Imagewoof](imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
* [MNIST](mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
### Adding your own dataset
If you have your own dataset and would like to use it for training classification models with Ultralytics, ensure that it follows the format specified above under "Dataset format" and then point your `data` argument to the dataset directory.

@ -0,0 +1,86 @@
---
comments: true
description: Detailed guide on the MNIST Dataset, a benchmark in the machine learning community for image classification tasks. Learn about its structure, usage and application.
keywords: MNIST dataset, Ultralytics, image classification, machine learning, computer vision, deep learning, AI, dataset guide
---
# MNIST Dataset
The [MNIST](http://yann.lecun.com/exdb/mnist/) (Modified National Institute of Standards and Technology) dataset is a large database of handwritten digits that is commonly used for training various image processing systems and machine learning models. It was created by "re-mixing" the samples from NIST's original datasets and has become a benchmark for evaluating the performance of image classification algorithms.
## Key Features
- MNIST contains 60,000 training images and 10,000 testing images of handwritten digits.
- The dataset comprises grayscale images of size 28x28 pixels.
- The images are normalized to fit into a 28x28 pixel bounding box and anti-aliased, introducing grayscale levels.
- MNIST is widely used for training and testing in the field of machine learning, especially for image classification tasks.
## Dataset Structure
The MNIST dataset is split into two subsets:
1. **Training Set**: This subset contains 60,000 images of handwritten digits used for training machine learning models.
2. **Testing Set**: This subset consists of 10,000 images used for testing and benchmarking the trained models.
## Extended MNIST (EMNIST)
Extended MNIST (EMNIST) is a newer dataset developed and released by NIST to be the successor to MNIST. While MNIST included images only of handwritten digits, EMNIST includes all the images from NIST Special Database 19, which is a large database of handwritten uppercase and lowercase letters as well as digits. The images in EMNIST were converted into the same 28x28 pixel format, by the same process, as were the MNIST images. Accordingly, tools that work with the older, smaller MNIST dataset will likely work unmodified with EMNIST.
## Applications
The MNIST dataset is widely used for training and evaluating deep learning models in image classification tasks, such as Convolutional Neural Networks (CNNs), Support Vector Machines (SVMs), and various other machine learning algorithms. The dataset's simple and well-structured format makes it an essential resource for researchers and practitioners in the field of machine learning and computer vision.
## Usage
To train a CNN model on the MNIST dataset for 100 epochs with an image size of 32x32, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-cls.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='mnist', epochs=100, imgsz=32)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
cnn detect train data=mnist model=yolov8n-cls.pt epochs=100 imgsz=28
```
## Sample Images and Annotations
The MNIST dataset contains grayscale images of handwritten digits, providing a well-structured dataset for image classification tasks. Here are some examples of images from the dataset:
![Dataset sample image](https://upload.wikimedia.org/wikipedia/commons/2/27/MnistExamples.png)
The example showcases the variety and complexity of the handwritten digits in the MNIST dataset, highlighting the importance of a diverse dataset for training robust image classification models.
## Citations and Acknowledgments
If you use the MNIST dataset in your
research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{lecun2010mnist,
title={MNIST handwritten digit database},
author={LeCun, Yann and Cortes, Corinna and Burges, CJ},
journal={ATT Labs [Online]. Available: http://yann.lecun.com/exdb/mnist},
volume={2},
year={2010}
}
```
We would like to acknowledge Yann LeCun, Corinna Cortes, and Christopher J.C. Burges for creating and maintaining the MNIST dataset as a valuable resource for the machine learning and computer vision research community. For more information about the MNIST dataset and its creators, visit the [MNIST dataset website](http://yann.lecun.com/exdb/mnist/).

@ -0,0 +1,97 @@
---
comments: true
description: Explore Argoverse, a comprehensive dataset for autonomous driving tasks including 3D tracking, motion forecasting and depth estimation used in YOLO.
keywords: Argoverse dataset, autonomous driving, YOLO, 3D tracking, motion forecasting, LiDAR data, HD maps, ultralytics documentation
---
# Argoverse Dataset
The [Argoverse](https://www.argoverse.org/) dataset is a collection of data designed to support research in autonomous driving tasks, such as 3D tracking, motion forecasting, and stereo depth estimation. Developed by Argo AI, the dataset provides a wide range of high-quality sensor data, including high-resolution images, LiDAR point clouds, and map data.
!!! note
The Argoverse dataset *.zip file required for training was removed from Amazon S3 after the shutdown of Argo AI by Ford, but we have made it available for manual download on [Google Drive](https://drive.google.com/file/d/1st9qW3BeIwQsnR0t8mRpvbsSWIo16ACi/view?usp=drive_link).
## Key Features
- Argoverse contains over 290K labeled 3D object tracks and 5 million object instances across 1,263 distinct scenes.
- The dataset includes high-resolution camera images, LiDAR point clouds, and richly annotated HD maps.
- Annotations include 3D bounding boxes for objects, object tracks, and trajectory information.
- Argoverse provides multiple subsets for different tasks, such as 3D tracking, motion forecasting, and stereo depth estimation.
## Dataset Structure
The Argoverse dataset is organized into three main subsets:
1. **Argoverse 3D Tracking**: This subset contains 113 scenes with over 290K labeled 3D object tracks, focusing on 3D object tracking tasks. It includes LiDAR point clouds, camera images, and sensor calibration information.
2. **Argoverse Motion Forecasting**: This subset consists of 324K vehicle trajectories collected from 60 hours of driving data, suitable for motion forecasting tasks.
3. **Argoverse Stereo Depth Estimation**: This subset is designed for stereo depth estimation tasks and includes over 10K stereo image pairs with corresponding LiDAR point clouds for ground truth depth estimation.
## Applications
The Argoverse dataset is widely used for training and evaluating deep learning models in autonomous driving tasks such as 3D object tracking, motion forecasting, and stereo depth estimation. The dataset's diverse set of sensor data, object annotations, and map information make it a valuable resource for researchers and practitioners in the field of autonomous driving.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Argoverse dataset, the `Argoverse.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Argoverse.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Argoverse.yaml).
!!! example "ultralytics/cfg/datasets/Argoverse.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/Argoverse.yaml"
```
## Usage
To train a YOLOv8n model on the Argoverse dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='Argoverse.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=Argoverse.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
The Argoverse dataset contains a diverse set of sensor data, including camera images, LiDAR point clouds, and HD map information, providing rich context for autonomous driving tasks. Here are some examples of data from the dataset, along with their corresponding annotations:
![Dataset sample image](https://www.argoverse.org/assets/images/reference_images/av2_ground_height.png)
- **Argoverse 3D Tracking**: This image demonstrates an example of 3D object tracking, where objects are annotated with 3D bounding boxes. The dataset provides LiDAR point clouds and camera images to facilitate the development of models for this task.
The example showcases the variety and complexity of the data in the Argoverse dataset and highlights the importance of high-quality sensor data for autonomous driving tasks.
## Citations and Acknowledgments
If you use the Argoverse dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@inproceedings{chang2019argoverse,
title={Argoverse: 3D Tracking and Forecasting with Rich Maps},
author={Chang, Ming-Fang and Lambert, John and Sangkloy, Patsorn and Singh, Jagjeet and Bak, Slawomir and Hartnett, Andrew and Wang, Dequan and Carr, Peter and Lucey, Simon and Ramanan, Deva and others},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={8748--8757},
year={2019}
}
```
We would like to acknowledge Argo AI for creating and maintaining the Argoverse dataset as a valuable resource for the autonomous driving research community. For more information about the Argoverse dataset and its creators, visit the [Argoverse dataset website](https://www.argoverse.org/).

@ -0,0 +1,94 @@
---
comments: true
description: Learn how COCO, a leading dataset for object detection and segmentation, integrates with Ultralytics. Discover ways to use it for training YOLO models.
keywords: Ultralytics, COCO dataset, object detection, YOLO, YOLO model training, image segmentation, computer vision, deep learning models
---
# COCO Dataset
The [COCO](https://cocodataset.org/#home) (Common Objects in Context) dataset is a large-scale object detection, segmentation, and captioning dataset. It is designed to encourage research on a wide variety of object categories and is commonly used for benchmarking computer vision models. It is an essential dataset for researchers and developers working on object detection, segmentation, and pose estimation tasks.
## Key Features
- COCO contains 330K images, with 200K images having annotations for object detection, segmentation, and captioning tasks.
- The dataset comprises 80 object categories, including common objects like cars, bicycles, and animals, as well as more specific categories such as umbrellas, handbags, and sports equipment.
- Annotations include object bounding boxes, segmentation masks, and captions for each image.
- COCO provides standardized evaluation metrics like mean Average Precision (mAP) for object detection, and mean Average Recall (mAR) for segmentation tasks, making it suitable for comparing model performance.
## Dataset Structure
The COCO dataset is split into three subsets:
1. **Train2017**: This subset contains 118K images for training object detection, segmentation, and captioning models.
2. **Val2017**: This subset has 5K images used for validation purposes during model training.
3. **Test2017**: This subset consists of 20K images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7384) for performance evaluation.
## Applications
The COCO dataset is widely used for training and evaluating deep learning models in object detection (such as YOLO, Faster R-CNN, and SSD), instance segmentation (such as Mask R-CNN), and keypoint detection (such as OpenPose). The dataset's diverse set of object categories, large number of annotated images, and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO dataset, the `coco.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml).
!!! example "ultralytics/cfg/datasets/coco.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco.yaml"
```
## Usage
To train a YOLOv8n model on the COCO dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
The COCO dataset contains a diverse set of images with various object categories and complex scenes. Here are some examples of images from the dataset, along with their corresponding annotations:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/236811818-5b566576-1e92-42fa-9462-4b6a848abe89.jpg)
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).

@ -0,0 +1,84 @@
---
comments: true
description: Discover the benefits of using the practical and diverse COCO8 dataset for object detection model testing. Learn to configure and use it via Ultralytics HUB and YOLOv8.
keywords: Ultralytics, COCO8 dataset, object detection, model testing, dataset configuration, detection approaches, sanity check, training pipelines, YOLOv8
---
# COCO8 Dataset
## Introduction
[Ultralytics](https://ultralytics.com) COCO8 is a small, but versatile object detection dataset composed of the first 8
images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and debugging
object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough to be
easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training
larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8 dataset, the `coco8.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8.yaml).
!!! example "ultralytics/cfg/datasets/coco8.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8.yaml"
```
## Usage
To train a YOLOv8n model on the COCO8 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
Here are some examples of images from the COCO8 dataset, along with their corresponding annotations:
<img src="https://user-images.githubusercontent.com/26833433/236818348-e6260a3d-0454-436b-83a9-de366ba07235.jpg" alt="Dataset sample image" width="800">
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO8 dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).

@ -0,0 +1,91 @@
---
comments: true
description: Understand how to utilize the vast Global Wheat Head Dataset for building wheat head detection models. Features, structure, applications, usage, sample data, and citation.
keywords: Ultralytics, YOLO, Global Wheat Head Dataset, wheat head detection, plant phenotyping, crop management, deep learning, outdoor images, annotations, YAML configuration
---
# Global Wheat Head Dataset
The [Global Wheat Head Dataset](http://www.global-wheat.com/) is a collection of images designed to support the development of accurate wheat head detection models for applications in wheat phenotyping and crop management. Wheat heads, also known as spikes, are the grain-bearing parts of the wheat plant. Accurate estimation of wheat head density and size is essential for assessing crop health, maturity, and yield potential. The dataset, created by a collaboration of nine research institutes from seven countries, covers multiple growing regions to ensure models generalize well across different environments.
## Key Features
- The dataset contains over 3,000 training images from Europe (France, UK, Switzerland) and North America (Canada).
- It includes approximately 1,000 test images from Australia, Japan, and China.
- Images are outdoor field images, capturing the natural variability in wheat head appearances.
- Annotations include wheat head bounding boxes to support object detection tasks.
## Dataset Structure
The Global Wheat Head Dataset is organized into two main subsets:
1. **Training Set**: This subset contains over 3,000 images from Europe and North America. The images are labeled with wheat head bounding boxes, providing ground truth for training object detection models.
2. **Test Set**: This subset consists of approximately 1,000 images from Australia, Japan, and China. These images are used for evaluating the performance of trained models on unseen genotypes, environments, and observational conditions.
## Applications
The Global Wheat Head Dataset is widely used for training and evaluating deep learning models in wheat head detection tasks. The dataset's diverse set of images, capturing a wide range of appearances, environments, and conditions, make it a valuable resource for researchers and practitioners in the field of plant phenotyping and crop management.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Global Wheat Head Dataset, the `GlobalWheat2020.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/GlobalWheat2020.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/GlobalWheat2020.yaml).
!!! example "ultralytics/cfg/datasets/GlobalWheat2020.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/GlobalWheat2020.yaml"
```
## Usage
To train a YOLOv8n model on the Global Wheat Head Dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='GlobalWheat2020.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=GlobalWheat2020.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
The Global Wheat Head Dataset contains a diverse set of outdoor field images, capturing the natural variability in wheat head appearances, environments, and conditions. Here are some examples of data from the dataset, along with their corresponding annotations:
![Dataset sample image](https://i.ytimg.com/vi/yqvMuw-uedU/maxresdefault.jpg)
- **Wheat Head Detection**: This image demonstrates an example of wheat head detection, where wheat heads are annotated with bounding boxes. The dataset provides a variety of images to facilitate the development of models for this task.
The example showcases the variety and complexity of the data in the Global Wheat Head Dataset and highlights the importance of accurate wheat head detection for applications in wheat phenotyping and crop management.
## Citations and Acknowledgments
If you use the Global Wheat Head Dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{david2020global,
title={Global Wheat Head Detection (GWHD) Dataset: A Large and Diverse Dataset of High-Resolution RGB-Labelled Images to Develop and Benchmark Wheat Head Detection Methods},
author={David, Etienne and Madec, Simon and Sadeghi-Tehran, Pouria and Aasen, Helge and Zheng, Bangyou and Liu, Shouyang and Kirchgessner, Norbert and Ishikawa, Goro and Nagasawa, Koichi and Badhon, Minhajul and others},
journal={arXiv preprint arXiv:2005.02162},
year={2020}
}
```
We would like to acknowledge the researchers and institutions that contributed to the creation and maintenance of the Global Wheat Head Dataset as a valuable resource for the plant phenotyping and crop management research community. For more information about the dataset and its creators, visit the [Global Wheat Head Dataset website](http://www.global-wheat.com/).

@ -0,0 +1,104 @@
---
comments: true
description: Navigate through supported dataset formats, methods to utilize them and how to add your own datasets. Get insights on porting or converting label formats.
keywords: Ultralytics, YOLO, datasets, object detection, dataset formats, label formats, data conversion
---
# Object Detection Datasets Overview
Training a robust and accurate object detection model requires a comprehensive dataset. This guide introduces various formats of datasets that are compatible with the Ultralytics YOLO model and provides insights into their structure, usage, and how to convert between different formats.
## Supported Dataset Formats
### Ultralytics YOLO format
The Ultralytics YOLO format is a dataset configuration format that allows you to define the dataset root directory, the relative paths to training/validation/testing image directories or *.txt files containing image paths, and a dictionary of class names. Here is an example:
```yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco8 # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)
# Classes (80 COCO classes)
names:
0: person
1: bicycle
2: car
...
77: teddy bear
78: hair drier
79: toothbrush
```
Labels for this format should be exported to YOLO format with one `*.txt` file per image. If there are no objects in an image, no `*.txt` file is required. The `*.txt` file should be formatted with one row per object in `class x_center y_center width height` format. Box coordinates must be in **normalized xywh** format (from 0 to 1). If your boxes are in pixels, you should divide `x_center` and `width` by image width, and `y_center` and `height` by image height. Class numbers should be zero-indexed (start with 0).
<p align="center"><img width="750" src="https://user-images.githubusercontent.com/26833433/91506361-c7965000-e886-11ea-8291-c72b98c25eec.jpg"></p>
The label file corresponding to the above image contains 2 persons (class `0`) and a tie (class `27`):
<p align="center"><img width="428" src="https://user-images.githubusercontent.com/26833433/112467037-d2568c00-8d66-11eb-8796-55402ac0d62f.png"></p>
When using the Ultralytics YOLO format, organize your training and validation images and labels as shown in the example below.
<p align="center"><img width="700" src="https://user-images.githubusercontent.com/26833433/134436012-65111ad1-9541-4853-81a6-f19a3468b75f.png"></p>
## Usage
Here's how you can use these formats to train your model:
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco8.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Supported Datasets
Here is a list of the supported datasets and a brief description for each:
- [**Argoverse**](./argoverse.md): A collection of sensor data collected from autonomous vehicles. It contains 3D tracking annotations for car objects.
- [**COCO**](./coco.md): Common Objects in Context (COCO) is a large-scale object detection, segmentation, and captioning dataset with 80 object categories.
- [**COCO8**](./coco8.md): A smaller subset of the COCO dataset, COCO8 is more lightweight and faster to train.
- [**GlobalWheat2020**](./globalwheat2020.md): A dataset containing images of wheat heads for the Global Wheat Challenge 2020.
- [**Objects365**](./objects365.md): A large-scale object detection dataset with 365 object categories and 600k images, aimed at advancing object detection research.
- [**OpenImagesV7**](./open-images-v7.md): A comprehensive dataset by Google with 1.7M train images and 42k validation images.
- [**SKU-110K**](./sku-110k.md): A dataset containing images of densely packed retail products, intended for retail environment object detection.
- [**VisDrone**](./visdrone.md): A dataset focusing on drone-based images, containing various object categories like cars, pedestrians, and cyclists.
- [**VOC**](./voc.md): PASCAL VOC is a popular object detection dataset with 20 object categories including vehicles, animals, and furniture.
- [**xView**](./xview.md): A dataset containing high-resolution satellite imagery, designed for the detection of various object classes in overhead views.
### Adding your own dataset
If you have your own dataset and would like to use it for training detection models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
## Port or Convert Label Formats
### COCO Dataset Format to YOLO Format
You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:
```python
from ultralytics.data.converter import convert_coco
convert_coco(labels_dir='../coco/annotations/')
```
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.

@ -0,0 +1,92 @@
---
comments: true
description: Discover the Objects365 dataset, a wide-scale, high-quality resource for object detection research. Learn to use it with the Ultralytics YOLO model.
keywords: Objects365, object detection, Ultralytics, dataset, YOLO, bounding boxes, annotations, computer vision, deep learning, training models
---
# Objects365 Dataset
The [Objects365](https://www.objects365.org/) dataset is a large-scale, high-quality dataset designed to foster object detection research with a focus on diverse objects in the wild. Created by a team of [Megvii](https://en.megvii.com/) researchers, the dataset offers a wide range of high-resolution images with a comprehensive set of annotated bounding boxes covering 365 object categories.
## Key Features
- Objects365 contains 365 object categories, with 2 million images and over 30 million bounding boxes.
- The dataset includes diverse objects in various scenarios, providing a rich and challenging benchmark for object detection tasks.
- Annotations include bounding boxes for objects, making it suitable for training and evaluating object detection models.
- Objects365 pre-trained models significantly outperform ImageNet pre-trained models, leading to better generalization on various tasks.
## Dataset Structure
The Objects365 dataset is organized into a single set of images with corresponding annotations:
- **Images**: The dataset includes 2 million high-resolution images, each containing a variety of objects across 365 categories.
- **Annotations**: The images are annotated with over 30 million bounding boxes, providing comprehensive ground truth information for object detection tasks.
## Applications
The Objects365 dataset is widely used for training and evaluating deep learning models in object detection tasks. The dataset's diverse set of object categories and high-quality annotations make it a valuable resource for researchers and practitioners in the field of computer vision.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the Objects365 Dataset, the `Objects365.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Objects365.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/Objects365.yaml).
!!! example "ultralytics/cfg/datasets/Objects365.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/Objects365.yaml"
```
## Usage
To train a YOLOv8n model on the Objects365 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='Objects365.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=Objects365.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
The Objects365 dataset contains a diverse set of high-resolution images with objects from 365 categories, providing rich context for object detection tasks. Here are some examples of the images in the dataset:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/238215467-caf757dd-0b87-4b0d-bb19-d94a547f7fbf.jpg)
- **Objects365**: This image demonstrates an example of object detection, where objects are annotated with bounding boxes. The dataset provides a wide range of images to facilitate the development of models for this task.
The example showcases the variety and complexity of the data in the Objects365 dataset and highlights the importance of accurate object detection for computer vision applications.
## Citations and Acknowledgments
If you use the Objects365 dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@inproceedings{shao2019objects365,
title={Objects365: A Large-scale, High-quality Dataset for Object Detection},
author={Shao, Shuai and Li, Zeming and Zhang, Tianyuan and Peng, Chao and Yu, Gang and Li, Jing and Zhang, Xiangyu and Sun, Jian},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={8425--8434},
year={2019}
}
```
We would like to acknowledge the team of researchers who created and maintain the Objects365 dataset as a valuable resource for the computer vision research community. For more information about the Objects365 dataset and its creators, visit the [Objects365 dataset website](https://www.objects365.org/).

@ -0,0 +1,110 @@
---
comments: true
description: Dive into Google's Open Images V7, a comprehensive dataset offering a broad scope for computer vision research. Understand its usage with deep learning models.
keywords: Open Images V7, object detection, segmentation masks, visual relationships, localized narratives, computer vision, deep learning, annotations, bounding boxes
---
# Open Images V7 Dataset
[Open Images V7](https://storage.googleapis.com/openimages/web/index.html) is a versatile and expansive dataset championed by Google. Aimed at propelling research in the realm of computer vision, it boasts a vast collection of images annotated with a plethora of data, including image-level labels, object bounding boxes, object segmentation masks, visual relationships, and localized narratives.
![Open Images V7 classes visual](https://user-images.githubusercontent.com/26833433/258660358-2dc07771-ec08-4d11-b24a-f66e07550050.png)
## Key Features
- Encompasses ~9M images annotated in various ways to suit multiple computer vision tasks.
- Houses a staggering 16M bounding boxes across 600 object classes in 1.9M images. These boxes are primarily hand-drawn by experts ensuring high precision.
- Visual relationship annotations totaling 3.3M are available, detailing 1,466 unique relationship triplets, object properties, and human activities.
- V5 introduced segmentation masks for 2.8M objects across 350 classes.
- V6 introduced 675k localized narratives that amalgamate voice, text, and mouse traces highlighting described objects.
- V7 introduced 66.4M point-level labels on 1.4M images, spanning 5,827 classes.
- Encompasses 61.4M image-level labels across a diverse set of 20,638 classes.
- Provides a unified platform for image classification, object detection, relationship detection, instance segmentation, and multimodal image descriptions.
## Dataset Structure
Open Images V7 is structured in multiple components catering to varied computer vision challenges:
- **Images**: About 9 million images, often showcasing intricate scenes with an average of 8.3 objects per image.
- **Bounding Boxes**: Over 16 million boxes that demarcate objects across 600 categories.
- **Segmentation Masks**: These detail the exact boundary of 2.8M objects across 350 classes.
- **Visual Relationships**: 3.3M annotations indicating object relationships, properties, and actions.
- **Localized Narratives**: 675k descriptions combining voice, text, and mouse traces.
- **Point-Level Labels**: 66.4M labels across 1.4M images, suitable for zero/few-shot semantic segmentation.
## Applications
Open Images V7 is a cornerstone for training and evaluating state-of-the-art models in various computer vision tasks. The dataset's broad scope and high-quality annotations make it indispensable for researchers and developers specializing in computer vision.
## Dataset YAML
Typically, datasets come with a YAML (Yet Another Markup Language) file that delineates the dataset's configuration. For the case of Open Images V7, a hypothetical `OpenImagesV7.yaml` might exist. For accurate paths and configurations, one should refer to the dataset's official repository or documentation.
!!! example "OpenImagesV7.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/open-images-v7.yaml"
```
## Usage
To train a YOLOv8n model on the Open Images V7 dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! warning
The complete Open Images V7 dataset comprises 1,743,042 training images and 41,620 validation images, requiring approximately **561 GB of storage space** upon download.
Executing the commands provided below will trigger an automatic download of the full dataset if it's not already present locally. Before running the below example it's crucial to:
- Verify that your device has enough storage capacity.
- Ensure a robust and speedy internet connection.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Train the model on the Open Images V7 dataset
results = model.train(data='open-images-v7.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Train a COCO-pretrained YOLOv8n model on the Open Images V7 dataset
yolo detect train data=open-images-v7.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
Illustrations of the dataset help provide insights into its richness:
![Dataset sample image](https://storage.googleapis.com/openimages/web/images/oidv7_all-in-one_example_ab.jpg)
- **Open Images V7**: This image exemplifies the depth and detail of annotations available, including bounding boxes, relationships, and segmentation masks.
Researchers can gain invaluable insights into the array of computer vision challenges that the dataset addresses, from basic object detection to intricate relationship identification.
## Citations and Acknowledgments
For those employing Open Images V7 in their work, it's prudent to cite the relevant papers and acknowledge the creators:
!!! note ""
=== "BibTeX"
```bibtex
@article{OpenImages,
author = {Alina Kuznetsova and Hassan Rom and Neil Alldrin and Jasper Uijlings and Ivan Krasin and Jordi Pont-Tuset and Shahab Kamali and Stefan Popov and Matteo Malloci and Alexander Kolesnikov and Tom Duerig and Vittorio Ferrari},
title = {The Open Images Dataset V4: Unified image classification, object detection, and visual relationship detection at scale},
year = {2020},
journal = {IJCV}
}
```
A heartfelt acknowledgment goes out to the Google AI team for creating and maintaining the Open Images V7 dataset. For a deep dive into the dataset and its offerings, navigate to the [official Open Images V7 website](https://storage.googleapis.com/openimages/web/index.html).

@ -0,0 +1,93 @@
---
comments: true
description: 'Explore the SKU-110k dataset: densely packed retail shelf images for object detection research. Learn how to use it with Ultralytics.'
keywords: SKU-110k dataset, object detection, retail shelf images, Ultralytics, YOLO, computer vision, deep learning models
---
# SKU-110k Dataset
The [SKU-110k](https://github.com/eg4000/SKU110K_CVPR19) dataset is a collection of densely packed retail shelf images, designed to support research in object detection tasks. Developed by Eran Goldman et al., the dataset contains over 110,000 unique store keeping unit (SKU) categories with densely packed objects, often looking similar or even identical, positioned in close proximity.
![Dataset sample image](https://github.com/eg4000/SKU110K_CVPR19/raw/master/figures/benchmarks_comparison.jpg)
## Key Features
- SKU-110k contains images of store shelves from around the world, featuring densely packed objects that pose challenges for state-of-the-art object detectors.
- The dataset includes over 110,000 unique SKU categories, providing a diverse range of object appearances.
- Annotations include bounding boxes for objects and SKU category labels.
## Dataset Structure
The SKU-110k dataset is organized into three main subsets:
1. **Training set**: This subset contains images and annotations used for training object detection models.
2. **Validation set**: This subset consists of images and annotations used for model validation during training.
3. **Test set**: This subset is designed for the final evaluation of trained object detection models.
## Applications
The SKU-110k dataset is widely used for training and evaluating deep learning models in object detection tasks, especially in densely packed scenes such as retail shelf displays. The dataset's diverse set of SKU categories and densely packed object arrangements make it a valuable resource for researchers and practitioners in the field of computer vision.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. For the case of the SKU-110K dataset, the `SKU-110K.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/SKU-110K.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/SKU-110K.yaml).
!!! example "ultralytics/cfg/datasets/SKU-110K.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/SKU-110K.yaml"
```
## Usage
To train a YOLOv8n model on the SKU-110K dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='SKU-110K.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=SKU-110K.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
The SKU-110k dataset contains a diverse set of retail shelf images with densely packed objects, providing rich context for object detection tasks. Here are some examples of data from the dataset, along with their corresponding annotations:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/238215979-1ab791c4-15d9-46f6-a5d6-0092c05dff7a.jpg)
- **Densely packed retail shelf image**: This image demonstrates an example of densely packed objects in a retail shelf setting. Objects are annotated with bounding boxes and SKU category labels.
The example showcases the variety and complexity of the data in the SKU-110k dataset and highlights the importance of high-quality data for object detection tasks.
## Citations and Acknowledgments
If you use the SKU-110k dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@inproceedings{goldman2019dense,
author = {Eran Goldman and Roei Herzig and Aviv Eisenschtat and Jacob Goldberger and Tal Hassner},
title = {Precise Detection in Densely Packed Scenes},
booktitle = {Proc. Conf. Comput. Vision Pattern Recognition (CVPR)},
year = {2019}
}
```
We would like to acknowledge Eran Goldman et al. for creating and maintaining the SKU-110k dataset as a valuable resource for the computer vision research community. For more information about the SKU-110k dataset and its creators, visit the [SKU-110k dataset GitHub repository](https://github.com/eg4000/SKU110K_CVPR19).

@ -0,0 +1,92 @@
---
comments: true
description: Explore the VisDrone Dataset, a large-scale benchmark for drone-based image analysis, and learn how to train a YOLO model using it.
keywords: VisDrone Dataset, Ultralytics, drone-based image analysis, YOLO model, object detection, object tracking, crowd counting
---
# VisDrone Dataset
The [VisDrone Dataset](https://github.com/VisDrone/VisDrone-Dataset) is a large-scale benchmark created by the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China. It contains carefully annotated ground truth data for various computer vision tasks related to drone-based image and video analysis.
VisDrone is composed of 288 video clips with 261,908 frames and 10,209 static images, captured by various drone-mounted cameras. The dataset covers a wide range of aspects, including location (14 different cities across China), environment (urban and rural), objects (pedestrians, vehicles, bicycles, etc.), and density (sparse and crowded scenes). The dataset was collected using various drone platforms under different scenarios and weather and lighting conditions. These frames are manually annotated with over 2.6 million bounding boxes of targets such as pedestrians, cars, bicycles, and tricycles. Attributes like scene visibility, object class, and occlusion are also provided for better data utilization.
## Dataset Structure
The VisDrone dataset is organized into five main subsets, each focusing on a specific task:
1. **Task 1**: Object detection in images
2. **Task 2**: Object detection in videos
3. **Task 3**: Single-object tracking
4. **Task 4**: Multi-object tracking
5. **Task 5**: Crowd counting
## Applications
The VisDrone dataset is widely used for training and evaluating deep learning models in drone-based computer vision tasks such as object detection, object tracking, and crowd counting. The dataset's diverse set of sensor data, object annotations, and attributes make it a valuable resource for researchers and practitioners in the field of drone-based computer vision.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the Visdrone dataset, the `VisDrone.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VisDrone.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VisDrone.yaml).
!!! example "ultralytics/cfg/datasets/VisDrone.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/VisDrone.yaml"
```
## Usage
To train a YOLOv8n model on the VisDrone dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='VisDrone.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=VisDrone.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
The VisDrone dataset contains a diverse set of images and videos captured by drone-mounted cameras. Here are some examples of data from the dataset, along with their corresponding annotations:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/238217600-df0b7334-4c9e-4c77-81a5-c70cd33429cc.jpg)
- **Task 1**: Object detection in images - This image demonstrates an example of object detection in images, where objects are annotated with bounding boxes. The dataset provides a wide variety of images taken from different locations, environments, and densities to facilitate the development of models for this task.
The example showcases the variety and complexity of the data in the VisDrone dataset and highlights the importance of high-quality sensor data for drone-based computer vision tasks.
## Citations and Acknowledgments
If you use the VisDrone dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@ARTICLE{9573394,
author={Zhu, Pengfei and Wen, Longyin and Du, Dawei and Bian, Xiao and Fan, Heng and Hu, Qinghua and Ling, Haibin},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Detection and Tracking Meet Drones Challenge},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3119563}}
```
We would like to acknowledge the AISKYEYE team at the Lab of Machine Learning and Data Mining, Tianjin University, China, for creating and maintaining the VisDrone dataset as a valuable resource for the drone-based computer vision research community. For more information about the VisDrone dataset and its creators, visit the [VisDrone Dataset GitHub repository](https://github.com/VisDrone/VisDrone-Dataset).

@ -0,0 +1,95 @@
---
comments: true
description: A complete guide to the PASCAL VOC dataset used for object detection, segmentation and classification tasks with relevance to YOLO model training.
keywords: Ultralytics, PASCAL VOC dataset, object detection, segmentation, image classification, YOLO, model training, VOC.yaml, deep learning
---
# VOC Dataset
The [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/) (Visual Object Classes) dataset is a well-known object detection, segmentation, and classification dataset. It is designed to encourage research on a wide variety of object categories and is commonly used for benchmarking computer vision models. It is an essential dataset for researchers and developers working on object detection, segmentation, and classification tasks.
## Key Features
- VOC dataset includes two main challenges: VOC2007 and VOC2012.
- The dataset comprises 20 object categories, including common objects like cars, bicycles, and animals, as well as more specific categories such as boats, sofas, and dining tables.
- Annotations include object bounding boxes and class labels for object detection and classification tasks, and segmentation masks for the segmentation tasks.
- VOC provides standardized evaluation metrics like mean Average Precision (mAP) for object detection and classification, making it suitable for comparing model performance.
## Dataset Structure
The VOC dataset is split into three subsets:
1. **Train**: This subset contains images for training object detection, segmentation, and classification models.
2. **Validation**: This subset has images used for validation purposes during model training.
3. **Test**: This subset consists of images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [PASCAL VOC evaluation server](http://host.robots.ox.ac.uk:8080/leaderboard/displaylb.php) for performance evaluation.
## Applications
The VOC dataset is widely used for training and evaluating deep learning models in object detection (such as YOLO, Faster R-CNN, and SSD), instance segmentation (such as Mask R-CNN), and image classification. The dataset's diverse set of object categories, large number of annotated images, and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the VOC dataset, the `VOC.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VOC.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/VOC.yaml).
!!! example "ultralytics/cfg/datasets/VOC.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/VOC.yaml"
```
## Usage
To train a YOLOv8n model on the VOC dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='VOC.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from
a pretrained *.pt model
yolo detect train data=VOC.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
The VOC dataset contains a diverse set of images with various object categories and complex scenes. Here are some examples of images from the dataset, along with their corresponding annotations:
![Dataset sample image](https://github.com/ultralytics/ultralytics/assets/26833433/7d4c18f4-774e-43f8-a5f3-9467cda7de4a)
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the VOC dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the VOC dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{everingham2010pascal,
title={The PASCAL Visual Object Classes (VOC) Challenge},
author={Mark Everingham and Luc Van Gool and Christopher K. I. Williams and John Winn and Andrew Zisserman},
year={2010},
eprint={0909.5206},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the PASCAL VOC Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the VOC dataset and its creators, visit the [PASCAL VOC dataset website](http://host.robots.ox.ac.uk/pascal/VOC/).

@ -0,0 +1,97 @@
---
comments: true
description: Explore xView, a large-scale, high resolution satellite imagery dataset for object detection. Dive into dataset structure, usage examples & its potential applications.
keywords: Ultralytics, YOLO, computer vision, xView dataset, satellite imagery, object detection, overhead imagery, training, deep learning, dataset YAML
---
# xView Dataset
The [xView](http://xviewdataset.org/) dataset is one of the largest publicly available datasets of overhead imagery, containing images from complex scenes around the world annotated using bounding boxes. The goal of the xView dataset is to accelerate progress in four computer vision frontiers:
1. Reduce minimum resolution for detection.
2. Improve learning efficiency.
3. Enable discovery of more object classes.
4. Improve detection of fine-grained classes.
xView builds on the success of challenges like Common Objects in Context (COCO) and aims to leverage computer vision to analyze the growing amount of available imagery from space in order to understand the visual world in new ways and address a range of important applications.
## Key Features
- xView contains over 1 million object instances across 60 classes.
- The dataset has a resolution of 0.3 meters, providing higher resolution imagery than most public satellite imagery datasets.
- xView features a diverse collection of small, rare, fine-grained, and multi-type objects with bounding box annotation.
- Comes with a pre-trained baseline model using the TensorFlow object detection API and an example for PyTorch.
## Dataset Structure
The xView dataset is composed of satellite images collected from WorldView-3 satellites at a 0.3m ground sample distance. It contains over 1 million objects across 60 classes in over 1,400 km² of imagery.
## Applications
The xView dataset is widely used for training and evaluating deep learning models for object detection in overhead imagery. The dataset's diverse set of object classes and high-resolution imagery make it a valuable resource for researchers and practitioners in the field of computer vision, especially for satellite imagery analysis.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the xView dataset, the `xView.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/xView.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/xView.yaml).
!!! example "ultralytics/cfg/datasets/xView.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/xView.yaml"
```
## Usage
To train a model on the xView dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='xView.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=xView.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
The xView dataset contains high-resolution satellite images with a diverse set of objects annotated using bounding boxes. Here are some examples of data from the dataset, along with their corresponding annotations:
![Dataset sample image](https://github-production-user-asset-6210df.s3.amazonaws.com/26833433/238799379-bb3b02f0-dee4-4e67-80ae-4b2378b813ad.jpg)
- **Overhead Imagery**: This image demonstrates an example of object detection in overhead imagery, where objects are annotated with bounding boxes. The dataset provides high-resolution satellite images to facilitate the development of models for this task.
The example showcases the variety and complexity of the data in the xView dataset and highlights the importance of high-quality satellite imagery for object detection tasks.
## Citations and Acknowledgments
If you use the xView dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lam2018xview,
title={xView: Objects in Context in Overhead Imagery},
author={Darius Lam and Richard Kuzma and Kevin McGee and Samuel Dooley and Michael Laielli and Matthew Klaric and Yaroslav Bulatov and Brendan McCord},
year={2018},
eprint={1802.07856},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the [Defense Innovation Unit](https://www.diu.mil/) (DIU) and the creators of the xView dataset for their valuable contribution to the computer vision research community. For more information about the xView dataset and its creators, visit the [xView dataset website](http://xviewdataset.org/).

@ -0,0 +1,66 @@
---
comments: true
description: Explore various computer vision datasets supported by Ultralytics for object detection, segmentation, pose estimation, image classification, and multi-object tracking.
keywords: computer vision, datasets, Ultralytics, YOLO, object detection, instance segmentation, pose estimation, image classification, multi-object tracking
---
# Datasets Overview
Ultralytics provides support for various datasets to facilitate computer vision tasks such as detection, instance segmentation, pose estimation, classification, and multi-object tracking. Below is a list of the main Ultralytics datasets, followed by a summary of each computer vision task and the respective datasets.
## [Detection Datasets](detect/index.md)
Bounding box object detection is a computer vision technique that involves detecting and localizing objects in an image by drawing a bounding box around each object.
- [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations.
- [COCO](detect/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning with over 200K labeled images.
- [COCO8](detect/coco8.md): Contains the first 4 images from COCO train and COCO val, suitable for quick tests.
- [Global Wheat 2020](detect/globalwheat2020.md): A dataset of wheat head images collected from around the world for object detection and localization tasks.
- [Objects365](detect/objects365.md): A high-quality, large-scale dataset for object detection with 365 object categories and over 600K annotated images.
- [OpenImagesV7](detect/open-images-v7.md): A comprehensive dataset by Google with 1.7M train images and 42k validation images.
- [SKU-110K](detect/sku-110k.md): A dataset featuring dense object detection in retail environments with over 11K images and 1.7 million bounding boxes.
- [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.
- [VOC](detect/voc.md): The Pascal Visual Object Classes (VOC) dataset for object detection and segmentation with 20 object classes and over 11K images.
- [xView](detect/xview.md): A dataset for object detection in overhead imagery with 60 object categories and over 1 million annotated objects.
## [Instance Segmentation Datasets](segment/index.md)
Instance segmentation is a computer vision technique that involves identifying and localizing objects in an image at the pixel level.
- [COCO](segment/coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
- [COCO8-seg](segment/coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
## [Pose Estimation](pose/index.md)
Pose estimation is a technique used to determine the pose of the object relative to the camera or the world coordinate system.
- [COCO](pose/coco.md): A large-scale dataset with human pose annotations designed for pose estimation tasks.
- [COCO8-pose](pose/coco8-pose.md): A smaller dataset for pose estimation tasks, containing a subset of 8 COCO images with human pose annotations.
## [Classification](classify/index.md)
Image classification is a computer vision task that involves categorizing an image into one or more predefined classes or categories based on its visual content.
- [Caltech 101](classify/caltech101.md): A dataset containing images of 101 object categories for image classification tasks.
- [Caltech 256](classify/caltech256.md): An extended version of Caltech 101 with 256 object categories and more challenging images.
- [CIFAR-10](classify/cifar10.md): A dataset of 60K 32x32 color images in 10 classes, with 6K images per class.
- [CIFAR-100](classify/cifar100.md): An extended version of CIFAR-10 with 100 object categories and 600 images per class.
- [Fashion-MNIST](classify/fashion-mnist.md): A dataset consisting of 70,000 grayscale images of 10 fashion categories for image classification tasks.
- [ImageNet](classify/imagenet.md): A large-scale dataset for object detection and image classification with over 14 million images and 20,000 categories.
- [ImageNet-10](classify/imagenet10.md): A smaller subset of ImageNet with 10 categories for faster experimentation and testing.
- [Imagenette](classify/imagenette.md): A smaller subset of ImageNet that contains 10 easily distinguishable classes for quicker training and testing.
- [Imagewoof](classify/imagewoof.md): A more challenging subset of ImageNet containing 10 dog breed categories for image classification tasks.
- [MNIST](classify/mnist.md): A dataset of 70,000 grayscale images of handwritten digits for image classification tasks.
## [Oriented Bounding Boxes (OBB)](obb/index.md)
Oriented Bounding Boxes (OBB) is a method in computer vision for detecting angled objects in images using rotated bounding boxes, often applied to aerial and satellite imagery.
- [DOTAv2](obb/dota-v2.md): A popular OBB aerial imagery dataset with 1.7 million instances and 11,268 images.
## [Multi-Object Tracking](track/index.md)
Multi-object tracking is a computer vision technique that involves detecting and tracking multiple objects over time in a video sequence.
- [Argoverse](detect/argoverse.md): A dataset containing 3D tracking and motion forecasting data from urban environments with rich annotations for multi-object tracking tasks.
- [VisDrone](detect/visdrone.md): A dataset containing object detection and multi-object tracking data from drone-captured imagery with over 10K images and video sequences.

@ -0,0 +1,129 @@
---
comments: true
description: Delve into DOTA v2, an Oriented Bounding Box (OBB) aerial imagery dataset with 1.7 million instances and 11,268 images.
keywords: DOTA v2, object detection, aerial images, computer vision, deep learning, annotations, oriented bounding boxes, OBB
---
# DOTA v2 Dataset with OBB
[DOTA v2](https://captain-whu.github.io/DOTA/index.html) stands as a specialized dataset, emphasizing object detection in aerial images. Originating from the DOTA series of datasets, it offers annotated images capturing a diverse array of aerial scenes with Oriented Bounding Boxes (OBB).
![DOTA v2 classes visual](https://user-images.githubusercontent.com/26833433/259461765-72fdd0d8-266b-44a9-8199-199329bf5ca9.jpg)
## Key Features
- Collection from various sensors and platforms, with image sizes ranging from 800 × 800 to 20,000 × 20,000 pixels.
- Features more than 1.7M Oriented Bounding Boxes across 18 categories.
- Encompasses multiscale object detection.
- Instances are annotated by experts using arbitrary (8 d.o.f.) quadrilateral, capturing objects of different scales, orientations, and shapes.
## Dataset Versions
### DOTA-v1.0
- Contains 15 common categories.
- Comprises 2,806 images with 188,282 instances.
- Split ratios: 1/2 for training, 1/6 for validation, and 1/3 for testing.
### DOTA-v1.5
- Incorporates the same images as DOTA-v1.0.
- Very small instances (less than 10 pixels) are also annotated.
- Addition of a new category: "container crane".
- A total of 403,318 instances.
- Released for the DOAI Challenge 2019 on Object Detection in Aerial Images.
### DOTA-v2.0
- Collections from Google Earth, GF-2 Satellite, and other aerial images.
- Contains 18 common categories.
- Comprises 11,268 images with a whopping 1,793,658 instances.
- New categories introduced: "airport" and "helipad".
- Image splits:
- Training: 1,830 images with 268,627 instances.
- Validation: 593 images with 81,048 instances.
- Test-dev: 2,792 images with 353,346 instances.
- Test-challenge: 6,053 images with 1,090,637 instances.
## Dataset Structure
DOTA v2 exhibits a structured layout tailored for OBB object detection challenges:
- **Images**: A vast collection of high-resolution aerial images capturing diverse terrains and structures.
- **Oriented Bounding Boxes**: Annotations in the form of rotated rectangles encapsulating objects irrespective of their orientation, ideal for capturing objects like airplanes, ships, and buildings.
## Applications
DOTA v2 serves as a benchmark for training and evaluating models specifically tailored for aerial image analysis. With the inclusion of OBB annotations, it provides a unique challenge, enabling the development of specialized object detection models that cater to aerial imagery's nuances.
## Dataset YAML
Typically, datasets incorporate a YAML (Yet Another Markup Language) file detailing the dataset's configuration. For DOTA v2, a hypothetical `DOTAv2.yaml` could be used. For accurate paths and configurations, it's vital to consult the dataset's official repository or documentation.
!!! example "DOTAv2.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/DOTAv2.yaml"
```
## Usage
To train a model on the DOTA v2 dataset, you can utilize the following code snippets. Always refer to your model's documentation for a thorough list of available arguments.
!!! warning
Please note that all images and associated annotations in the DOTAv2 dataset can be used for academic purposes, but commercial use is prohibited. Your understanding and respect for the dataset creators' wishes are greatly appreciated!
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Create a new YOLOv8n-OBB model from scratch
model = YOLO('yolov8n-obb.yaml')
# Train the model on the DOTAv2 dataset
results = model.train(data='DOTAv2.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Train a new YOLOv8n-OBB model on the DOTAv2 dataset
yolo detect train data=DOTAv2.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Data and Annotations
Having a glance at the dataset illustrates its depth:
![Dataset sample image](https://captain-whu.github.io/DOTA/images/instances-DOTA.jpg)
- **DOTA v2**: This snapshot underlines the complexity of aerial scenes and the significance of Oriented Bounding Box annotations, capturing objects in their natural orientation.
The dataset's richness offers invaluable insights into object detection challenges exclusive to aerial imagery.
## Citations and Acknowledgments
For those leveraging DOTA v2 in their endeavors, it's pertinent to cite the relevant research papers:
!!! note ""
=== "BibTeX"
```bibtex
@article{9560031,
author={Ding, Jian and Xue, Nan and Xia, Gui-Song and Bai, Xiang and Yang, Wen and Yang, Michael and Belongie, Serge and Luo, Jiebo and Datcu, Mihai and Pelillo, Marcello and Zhang, Liangpei},
journal={IEEE Transactions on Pattern Analysis and Machine Intelligence},
title={Object Detection in Aerial Images: A Large-Scale Benchmark and Challenges},
year={2021},
volume={},
number={},
pages={1-1},
doi={10.1109/TPAMI.2021.3117983}
}
```
A special note of gratitude to the team behind DOTA v2 for their commendable effort in curating this dataset. For an exhaustive understanding of the dataset and its nuances, please visit the [official DOTA v2 website](https://captain-whu.github.io/DOTA/index.html).

@ -0,0 +1,80 @@
---
comments: true
description: Dive deep into various oriented bounding box (OBB) dataset formats compatible with Ultralytics YOLO models. Grasp the nuances of using and converting datasets to this format.
keywords: Ultralytics, YOLO, oriented bounding boxes, OBB, dataset formats, label formats, DOTA v2, data conversion
---
# Oriented Bounding Box (OBB) Datasets Overview
Training a precise object detection model with oriented bounding boxes (OBB) requires a thorough dataset. This guide explains the various OBB dataset formats compatible with Ultralytics YOLO models, offering insights into their structure, application, and methods for format conversions.
## Supported OBB Dataset Formats
### YOLO OBB Format
The YOLO OBB format designates bounding boxes by their four corner points with coordinates normalized between 0 and 1. It follows this format:
```bash
class_index, x1, y1, x2, y2, x3, y3, x4, y4
```
Internally, YOLO processes losses and outputs in the `xywhr` format, which represents the bounding box's center point (xy), width, height, and rotation.
<p align="center"><img width="800" src="https://user-images.githubusercontent.com/26833433/259471881-59020fe2-09a4-4dcc-acce-9b0f7cfa40ee.png"></p>
An example of a `*.txt` label file for the above image, which contains an object of class `0` in OBB format, could look like:
```bash
0 0.780811 0.743961 0.782371 0.74686 0.777691 0.752174 0.776131 0.749758
```
## Usage
To train a model using these OBB formats:
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Create a new YOLOv8n-OBB model from scratch
model = YOLO('yolov8n-obb.yaml')
# Train the model on the DOTAv2 dataset
results = model.train(data='DOTAv2.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Train a new YOLOv8n-OBB model on the DOTAv2 dataset
yolo detect train data=DOTAv2.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Supported Datasets
Currently, the following datasets with Oriented Bounding Boxes are supported:
- [**DOTA v2**](./dota-v2.md): DOTA (A Large-scale Dataset for Object Detection in Aerial Images) version 2, emphasizes detection from aerial perspectives and contains oriented bounding boxes with 1.7 million instances and 11,268 images.
### Incorporating your own OBB dataset
For those looking to introduce their own datasets with oriented bounding boxes, ensure compatibility with the "YOLO OBB format" mentioned above. Convert your annotations to this required format and detail the paths, classes, and class names in a corresponding YAML configuration file.
## Convert Label Formats
### DOTA Dataset Format to YOLO OBB Format
Transitioning labels from the DOTA dataset format to the YOLO OBB format can be achieved with this script:
```python
from ultralytics.data.converter import convert_dota_to_yolo_obb
convert_dota_to_yolo_obb('path/to/DOTA')
```
This conversion mechanism is instrumental for datasets in the DOTA format, ensuring alignment with the Ultralytics YOLO OBB format.
It's imperative to validate the compatibility of the dataset with your model and adhere to the necessary format conventions. Properly structured datasets are pivotal for training efficient object detection models with oriented bounding boxes.

@ -0,0 +1,95 @@
---
comments: true
description: Detailed guide on the special COCO-Pose Dataset in Ultralytics. Learn about its key features, structure, and usage in pose estimation tasks with YOLO.
keywords: Ultralytics YOLO, COCO-Pose Dataset, Deep Learning, Pose Estimation, Training Models, Dataset YAML, openpose, YOLO
---
# COCO-Pose Dataset
The [COCO-Pose](https://cocodataset.org/#keypoints-2017) dataset is a specialized version of the COCO (Common Objects in Context) dataset, designed for pose estimation tasks. It leverages the COCO Keypoints 2017 images and labels to enable the training of models like YOLO for pose estimation tasks.
![Pose sample image](https://user-images.githubusercontent.com/26833433/239691398-d62692dc-713e-4207-9908-2f6710050e5c.jpg)
## Key Features
- COCO-Pose builds upon the COCO Keypoints 2017 dataset which contains 200K images labeled with keypoints for pose estimation tasks.
- The dataset supports 17 keypoints for human figures, facilitating detailed pose estimation.
- Like COCO, it provides standardized evaluation metrics, including Object Keypoint Similarity (OKS) for pose estimation tasks, making it suitable for comparing model performance.
## Dataset Structure
The COCO-Pose dataset is split into three subsets:
1. **Train2017**: This subset contains a portion of the 118K images from the COCO dataset, annotated for training pose estimation models.
2. **Val2017**: This subset has a selection of images used for validation purposes during model training.
3. **Test2017**: This subset consists of images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7384) for performance evaluation.
## Applications
The COCO-Pose dataset is specifically used for training and evaluating deep learning models in keypoint detection and pose estimation tasks, such as OpenPose. The dataset's large number of annotated images and standardized evaluation metrics make it an essential resource for computer vision researchers and practitioners focused on pose estimation.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Pose dataset, the `coco-pose.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco-pose.yaml).
!!! example "ultralytics/cfg/datasets/coco-pose.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco-pose.yaml"
```
## Usage
To train a YOLOv8n-pose model on the COCO-Pose dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco-pose.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco-pose.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
The COCO-Pose dataset contains a diverse set of images with human figures annotated with keypoints. Here are some examples of images from the dataset, along with their corresponding annotations:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239690150-a9dc0bd0-7ad9-4b78-a30f-189ed727ea0e.jpg)
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO-Pose dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO-Pose dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO-Pose dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).

@ -0,0 +1,84 @@
---
comments: true
description: Discover the versatile COCO8-Pose dataset, perfect for testing and debugging pose detection models. Learn how to get started with YOLOv8-pose model training.
keywords: Ultralytics, YOLOv8, pose detection, COCO8-Pose dataset, dataset, model training, YAML
---
# COCO8-Pose Dataset
## Introduction
[Ultralytics](https://ultralytics.com) COCO8-Pose is a small, but versatile pose detection dataset composed of the first
8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and
debugging object detection models, or for experimenting with new detection approaches. With 8 images, it is small enough
to be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before
training larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8-Pose dataset, the `coco8-pose.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-pose.yaml).
!!! example "ultralytics/cfg/datasets/coco8-pose.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8-pose.yaml"
```
## Usage
To train a YOLOv8n-pose model on the COCO8-Pose dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco8-pose.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco8-pose.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
Here are some examples of images from the COCO8-Pose dataset, along with their corresponding annotations:
<img src="https://user-images.githubusercontent.com/26833433/236818283-52eecb96-fc6a-420d-8a26-d488b352dd4c.jpg" alt="Dataset sample image" width="800">
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO8-Pose dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).

@ -0,0 +1,128 @@
---
comments: true
description: Understand the YOLO pose dataset format and learn to use Ultralytics datasets to train your pose estimation models effectively.
keywords: Ultralytics, YOLO, pose estimation, datasets, training, YAML, keypoints, COCO-Pose, COCO8-Pose, data conversion
---
# Pose Estimation Datasets Overview
## Supported Dataset Formats
### Ultralytics YOLO format
** Label Format **
The dataset format used for training YOLO pose models is as follows:
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
2. One row per object: Each row in the text file corresponds to one object instance in the image.
3. Object information per row: Each row contains the following information about the object instance:
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
- Object center coordinates: The x and y coordinates of the center of the object, normalized to be between 0 and 1.
- Object width and height: The width and height of the object, normalized to be between 0 and 1.
- Object keypoint coordinates: The keypoints of the object, normalized to be between 0 and 1.
Here is an example of the label format for pose estimation task:
Format with Dim = 2
```
<class-index> <x> <y> <width> <height> <px1> <py1> <px2> <py2> ... <pxn> <pyn>
```
Format with Dim = 3
```
<class-index> <x> <y> <width> <height> <px1> <py1> <p1-visibility> <px2> <py2> <p2-visibility> <pxn> <pyn> <p2-visibility>
```
In this format, `<class-index>` is the index of the class for the object,`<x> <y> <width> <height>` are coordinates of boudning box, and `<px1> <py1> <px2> <py2> ... <pxn> <pyn>` are the pixel coordinates of the keypoints. The coordinates are separated by spaces.
### Dataset YAML format
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco8-pose # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)
# Keypoints
kpt_shape: [17, 3] # number of keypoints, number of dims (2 for x,y or 3 for x,y,visible)
flip_idx: [0, 2, 1, 4, 3, 6, 5, 8, 7, 10, 9, 12, 11, 14, 13, 16, 15]
# Classes dictionary
names:
0: person
```
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
`names` is a dictionary of class names. The order of the names should match the order of the object class indices in the YOLO dataset files.
(Optional) if the points are symmetric then need flip_idx, like left-right side of human or face.
For example if we assume five keypoints of facial landmark: [left eye, right eye, nose, left mouth, right mouth], and the original index is [0, 1, 2, 3, 4], then flip_idx is [1, 0, 2, 4, 3] (just exchange the left-right index, i.e 0-1 and 3-4, and do not modify others like nose in this example).
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-pose.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco128-pose.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco128-pose.yaml model=yolov8n-pose.pt epochs=100 imgsz=640
```
## Supported Datasets
This section outlines the datasets that are compatible with Ultralytics YOLO format and can be used for training pose estimation models:
### COCO-Pose
- **Description**: COCO-Pose is a large-scale object detection, segmentation, and pose estimation dataset. It is a subset of the popular COCO dataset and focuses on human pose estimation. COCO-Pose includes multiple keypoints for each human instance.
- **Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
- **Number of Classes**: 1 (Human).
- **Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
- **Usage**: Suitable for training human pose estimation models.
- **Additional Notes**: The dataset is rich and diverse, containing over 200k labeled images.
- [Read more about COCO-Pose](./coco.md)
### COCO8-Pose
- **Description**: [Ultralytics](https://ultralytics.com) COCO8-Pose is a small, but versatile pose detection dataset composed of the first 8 images of the COCO train 2017 set, 4 for training and 4 for validation.
- **Label Format**: Same as Ultralytics YOLO format as described above, with keypoints for human poses.
- **Number of Classes**: 1 (Human).
- **Keypoints**: 17 keypoints including nose, eyes, ears, shoulders, elbows, wrists, hips, knees, and ankles.
- **Usage**: Suitable for testing and debugging object detection models, or for experimenting with new detection approaches.
- **Additional Notes**: COCO8-Pose is ideal for sanity checks and CI checks.
- [Read more about COCO8-Pose](./coco8-pose.md)
### Adding your own dataset
If you have your own dataset and would like to use it for training pose estimation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
### Conversion Tool
Ultralytics provides a convenient conversion tool to convert labels from the popular COCO dataset format to YOLO format:
```python
from ultralytics.data.converter import convert_coco
convert_coco(labels_dir='../coco/annotations/', use_keypoints=True)
```
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format. The `use_keypoints` parameter specifies whether to include keypoints (for pose estimation) in the converted labels.

@ -0,0 +1,94 @@
---
comments: true
description: Explore the possibilities of the COCO-Seg dataset, designed for object instance segmentation and YOLO model training. Discover key features, dataset structure, applications, and usage.
keywords: Ultralytics, YOLO, COCO-Seg, dataset, instance segmentation, model training, deep learning, computer vision
---
# COCO-Seg Dataset
The [COCO-Seg](https://cocodataset.org/#home) dataset, an extension of the COCO (Common Objects in Context) dataset, is specially designed to aid research in object instance segmentation. It uses the same images as COCO but introduces more detailed segmentation annotations. This dataset is a crucial resource for researchers and developers working on instance segmentation tasks, especially for training YOLO models.
## Key Features
- COCO-Seg retains the original 330K images from COCO.
- The dataset consists of the same 80 object categories found in the original COCO dataset.
- Annotations now include more detailed instance segmentation masks for each object in the images.
- COCO-Seg provides standardized evaluation metrics like mean Average Precision (mAP) for object detection, and mean Average Recall (mAR) for instance segmentation tasks, enabling effective comparison of model performance.
## Dataset Structure
The COCO-Seg dataset is partitioned into three subsets:
1. **Train2017**: This subset contains 118K images for training instance segmentation models.
2. **Val2017**: This subset includes 5K images used for validation purposes during model training.
3. **Test2017**: This subset encompasses 20K images used for testing and benchmarking the trained models. Ground truth annotations for this subset are not publicly available, and the results are submitted to the [COCO evaluation server](https://codalab.lisn.upsaclay.fr/competitions/7383) for performance evaluation.
## Applications
COCO-Seg is widely used for training and evaluating deep learning models in instance segmentation, such as the YOLO models. The large number of annotated images, the diversity of object categories, and the standardized evaluation metrics make it an indispensable resource for computer vision researchers and practitioners.
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO-Seg dataset, the `coco.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco.yaml).
!!! example "ultralytics/cfg/datasets/coco.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco.yaml"
```
## Usage
To train a YOLOv8n-seg model on the COCO-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco-seg.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco-seg.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
COCO-Seg, like its predecessor COCO, contains a diverse set of images with various object categories and complex scenes. However, COCO-Seg introduces more detailed instance segmentation masks for each object in the images. Here are some examples of images from the dataset, along with their corresponding instance segmentation masks:
![Dataset sample image](https://user-images.githubusercontent.com/26833433/239690696-93fa8765-47a2-4b34-a6e5-516d0d1c725b.jpg)
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This aids the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO-Seg dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO-Seg dataset in your research or development work, please cite the original COCO paper and acknowledge the extension to COCO-Seg:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We extend our thanks to the COCO Consortium for creating and maintaining this invaluable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).

@ -0,0 +1,84 @@
---
comments: true
description: 'Discover the COCO8-Seg: a compact but versatile instance segmentation dataset ideal for testing Ultralytics YOLOv8 detection approaches. Complete usage guide included.'
keywords: COCO8-Seg dataset, Ultralytics, YOLOv8, instance segmentation, dataset configuration, YAML, YOLOv8n-seg model, mosaiced dataset images
---
# COCO8-Seg Dataset
## Introduction
[Ultralytics](https://ultralytics.com) COCO8-Seg is a small, but versatile instance segmentation dataset composed of the
first 8 images of the COCO train 2017 set, 4 for training and 4 for validation. This dataset is ideal for testing and
debugging segmentation models, or for experimenting with new detection approaches. With 8 images, it is small enough to
be easily manageable, yet diverse enough to test training pipelines for errors and act as a sanity check before training
larger datasets.
This dataset is intended for use with Ultralytics [HUB](https://hub.ultralytics.com)
and [YOLOv8](https://github.com/ultralytics/ultralytics).
## Dataset YAML
A YAML (Yet Another Markup Language) file is used to define the dataset configuration. It contains information about the dataset's paths, classes, and other relevant information. In the case of the COCO8-Seg dataset, the `coco8-seg.yaml` file is maintained at [https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/datasets/coco8-seg.yaml).
!!! example "ultralytics/cfg/datasets/coco8-seg.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8-seg.yaml"
```
## Usage
To train a YOLOv8n-seg model on the COCO8-Seg dataset for 100 epochs with an image size of 640, you can use the following code snippets. For a comprehensive list of available arguments, refer to the model [Training](../../modes/train.md) page.
!!! example "Train Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco8-seg.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco8-seg.yaml model=yolov8n.pt epochs=100 imgsz=640
```
## Sample Images and Annotations
Here are some examples of images from the COCO8-Seg dataset, along with their corresponding annotations:
<img src="https://user-images.githubusercontent.com/26833433/236818387-f7bde7df-caaa-46d1-8341-1f7504cd11a1.jpg" alt="Dataset sample image" width="800">
- **Mosaiced Image**: This image demonstrates a training batch composed of mosaiced dataset images. Mosaicing is a technique used during training that combines multiple images into a single image to increase the variety of objects and scenes within each training batch. This helps improve the model's ability to generalize to different object sizes, aspect ratios, and contexts.
The example showcases the variety and complexity of the images in the COCO8-Seg dataset and the benefits of using mosaicing during the training process.
## Citations and Acknowledgments
If you use the COCO dataset in your research or development work, please cite the following paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{lin2015microsoft,
title={Microsoft COCO: Common Objects in Context},
author={Tsung-Yi Lin and Michael Maire and Serge Belongie and Lubomir Bourdev and Ross Girshick and James Hays and Pietro Perona and Deva Ramanan and C. Lawrence Zitnick and Piotr Dollár},
year={2015},
eprint={1405.0312},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge the COCO Consortium for creating and maintaining this valuable resource for the computer vision community. For more information about the COCO dataset and its creators, visit the [COCO dataset website](https://cocodataset.org/#home).

@ -0,0 +1,140 @@
---
comments: true
description: Learn how Ultralytics YOLO supports various dataset formats for instance segmentation. This guide includes information on data conversions, auto-annotations, and dataset usage.
keywords: Ultralytics, YOLO, Instance Segmentation, Dataset, YAML, COCO, Auto-Annotation, Image Segmentation
---
# Instance Segmentation Datasets Overview
## Supported Dataset Formats
### Ultralytics YOLO format
** Label Format **
The dataset format used for training YOLO segmentation models is as follows:
1. One text file per image: Each image in the dataset has a corresponding text file with the same name as the image file and the ".txt" extension.
2. One row per object: Each row in the text file corresponds to one object instance in the image.
3. Object information per row: Each row contains the following information about the object instance:
- Object class index: An integer representing the class of the object (e.g., 0 for person, 1 for car, etc.).
- Object bounding coordinates: The bounding coordinates around the mask area, normalized to be between 0 and 1.
The format for a single row in the segmentation dataset file is as follows:
```
<class-index> <x1> <y1> <x2> <y2> ... <xn> <yn>
```
In this format, `<class-index>` is the index of the class for the object, and `<x1> <y1> <x2> <y2> ... <xn> <yn>` are the bounding coordinates of the object's segmentation mask. The coordinates are separated by spaces.
Here is an example of the YOLO dataset format for a single image with two object instances:
```
0 0.6812 0.48541 0.67 0.4875 0.67656 0.487 0.675 0.489 0.66
1 0.5046 0.0 0.5015 0.004 0.4984 0.00416 0.4937 0.010 0.492 0.0104
```
!!! tip "Tip"
- The length of each row does not have to be equal.
- Each segmentation label must have a **minimum of 3 xy points**: `<class-index> <x1> <y1> <x2> <y2> <x3> <y3>`
### Dataset YAML format
The Ultralytics framework uses a YAML file format to define the dataset and model configuration for training Detection Models. Here is an example of the YAML format used for defining a detection dataset:
```yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: ../datasets/coco8-seg # dataset root dir
train: images/train # train images (relative to 'path') 4 images
val: images/val # val images (relative to 'path') 4 images
test: # test images (optional)
# Classes (80 COCO classes)
names:
0: person
1: bicycle
2: car
...
77: teddy bear
78: hair drier
79: toothbrush
```
The `train` and `val` fields specify the paths to the directories containing the training and validation images, respectively.
`names` is a dictionary of class names. The order of the names should match the order of the object class indices in the YOLO dataset files.
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n-seg.pt') # load a pretrained model (recommended for training)
# Train the model
results = model.train(data='coco128-seg.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model
yolo detect train data=coco128-seg.yaml model=yolov8n-seg.pt epochs=100 imgsz=640
```
## Supported Datasets
* [COCO](coco.md): A large-scale dataset designed for object detection, segmentation, and captioning tasks with over 200K labeled images.
* [COCO8-seg](coco8-seg.md): A smaller dataset for instance segmentation tasks, containing a subset of 8 COCO images with segmentation annotations.
### Adding your own dataset
If you have your own dataset and would like to use it for training segmentation models with Ultralytics YOLO format, ensure that it follows the format specified above under "Ultralytics YOLO format". Convert your annotations to the required format and specify the paths, number of classes, and class names in the YAML configuration file.
## Port or Convert Label Formats
### COCO Dataset Format to YOLO Format
You can easily convert labels from the popular COCO dataset format to the YOLO format using the following code snippet:
```python
from ultralytics.data.converter import convert_coco
convert_coco(labels_dir='../coco/annotations/', use_segments=True)
```
This conversion tool can be used to convert the COCO dataset or any dataset in the COCO format to the Ultralytics YOLO format.
Remember to double-check if the dataset you want to use is compatible with your model and follows the necessary format conventions. Properly formatted datasets are crucial for training successful object detection models.
## Auto-Annotation
Auto-annotation is an essential feature that allows you to generate a segmentation dataset using a pre-trained detection model. It enables you to quickly and accurately annotate a large number of images without the need for manual labeling, saving time and effort.
### Generate Segmentation Dataset Using a Detection Model
To auto-annotate your dataset using the Ultralytics framework, you can use the `auto_annotate` function as shown below:
```python
from ultralytics.data.annotator import auto_annotate
auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model='sam_b.pt')
```
| Argument | Type | Description | Default |
|------------|---------------------|---------------------------------------------------------------------------------------------------------|--------------|
| data | str | Path to a folder containing images to be annotated. | |
| det_model | str, optional | Pre-trained YOLO detection model. Defaults to 'yolov8x.pt'. | 'yolov8x.pt' |
| sam_model | str, optional | Pre-trained SAM segmentation model. Defaults to 'sam_b.pt'. | 'sam_b.pt' |
| device | str, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | |
| output_dir | str, None, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | None |
The `auto_annotate` function takes the path to your images, along with optional arguments for specifying the pre-trained detection and [SAM segmentation models](https://docs.ultralytics.com/models/sam), the device to run the models on, and the output directory for saving the annotated results.
By leveraging the power of pre-trained models, auto-annotation can significantly reduce the time and effort required for creating high-quality segmentation datasets. This feature is particularly useful for researchers and developers working with large image collections, as it allows them to focus on model development and evaluation rather than manual annotation.

@ -0,0 +1,30 @@
---
comments: true
description: Understand multi-object tracking datasets, upcoming features and how to use them with YOLO in Python and CLI. Dive in now!.
keywords: Ultralytics, YOLO, multi-object tracking, datasets, detection, segmentation, pose models, Python, CLI
---
# Multi-object Tracking Datasets Overview
## Dataset Format (Coming Soon)
Multi-Object Detector doesn't need standalone training and directly supports pre-trained detection, segmentation or Pose models.
Support for training trackers alone is coming soon
## Usage
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
model = YOLO('yolov8n.pt')
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", conf=0.3, iou=0.5, show=True)
```
=== "CLI"
```bash
yolo track model=yolov8n.pt source="https://youtu.be/Zgi9g1ksQHc" conf=0.3, iou=0.5 show
```

@ -0,0 +1,19 @@
---
comments: true
description: In-depth exploration of Ultralytics' YOLO. Learn about the YOLO object detection model, how to train it on custom data, multi-GPU training, exporting, predicting, deploying, and more.
keywords: Ultralytics, YOLO, Deep Learning, Object detection, PyTorch, Tutorial, Multi-GPU training, Custom data training
---
# Comprehensive Tutorials to Ultralytics YOLO
Welcome to the Ultralytics' YOLO 🚀 Guides! Our comprehensive tutorials cover various aspects of the YOLO object detection model, ranging from training and prediction to deployment. Built on PyTorch, YOLO stands out for its exceptional speed and accuracy in real-time object detection tasks.
Whether you're a beginner or an expert in deep learning, our tutorials offer valuable insights into the implementation and optimization of YOLO for your computer vision projects. Let's dive in!
## Guides
Here's a compilation of in-depth guides to help you master different aspects of Ultralytics YOLO.
* [K-Fold Cross Validation](kfold-cross-validation.md) 🚀 NEW: Learn how to improve model generalization using K-Fold cross-validation technique.
Note: More guides about training, exporting, predicting, and deploying with Ultralytics YOLO are coming soon. Stay tuned!

@ -0,0 +1,265 @@
---
comments: true
description: An in-depth guide demonstrating the implementation of K-Fold Cross Validation with the Ultralytics ecosystem for object detection datasets, leveraging Python, YOLO, and sklearn.
keywords: K-Fold cross validation, Ultralytics, YOLO detection format, Python, sklearn, object detection
---
# K-Fold Cross Validation with Ultralytics
## Introduction
This comprehensive guide illustrates the implementation of K-Fold Cross Validation for object detection datasets within the Ultralytics ecosystem. We'll leverage the YOLO detection format and key Python libraries such as sklearn, pandas, and PyYaml to guide you through the necessary setup, the process of generating feature vectors, and the execution of a K-Fold dataset split.
<p align="center">
<img width="800" src="https://user-images.githubusercontent.com/26833433/258589390-8d815058-ece8-48b9-a94e-0e1ab53ea0f6.png" alt="K-Fold Cross Validation Overview">
</p>
Whether your project involves the Fruit Detection dataset or a custom data source, this tutorial aims to help you comprehend and apply K-Fold Cross Validation to bolster the reliability and robustness of your machine learning models. While we're applying `k=5` folds for this tutorial, keep in mind that the optimal number of folds can vary depending on your dataset and the specifics of your project.
Without further ado, let's dive in!
## Setup
- Your annotations should be in the [YOLO detection format](https://docs.ultralytics.com/datasets/detect/).
- This guide assumes that annotation files are locally available.
- For our demonstration, we use the [Fruit Detection](https://www.kaggle.com/datasets/lakshaytyagi01/fruit-detection/code) dataset.
- This dataset contains a total of 8479 images.
- It includes 6 class labels, each with its total instance counts listed below.
| Class Label | Instance Count |
|:------------|:--------------:|
| Apple | 7049 |
| Grapes | 7202 |
| Pineapple | 1613 |
| Orange | 15549 |
| Banana | 3536 |
| Watermelon | 1976 |
- Necessary Python packages include:
- `ultralytics`
- `sklearn`
- `pandas`
- `pyyaml`
- This tutorial operates with `k=5` folds. However, you should determine the best number of folds for your specific dataset.
1. Initiate a new Python virtual environment (`venv`) for your project and activate it. Use `pip` (or your preferred package manager) to install:
- The Ultralytics library: `pip install -U ultralytics`. Alternatively, you can clone the official [repo](https://github.com/ultralytics/ultralytics).
- Scikit-learn, pandas, and PyYAML: `pip install -U scikit-learn pandas pyyaml`.
2. Verify that your annotations are in the [YOLO detection format](https://docs.ultralytics.com/datasets/detect/).
- For this tutorial, all annotation files are found in the `Fruit-Detection/labels` directory.
## Generating Feature Vectors for Object Detection Dataset
1. Start by creating a new Python file and import the required libraries.
```python
import datetime
import shutil
from pathlib import Path
from collections import Counter
import yaml
import numpy as np
import pandas as pd
from ultralytics import YOLO
from sklearn.model_selection import KFold
```
2. Proceed to retrieve all label files for your dataset.
```python
dataset_path = Path('./Fruit-detection') # replace with 'path/to/dataset' for your custom data
labels = sorted(dataset_path.rglob("*labels/*.txt")) # all data in 'labels'
```
3. Now, read the contents of the dataset YAML file and extract the indices of the class labels.
```python
with open(yaml_file, 'r', encoding="utf8") as y:
classes = yaml.safe_load(y)['names']
cls_idx = sorted(classes.keys())
```
4. Initialize an empty `pandas` DataFrame.
```python
indx = [l.stem for l in labels] # uses base filename as ID (no extension)
labels_df = pd.DataFrame([], columns=cls_idx, index=indx)
```
5. Count the instances of each class-label present in the annotation files.
```python
for label in labels:
lbl_counter = Counter()
with open(label,'r') as lf:
lines = lf.readlines()
for l in lines:
# classes for YOLO label uses integer at first position of each line
lbl_counter[int(l.split(' ')[0])] += 1
labels_df.loc[label.stem] = lbl_counter
labels_df = labels_df.fillna(0.0) # replace `nan` values with `0.0`
```
6. The following is a sample view of the populated DataFrame:
```pandas
0 1 2 3 4 5
'0000a16e4b057580_jpg.rf.00ab48988370f64f5ca8ea4...' 0.0 0.0 0.0 0.0 0.0 7.0
'0000a16e4b057580_jpg.rf.7e6dce029fb67f01eb19aa7...' 0.0 0.0 0.0 0.0 0.0 7.0
'0000a16e4b057580_jpg.rf.bc4d31cdcbe229dd022957a...' 0.0 0.0 0.0 0.0 0.0 7.0
'00020ebf74c4881c_jpg.rf.508192a0a97aa6c4a3b6882...' 0.0 0.0 0.0 1.0 0.0 0.0
'00020ebf74c4881c_jpg.rf.5af192a2254c8ecc4188a25...' 0.0 0.0 0.0 1.0 0.0 0.0
... ... ... ... ... ... ...
'ff4cd45896de38be_jpg.rf.c4b5e967ca10c7ced3b9e97...' 0.0 0.0 0.0 0.0 0.0 2.0
'ff4cd45896de38be_jpg.rf.ea4c1d37d2884b3e3cbce08...' 0.0 0.0 0.0 0.0 0.0 2.0
'ff5fd9c3c624b7dc_jpg.rf.bb519feaa36fc4bf630a033...' 1.0 0.0 0.0 0.0 0.0 0.0
'ff5fd9c3c624b7dc_jpg.rf.f0751c9c3aa4519ea3c9d6a...' 1.0 0.0 0.0 0.0 0.0 0.0
'fffe28b31f2a70d4_jpg.rf.7ea16bd637ba0711c53b540...' 0.0 6.0 0.0 0.0 0.0 0.0
```
The rows index the label files, each corresponding to an image in your dataset, and the columns correspond to your class-label indices. Each row represents a pseudo feature-vector, with the count of each class-label present in your dataset. This data structure enables the application of K-Fold Cross Validation to an object detection dataset.
## K-Fold Dataset Split
1. Now we will use the `KFold` class from `sklearn.model_selection` to generate `k` splits of the dataset.
- Important:
- Setting `shuffle=True` ensures a randomized distribution of classes in your splits.
- By setting `random_state=M` where `M` is a chosen integer, you can obtain repeatable results.
```python
ksplit = 5
kf = KFold(n_splits=ksplit, shuffle=True, random_state=20) # setting random_state for repeatable results
kfolds = list(kf.split(labels_df))
```
2. The dataset has now been split into `k` folds, each having a list of `train` and `val` indices. We will construct a DataFrame to display these results more clearly.
```python
folds = [f'split_{n}' for n in range(1, ksplit + 1)]
folds_df = pd.DataFrame(index=indx, columns=folds)
for idx, (train, val) in enumerate(kfolds, start=1):
folds_df[f'split_{idx}'].loc[labels_df.iloc[train].index] = 'train'
folds_df[f'split_{idx}'].loc[labels_df.iloc[val].index] = 'val'
```
3. Now we will calculate the distribution of class labels for each fold as a ratio of the classes present in `val` to those present in `train`.
```python
fold_lbl_distrb = pd.DataFrame(index=folds, columns=cls_idx)
for n, (train_indices, val_indices) in enumerate(kfolds, start=1):
train_totals = labels_df.iloc[train_indices].sum()
val_totals = labels_df.iloc[val_indices].sum()
# To avoid division by zero, we add a small value (1E-7) to the denominator
ratio = val_totals / (train_totals + 1E-7)
fold_lbl_distrb.loc[f'split_{n}'] = ratio
```
The ideal scenario is for all class ratios to be reasonably similar for each split and across classes. This, however, will be subject to the specifics of your dataset.
4. Next, we create the directories and dataset YAML files for each split.
```python
save_path = Path(dataset_path / f'{datetime.date.today().isoformat()}_{ksplit}-Fold_Cross-val')
save_path.mkdir(parents=True, exist_ok=True)
images = sorted((dataset_path / 'images').rglob("*.jpg")) # change file extension as needed
ds_yamls = []
for split in folds_df.columns:
# Create directories
split_dir = save_path / split
split_dir.mkdir(parents=True, exist_ok=True)
(split_dir / 'train' / 'images').mkdir(parents=True, exist_ok=True)
(split_dir / 'train' / 'labels').mkdir(parents=True, exist_ok=True)
(split_dir / 'val' / 'images').mkdir(parents=True, exist_ok=True)
(split_dir / 'val' / 'labels').mkdir(parents=True, exist_ok=True)
# Create dataset YAML files
dataset_yaml = split_dir / f'{split}_dataset.yaml'
ds_yamls.append(dataset_yaml)
with open(dataset_yaml, 'w') as ds_y:
yaml.safe_dump({
'path': split_dir.as_posix(),
'train': 'train',
'val': 'val',
'names': classes
}, ds_y)
```
5. Lastly, copy images and labels into the respective directory ('train' or 'val') for each split.
- __NOTE:__ The time required for this portion of the code will vary based on the size of your dataset and your system hardware.
```python
for image, label in zip(images, labels):
for split, k_split in folds_df.loc[image.stem].items():
# Destination directory
img_to_path = save_path / split / k_split / 'images'
lbl_to_path = save_path / split / k_split / 'labels'
# Copy image and label files to new directory
# Might throw a SamefileError if file already exists
shutil.copy(image, img_to_path / image.name)
shutil.copy(label, lbl_to_path / label.name)
```
## Save Records (Optional)
Optionally, you can save the records of the K-Fold split and label distribution DataFrames as CSV files for future reference.
```python
folds_df.to_csv(save_path / "kfold_datasplit.csv")
fold_lbl_distrb.to_csv(save_path / "kfold_label_distribution.csv")
```
## Train YOLO using K-Fold Data Splits
1. First, load the YOLO model.
```python
weights_path = 'path/to/weights.pt'
model = YOLO(weights_path, task='detect')
```
2. Next, iterate over the dataset YAML files to run training. The results will be saved to a directory specified by the `project` and `name` arguments. By default, this directory is 'exp/runs#' where # is an integer index.
```python
results = {}
for k in range(ksplit):
dataset_yaml = ds_yamls[k]
model.train(data=dataset_yaml, *args, **kwargs) # Include any training arguments
results[k] = model.metrics # save output metrics for further analysis
```
## Conclusion
In this guide, we have explored the process of using K-Fold cross-validation for training the YOLO object detection model. We learned how to split our dataset into K partitions, ensuring a balanced class distribution across the different folds.
We also explored the procedure for creating report DataFrames to visualize the data splits and label distributions across these splits, providing us a clear insight into the structure of our training and validation sets.
Optionally, we saved our records for future reference, which could be particularly useful in large-scale projects or when troubleshooting model performance.
Finally, we implemented the actual model training using each split in a loop, saving our training results for further analysis and comparison.
This technique of K-Fold cross-validation is a robust way of making the most out of your available data, and it helps to ensure that your model performance is reliable and consistent across different data subsets. This results in a more generalizable and reliable model that is less likely to overfit to specific data patterns.
Remember that although we used YOLO in this guide, these steps are mostly transferable to other machine learning models. Understanding these steps allows you to apply cross-validation effectively in your own machine learning projects. Happy coding!

@ -0,0 +1,62 @@
---
comments: true
description: Learn how Ultralytics leverages Continuous Integration (CI) for maintaining high-quality code. Explore our CI tests and the status of these tests for our repositories.
keywords: continuous integration, software development, CI tests, Ultralytics repositories, high-quality code, Docker Deployment, Broken Links, CodeQL, PyPi Publishing
---
# Continuous Integration (CI)
Continuous Integration (CI) is an essential aspect of software development which involves integrating changes and testing them automatically. CI allows us to maintain high-quality code by catching issues early and often in the development process. At Ultralytics, we use various CI tests to ensure the quality and integrity of our codebase.
## CI Actions
Here's a brief description of our CI actions:
- **CI:** This is our primary CI test that involves running unit tests, linting checks, and sometimes more comprehensive tests depending on the repository.
- **Docker Deployment:** This test checks the deployment of the project using Docker to ensure the Dockerfile and related scripts are working correctly.
- **Broken Links:** This test scans the codebase for any broken or dead links in our markdown or HTML files.
- **CodeQL:** CodeQL is a tool from GitHub that performs semantic analysis on our code, helping to find potential security vulnerabilities and maintain high-quality code.
- **PyPi Publishing:** This test checks if the project can be packaged and published to PyPi without any errors.
### CI Results
Below is the table showing the status of these CI tests for our main repositories:
| Repository | CI | Docker Deployment | Broken Links | CodeQL | PyPi and Docs Publishing |
|-----------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [yolov3](https://github.com/ultralytics/yolov3) | [![YOLOv3 CI](https://github.com/ultralytics/yolov3/actions/workflows/ci-testing.yml/badge.svg)](https://github.com/ultralytics/yolov3/actions/workflows/ci-testing.yml) | [![Publish Docker Images](https://github.com/ultralytics/yolov3/actions/workflows/docker.yml/badge.svg)](https://github.com/ultralytics/yolov3/actions/workflows/docker.yml) | [![Check Broken links](https://github.com/ultralytics/yolov3/actions/workflows/links.yml/badge.svg)](https://github.com/ultralytics/yolov3/actions/workflows/links.yml) | [![CodeQL](https://github.com/ultralytics/yolov3/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/ultralytics/yolov3/actions/workflows/codeql-analysis.yml) | |
| [yolov5](https://github.com/ultralytics/yolov5) | [![YOLOv5 CI](https://github.com/ultralytics/yolov5/actions/workflows/ci-testing.yml/badge.svg)](https://github.com/ultralytics/yolov5/actions/workflows/ci-testing.yml) | [![Publish Docker Images](https://github.com/ultralytics/yolov5/actions/workflows/docker.yml/badge.svg)](https://github.com/ultralytics/yolov5/actions/workflows/docker.yml) | [![Check Broken links](https://github.com/ultralytics/yolov5/actions/workflows/links.yml/badge.svg)](https://github.com/ultralytics/yolov5/actions/workflows/links.yml) | [![CodeQL](https://github.com/ultralytics/yolov5/actions/workflows/codeql-analysis.yml/badge.svg)](https://github.com/ultralytics/yolov5/actions/workflows/codeql-analysis.yml) | |
| [ultralytics](https://github.com/ultralytics/ultralytics) | [![ultralytics CI](https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml/badge.svg)](https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml) | [![Publish Docker Images](https://github.com/ultralytics/ultralytics/actions/workflows/docker.yaml/badge.svg)](https://github.com/ultralytics/ultralytics/actions/workflows/docker.yaml) | [![Check Broken links](https://github.com/ultralytics/ultralytics/actions/workflows/links.yml/badge.svg)](https://github.com/ultralytics/ultralytics/actions/workflows/links.yml) | [![CodeQL](https://github.com/ultralytics/ultralytics/actions/workflows/codeql.yaml/badge.svg)](https://github.com/ultralytics/ultralytics/actions/workflows/codeql.yaml) | [![Publish to PyPI and Deploy Docs](https://github.com/ultralytics/ultralytics/actions/workflows/publish.yml/badge.svg)](https://github.com/ultralytics/ultralytics/actions/workflows/publish.yml) |
| [hub](https://github.com/ultralytics/hub) | [![HUB CI](https://github.com/ultralytics/hub/actions/workflows/ci.yaml/badge.svg)](https://github.com/ultralytics/hub/actions/workflows/ci.yaml) | | [![Check Broken links](https://github.com/ultralytics/hub/actions/workflows/links.yml/badge.svg)](https://github.com/ultralytics/hub/actions/workflows/links.yml) | | |
| [docs](https://github.com/ultralytics/docs) | | | | | [![pages-build-deployment](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment/badge.svg)](https://github.com/ultralytics/docs/actions/workflows/pages/pages-build-deployment) |
Each badge shows the status of the last run of the corresponding CI test on the `main` branch of the respective repository. If a test fails, the badge will display a "failing" status, and if it passes, it will display a "passing" status.
If you notice a test failing, it would be a great help if you could report it through a GitHub issue in the respective repository.
Remember, a successful CI test does not mean that everything is perfect. It is always recommended to manually review the code before deployment or merging changes.
## Code Coverage
Code coverage is a metric that represents the percentage of your codebase that is executed when your tests run. It provides insight into how well your tests exercise your code and can be crucial in identifying untested parts of your application. A high code coverage percentage is often associated with a lower likelihood of bugs. However, it's essential to understand that code coverage doesn't guarantee the absence of defects. It merely indicates which parts of the code have been executed by the tests.
### Integration with [codecov.io](https://codecov.io/)
At Ultralytics, we have integrated our repositories with [codecov.io](https://codecov.io/), a popular online platform for measuring and visualizing code coverage. Codecov provides detailed insights, coverage comparisons between commits, and visual overlays directly on your code, indicating which lines were covered.
By integrating with Codecov, we aim to maintain and improve the quality of our code by focusing on areas that might be prone to errors or need further testing.
### Coverage Results
To quickly get a glimpse of the code coverage status of the `ultralytics` python package, we have included a badge and and sunburst visual of the `ultralytics` coverage results. These images show the percentage of code covered by our tests, offering an at-a-glance metric of our testing efforts. For full details please see https://codecov.io/github/ultralytics/ultralytics.
| Repository | Code Coverage |
|-----------------------------------------------------------|----------------------------------------------------------------------|
| [ultralytics](https://github.com/ultralytics/ultralytics) | [![codecov](https://codecov.io/gh/ultralytics/ultralytics/branch/main/graph/badge.svg?token=HHW7IIVFVY)](https://codecov.io/gh/ultralytics/ultralytics) |
In the sunburst graphic below, the inner-most circle is the entire project, moving away from the center are folders then, finally, a single file. The size and color of each slice is representing the number of statements and the coverage, respectively.
<a href="https://codecov.io/github/ultralytics/ultralytics">
<img src="https://codecov.io/gh/ultralytics/ultralytics/branch/main/graphs/sunburst.svg?token=HHW7IIVFVY" alt="Ultralytics Codecov Image">
</a>

@ -1,3 +1,8 @@
---
description: Understand terms governing contributions to Ultralytics projects including source code, bug fixes, documentation and more. Read our Contributor License Agreement.
keywords: Ultralytics, Contributor License Agreement, Open Source Software, Contributions, Copyright License, Patent License, Moral Rights
---
# Ultralytics Individual Contributor License Agreement
Thank you for your interest in contributing to open source software projects (“Projects”) made available by Ultralytics

@ -1,5 +1,7 @@
---
comments: true
description: Find solutions to your common Ultralytics YOLO related queries. Learn about hardware requirements, fine-tuning YOLO models, conversion to ONNX/TensorFlow, and more.
keywords: Ultralytics, YOLO, FAQ, hardware requirements, ONNX, TensorFlow, real-time detection, YOLO accuracy
---
# Ultralytics YOLO Frequently Asked Questions (FAQ)
@ -34,4 +36,4 @@ Improving the accuracy of a YOLO model may involve several strategies, such as:
Remember that there's often a trade-off between accuracy and inference speed, so finding the right balance is crucial for your specific application.
If you have any more questions or need assistance, don't hesitate to consult the Ultralytics documentation or reach out to the community through GitHub Issues or the official discussion forum.
If you have any more questions or need assistance, don't hesitate to consult the Ultralytics documentation or reach out to the community through GitHub Issues or the official discussion forum.

@ -1,5 +1,7 @@
---
comments: true
description: Explore Ultralytics community’s Code of Conduct, ensuring a supportive, inclusive environment for contributors & members at all levels. Find our guidelines on acceptable behavior & enforcement.
keywords: Ultralytics, code of conduct, community, contribution, behavior guidelines, enforcement, open source contributions
---
# Ultralytics Contributor Covenant Code of Conduct
@ -110,7 +112,7 @@ Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within

@ -1,20 +1,22 @@
---
comments: true
description: Learn how to contribute to Ultralytics YOLO projects – guidelines for pull requests, reporting bugs, code conduct and CLA signing.
keywords: Ultralytics, YOLO, open-source, contribute, pull request, bug report, coding guidelines, CLA, code of conduct, GitHub
---
# Contributing to Ultralytics Open-Source YOLO Repositories
First of all, thank you for your interest in contributing to Ultralytics open-source YOLO repositories! Your contributions will help improve the project and benefit the community. This document provides guidelines and best practices for contributing to Ultralytics YOLO repositories.
First of all, thank you for your interest in contributing to Ultralytics open-source YOLO repositories! Your contributions will help improve the project and benefit the community. This document provides guidelines and best practices to get you started.
## Table of Contents
- [Code of Conduct](#code-of-conduct)
- [Pull Requests](#pull-requests)
- [CLA Signing](#cla-signing)
- [Google-Style Docstrings](#google-style-docstrings)
- [GitHub Actions CI Tests](#github-actions-ci-tests)
- [CLA Signing](#cla-signing)
- [Google-Style Docstrings](#google-style-docstrings)
- [GitHub Actions CI Tests](#github-actions-ci-tests)
- [Bug Reports](#bug-reports)
- [Minimum Reproducible Example](#minimum-reproducible-example)
- [Minimum Reproducible Example](#minimum-reproducible-example)
- [License and Copyright](#license-and-copyright)
## Code of Conduct
@ -70,4 +72,4 @@ def example_function(arg1: int, arg2: str) -> bool:
### GitHub Actions CI Tests
Before your pull request can be merged, all GitHub Actions Continuous Integration (CI) tests must pass. These tests include linting, unit tests, and other checks to ensure that your changes meet the quality standards of the project. Make sure to review the output of the GitHub Actions and fix any issues
Before your pull request can be merged, all GitHub Actions Continuous Integration (CI) tests must pass. These tests include linting, unit tests, and other checks to ensure that your changes meet the quality standards of the project. Make sure to review the output of the GitHub Actions and fix any issues

@ -0,0 +1,37 @@
---
comments: false
description: Discover Ultralytics’ EHS policy principles and implementation measures. Committed to safety, environment, and continuous improvement for a sustainable future.
keywords: Ultralytics policy, EHS, environment, health and safety, compliance, prevention, continuous improvement, risk management, emergency preparedness, resource allocation, communication
---
# Ultralytics Environmental, Health and Safety (EHS) Policy
At Ultralytics, we recognize that the long-term success of our company relies not only on the products and services we offer, but also the manner in which we conduct our business. We are committed to ensuring the safety and well-being of our employees, stakeholders, and the environment, and we will continuously strive to mitigate our impact on the environment while promoting health and safety.
## Policy Principles
1. **Compliance**: We will comply with all applicable laws, regulations, and standards related to EHS, and we will strive to exceed these standards where possible.
2. **Prevention**: We will work to prevent accidents, injuries, and environmental harm by implementing risk management measures and ensuring all our operations and procedures are safe.
3. **Continuous Improvement**: We will continuously improve our EHS performance by setting measurable objectives, monitoring our performance, auditing our operations, and revising our policies and procedures as needed.
4. **Communication**: We will communicate openly about our EHS performance and will engage with stakeholders to understand and address their concerns and expectations.
5. **Education and Training**: We will educate and train our employees and contractors in appropriate EHS procedures and practices.
## Implementation Measures
1. **Responsibility and Accountability**: Every employee and contractor working at or with Ultralytics is responsible for adhering to this policy. Managers and supervisors are accountable for ensuring this policy is implemented within their areas of control.
2. **Risk Management**: We will identify, assess, and manage EHS risks associated with our operations and activities to prevent accidents, injuries, and environmental harm.
3. **Resource Allocation**: We will allocate the necessary resources to ensure the effective implementation of our EHS policy, including the necessary equipment, personnel, and training.
4. **Emergency Preparedness and Response**: We will develop, maintain, and test emergency preparedness and response plans to ensure we can respond effectively to EHS incidents.
5. **Monitoring and Review**: We will monitor and review our EHS performance regularly to identify opportunities for improvement and ensure we are meeting our objectives.
This policy reflects our commitment to minimizing our environmental footprint, ensuring the safety and well-being of our employees, and continuously improving our performance.
Please remember that the implementation of an effective EHS policy requires the involvement and commitment of everyone working at or with Ultralytics. We encourage you to take personal responsibility for your safety and the safety of others, and to take care of the environment in which we live and work.

@ -1,14 +1,18 @@
---
comments: true
description: Find comprehensive guides and documents on Ultralytics YOLO tasks. Includes FAQs, contributing guides, CI guide, CLA, MRE guide, code of conduct & more.
keywords: Ultralytics, YOLO, guides, documents, FAQ, contributing, CI guide, CLA, MRE guide, code of conduct, EHS policy, security policy
---
Welcome to the Ultralytics Help page! We are committed to providing you with comprehensive resources to make your experience with Ultralytics YOLO repositories as smooth and enjoyable as possible. On this page, you'll find essential links to guides and documents that will help you navigate through common tasks and address any questions you might have while using our repositories.
- [Frequently Asked Questions (FAQ)](FAQ.md): Find answers to common questions and issues faced by users and contributors of Ultralytics YOLO repositories.
- [Contributing Guide](contributing.md): Learn the best practices for submitting pull requests, reporting bugs, and contributing to the development of our repositories.
- [Continuous Integration (CI) Guide](CI.md): Understand the CI tests we perform for each Ultralytics repository and see their current statuses.
- [Contributor License Agreement (CLA)](CLA.md): Familiarize yourself with our CLA to understand the terms and conditions for contributing to Ultralytics projects.
- [Minimum Reproducible Example (MRE) Guide](minimum_reproducible_example.md): Understand how to create an MRE when submitting bug reports to ensure that our team can quickly and efficiently address the issue.
- [Code of Conduct](code_of_conduct.md): Learn about our community guidelines and expectations to ensure a welcoming and inclusive environment for all participants.
- [Environmental, Health and Safety (EHS) Policy](environmental-health-safety.md): Explore Ultralytics' dedicated approach towards maintaining a sustainable, safe, and healthy work environment for all our stakeholders.
- [Security Policy](../SECURITY.md): Understand our security practices and how to report security vulnerabilities responsibly.
We highly recommend going through these guides to make the most of your collaboration with the Ultralytics community. Our goal is to maintain a welcoming and supportive environment for all users and contributors. If you need further assistance, don't hesitate to reach out to us through GitHub Issues or the official discussion forum. Happy coding!
We highly recommend going through these guides to make the most of your collaboration with the Ultralytics community. Our goal is to maintain a welcoming and supportive environment for all users and contributors. If you need further assistance, don't hesitate to reach out to us through GitHub Issues or the official discussion forum. Happy coding!

@ -1,10 +1,12 @@
---
comments: true
description: Learn how to create minimum reproducible examples (MRE) for efficient bug reporting in Ultralytics YOLO repositories with this step-by-step guide.
keywords: Ultralytics, YOLO, minimum reproducible example, MRE, bug reports, guide, dependencies, code, troubleshooting
---
# Creating a Minimum Reproducible Example for Bug Reports in Ultralytics YOLO Repositories
When submitting a bug report for Ultralytics YOLO repositories, it's essential to provide a [minimum reproducible example](https://stackoverflow.com/help/minimal-reproducible-example) (MRE). An MRE is a small, self-contained piece of code that demonstrates the problem you're experiencing. Providing an MRE helps maintainers and contributors understand the issue and work on a fix more efficiently. This guide explains how to create an MRE when submitting bug reports to Ultralytics YOLO repositories.
When submitting a bug report for Ultralytics YOLO repositories, it's essential to provide a [minimum reproducible example](https://docs.ultralytics.com/help/minimum_reproducible_example/) (MRE). An MRE is a small, self-contained piece of code that demonstrates the problem you're experiencing. Providing an MRE helps maintainers and contributors understand the issue and work on a fix more efficiently. This guide explains how to create an MRE when submitting bug reports to Ultralytics YOLO repositories.
## 1. Isolate the Problem
@ -73,4 +75,4 @@ RuntimeError: Expected input[1, 0, 640, 640] to have 3 channels, but got 0 chann
In this example, the MRE demonstrates the issue with a minimal amount of code, uses a public model ('yolov8n.pt'), includes all necessary dependencies, and provides a clear description of the problem along with the error message.
By following these guidelines, you'll help the maintainers and contributors of Ultralytics YOLO repositories to understand and resolve your issue more efficiently.
By following these guidelines, you'll help the maintainers and contributors of Ultralytics YOLO repositories to understand and resolve your issue more efficiently.

@ -1,116 +0,0 @@
---
comments: true
---
# Ultralytics HUB
<a href="https://bit.ly/ultralytics_hub" target="_blank">
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/im/ultralytics-hub.png"></a>
<br>
<div align="center">
<a href="https://github.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://www.linkedin.com/company/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://twitter.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-twitter.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://youtube.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-youtube.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://www.tiktok.com/@ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-tiktok.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://www.instagram.com/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-instagram.png" width="2%" alt="" /></a>
<br>
<br>
<a href="https://github.com/ultralytics/hub/actions/workflows/ci.yaml">
<img src="https://github.com/ultralytics/hub/actions/workflows/ci.yaml/badge.svg" alt="CI CPU"></a>
<a href="https://colab.research.google.com/github/ultralytics/hub/blob/master/hub.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
</div>
[Ultralytics HUB](https://hub.ultralytics.com) is a new no-code online tool developed
by [Ultralytics](https://ultralytics.com), the creators of the popular [YOLOv5](https://github.com/ultralytics/yolov5)
object detection and image segmentation models. With Ultralytics HUB, users can easily train and deploy YOLO models
without any coding or technical expertise.
Ultralytics HUB is designed to be user-friendly and intuitive, with a drag-and-drop interface that allows users to
easily upload their data and select their model configurations. It also offers a range of pre-trained models and
templates to choose from, making it easy for users to get started with training their own models. Once a model is
trained, it can be easily deployed and used for real-time object detection and image segmentation tasks. Overall,
Ultralytics HUB is an essential tool for anyone looking to use YOLO for their object detection and image segmentation
projects.
**[Get started now](https://hub.ultralytics.com)** and experience the power and simplicity of Ultralytics HUB for
yourself. Sign up for a free account and start building, training, and deploying YOLOv5 and YOLOv8 models today.
## 1. Upload a Dataset
Ultralytics HUB datasets are just like YOLOv5 🚀 datasets, they use the same structure and the same label formats to keep
everything simple.
When you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML inside the dataset root directory**
as in the example shown below, and then zip for upload to https://hub.ultralytics.com/. Your **dataset YAML, directory
and zip** should all share the same name. For example, if your dataset is called 'coco6' as in our
example [ultralytics/hub/coco6.zip](https://github.com/ultralytics/hub/blob/master/coco6.zip), then you should have a
coco6.yaml inside your coco6/ directory, which should zip to create coco6.zip for upload:
```bash
zip -r coco6.zip coco6
```
The example [coco6.zip](https://github.com/ultralytics/hub/blob/master/coco6.zip) dataset in this repository can be
downloaded and unzipped to see exactly how to structure your custom dataset.
<p align="center">
<img width="80%" src="https://user-images.githubusercontent.com/26833433/201424843-20fa081b-ad4b-4d6c-a095-e810775908d8.png" title="COCO6" />
</p>
The dataset YAML is the same standard YOLOv5 YAML format. See
the [YOLOv5 Train Custom Data tutorial](https://docs.ultralytics.com/yolov5/tutorials/train_custom_data) for full details.
```yaml
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: # dataset root dir (leave empty for HUB)
train: images/train # train images (relative to 'path') 8 images
val: images/val # val images (relative to 'path') 8 images
test: # test images (optional)
# Classes
names:
0: person
1: bicycle
2: car
3: motorcycle
...
```
After zipping your dataset, sign in to [Ultralytics HUB](https://bit.ly/ultralytics_hub) and click the Datasets tab.
Click 'Upload Dataset' to upload, scan and visualize your new dataset before training new YOLOv5 models on it!
<img width="100%" alt="HUB Dataset Upload" src="https://user-images.githubusercontent.com/26833433/216763338-9a8812c8-a4e5-4362-8102-40dad7818396.png">
## 2. Train a Model
Connect to the Ultralytics HUB notebook and use your model API key to begin training!
<a href="https://colab.research.google.com/github/ultralytics/hub/blob/master/hub.ipynb" target="_blank">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
## 3. Deploy to Real World
Export your model to 13 different formats, including TensorFlow, ONNX, OpenVINO, CoreML, Paddle and many others. Run
models directly on your [iOS](https://apps.apple.com/xk/app/ultralytics/id1583935240) or
[Android](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app) mobile device by downloading
the [Ultralytics App](https://ultralytics.com/app_install)!
## ❓ Issues
If you are a new [Ultralytics HUB](https://bit.ly/ultralytics_hub) user and have questions or comments, you are in the
right place! Please raise a [New Issue](https://github.com/ultralytics/hub/issues/new/choose) and let us know what we
can do to make your life better 😃!

@ -0,0 +1,66 @@
---
comments: true
description: Learn about the Ultralytics Android App, enabling real-time object detection using YOLO models. Discover in-app features, quantization methods, and delegate options for optimal performance.
keywords: Ultralytics, Android App, real-time object detection, YOLO models, TensorFlow Lite, FP16 quantization, INT8 quantization, CPU, GPU, Hexagon, NNAPI
---
# Ultralytics Android App: Real-time Object Detection with YOLO Models
The Ultralytics Android App is a powerful tool that allows you to run YOLO models directly on your Android device for real-time object detection. This app utilizes TensorFlow Lite for model optimization and various hardware delegates for acceleration, enabling fast and efficient object detection.
## Quantization and Acceleration
To achieve real-time performance on your Android device, YOLO models are quantized to either FP16 or INT8 precision. Quantization is a process that reduces the numerical precision of the model's weights and biases, thus reducing the model's size and the amount of computation required. This results in faster inference times without significantly affecting the model's accuracy.
### FP16 Quantization
FP16 (or half-precision) quantization converts the model's 32-bit floating-point numbers to 16-bit floating-point numbers. This reduces the model's size by half and speeds up the inference process, while maintaining a good balance between accuracy and performance.
### INT8 Quantization
INT8 (or 8-bit integer) quantization further reduces the model's size and computation requirements by converting its 32-bit floating-point numbers to 8-bit integers. This quantization method can result in a significant speedup, but it may lead to a slight reduction in mean average precision (mAP) due to the lower numerical precision.
!!! tip "mAP Reduction in INT8 Models"
The reduced numerical precision in INT8 models can lead to some loss of information during the quantization process, which may result in a slight decrease in mAP. However, this trade-off is often acceptable considering the substantial performance gains offered by INT8 quantization.
## Delegates and Performance Variability
Different delegates are available on Android devices to accelerate model inference. These delegates include CPU, [GPU](https://www.tensorflow.org/lite/android/delegates/gpu), [Hexagon](https://www.tensorflow.org/lite/android/delegates/hexagon) and [NNAPI](https://www.tensorflow.org/lite/android/delegates/nnapi). The performance of these delegates varies depending on the device's hardware vendor, product line, and specific chipsets used in the device.
1. **CPU**: The default option, with reasonable performance on most devices.
2. **GPU**: Utilizes the device's GPU for faster inference. It can provide a significant performance boost on devices with powerful GPUs.
3. **Hexagon**: Leverages Qualcomm's Hexagon DSP for faster and more efficient processing. This option is available on devices with Qualcomm Snapdragon processors.
4. **NNAPI**: The Android Neural Networks API (NNAPI) serves as an abstraction layer for running ML models on Android devices. NNAPI can utilize various hardware accelerators, such as CPU, GPU, and dedicated AI chips (e.g., Google's Edge TPU, or the Pixel Neural Core).
Here's a table showing the primary vendors, their product lines, popular devices, and supported delegates:
| Vendor | Product Lines | Popular Devices | Delegates Supported |
|-----------------------------------------|---------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------|
| [Qualcomm](https://www.qualcomm.com/) | [Snapdragon (e.g., 800 series)](https://www.qualcomm.com/snapdragon) | [Samsung Galaxy S21](https://www.samsung.com/global/galaxy/galaxy-s21-5g/), [OnePlus 9](https://www.oneplus.com/9), [Google Pixel 6](https://store.google.com/product/pixel_6) | CPU, GPU, Hexagon, NNAPI |
| [Samsung](https://www.samsung.com/) | [Exynos (e.g., Exynos 2100)](https://www.samsung.com/semiconductor/minisite/exynos/) | [Samsung Galaxy S21 (Global version)](https://www.samsung.com/global/galaxy/galaxy-s21-5g/) | CPU, GPU, NNAPI |
| [MediaTek](https://www.mediatek.com/) | [Dimensity (e.g., Dimensity 1200)](https://www.mediatek.com/products/smartphones) | [Realme GT](https://www.realme.com/global/realme-gt), [Xiaomi Redmi Note](https://www.mi.com/en/phone/redmi/note-list) | CPU, GPU, NNAPI |
| [HiSilicon](https://www.hisilicon.com/) | [Kirin (e.g., Kirin 990)](https://www.hisilicon.com/en/products/Kirin) | [Huawei P40 Pro](https://consumer.huawei.com/en/phones/p40-pro/), [Huawei Mate 30 Pro](https://consumer.huawei.com/en/phones/mate30-pro/) | CPU, GPU, NNAPI |
| [NVIDIA](https://www.nvidia.com/) | [Tegra (e.g., Tegra X1)](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems-dev-kits-modules/) | [NVIDIA Shield TV](https://www.nvidia.com/en-us/shield/shield-tv/), [Nintendo Switch](https://www.nintendo.com/switch/) | CPU, GPU, NNAPI |
Please note that the list of devices mentioned is not exhaustive and may vary depending on the specific chipsets and device models. Always test your models on your target devices to ensure compatibility and optimal performance.
Keep in mind that the choice of delegate can affect performance and model compatibility. For example, some models may not work with certain delegates, or a delegate may not be available on a specific device. As such, it's essential to test your model and the chosen delegate on your target devices for the best results.
## Getting Started with the Ultralytics Android App
To get started with the Ultralytics Android App, follow these steps:
1. Download the Ultralytics App from the [Google Play Store](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app).
2. Launch the app on your Android device and sign in with your Ultralytics account. If you don't have an account yet, create one [here](https://hub.ultralytics.com/).
3. Once signed in, you will see a list of your trained YOLO models. Select a model to use for object detection.
4. Grant the app permission to access your device's camera.
5. Point your device's camera at objects you want to detect. The app will display bounding boxes and class labels in real-time as it detects objects.
6. Explore the app's settings to adjust the detection threshold, enable or disable specific object classes, and more.
With the Ultralytics Android App, you now have the power of real-time object detection using YOLO models right at your fingertips. Enjoy exploring the app's features and optimizing its settings to suit your specific use cases.

@ -1,8 +1,10 @@
---
comments: true
description: Explore the Ultralytics HUB App, offering the ability to run YOLOv5 and YOLOv8 models on your iOS and Android devices with optimized performance.
keywords: Ultralytics, HUB App, YOLOv5, YOLOv8, mobile AI, real-time object detection, image recognition, mobile device, hardware acceleration, Apple Neural Engine, Android GPU, NNAPI, custom model training
---
# Ultralytics HUB App for YOLOv8
# Ultralytics HUB App
<a href="https://bit.ly/ultralytics_hub" target="_blank">
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/im/ultralytics-hub.png"></a>
@ -11,7 +13,7 @@ comments: true
<a href="https://github.com/ultralytics" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-github.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://www.linkedin.com/company/ultralytics" style="text-decoration:none;">
<a href="https://www.linkedin.com/company/ultralytics/" style="text-decoration:none;">
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-social-linkedin.png" width="2%" alt="" /></a>
<img src="https://github.com/ultralytics/assets/raw/main/social/logo-transparent.png" width="2%" alt="" />
<a href="https://twitter.com/ultralytics" style="text-decoration:none;">
@ -33,20 +35,18 @@ comments: true
<img src="https://raw.githubusercontent.com/ultralytics/assets/master/app/app-store.svg" width="15%" alt="" /></a>
</div>
Welcome to the Ultralytics HUB app, which is designed to demonstrate the power and capabilities of the YOLOv5 and YOLOv8
models. This app is available for download on
the [Apple App Store](https://apps.apple.com/xk/app/ultralytics/id1583935240) and
the [Google Play Store](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app).
Welcome to the Ultralytics HUB App! We are excited to introduce this powerful mobile app that allows you to run YOLOv5 and YOLOv8 models directly on your [iOS](https://apps.apple.com/xk/app/ultralytics/id1583935240) and [Android](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app) devices. With the HUB App, you can utilize hardware acceleration features like Apple's Neural Engine (ANE) or Android GPU and Neural Network API (NNAPI) delegates to achieve impressive performance on your mobile device.
**To install the app, simply scan the QR code provided above**. At the moment, the app features YOLOv5 models, with
YOLOv8 models set to be available soon.
## Features
With the YOLOv5 model, you can easily detect and classify objects in images and videos with high accuracy and speed. The
model has been trained on a vast dataset and can recognize a wide range of objects, including pedestrians, traffic
signs, and cars.
- **Run YOLOv5 and YOLOv8 models**: Experience the power of YOLO models on your mobile device for real-time object detection and image recognition tasks.
- **Hardware Acceleration**: Benefit from Apple ANE on iOS devices or Android GPU and NNAPI delegates for optimized performance.
- **Custom Model Training**: Train custom models with the Ultralytics HUB platform and preview them live using the HUB App.
- **Mobile Compatibility**: The HUB App supports both iOS and Android devices, bringing the power of YOLO models to a wide range of users.
Using this app, you can try out YOLOv5 on your images and videos, and observe how the model works in real-time.
Additionally, you can learn more about YOLOv5's functionality and how it can be integrated into real-world applications.
## App Documentation
We are confident that you will enjoy using YOLOv5 and be amazed at its capabilities. Thank you for choosing Ultralytics
for your AI solutions.
- [**iOS**](./ios.md): Learn about YOLO CoreML models accelerated on Apple's Neural Engine for iPhones and iPads.
- [**Android**](./android.md): Explore TFLite acceleration on Android mobile devices.
Get started today by downloading the Ultralytics HUB App on your mobile device and unlock the potential of YOLOv5 and YOLOv8 models on-the-go. Don't forget to check out our comprehensive [HUB Docs](../index.md) for more information on training, deploying, and using your custom models with the Ultralytics HUB platform.

@ -0,0 +1,56 @@
---
comments: true
description: Execute object detection in real-time on your iOS devices utilizing YOLO models. Leverage the power of the Apple Neural Engine and Core ML for fast and efficient object detection.
keywords: Ultralytics, iOS app, object detection, YOLO models, real time, Apple Neural Engine, Core ML, FP16, INT8, quantization
---
# Ultralytics iOS App: Real-time Object Detection with YOLO Models
The Ultralytics iOS App is a powerful tool that allows you to run YOLO models directly on your iPhone or iPad for real-time object detection. This app utilizes the Apple Neural Engine and Core ML for model optimization and acceleration, enabling fast and efficient object detection.
## Quantization and Acceleration
To achieve real-time performance on your iOS device, YOLO models are quantized to either FP16 or INT8 precision. Quantization is a process that reduces the numerical precision of the model's weights and biases, thus reducing the model's size and the amount of computation required. This results in faster inference times without significantly affecting the model's accuracy.
### FP16 Quantization
FP16 (or half-precision) quantization converts the model's 32-bit floating-point numbers to 16-bit floating-point numbers. This reduces the model's size by half and speeds up the inference process, while maintaining a good balance between accuracy and performance.
### INT8 Quantization
INT8 (or 8-bit integer) quantization further reduces the model's size and computation requirements by converting its 32-bit floating-point numbers to 8-bit integers. This quantization method can result in a significant speedup, but it may lead to a slight reduction in accuracy.
## Apple Neural Engine
The Apple Neural Engine (ANE) is a dedicated hardware component integrated into Apple's A-series and M-series chips. It's designed to accelerate machine learning tasks, particularly for neural networks, allowing for faster and more efficient execution of your YOLO models.
By combining quantized YOLO models with the Apple Neural Engine, the Ultralytics iOS App achieves real-time object detection on your iOS device without compromising on accuracy or performance.
| Release Year | iPhone Name | Chipset Name | Node Size | ANE TOPs |
|--------------|------------------------------------------------------|-------------------------------------------------------|-----------|----------|
| 2017 | [iPhone X](https://en.wikipedia.org/wiki/IPhone_X) | [A11 Bionic](https://en.wikipedia.org/wiki/Apple_A11) | 10 nm | 0.6 |
| 2018 | [iPhone XS](https://en.wikipedia.org/wiki/IPhone_XS) | [A12 Bionic](https://en.wikipedia.org/wiki/Apple_A12) | 7 nm | 5 |
| 2019 | [iPhone 11](https://en.wikipedia.org/wiki/IPhone_11) | [A13 Bionic](https://en.wikipedia.org/wiki/Apple_A13) | 7 nm | 6 |
| 2020 | [iPhone 12](https://en.wikipedia.org/wiki/IPhone_12) | [A14 Bionic](https://en.wikipedia.org/wiki/Apple_A14) | 5 nm | 11 |
| 2021 | [iPhone 13](https://en.wikipedia.org/wiki/IPhone_13) | [A15 Bionic](https://en.wikipedia.org/wiki/Apple_A15) | 5 nm | 15.8 |
| 2022 | [iPhone 14](https://en.wikipedia.org/wiki/IPhone_14) | [A16 Bionic](https://en.wikipedia.org/wiki/Apple_A16) | 4 nm | 17.0 |
Please note that this list only includes iPhone models from 2017 onwards, and the ANE TOPs values are approximate.
## Getting Started with the Ultralytics iOS App
To get started with the Ultralytics iOS App, follow these steps:
1. Download the Ultralytics App from the [App Store](https://apps.apple.com/xk/app/ultralytics/id1583935240).
2. Launch the app on your iOS device and sign in with your Ultralytics account. If you don't have an account yet, create one [here](https://hub.ultralytics.com/).
3. Once signed in, you will see a list of your trained YOLO models. Select a model to use for object detection.
4. Grant the app permission to access your device's camera.
5. Point your device's camera at objects you want to detect. The app will display bounding boxes and class labels in real-time as it detects objects.
6. Explore the app's settings to adjust the detection threshold, enable or disable specific object classes, and more.
With the Ultralytics iOS App, you can now leverage the power of YOLO models for real-time object detection on your iPhone or iPad, powered by the Apple Neural Engine and optimized with FP16 or INT8 quantization.

@ -0,0 +1,159 @@
---
comments: true
description: Learn how Ultralytics HUB datasets streamline your ML workflow. Upload, format, validate, access, share, edit or delete datasets for Ultralytics YOLO model training.
keywords: Ultralytics, HUB datasets, YOLO model training, upload datasets, dataset validation, ML workflow, share datasets
---
# HUB Datasets
Ultralytics HUB datasets are a practical solution for managing and leveraging your custom datasets.
Once uploaded, datasets can be immediately utilized for model training. This integrated approach facilitates a seamless transition from dataset management to model training, significantly simplifying the entire process.
## Upload Dataset
Ultralytics HUB datasets are just like YOLOv5 and YOLOv8 🚀 datasets. They use the same structure and the same label formats to keep
everything simple.
Before you upload a dataset to Ultralytics HUB, make sure to **place your dataset YAML file inside the dataset root directory** and that **your dataset YAML, directory and ZIP have the same name**, as shown in the example below, and then zip the dataset directory.
For example, if your dataset is called "coco8", as our [COCO8](https://docs.ultralytics.com/datasets/detect/coco8) example dataset, then you should have a `coco8.yaml` inside your `coco8/` directory, which will create a `coco8.zip` when zipped:
```bash
zip -r coco8.zip coco8
```
You can download our [COCO8](https://github.com/ultralytics/hub/blob/master/example_datasets/coco8.zip) example dataset and unzip it to see exactly how to structure your dataset.
<p align="center">
<img src="https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_1.jpg" alt="COCO8 Dataset Structure" width="80%" />
</p>
The dataset YAML is the same standard YOLOv5 and YOLOv8 YAML format.
!!! example "coco8.yaml"
```yaml
--8<-- "ultralytics/cfg/datasets/coco8.yaml"
```
After zipping your dataset, you should validate it before uploading it to Ultralytics HUB. Ultralytics HUB conducts the dataset validation check post-upload, so by ensuring your dataset is correctly formatted and error-free ahead of time, you can forestall any setbacks due to dataset rejection.
```py
from ultralytics.hub import check_dataset
check_dataset('path/to/coco8.zip')
```
Once your dataset ZIP is ready, navigate to the [Datasets](https://hub.ultralytics.com/datasets) page by clicking on the **Datasets** button in the sidebar.
![Ultralytics HUB screenshot of the Home page with an arrow pointing to the Datasets button in the sidebar](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_2.jpg)
??? tip "Tip"
You can also upload a dataset directly from the [Home](https://hub.ultralytics.com/home) page.
![Ultralytics HUB screenshot of the Home page with an arrow pointing to the Upload Dataset card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_3.jpg)
Click on the **Upload Dataset** button on the top right of the page. This action will trigger the **Upload Dataset** dialog.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Upload Dataset button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_4.jpg)
Upload your dataset in the _Dataset .zip file_ field.
You have the additional option to set a custom name and description for your Ultralytics HUB dataset.
When you're happy with your dataset configuration, click **Upload**.
![Ultralytics HUB screenshot of the Upload Dataset dialog with an arrow pointing to the Upload button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_5.jpg)
After your dataset is uploaded and processed, you will be able to access it from the Datasets page.
![Ultralytics HUB screenshot of the Datasets page with an arrow pointing to one of the datasets](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_6.jpg)
You can view the images in your dataset grouped by splits (Train, Validation, Test).
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Images tab](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_7.jpg)
??? tip "Tip"
Each image can be enlarged for better visualization.
![Ultralytics HUB screenshot of the Images tab inside the Dataset page with an arrow pointing to the expand icon](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_8.jpg)
![Ultralytics HUB screenshot of the Images tab inside the Dataset page with one of the images expanded](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_9.jpg)
Also, you can analyze your dataset by click on the **Overview** tab.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Overview tab](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_10.jpg)
Next, [train a model](https://docs.ultralytics.com/hub/models/#train-model) on your dataset.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Train Model button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_upload_dataset_11.jpg)
## Share Dataset
!!! info "Info"
Ultralytics HUB's sharing functionality provides a convenient way to share datasets with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
??? note "Note"
You have control over the general access of your datasets.
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the dataset, regardless of whether they have an Ultralytics HUB account or not.
Navigate to the Dataset page of the dataset you want to share, open the dataset actions dropdown and click on the **Share** option. This action will trigger the **Share Dataset** dialog.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Share option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_share_dataset_1.jpg)
??? tip "Tip"
You can also share a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
![Ultralytics HUB screenshot of the Datasets page with an arrow pointing to the Share option of one of the datasets](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_share_dataset_2.jpg)
Set the general access to "Unlisted" and click **Save**.
![Ultralytics HUB screenshot of the Share Dataset dialog with an arrow pointing to the dropdown and one to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_share_dataset_3.jpg)
Now, anyone who has the direct link to your dataset can view it.
??? tip "Tip"
You can easily click on the dataset's link shown in the **Share Dataset** dialog to copy it.
![Ultralytics HUB screenshot of the Share Dataset dialog with an arrow pointing to the dataset's link](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_share_dataset_4.jpg)
## Edit Dataset
Navigate to the Dataset page of the dataset you want to edit, open the dataset actions dropdown and click on the **Edit** option. This action will trigger the **Update Dataset** dialog.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Edit option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_edit_dataset_1.jpg)
??? tip "Tip"
You can also edit a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
![Ultralytics HUB screenshot of the Datasets page with an arrow pointing to the Edit option of one of the datasets](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_edit_dataset_2.jpg)
Apply the desired modifications to your dataset and then confirm the changes by clicking **Save**.
![Ultralytics HUB screenshot of the Update Dataset dialog with an arrow pointing to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_edit_dataset_3.jpg)
## Delete Dataset
Navigate to the Dataset page of the dataset you want to delete, open the dataset actions dropdown and click on the **Delete** option. This action will delete the dataset.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Delete option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_delete_dataset_1.jpg)
??? tip "Tip"
You can also delete a dataset directly from the [Datasets](https://hub.ultralytics.com/datasets) page.
![Ultralytics HUB screenshot of the Datasets page with an arrow pointing to the Delete option of one of the datasets](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_delete_dataset_2.jpg)
??? note "Note"
If you change your mind, you can restore the dataset from the [Trash](https://hub.ultralytics.com/trash) page.
![Ultralytics HUB screenshot of the Trash page with an arrow pointing to the Restore option of one of the datasets](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/datasets/hub_delete_dataset_3.jpg)

@ -0,0 +1,42 @@
---
comments: true
description: Gain seamless experience in training and deploying your YOLOv5 and YOLOv8 models with Ultralytics HUB. Explore pre-trained models, templates and various integrations.
keywords: Ultralytics HUB, YOLOv5, YOLOv8, model training, model deployment, pretrained models, model integrations
---
# Ultralytics HUB
<a href="https://bit.ly/ultralytics_hub" target="_blank">
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/im/ultralytics-hub.png"></a>
<br>
<br>
<div align="center">
<a href="https://github.com/ultralytics/hub/actions/workflows/ci.yaml">
<img src="https://github.com/ultralytics/hub/actions/workflows/ci.yaml/badge.svg" alt="CI CPU"></a>
<a href="https://colab.research.google.com/github/ultralytics/hub/blob/master/hub.ipynb">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a>
</div>
<br>
👋 Hello from the [Ultralytics](https://ultralytics.com/) Team! We've been working hard these last few months to
launch [Ultralytics HUB](https://bit.ly/ultralytics_hub), a new web tool for training and deploying all your YOLOv5 and YOLOv8 🚀
models from one spot!
## Introduction
HUB is designed to be user-friendly and intuitive, with a drag-and-drop interface that allows users to
easily upload their data and train new models quickly. It offers a range of pre-trained models and
templates to choose from, making it easy for users to get started with training their own models. Once a model is
trained, it can be easily deployed and used for real-time object detection, instance segmentation and classification tasks.
We hope that the resources here will help you get the most out of HUB. Please browse the HUB <a href="https://docs.ultralytics.com/hub">Docs</a> for details, raise an issue on <a href="https://github.com/ultralytics/hub/issues/new/choose">GitHub</a> for support, and join our <a href="https://ultralytics.com/discord">Discord</a> community for questions and discussions!
- [**Quickstart**](./quickstart.md). Start training and deploying YOLO models with HUB in seconds.
- [**Datasets: Preparing and Uploading**](./datasets.md). Learn how to prepare and upload your datasets to HUB in YOLO format.
- [**Projects: Creating and Managing**](./projects.md). Group your models into projects for improved organization.
- [**Models: Training and Exporting**](./models.md). Train YOLOv5 and YOLOv8 models on your custom datasets and export them to various formats for deployment.
- [**Integrations: Options**](./integrations.md). Explore different integration options for your trained models, such as TensorFlow, ONNX, OpenVINO, CoreML, and PaddlePaddle.
- [**Ultralytics HUB App**](./app/index.md). Learn about the Ultralytics App for iOS and Android, which allows you to run models directly on your mobile device.
* [**iOS**](./app/ios.md). Learn about YOLO CoreML models accelerated on Apple's Neural Engine on iPhones and iPads.
* [**Android**](./app/android.md). Explore TFLite acceleration on mobile devices.
- [**Inference API**](./inference_api.md). Understand how to use the Inference API for running your trained models in the cloud to generate predictions.

@ -0,0 +1,458 @@
---
comments: true
description: Access object detection capabilities of YOLOv8 via our RESTful API. Learn how to use the YOLO Inference API with Python or CLI for swift object detection.
keywords: Ultralytics, YOLOv8, Inference API, object detection, RESTful API, Python, CLI, Quickstart
---
# YOLO Inference API
The YOLO Inference API allows you to access the YOLOv8 object detection capabilities via a RESTful API. This enables you to run object detection on images without the need to install and set up the YOLOv8 environment locally.
![Inference API Screenshot](https://github.com/ultralytics/ultralytics/assets/26833433/c0109ec0-7bb0-46e1-b0d2-bae687960a01)
Screenshot of the Inference API section in the trained model Preview tab.
## API URL
The API URL is the address used to access the YOLO Inference API. In this case, the base URL is:
```
https://api.ultralytics.com/v1/predict
```
## Example Usage in Python
To access the YOLO Inference API with the specified model and API key using Python, you can use the following code:
```python
import requests
# API URL, use actual MODEL_ID
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (optional)
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
```
In this example, replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `path/to/image.jpg` with the path to the image you want to analyze.
## Example Usage with CLI
You can use the YOLO Inference API with the command-line interface (CLI) by utilizing the `curl` command. Replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `image.jpg` with the path to the image you want to analyze:
```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \
-F "size=640" \
-F "confidence=0.25" \
-F "iou=0.45"
```
## Passing Arguments
This command sends a POST request to the YOLO Inference API with the specified `MODEL_ID` in the URL and the `API_KEY` in the request `headers`, along with the image file specified by `@path/to/image.jpg`.
Here's an example of passing the `size`, `confidence`, and `iou` arguments via the API URL using the `requests` library in Python:
```python
import requests
# API URL, use actual MODEL_ID
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (optional)
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
```
In this example, the `data` dictionary contains the query arguments `size`, `confidence`, and `iou`, which tells the API to run inference at image size 640 with confidence and IoU thresholds of 0.25 and 0.45.
This will send the query parameters along with the file in the POST request. See the table below for a full list of available inference arguments.
| Inference Argument | Default | Type | Notes |
|--------------------|---------|---------|------------------------------------------------|
| `size` | `640` | `int` | valid range is `32` - `1280` pixels |
| `confidence` | `0.25` | `float` | valid range is `0.01` - `1.0` |
| `iou` | `0.45` | `float` | valid range is `0.0` - `0.95` |
| `url` | `''` | `str` | optional image URL if not image file is passed |
| `normalize` | `False` | `bool` | |
## Return JSON format
The YOLO Inference API returns a JSON list with the detection results. The format of the JSON list will be the same as the one produced locally by the `results[0].tojson()` command.
The JSON list contains information about the detected objects, their coordinates, classes, and confidence scores.
### Detect Model Format
YOLO detection models, such as `yolov8n.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
!!! example "Detect Model JSON Response"
=== "Local"
```python
from ultralytics import YOLO
# Load model
model = YOLO('yolov8n.pt')
# Run inference
results = model('image.jpg')
# Print image.jpg results in JSON format
print(results[0].tojson())
```
=== "CLI API"
```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \
-F "size=640" \
-F "confidence=0.25" \
-F "iou=0.45"
```
=== "Python API"
```python
import requests
# API URL, use actual MODEL_ID
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (optional)
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
```
=== "JSON Response"
```json
{
"success": True,
"message": "Inference complete.",
"data": [
{
"name": "person",
"class": 0,
"confidence": 0.8359682559967041,
"box": {
"x1": 0.08974208831787109,
"y1": 0.27418340047200523,
"x2": 0.8706787109375,
"y2": 0.9887352837456598
}
},
{
"name": "person",
"class": 0,
"confidence": 0.8189555406570435,
"box": {
"x1": 0.5847355842590332,
"y1": 0.05813225640190972,
"x2": 0.8930277824401855,
"y2": 0.9903111775716146
}
},
{
"name": "tie",
"class": 27,
"confidence": 0.2909725308418274,
"box": {
"x1": 0.3433395862579346,
"y1": 0.6070465511745877,
"x2": 0.40964522361755373,
"y2": 0.9849439832899306
}
}
]
}
```
### Segment Model Format
YOLO segmentation models, such as `yolov8n-seg.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
!!! example "Segment Model JSON Response"
=== "Local"
```python
from ultralytics import YOLO
# Load model
model = YOLO('yolov8n-seg.pt')
# Run inference
results = model('image.jpg')
# Print image.jpg results in JSON format
print(results[0].tojson())
```
=== "CLI API"
```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \
-F "size=640" \
-F "confidence=0.25" \
-F "iou=0.45"
```
=== "Python API"
```python
import requests
# API URL, use actual MODEL_ID
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (optional)
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
```
=== "JSON Response"
Note `segments` `x` and `y` lengths may vary from one object to another. Larger or more complex objects may have more segment points.
```json
{
"success": True,
"message": "Inference complete.",
"data": [
{
"name": "person",
"class": 0,
"confidence": 0.856913149356842,
"box": {
"x1": 0.1064866065979004,
"y1": 0.2798851860894097,
"x2": 0.8738358497619629,
"y2": 0.9894873725043403
},
"segments": {
"x": [
0.421875,
0.4203124940395355,
0.41718751192092896
...
],
"y": [
0.2888889014720917,
0.2916666567325592,
0.2916666567325592
...
]
}
},
{
"name": "person",
"class": 0,
"confidence": 0.8512625694274902,
"box": {
"x1": 0.5757311820983887,
"y1": 0.053943040635850696,
"x2": 0.8960096359252929,
"y2": 0.985154045952691
},
"segments": {
"x": [
0.7515624761581421,
0.75,
0.7437499761581421
...
],
"y": [
0.0555555559694767,
0.05833333358168602,
0.05833333358168602
...
]
}
},
{
"name": "tie",
"class": 27,
"confidence": 0.6485961675643921,
"box": {
"x1": 0.33911995887756347,
"y1": 0.6057066175672743,
"x2": 0.4081430912017822,
"y2": 0.9916408962673611
},
"segments": {
"x": [
0.37187498807907104,
0.37031251192092896,
0.3687500059604645
...
],
"y": [
0.6111111044883728,
0.6138888597488403,
0.6138888597488403
...
]
}
}
]
}
```
### Pose Model Format
YOLO pose models, such as `yolov8n-pose.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
!!! example "Pose Model JSON Response"
=== "Local"
```python
from ultralytics import YOLO
# Load model
model = YOLO('yolov8n-seg.pt')
# Run inference
results = model('image.jpg')
# Print image.jpg results in JSON format
print(results[0].tojson())
```
=== "CLI API"
```bash
curl -X POST "https://api.ultralytics.com/v1/predict/MODEL_ID" \
-H "x-api-key: API_KEY" \
-F "image=@/path/to/image.jpg" \
-F "size=640" \
-F "confidence=0.25" \
-F "iou=0.45"
```
=== "Python API"
```python
import requests
# API URL, use actual MODEL_ID
url = f"https://api.ultralytics.com/v1/predict/MODEL_ID"
# Headers, use actual API_KEY
headers = {"x-api-key": "API_KEY"}
# Inference arguments (optional)
data = {"size": 640, "confidence": 0.25, "iou": 0.45}
# Load image and send request
with open("path/to/image.jpg", "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, headers=headers, files=files, data=data)
print(response.json())
```
=== "JSON Response"
Note COCO-keypoints pretrained models will have 17 human keypoints. The `visible` part of the keypoints indicates whether a keypoint is visible or obscured. Obscured keypoints may be outside the image or may not be visible, i.e. a person's eyes facing away from the camera.
```json
{
"success": True,
"message": "Inference complete.",
"data": [
{
"name": "person",
"class": 0,
"confidence": 0.8439509868621826,
"box": {
"x1": 0.1125,
"y1": 0.28194444444444444,
"x2": 0.7953125,
"y2": 0.9902777777777778
},
"keypoints": {
"x": [
0.5058594942092896,
0.5103894472122192,
0.4920862317085266
...
],
"y": [
0.48964157700538635,
0.4643048942089081,
0.4465252459049225
...
],
"visible": [
0.8726999163627625,
0.653947651386261,
0.9130823612213135
...
]
}
},
{
"name": "person",
"class": 0,
"confidence": 0.7474289536476135,
"box": {
"x1": 0.58125,
"y1": 0.0625,
"x2": 0.8859375,
"y2": 0.9888888888888889
},
"keypoints": {
"x": [
0.778544008731842,
0.7976160049438477,
0.7530890107154846
...
],
"y": [
0.27595141530036926,
0.2378823608160019,
0.23644638061523438
...
],
"visible": [
0.8900790810585022,
0.789978563785553,
0.8974530100822449
...
]
}
}
]
}
```

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -0,0 +1,213 @@
---
comments: true
description: Learn how to use Ultralytics HUB models for efficient and user-friendly AI model training. For easy model creation, training, evaluation and deployment, follow our detailed guide.
keywords: Ultralytics, HUB Models, AI model training, model creation, model training, model evaluation, model deployment
---
# Ultralytics HUB Models
Ultralytics HUB models provide a streamlined solution for training vision AI models on your custom datasets.
The process is user-friendly and efficient, involving a simple three-step creation and accelerated training powered by Utralytics YOLOv8. During training, real-time updates on model metrics are available so that you can monitor each step of the progress. Once training is completed, you can preview your model and easily deploy it to real-world applications. Therefore, Ultralytics HUB offers a comprehensive yet straightforward system for model creation, training, evaluation, and deployment.
## Train Model
Navigate to the [Models](https://hub.ultralytics.com/models) page by clicking on the **Models** button in the sidebar.
![Ultralytics HUB screenshot of the Home page with an arrow pointing to the Models button in the sidebar](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_1.jpg)
??? tip "Tip"
You can also train a model directly from the [Home](https://hub.ultralytics.com/home) page.
![Ultralytics HUB screenshot of the Home page with an arrow pointing to the Train Model card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_2.jpg)
Click on the **Train Model** button on the top right of the page. This action will trigger the **Train Model** dialog.
![Ultralytics HUB screenshot of the Models page with an arrow pointing to the Train Model button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_3.jpg)
The **Train Model** dialog has three simple steps, explained below.
### 1. Dataset
In this step, you have to select the dataset you want to train your model on. After you selected a dataset, click **Continue**.
![Ultralytics HUB screenshot of the Train Model dialog with an arrow pointing to a dataset and one to the Continue button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_4.jpg)
??? tip "Tip"
You can skip this step if you train a model directly from the Dataset page.
![Ultralytics HUB screenshot of the Dataset page with an arrow pointing to the Train Model button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_5.jpg)
### 2. Model
In this step, you have to choose the project in which you want to create your model, the name of your model and your model's architecture.
??? note "Note"
Ultralytics HUB will try to pre-select the project.
If you opened the **Train Model** dialog as described above, Ultralytics HUB will pre-select the last project you used.
If you opened the **Train Model** dialog from the Project page, Ultralytics HUB will pre-select the project you were inside of.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Train Model button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_6.jpg)
In case you don't have a project created yet, you can set the name of your project in this step and it will be created together with your model.
![Ultralytics HUB screenshot of the Train Model dialog with an arrow pointing to the project name](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_7.jpg)
!!! info "Info"
You can read more about the available [YOLOv8](https://docs.ultralytics.com/models/yolov8) (and [YOLOv5](https://docs.ultralytics.com/models/yolov5)) architectures in our documentation.
When you're happy with your model configuration, click **Continue**.
![Ultralytics HUB screenshot of the Train Model dialog with an arrow pointing to a model architecture and one to the Continue button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_8.jpg)
??? note "Note"
By default, your model will use a pre-trained model (trained on the [COCO](https://docs.ultralytics.com/datasets/detect/coco) dataset) to reduce training time.
You can change this behaviour by opening the **Advanced Options** accordion.
### 3. Train
In this step, you will start training you model.
Ultralytics HUB offers three training options:
- Ultralytics Cloud **(COMING SOON)**
- Google Colab
- Bring your own agent
In order to start training your model, follow the instructions presented in this step.
![Ultralytics HUB screenshot of the Train Model dialog with an arrow pointing to each step](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_9.jpg)
??? note "Note"
When you are on this step, before the training starts, you can change the default training configuration by opening the **Advanced Options** accordion.
![Ultralytics HUB screenshot of the Train Model dialog with an arrow pointing to the Train Advanced Options](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_10.jpg)
??? note "Note"
When you are on this step, you have the option to close the **Train Model** dialog and start training your model from the Model page later.
![Ultralytics HUB screenshot of the Model page with an arrow pointing to the Start Training card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_11.jpg)
To start training your model using Google Colab, simply follow the instructions shown above or on the Google Colab notebook.
<a href="https://colab.research.google.com/github/ultralytics/hub/blob/master/hub.ipynb" target="_blank">
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab">
</a>
When the training starts, you can click **Done** and monitor the training progress on the Model page.
![Ultralytics HUB screenshot of the Train Model dialog with an arrow pointing to the Done button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_12.jpg)
![Ultralytics HUB screenshot of the Model page of a model that is currently training](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_13.jpg)
??? note "Note"
In case the training stops and a checkpoint was saved, you can resume training your model from the Model page.
![Ultralytics HUB screenshot of the Model page with an arrow pointing to the Resume Training card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_train_model_14.jpg)
## Preview Model
Ultralytics HUB offers a variety of ways to preview your trained model.
You can preview your model if you click on the **Preview** tab and upload an image in the **Test** card.
![Ultralytics HUB screenshot of the Preview tab (Test card) inside the Model page](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_preview_model_1.jpg)
You can also use our Ultralytics Cloud API to effortlessly [run inference](https://docs.ultralytics.com/hub/inference_api) with your custom model.
![Ultralytics HUB screenshot of the Preview tab (Ultralytics Cloud API card) inside the Model page](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_preview_model_2.jpg)
Furthermore, you can preview your model in real-time directly on your [iOS](https://apps.apple.com/xk/app/ultralytics/id1583935240) or [Android](https://play.google.com/store/apps/details?id=com.ultralytics.ultralytics_app) mobile device by [downloading](https://ultralytics.com/app_install) our [Ultralytics HUB Mobile Application](./app/index.md).
![Ultralytics HUB screenshot of the Deploy tab inside the Model page with arrow pointing to the Real-Time Preview card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_preview_model_3.jpg)
## Deploy Model
You can export your model to 13 different formats, including ONNX, OpenVINO, CoreML, TensorFlow, Paddle and many others.
![Ultralytics HUB screenshot of the Deploy tab inside the Model page with all formats exported](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_deploy_model_1.jpg)
??? tip "Tip"
You can customize the export options of each format if you open the export actions dropdown and click on the **Advanced** option.
![Ultralytics HUB screenshot of the Deploy tab inside the Model page with an arrow pointing to the Advanced option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_deploy_model_2.jpg)
## Share Model
!!! info "Info"
Ultralytics HUB's sharing functionality provides a convenient way to share models with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
??? note "Note"
You have control over the general access of your models.
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the model, regardless of whether they have an Ultralytics HUB account or not.
Navigate to the Model page of the model you want to share, open the model actions dropdown and click on the **Share** option. This action will trigger the **Share Model** dialog.
![Ultralytics HUB screenshot of the Model page with an arrow pointing to the Share option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_share_model_1.jpg)
??? tip "Tip"
You can also share a model directly from the [Models](https://hub.ultralytics.com/models) page or from the Project page of the project where your model is located.
![Ultralytics HUB screenshot of the Models page with an arrow pointing to the Share option of one of the models](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_share_model_2.jpg)
Set the general access to "Unlisted" and click **Save**.
![Ultralytics HUB screenshot of the Share Model dialog with an arrow pointing to the dropdown and one to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_share_model_3.jpg)
Now, anyone who has the direct link to your model can view it.
??? tip "Tip"
You can easily click on the models's link shown in the **Share Model** dialog to copy it.
![Ultralytics HUB screenshot of the Share Model dialog with an arrow pointing to the model's link](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_share_model_4.jpg)
## Edit Model
Navigate to the Model page of the model you want to edit, open the model actions dropdown and click on the **Edit** option. This action will trigger the **Update Model** dialog.
![Ultralytics HUB screenshot of the Model page with an arrow pointing to the Edit option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_edit_model_1.jpg)
??? tip "Tip"
You can also edit a model directly from the [Models](https://hub.ultralytics.com/models) page or from the Project page of the project where your model is located.
![Ultralytics HUB screenshot of the Models page with an arrow pointing to the Edit option of one of the models](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_edit_model_2.jpg)
Apply the desired modifications to your model and then confirm the changes by clicking **Save**.
![Ultralytics HUB screenshot of the Update Model dialog with an arrow pointing to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_edit_model_3.jpg)
## Delete Model
Navigate to the Model page of the model you want to delete, open the model actions dropdown and click on the **Delete** option. This action will delete the model.
![Ultralytics HUB screenshot of the Model page with an arrow pointing to the Delete option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_delete_model_1.jpg)
??? tip "Tip"
You can also delete a model directly from the [Models](https://hub.ultralytics.com/models) page or from the Project page of the project where your model is located.
![Ultralytics HUB screenshot of the Models page with an arrow pointing to the Delete option of one of the models](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_delete_model_2.jpg)
??? note "Note"
If you change your mind, you can restore the model from the [Trash](https://hub.ultralytics.com/trash) page.
![Ultralytics HUB screenshot of the Trash page with an arrow pointing to the Restore option of one of the models](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/models/hub_delete_model_3.jpg)

@ -0,0 +1,169 @@
---
comments: true
description: Learn how to manage Ultralytics HUB projects. Understand effective strategies to create, share, edit, delete, and compare models in an organized workspace.
keywords: Ultralytics, HUB projects, Create project, Edit project, Share project, Delete project, Compare Models, Model Management
---
# Ultralytics HUB Projects
Ultralytics HUB projects provide an effective solution for consolidating and managing your models. If you are working with several models that perform similar tasks or have related purposes, Ultralytics HUB projects allow you to group these models together.
This creates a unified and organized workspace that facilitates easier model management, comparison and development. Having similar models or various iterations together can facilitate rapid benchmarking, as you can compare their effectiveness. This can lead to faster, more insightful iterative development and refinement of your models.
## Create Project
Navigate to the [Projects](https://hub.ultralytics.com/projects) page by clicking on the **Projects** button in the sidebar.
![Ultralytics HUB screenshot of the Home page with an arrow pointing to the Projects button in the sidebar](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_create_project_1.jpg)
??? tip "Tip"
You can also create a project directly from the [Home](https://hub.ultralytics.com/home) page.
![Ultralytics HUB screenshot of the Home page with an arrow pointing to the Create Project card](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_create_project_2.jpg)
Click on the **Create Project** button on the top right of the page. This action will trigger the **Create Project** dialog, opening up a suite of options for tailoring your project to your needs.
![Ultralytics HUB screenshot of the Projects page with an arrow pointing to the Create Project button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_create_project_3.jpg)
Type the name of your project in the _Project name_ field or keep the default name and finalize the project creation with a single click.
You have the additional option to enrich your project with a description and a unique image, enhancing its recognizability on the Projects page.
When you're happy with your project configuration, click **Create**.
![Ultralytics HUB screenshot of the Create Project dialog with an arrow pointing to the Create button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_create_project_4.jpg)
After your project is created, you will be able to access it from the Projects page.
![Ultralytics HUB screenshot of the Projects page with an arrow pointing to one of the projects](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_create_project_5.jpg)
Next, [train a model](https://docs.ultralytics.com/hub/models/#train-model) inside your project.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Train Model button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_create_project_6.jpg)
## Share Project
!!! info "Info"
Ultralytics HUB's sharing functionality provides a convenient way to share projects with others. This feature is designed to accommodate both existing Ultralytics HUB users and those who have yet to create an account.
??? note "Note"
You have control over the general access of your projects.
You can choose to set the general access to "Private", in which case, only you will have access to it. Alternatively, you can set the general access to "Unlisted" which grants viewing access to anyone who has the direct link to the project, regardless of whether they have an Ultralytics HUB account or not.
Navigate to the Project page of the project you want to share, open the project actions dropdown and click on the **Share** option. This action will trigger the **Share Project** dialog.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Share option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_share_project_1.jpg)
??? tip "Tip"
You can also share a project directly from the [Projects](https://hub.ultralytics.com/projects) page.
![Ultralytics HUB screenshot of the Projects page with an arrow pointing to the Share option of one of the projects](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_share_project_2.jpg)
Set the general access to "Unlisted" and click **Save**.
![Ultralytics HUB screenshot of the Share Project dialog with an arrow pointing to the dropdown and one to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_share_project_3.jpg)
!!! warning "Warning"
When changing the general access of a project, the general access of the models inside the project will be changed as well.
Now, anyone who has the direct link to your project can view it.
??? tip "Tip"
You can easily click on the project's link shown in the **Share Project** dialog to copy it.
![Ultralytics HUB screenshot of the Share Project dialog with an arrow pointing to the project's link](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_share_project_4.jpg)
## Edit Project
Navigate to the Project page of the project you want to edit, open the project actions dropdown and click on the **Edit** option. This action will trigger the **Update Project** dialog.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Edit option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_edit_project_1.jpg)
??? tip "Tip"
You can also edit a project directly from the [Projects](https://hub.ultralytics.com/projects) page.
![Ultralytics HUB screenshot of the Projects page with an arrow pointing to the Edit option of one of the projects](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_edit_project_2.jpg)
Apply the desired modifications to your project and then confirm the changes by clicking **Save**.
![Ultralytics HUB screenshot of the Update Project dialog with an arrow pointing to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_edit_project_3.jpg)
## Delete Project
Navigate to the Project page of the project you want to delete, open the project actions dropdown and click on the **Delete** option. This action will delete the project.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Delete option](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_delete_project_1.jpg)
??? tip "Tip"
You can also delete a project directly from the [Projects](https://hub.ultralytics.com/projects) page.
![Ultralytics HUB screenshot of the Projects page with an arrow pointing to the Delete option of one of the projects](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_delete_project_2.jpg)
!!! warning "Warning"
When deleting a project, the the models inside the project will be deleted as well.
??? note "Note"
If you change your mind, you can restore the project from the [Trash](https://hub.ultralytics.com/trash) page.
![Ultralytics HUB screenshot of the Trash page with an arrow pointing to the Restore option of one of the projects](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_delete_project_3.jpg)
## Compare Models
Navigate to the Project page of the project where the models you want to compare are located. To use the model comparison feature, click on the **Charts** tab.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Charts tab](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_compare_models_1.jpg)
This will display all the relevant charts. Each chart corresponds to a different metric and contains the performance of each model for that metric. The models are represented by different colors and you can hover over each data point to get more information.
![Ultralytics HUB screenshot of the Charts tab inside the Project page](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_compare_models_2.jpg)
??? tip "Tip"
Each chart can be enlarged for better visualization.
![Ultralytics HUB screenshot of the Charts tab inside the Project page with an arrow pointing to the expand icon](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_compare_models_3.jpg)
![Ultralytics HUB screenshot of the Charts tab inside the Project page with one of the charts expanded](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_compare_models_4.jpg)
??? tip "Tip"
You have the flexibility to customize your view by selectively hiding certain models. This feature allows you to concentrate on the models of interest.
![Ultralytics HUB screenshot of the Charts tab inside the Project page with an arrow pointing to the hide/unhide icon of one of the model](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_compare_models_5.jpg)
## Reorder Models
??? note "Note"
Ultralytics HUB's reordering functionality works only inside projects you own.
Navigate to the Project page of the project where the models you want to reorder are located. Click on the designated reorder icon of the model you want to move and drag it to the desired location.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the reorder icon](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_reorder_models_1.jpg)
## Transfer Models
Navigate to the Project page of the project where the model you want to mode is located, open the project actions dropdown and click on the **Transfer** option. This action will trigger the **Transfer Model** dialog.
![Ultralytics HUB screenshot of the Project page with an arrow pointing to the Transfer option of one of the models](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_transfer_models_1.jpg)
??? tip "Tip"
You can also transfer a model directly from the [Models](https://hub.ultralytics.com/models) page.
![Ultralytics HUB screenshot of the Models page with an arrow pointing to the Transfer option of one of the models](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_transfer_models_2.jpg)
Select the project you want to transfer the model to and click **Save**.
![Ultralytics HUB screenshot of the Transfer Model dialog with an arrow pointing to the dropdown and one to the Save button](https://raw.githubusercontent.com/ultralytics/assets/main/docs/hub/projects/hub_transfer_models_3.jpg)

@ -0,0 +1,7 @@
---
comments: true
---
# 🚧 Page Under Construction ⚒
This page is currently under construction! 👷Please check back later for updates. 😃🔜

@ -1,5 +1,7 @@
---
comments: true
description: Explore a complete guide to Ultralytics YOLOv8, a high-speed, high-accuracy object detection & image segmentation model. Installation, prediction, training tutorials and more.
keywords: Ultralytics, YOLOv8, object detection, image segmentation, machine learning, deep learning, computer vision, YOLOv8 installation, YOLOv8 prediction, YOLOv8 training, YOLO history, YOLO licenses
---
<div align="center">
@ -8,6 +10,7 @@ comments: true
<img width="1024" src="https://raw.githubusercontent.com/ultralytics/assets/main/yolov8/banner-yolov8.png"></a>
</p>
<a href="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml"><img src="https://github.com/ultralytics/ultralytics/actions/workflows/ci.yaml/badge.svg" alt="Ultralytics CI"></a>
<a href="https://codecov.io/github/ultralytics/ultralytics"><img src="https://codecov.io/github/ultralytics/ultralytics/branch/main/graph/badge.svg?token=HHW7IIVFVY" alt="Ultralytics Code Coverage"></a>
<a href="https://zenodo.org/badge/latestdoi/264818686"><img src="https://zenodo.org/badge/264818686.svg" alt="YOLOv8 Citation"></a>
<a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
<br>
@ -23,7 +26,7 @@ Explore the YOLOv8 Docs, a comprehensive resource designed to help you understan
## Where to Start
- **Install** `ultralytics` with pip and get up and running in minutes &nbsp; [:material-clock-fast: Get Started](quickstart.md){ .md-button }
- **Predict** new images and videos with YOLOv8 &nbsp; [:octicons-image-16: Predict on Images](modes/predict.md){ .md-button }
- **Predict** new images and videos with YOLOv8 &nbsp; [:octicons-image-16: Predict on Images](modes/predict.md){ .md-button }
- **Train** a new YOLOv8 model on your own custom dataset &nbsp; [:fontawesome-solid-brain: Train a Model](modes/train.md){ .md-button }
- **Explore** YOLOv8 tasks like segment, classify, pose and track &nbsp; [:material-magnify-expand: Explore Tasks](tasks/index.md){ .md-button }
@ -38,3 +41,12 @@ Explore the YOLOv8 Docs, a comprehensive resource designed to help you understan
- [YOLOv6](https://github.com/meituan/YOLOv6) was open-sourced by [Meituan](https://about.meituan.com/) in 2022 and is in use in many of the company's autonomous delivery robots.
- [YOLOv7](https://github.com/WongKinYiu/yolov7) added additional tasks such as pose estimation on the COCO keypoints dataset.
- [YOLOv8](https://github.com/ultralytics/ultralytics) is the latest version of YOLO by Ultralytics. As a cutting-edge, state-of-the-art (SOTA) model, YOLOv8 builds on the success of previous versions, introducing new features and improvements for enhanced performance, flexibility, and efficiency. YOLOv8 supports a full range of vision AI tasks, including [detection](tasks/detect.md), [segmentation](tasks/segment.md), [pose estimation](tasks/pose.md), [tracking](modes/track.md), and [classification](tasks/classify.md). This versatility allows users to leverage YOLOv8's capabilities across diverse applications and domains.
## YOLO Licenses: How is Ultralytics YOLO licensed?
Ultralytics offers two licensing options to accommodate diverse use cases:
- **AGPL-3.0 License**: This [OSI-approved](https://opensource.org/licenses/) open-source license is ideal for students and enthusiasts, promoting open collaboration and knowledge sharing. See the [LICENSE](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) file for more details.
- **Enterprise License**: Designed for commercial use, this license permits seamless integration of Ultralytics software and AI models into commercial goods and services, bypassing the open-source requirements of AGPL-3.0. If your scenario involves embedding our solutions into a commercial offering, reach out through [Ultralytics Licensing](https://ultralytics.com/license).
Our licensing strategy is designed to ensure that any improvements to our open-source projects are returned to the community. We hold the principles of open source close to our hearts ❤, and our mission is to guarantee that our contributions can be utilized and expanded upon in ways that are beneficial to all.

@ -1,429 +0,0 @@
---
comments: true
---
# YOLO Inference API (UNDER CONSTRUCTION)
The YOLO Inference API allows you to access the YOLOv8 object detection capabilities via a RESTful API. This enables you to run object detection on images without the need to install and set up the YOLOv8 environment locally.
## API URL
The API URL is the address used to access the YOLO Inference API. In this case, the base URL is:
```
https://api.ultralytics.com/inference/v1
```
To access the API with a specific model and your API key, you can include them as query parameters in the API URL. The `model` parameter refers to the `MODEL_ID` you want to use for inference, and the `key` parameter corresponds to your `API_KEY`.
The complete API URL with the model and API key parameters would be:
```
https://api.ultralytics.com/inference/v1?model=MODEL_ID&key=API_KEY
```
Replace `MODEL_ID` with the ID of the model you want to use and `API_KEY` with your actual API key from [https://hub.ultralytics.com/settings?tab=api+keys](https://hub.ultralytics.com/settings?tab=api+keys).
## Example Usage in Python
To access the YOLO Inference API with the specified model and API key using Python, you can use the following code:
```python
import requests
api_key = "API_KEY"
model_id = "MODEL_ID"
url = f"https://api.ultralytics.com/inference/v1?model={model_id}&key={api_key}"
image_path = "image.jpg"
with open(image_path, "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, files=files)
print(response.text)
```
In this example, replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `image.jpg` with the path to the image you want to analyze.
## Example Usage with CLI
You can use the YOLO Inference API with the command-line interface (CLI) by utilizing the `curl` command. Replace `API_KEY` with your actual API key, `MODEL_ID` with the desired model ID, and `image.jpg` with the path to the image you want to analyze:
```commandline
curl -X POST -F image=@image.jpg "https://api.ultralytics.com/inference/v1?model=MODEL_ID&key=API_KEY"
```
## Passing Arguments
This command sends a POST request to the YOLO Inference API with the specified `model` and `key` parameters in the URL, along with the image file specified by `@image.jpg`.
Here's an example of passing the `model`, `key`, and `normalize` arguments via the API URL using the `requests` library in Python:
```python
import requests
api_key = "API_KEY"
model_id = "MODEL_ID"
url = "https://api.ultralytics.com/inference/v1"
# Define your query parameters
params = {
"key": api_key,
"model": model_id,
"normalize": "True"
}
image_path = "image.jpg"
with open(image_path, "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, files=files, params=params)
print(response.text)
```
In this example, the `params` dictionary contains the query parameters `key`, `model`, and `normalize`, which tells the API to return all values in normalized image coordinates from 0 to 1. The `normalize` parameter is set to `"True"` as a string since query parameters should be passed as strings. These query parameters are then passed to the `requests.post()` function.
This will send the query parameters along with the file in the POST request. Make sure to consult the API documentation for the list of available arguments and their expected values.
## Return JSON format
The YOLO Inference API returns a JSON list with the detection results. The format of the JSON list will be the same as the one produced locally by the `results[0].tojson()` command.
The JSON list contains information about the detected objects, their coordinates, classes, and confidence scores.
### Detect Model Format
YOLO detection models, such as `yolov8n.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
!!! example "Detect Model JSON Response"
=== "Local"
```python
from ultralytics import YOLO
# Load model
model = YOLO('yolov8n.pt')
# Run inference
results = model('image.jpg')
# Print image.jpg results in JSON format
print(results[0].tojson())
```
=== "CLI API"
```commandline
curl -X POST -F image=@image.jpg https://api.ultralytics.com/inference/v1?model=MODEL_ID,key=API_KEY
```
=== "Python API"
```python
import requests
api_key = "API_KEY"
model_id = "MODEL_ID"
url = "https://api.ultralytics.com/inference/v1"
# Define your query parameters
params = {
"key": api_key,
"model": model_id,
}
image_path = "image.jpg"
with open(image_path, "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, files=files, params=params)
print(response.text)
```
=== "JSON Response"
```json
[
{
"name": "person",
"class": 0,
"confidence": 0.8359682559967041,
"box": {
"x1": 0.08974208831787109,
"y1": 0.27418340047200523,
"x2": 0.8706787109375,
"y2": 0.9887352837456598
}
},
{
"name": "person",
"class": 0,
"confidence": 0.8189555406570435,
"box": {
"x1": 0.5847355842590332,
"y1": 0.05813225640190972,
"x2": 0.8930277824401855,
"y2": 0.9903111775716146
}
},
{
"name": "tie",
"class": 27,
"confidence": 0.2909725308418274,
"box": {
"x1": 0.3433395862579346,
"y1": 0.6070465511745877,
"x2": 0.40964522361755373,
"y2": 0.9849439832899306
}
}
]
```
### Segment Model Format
YOLO segmentation models, such as `yolov8n-seg.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
!!! example "Segment Model JSON Response"
=== "Local"
```python
from ultralytics import YOLO
# Load model
model = YOLO('yolov8n-seg.pt')
# Run inference
results = model('image.jpg')
# Print image.jpg results in JSON format
print(results[0].tojson())
```
=== "CLI API"
```commandline
curl -X POST -F image=@image.jpg https://api.ultralytics.com/inference/v1?model=MODEL_ID,key=API_KEY
```
=== "Python API"
```python
import requests
api_key = "API_KEY"
model_id = "MODEL_ID"
url = "https://api.ultralytics.com/inference/v1"
# Define your query parameters
params = {
"key": api_key,
"model": model_id,
}
image_path = "image.jpg"
with open(image_path, "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, files=files, params=params)
print(response.text)
```
=== "JSON Response"
Note `segments` `x` and `y` lengths may vary from one object to another. Larger or more complex objects may have more segment points.
```json
[
{
"name": "person",
"class": 0,
"confidence": 0.856913149356842,
"box": {
"x1": 0.1064866065979004,
"y1": 0.2798851860894097,
"x2": 0.8738358497619629,
"y2": 0.9894873725043403
},
"segments": {
"x": [
0.421875,
0.4203124940395355,
0.41718751192092896
...
],
"y": [
0.2888889014720917,
0.2916666567325592,
0.2916666567325592
...
]
}
},
{
"name": "person",
"class": 0,
"confidence": 0.8512625694274902,
"box": {
"x1": 0.5757311820983887,
"y1": 0.053943040635850696,
"x2": 0.8960096359252929,
"y2": 0.985154045952691
},
"segments": {
"x": [
0.7515624761581421,
0.75,
0.7437499761581421
...
],
"y": [
0.0555555559694767,
0.05833333358168602,
0.05833333358168602
...
]
}
},
{
"name": "tie",
"class": 27,
"confidence": 0.6485961675643921,
"box": {
"x1": 0.33911995887756347,
"y1": 0.6057066175672743,
"x2": 0.4081430912017822,
"y2": 0.9916408962673611
},
"segments": {
"x": [
0.37187498807907104,
0.37031251192092896,
0.3687500059604645
...
],
"y": [
0.6111111044883728,
0.6138888597488403,
0.6138888597488403
...
]
}
}
]
```
### Pose Model Format
YOLO pose models, such as `yolov8n-pose.pt`, can return JSON responses from local inference, CLI API inference, and Python API inference. All of these methods produce the same JSON response format.
!!! example "Pose Model JSON Response"
=== "Local"
```python
from ultralytics import YOLO
# Load model
model = YOLO('yolov8n-seg.pt')
# Run inference
results = model('image.jpg')
# Print image.jpg results in JSON format
print(results[0].tojson())
```
=== "CLI API"
```commandline
curl -X POST -F image=@image.jpg https://api.ultralytics.com/inference/v1?model=MODEL_ID,key=API_KEY
```
=== "Python API"
```python
import requests
api_key = "API_KEY"
model_id = "MODEL_ID"
url = "https://api.ultralytics.com/inference/v1"
# Define your query parameters
params = {
"key": api_key,
"model": model_id,
}
image_path = "image.jpg"
with open(image_path, "rb") as image_file:
files = {"image": image_file}
response = requests.post(url, files=files, params=params)
print(response.text)
```
=== "JSON Response"
Note COCO-keypoints pretrained models will have 17 human keypoints. The `visible` part of the keypoints indicates whether a keypoint is visible or obscured. Obscured keypoints may be outside the image or may not be visible, i.e. a person's eyes facing away from the camera.
```json
[
{
"name": "person",
"class": 0,
"confidence": 0.8439509868621826,
"box": {
"x1": 0.1125,
"y1": 0.28194444444444444,
"x2": 0.7953125,
"y2": 0.9902777777777778
},
"keypoints": {
"x": [
0.5058594942092896,
0.5103894472122192,
0.4920862317085266
...
],
"y": [
0.48964157700538635,
0.4643048942089081,
0.4465252459049225
...
],
"visible": [
0.8726999163627625,
0.653947651386261,
0.9130823612213135
...
]
}
},
{
"name": "person",
"class": 0,
"confidence": 0.7474289536476135,
"box": {
"x1": 0.58125,
"y1": 0.0625,
"x2": 0.8859375,
"y2": 0.9888888888888889
},
"keypoints": {
"x": [
0.778544008731842,
0.7976160049438477,
0.7530890107154846
...
],
"y": [
0.27595141530036926,
0.2378823608160019,
0.23644638061523438
...
],
"visible": [
0.8900790810585022,
0.789978563785553,
0.8974530100822449
...
]
}
}
]
```

@ -0,0 +1,61 @@
---
comments: true
description: Explore Ultralytics integrations with tools for dataset management, model optimization, ML workflows automation, experiment tracking, version control, and more. Learn about our support for various model export formats for deployment.
keywords: Ultralytics integrations, Roboflow, Neural Magic, ClearML, Comet ML, DVC, Ultralytics HUB, MLFlow, Neptune, Ray Tune, TensorBoard, W&B, model export formats, PyTorch, TorchScript, ONNX, OpenVINO, TensorRT, CoreML, TF SavedModel, TF GraphDef, TF Lite, TF Edge TPU, TF.js, PaddlePaddle, NCNN
---
# Ultralytics Integrations
Welcome to the Ultralytics Integrations page! This page provides an overview of our partnerships with various tools and platforms, designed to streamline your machine learning workflows, enhance dataset management, simplify model training, and facilitate efficient deployment.
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
## Datasets Integrations
- [Roboflow](https://roboflow.com/?ref=ultralytics): Facilitate seamless dataset management for Ultralytics models, offering robust annotation, preprocessing, and augmentation capabilities.
## Training Integrations
- [Comet ML](https://www.comet.ml/): Enhance your model development with Ultralytics by tracking, comparing, and optimizing your machine learning experiments.
- [ClearML](https://clear.ml/): Automate your Ultralytics ML workflows, monitor experiments, and foster team collaboration.
- [DVC](https://dvc.org/): Implement version control for your Ultralytics machine learning projects, synchronizing data, code, and models effectively.
- [Ultralytics HUB](https://hub.ultralytics.com): Access and contribute to a community of pre-trained Ultralytics models.
- [MLFlow](https://mlflow.org/): Streamline the entire ML lifecycle of Ultralytics models, from experimentation and reproducibility to deployment.
- [Neptune](https://neptune.ai/): Maintain a comprehensive log of your ML experiments with Ultralytics in this metadata store designed for MLOps.
- [Ray Tune](ray-tune.md): Optimize the hyperparameters of your Ultralytics models at any scale.
- [TensorBoard](https://tensorboard.dev/): Visualize your Ultralytics ML workflows, monitor model metrics, and foster team collaboration.
- [Weights & Biases (W&B)](https://wandb.ai/site): Monitor experiments, visualize metrics, and foster reproducibility and collaboration on Ultralytics projects.
## Deployment Integrations
- [Neural Magic](https://neuralmagic.com/): Leverage Quantization Aware Training (QAT) and pruning techniques to optimize Ultralytics models for superior performance and leaner size.
### Export Formats
We also support a variety of model export formats for deployment in different environments. Here are the available formats:
| Format | `format` Argument | Model | Metadata | Arguments |
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
| [OpenVINO](openvino.md) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
| [NCNN](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |
Explore the links to learn more about each integration and how to get the most out of them with Ultralytics.

@ -0,0 +1,271 @@
---
comments: true
description: Discover the power of deploying your Ultralytics YOLOv8 model using OpenVINO format for up to 10x speedup vs PyTorch.
keywords: ultralytics docs, YOLOv8, export YOLOv8, YOLOv8 model deployment, exporting YOLOv8, OpenVINO, OpenVINO format
---
<img width="1024" src="https://user-images.githubusercontent.com/26833433/252345644-0cf84257-4b34-404c-b7ce-eb73dfbcaff1.png" alt="OpenVINO Ecosystem">
**Export mode** is used for exporting a YOLOv8 model to a format that can be used for deployment. In this guide, we specifically cover exporting to OpenVINO, which can provide up to 3x [CPU](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_supported_plugins_CPU.html) speedup as well as accelerating on other Intel hardware ([iGPU](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_supported_plugins_GPU.html), [dGPU](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_supported_plugins_GPU.html), [VPU](https://docs.openvino.ai/2022.3/openvino_docs_OV_UG_supported_plugins_VPU.html), etc.).
OpenVINO, short for Open Visual Inference & Neural Network Optimization toolkit, is a comprehensive toolkit for optimizing and deploying AI inference models. Even though the name contains Visual, OpenVINO also supports various additional tasks including language, audio, time series, etc.
## Usage Examples
Export a YOLOv8n model to OpenVINO format and run inference with the exported model.
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a YOLOv8n PyTorch model
model = YOLO('yolov8n.pt')
# Export the model
model.export(format='openvino') # creates 'yolov8n_openvino_model/'
# Load the exported OpenVINO model
ov_model = YOLO('yolov8n_openvino_model/')
# Run inference
results = ov_model('https://ultralytics.com/images/bus.jpg')
```
=== "CLI"
```bash
# Export a YOLOv8n PyTorch model to OpenVINO format
yolo export model=yolov8n.pt format=openvino # creates 'yolov8n_openvino_model/'
# Run inference with the exported model
yolo predict model=yolov8n_openvino_model source='https://ultralytics.com/images/bus.jpg'
```
## Arguments
| Key | Value | Description |
|----------|--------------|------------------------------------------------------|
| `format` | `'openvino'` | format to export to |
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
| `half` | `False` | FP16 quantization |
## Benefits of OpenVINO
1. **Performance**: OpenVINO delivers high-performance inference by utilizing the power of Intel CPUs, integrated and discrete GPUs, and FPGAs.
2. **Support for Heterogeneous Execution**: OpenVINO provides an API to write once and deploy on any supported Intel hardware (CPU, GPU, FPGA, VPU, etc.).
3. **Model Optimizer**: OpenVINO provides a Model Optimizer that imports, converts, and optimizes models from popular deep learning frameworks such as PyTorch, TensorFlow, TensorFlow Lite, Keras, ONNX, PaddlePaddle, and Caffe.
4. **Ease of Use**: The toolkit comes with more than [80 tutorial notebooks](https://github.com/openvinotoolkit/openvino_notebooks) (including [YOLOv8 optimization](https://github.com/openvinotoolkit/openvino_notebooks/tree/main/notebooks/230-yolov8-optimization)) teaching different aspects of the toolkit.
## OpenVINO Export Structure
When you export a model to OpenVINO format, it results in a directory containing the following:
1. **XML file**: Describes the network topology.
2. **BIN file**: Contains the weights and biases binary data.
3. **Mapping file**: Holds mapping of original model output tensors to OpenVINO tensor names.
You can use these files to run inference with the OpenVINO Inference Engine.
## Using OpenVINO Export in Deployment
Once you have the OpenVINO files, you can use the OpenVINO Runtime to run the model. The Runtime provides a unified API to inference across all supported Intel hardware. It also provides advanced capabilities like load balancing across Intel hardware and asynchronous execution. For more information on running the inference, refer to the [Inference with OpenVINO Runtime Guide](https://docs.openvino.ai/2023.0/openvino_docs_OV_UG_OV_Runtime_User_Guide.html).
Remember, you'll need the XML and BIN files as well as any application-specific settings like input size, scale factor for normalization, etc., to correctly set up and use the model with the Runtime.
In your deployment application, you would typically do the following steps:
1. Initialize OpenVINO by creating `core = Core()`.
2. Load the model using the `core.read_model()` method.
3. Compile the model using the `core.compile_model()` function.
4. Prepare the input (image, text, audio, etc.).
5. Run inference using `compiled_model(input_data)`.
For more detailed steps and code snippets, refer to the [OpenVINO documentation](https://docs.openvino.ai/) or [API tutorial](https://github.com/openvinotoolkit/openvino_notebooks/blob/main/notebooks/002-openvino-api/002-openvino-api.ipynb).
## OpenVINO YOLOv8 Benchmarks
YOLOv8 benchmarks below were run by the Ultralytics team on 4 different model formats measuring speed and accuracy: PyTorch, TorchScript, ONNX and OpenVINO. Benchmarks were run on Intel Flex and Arc GPUs, and on Intel Xeon CPUs at FP32 precision (with the `half=False` argument).
!!! note
The benchmarking results below are for reference and might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run.
All benchmarks run with `openvino` python package version [2023.0.1](https://pypi.org/project/openvino/2023.0.1/).
### Intel Flex GPU
The Intel® Data Center GPU Flex Series is a versatile and robust solution designed for the intelligent visual cloud. This GPU supports a wide array of workloads including media streaming, cloud gaming, AI visual inference, and virtual desktop Infrastructure workloads. It stands out for its open architecture and built-in support for the AV1 encode, providing a standards-based software stack for high-performance, cross-architecture applications. The Flex Series GPU is optimized for density and quality, offering high reliability, availability, and scalability.
Benchmarks below run on Intel® Data Center GPU Flex 170 at FP32 precision.
<div align="center">
<img width="800" src="https://user-images.githubusercontent.com/26833433/253741543-62659bf8-1765-4d0b-b71c-8a4f9885506a.jpg">
</div>
| Model | Format | Status | Size (MB) | mAP50-95(B) | Inference time (ms/im) |
|---------|-------------|--------|-----------|-------------|------------------------|
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.3709 | 21.79 |
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.3704 | 23.24 |
| YOLOv8n | ONNX | ✅ | 12.2 | 0.3704 | 37.22 |
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.3703 | 3.29 |
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.4471 | 31.89 |
| YOLOv8s | TorchScript | ✅ | 42.9 | 0.4472 | 32.71 |
| YOLOv8s | ONNX | ✅ | 42.8 | 0.4472 | 43.42 |
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.4470 | 3.92 |
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.5013 | 50.75 |
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.4999 | 47.90 |
| YOLOv8m | ONNX | ✅ | 99.0 | 0.4999 | 63.16 |
| YOLOv8m | OpenVINO | ✅ | 49.8 | 0.4997 | 7.11 |
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.5293 | 77.45 |
| YOLOv8l | TorchScript | ✅ | 167.2 | 0.5268 | 85.71 |
| YOLOv8l | ONNX | ✅ | 166.8 | 0.5268 | 88.94 |
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.5264 | 9.37 |
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.5404 | 100.09 |
| YOLOv8x | TorchScript | ✅ | 260.7 | 0.5371 | 114.64 |
| YOLOv8x | ONNX | ✅ | 260.4 | 0.5371 | 110.32 |
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.5367 | 15.02 |
This table represents the benchmark results for five different models (YOLOv8n, YOLOv8s, YOLOv8m, YOLOv8l, YOLOv8x) across four different formats (PyTorch, TorchScript, ONNX, OpenVINO), giving us the status, size, mAP50-95(B) metric, and inference time for each combination.
### Intel Arc GPU
Intel® Arc™ represents Intel's foray into the dedicated GPU market. The Arc™ series, designed to compete with leading GPU manufacturers like AMD and Nvidia, caters to both the laptop and desktop markets. The series includes mobile versions for compact devices like laptops, and larger, more powerful versions for desktop computers.
The Arc™ series is divided into three categories: Arc™ 3, Arc™ 5, and Arc™ 7, with each number indicating the performance level. Each category includes several models, and the 'M' in the GPU model name signifies a mobile, integrated variant.
Early reviews have praised the Arc™ series, particularly the integrated A770M GPU, for its impressive graphics performance. The availability of the Arc™ series varies by region, and additional models are expected to be released soon. Intel® Arc™ GPUs offer high-performance solutions for a range of computing needs, from gaming to content creation.
Benchmarks below run on Intel® Arc 770 GPU at FP32 precision.
<div align="center">
<img width="800" src="https://user-images.githubusercontent.com/26833433/253741545-8530388f-8fd1-44f7-a4ae-f875d59dc282.jpg">
</div>
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|---------|-------------|--------|-----------|---------------------|------------------------|
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.3709 | 88.79 |
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.3704 | 102.66 |
| YOLOv8n | ONNX | ✅ | 12.2 | 0.3704 | 57.98 |
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.3703 | 8.52 |
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.4471 | 189.83 |
| YOLOv8s | TorchScript | ✅ | 42.9 | 0.4472 | 227.58 |
| YOLOv8s | ONNX | ✅ | 42.7 | 0.4472 | 142.03 |
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.4469 | 9.19 |
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.5013 | 411.64 |
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.4999 | 517.12 |
| YOLOv8m | ONNX | ✅ | 98.9 | 0.4999 | 298.68 |
| YOLOv8m | OpenVINO | ✅ | 99.1 | 0.4996 | 12.55 |
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.5293 | 725.73 |
| YOLOv8l | TorchScript | ✅ | 167.1 | 0.5268 | 892.83 |
| YOLOv8l | ONNX | ✅ | 166.8 | 0.5268 | 576.11 |
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.5262 | 17.62 |
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.5404 | 988.92 |
| YOLOv8x | TorchScript | ✅ | 260.7 | 0.5371 | 1186.42 |
| YOLOv8x | ONNX | ✅ | 260.4 | 0.5371 | 768.90 |
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.5367 | 19 |
### Intel Xeon CPU
The Intel® Xeon® CPU is a high-performance, server-grade processor designed for complex and demanding workloads. From high-end cloud computing and virtualization to artificial intelligence and machine learning applications, Xeon® CPUs provide the power, reliability, and flexibility required for today's data centers.
Notably, Xeon® CPUs deliver high compute density and scalability, making them ideal for both small businesses and large enterprises. By choosing Intel® Xeon® CPUs, organizations can confidently handle their most demanding computing tasks and foster innovation while maintaining cost-effectiveness and operational efficiency.
Benchmarks below run on 4th Gen Intel® Xeon® Scalable CPU at FP32 precision.
<div align="center">
<img width="800" src="https://user-images.githubusercontent.com/26833433/253741546-dcd8e52a-fc38-424f-b87e-c8365b6f28dc.jpg">
</div>
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|---------|-------------|--------|-----------|---------------------|------------------------|
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.3709 | 24.36 |
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.3704 | 23.93 |
| YOLOv8n | ONNX | ✅ | 12.2 | 0.3704 | 39.86 |
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.3704 | 11.34 |
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.4471 | 33.77 |
| YOLOv8s | TorchScript | ✅ | 42.9 | 0.4472 | 34.84 |
| YOLOv8s | ONNX | ✅ | 42.8 | 0.4472 | 43.23 |
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.4471 | 13.86 |
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.5013 | 53.91 |
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.4999 | 53.51 |
| YOLOv8m | ONNX | ✅ | 99.0 | 0.4999 | 64.16 |
| YOLOv8m | OpenVINO | ✅ | 99.1 | 0.4996 | 28.79 |
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.5293 | 75.78 |
| YOLOv8l | TorchScript | ✅ | 167.2 | 0.5268 | 79.13 |
| YOLOv8l | ONNX | ✅ | 166.8 | 0.5268 | 88.45 |
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.5263 | 56.23 |
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.5404 | 96.60 |
| YOLOv8x | TorchScript | ✅ | 260.7 | 0.5371 | 114.28 |
| YOLOv8x | ONNX | ✅ | 260.4 | 0.5371 | 111.02 |
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.5371 | 83.28 |
### Intel Core CPU
The Intel® Core® series is a range of high-performance processors by Intel. The lineup includes Core i3 (entry-level), Core i5 (mid-range), Core i7 (high-end), and Core i9 (extreme performance). Each series caters to different computing needs and budgets, from everyday tasks to demanding professional workloads. With each new generation, improvements are made to performance, energy efficiency, and features.
Benchmarks below run on 13th Gen Intel® Core® i7-13700H CPU at FP32 precision.
<div align="center">
<img width="800" src="https://user-images.githubusercontent.com/26833433/254559985-727bfa43-93fa-4fec-a417-800f869f3f9e.jpg">
</div>
| Model | Format | Status | Size (MB) | metrics/mAP50-95(B) | Inference time (ms/im) |
|---------|-------------|--------|-----------|---------------------|------------------------|
| YOLOv8n | PyTorch | ✅ | 6.2 | 0.4478 | 104.61 |
| YOLOv8n | TorchScript | ✅ | 12.4 | 0.4525 | 112.39 |
| YOLOv8n | ONNX | ✅ | 12.2 | 0.4525 | 28.02 |
| YOLOv8n | OpenVINO | ✅ | 12.3 | 0.4504 | 23.53 |
| YOLOv8s | PyTorch | ✅ | 21.5 | 0.5885 | 194.83 |
| YOLOv8s | TorchScript | ✅ | 43.0 | 0.5962 | 202.01 |
| YOLOv8s | ONNX | ✅ | 42.8 | 0.5962 | 65.74 |
| YOLOv8s | OpenVINO | ✅ | 42.9 | 0.5966 | 38.66 |
| YOLOv8m | PyTorch | ✅ | 49.7 | 0.6101 | 355.23 |
| YOLOv8m | TorchScript | ✅ | 99.2 | 0.6120 | 424.78 |
| YOLOv8m | ONNX | ✅ | 99.0 | 0.6120 | 173.39 |
| YOLOv8m | OpenVINO | ✅ | 99.1 | 0.6091 | 69.80 |
| YOLOv8l | PyTorch | ✅ | 83.7 | 0.6591 | 593.00 |
| YOLOv8l | TorchScript | ✅ | 167.2 | 0.6580 | 697.54 |
| YOLOv8l | ONNX | ✅ | 166.8 | 0.6580 | 342.15 |
| YOLOv8l | OpenVINO | ✅ | 167.0 | 0.0708 | 117.69 |
| YOLOv8x | PyTorch | ✅ | 130.5 | 0.6651 | 804.65 |
| YOLOv8x | TorchScript | ✅ | 260.8 | 0.6650 | 921.46 |
| YOLOv8x | ONNX | ✅ | 260.4 | 0.6650 | 526.66 |
| YOLOv8x | OpenVINO | ✅ | 260.6 | 0.6619 | 158.73 |
## Reproduce Our Results
To reproduce the Ultralytics benchmarks above on all export [formats](../modes/export.md) run this code:
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a YOLOv8n PyTorch model
model = YOLO('yolov8n.pt')
# Benchmark YOLOv8n speed and accuracy on the COCO128 dataset for all all export formats
results= model.benchmarks(data='coco128.yaml')
```
=== "CLI"
```bash
# Benchmark YOLOv8n speed and accuracy on the COCO128 dataset for all all export formats
yolo benchmark model=yolov8n.pt data=coco128.yaml
```
Note that benchmarking results might vary based on the exact hardware and software configuration of a system, as well as the current workload of the system at the time the benchmarks are run. For the most reliable results use a dataset with a large number of images, i.e. `data='coco128.yaml' (128 val images), or `data='coco.yaml'` (5000 val images).
## Conclusion
The benchmarking results clearly demonstrate the benefits of exporting the YOLOv8 model to the OpenVINO format. Across different models and hardware platforms, the OpenVINO format consistently outperforms other formats in terms of inference speed while maintaining comparable accuracy.
For the Intel® Data Center GPU Flex Series, the OpenVINO format was able to deliver inference speeds almost 10 times faster than the original PyTorch format. On the Xeon CPU, the OpenVINO format was twice as fast as the PyTorch format. The accuracy of the models remained nearly identical across the different formats.
The benchmarks underline the effectiveness of OpenVINO as a tool for deploying deep learning models. By converting models to the OpenVINO format, developers can achieve significant performance improvements, making it easier to deploy these models in real-world applications.
For more detailed information and instructions on using OpenVINO, refer to the [official OpenVINO documentation](https://docs.openvinotoolkit.org/latest/index.html).

@ -0,0 +1,174 @@
---
comments: true
description: Discover how to streamline hyperparameter tuning for YOLOv8 models with Ray Tune. Learn to accelerate tuning, integrate with Weights & Biases, and analyze results.
keywords: Ultralytics, YOLOv8, Ray Tune, hyperparameter tuning, machine learning optimization, Weights & Biases integration, result analysis
---
# Efficient Hyperparameter Tuning with Ray Tune and YOLOv8
Hyperparameter tuning is vital in achieving peak model performance by discovering the optimal set of hyperparameters. This involves running trials with different hyperparameters and evaluating each trial’s performance.
## Accelerate Tuning with Ultralytics YOLOv8 and Ray Tune
[Ultralytics YOLOv8](https://ultralytics.com) incorporates Ray Tune for hyperparameter tuning, streamlining the optimization of YOLOv8 model hyperparameters. With Ray Tune, you can utilize advanced search strategies, parallelism, and early stopping to expedite the tuning process.
### Ray Tune
<p align="center">
<img width="640" src="https://docs.ray.io/en/latest/_images/tune_overview.png" alt="Ray Tune Overview">
</p>
[Ray Tune](https://docs.ray.io/en/latest/tune/index.html) is a hyperparameter tuning library designed for efficiency and flexibility. It supports various search strategies, parallelism, and early stopping strategies, and seamlessly integrates with popular machine learning frameworks, including Ultralytics YOLOv8.
### Integration with Weights & Biases
YOLOv8 also allows optional integration with [Weights & Biases](https://wandb.ai/site) for monitoring the tuning process.
## Installation
To install the required packages, run:
!!! tip "Installation"
```bash
# Install and update Ultralytics and Ray Tune packages
pip install -U ultralytics "ray[tune]"
# Optionally install W&B for logging
pip install wandb
```
## Usage
!!! example "Usage"
```python
from ultralytics import YOLO
# Load a YOLOv8n model
model = YOLO("yolov8n.pt")
# Start tuning hyperparameters for YOLOv8n training on the COCO128 dataset
result_grid = model.tune(data="coco128.yaml")
```
## `tune()` Method Parameters
The `tune()` method in YOLOv8 provides an easy-to-use interface for hyperparameter tuning with Ray Tune. It accepts several arguments that allow you to customize the tuning process. Below is a detailed explanation of each parameter:
| Parameter | Type | Description | Default Value |
|-----------------|------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------|
| `data` | `str` | The dataset configuration file (in YAML format) to run the tuner on. This file should specify the training and validation data paths, as well as other dataset-specific settings. | |
| `space` | `dict, optional` | A dictionary defining the hyperparameter search space for Ray Tune. Each key corresponds to a hyperparameter name, and the value specifies the range of values to explore during tuning. If not provided, YOLOv8 uses a default search space with various hyperparameters. | |
| `grace_period` | `int, optional` | The grace period in epochs for the [ASHA scheduler](https://docs.ray.io/en/latest/tune/api/schedulers.html) in Ray Tune. The scheduler will not terminate any trial before this number of epochs, allowing the model to have some minimum training before making a decision on early stopping. | 10 |
| `gpu_per_trial` | `int, optional` | The number of GPUs to allocate per trial during tuning. This helps manage GPU usage, particularly in multi-GPU environments. If not provided, the tuner will use all available GPUs. | None |
| `max_samples` | `int, optional` | The maximum number of trials to run during tuning. This parameter helps control the total number of hyperparameter combinations tested, ensuring the tuning process does not run indefinitely. | 10 |
| `**train_args` | `dict, optional` | Additional arguments to pass to the `train()` method during tuning. These arguments can include settings like the number of training epochs, batch size, and other training-specific configurations. | {} |
By customizing these parameters, you can fine-tune the hyperparameter optimization process to suit your specific needs and available computational resources.
## Default Search Space Description
The following table lists the default search space parameters for hyperparameter tuning in YOLOv8 with Ray Tune. Each parameter has a specific value range defined by `tune.uniform()`.
| Parameter | Value Range | Description |
|-------------------|----------------------------|------------------------------------------|
| `lr0` | `tune.uniform(1e-5, 1e-1)` | Initial learning rate |
| `lrf` | `tune.uniform(0.01, 1.0)` | Final learning rate factor |
| `momentum` | `tune.uniform(0.6, 0.98)` | Momentum |
| `weight_decay` | `tune.uniform(0.0, 0.001)` | Weight decay |
| `warmup_epochs` | `tune.uniform(0.0, 5.0)` | Warmup epochs |
| `warmup_momentum` | `tune.uniform(0.0, 0.95)` | Warmup momentum |
| `box` | `tune.uniform(0.02, 0.2)` | Box loss weight |
| `cls` | `tune.uniform(0.2, 4.0)` | Class loss weight |
| `hsv_h` | `tune.uniform(0.0, 0.1)` | Hue augmentation range |
| `hsv_s` | `tune.uniform(0.0, 0.9)` | Saturation augmentation range |
| `hsv_v` | `tune.uniform(0.0, 0.9)` | Value (brightness) augmentation range |
| `degrees` | `tune.uniform(0.0, 45.0)` | Rotation augmentation range (degrees) |
| `translate` | `tune.uniform(0.0, 0.9)` | Translation augmentation range |
| `scale` | `tune.uniform(0.0, 0.9)` | Scaling augmentation range |
| `shear` | `tune.uniform(0.0, 10.0)` | Shear augmentation range (degrees) |
| `perspective` | `tune.uniform(0.0, 0.001)` | Perspective augmentation range |
| `flipud` | `tune.uniform(0.0, 1.0)` | Vertical flip augmentation probability |
| `fliplr` | `tune.uniform(0.0, 1.0)` | Horizontal flip augmentation probability |
| `mosaic` | `tune.uniform(0.0, 1.0)` | Mosaic augmentation probability |
| `mixup` | `tune.uniform(0.0, 1.0)` | Mixup augmentation probability |
| `copy_paste` | `tune.uniform(0.0, 1.0)` | Copy-paste augmentation probability |
## Custom Search Space Example
In this example, we demonstrate how to use a custom search space for hyperparameter tuning with Ray Tune and YOLOv8. By providing a custom search space, you can focus the tuning process on specific hyperparameters of interest.
!!! example "Usage"
```python
from ultralytics import YOLO
# Define a YOLO model
model = YOLO("yolov8n.pt")
# Run Ray Tune on the model
result_grid = model.tune(data="coco128.yaml",
space={"lr0": tune.uniform(1e-5, 1e-1)},
epochs=50)
```
In the code snippet above, we create a YOLO model with the "yolov8n.pt" pretrained weights. Then, we call the `tune()` method, specifying the dataset configuration with "coco128.yaml". We provide a custom search space for the initial learning rate `lr0` using a dictionary with the key "lr0" and the value `tune.uniform(1e-5, 1e-1)`. Finally, we pass additional training arguments, such as the number of epochs directly to the tune method as `epochs=50`.
# Processing Ray Tune Results
After running a hyperparameter tuning experiment with Ray Tune, you might want to perform various analyses on the obtained results. This guide will take you through common workflows for processing and analyzing these results.
## Loading Tune Experiment Results from a Directory
After running the tuning experiment with `tuner.fit()`, you can load the results from a directory. This is useful, especially if you're performing the analysis after the initial training script has exited.
```python
experiment_path = f"{storage_path}/{exp_name}"
print(f"Loading results from {experiment_path}...")
restored_tuner = tune.Tuner.restore(experiment_path, trainable=train_mnist)
result_grid = restored_tuner.get_results()
```
## Basic Experiment-Level Analysis
Get an overview of how trials performed. You can quickly check if there were any errors during the trials.
```python
if result_grid.errors:
print("One or more trials failed!")
else:
print("No errors!")
```
## Basic Trial-Level Analysis
Access individual trial hyperparameter configurations and the last reported metrics.
```python
for i, result in enumerate(result_grid):
print(f"Trial #{i}: Configuration: {result.config}, Last Reported Metrics: {result.metrics}")
```
## Plotting the Entire History of Reported Metrics for a Trial
You can plot the history of reported metrics for each trial to see how the metrics evolved over time.
```python
import matplotlib.pyplot as plt
for result in result_grid:
plt.plot(result.metrics_dataframe["training_iteration"], result.metrics_dataframe["mean_accuracy"], label=f"Trial {i}")
plt.xlabel('Training Iterations')
plt.ylabel('Mean Accuracy')
plt.legend()
plt.show()
```
## Summary
In this documentation, we covered common workflows to analyze the results of experiments run with Ray Tune using Ultralytics. The key steps include loading the experiment results from a directory, performing basic experiment-level and trial-level analysis and plotting metrics.
Explore further by looking into Ray Tune’s [Analyze Results](https://docs.ray.io/en/latest/tune/examples/tune_analyze_results.html) docs page to get the most out of your hyperparameter tuning experiments.

@ -0,0 +1,186 @@
---
comments: true
description: Explore FastSAM, a CNN-based solution for real-time object segmentation in images. Enhanced user interaction, computational efficiency and adaptable across vision tasks.
keywords: FastSAM, machine learning, CNN-based solution, object segmentation, real-time solution, Ultralytics, vision tasks, image processing, industrial applications, user interaction
---
# Fast Segment Anything Model (FastSAM)
The Fast Segment Anything Model (FastSAM) is a novel, real-time CNN-based solution for the Segment Anything task. This task is designed to segment any object within an image based on various possible user interaction prompts. FastSAM significantly reduces computational demands while maintaining competitive performance, making it a practical choice for a variety of vision tasks.
![Fast Segment Anything Model (FastSAM) architecture overview](https://user-images.githubusercontent.com/26833433/248551984-d98f0f6d-7535-45d0-b380-2e1440b52ad7.jpg)
## Overview
FastSAM is designed to address the limitations of the [Segment Anything Model (SAM)](sam.md), a heavy Transformer model with substantial computational resource requirements. The FastSAM decouples the segment anything task into two sequential stages: all-instance segmentation and prompt-guided selection. The first stage uses [YOLOv8-seg](../tasks/segment.md) to produce the segmentation masks of all instances in the image. In the second stage, it outputs the region-of-interest corresponding to the prompt.
## Key Features
1. **Real-time Solution:** By leveraging the computational efficiency of CNNs, FastSAM provides a real-time solution for the segment anything task, making it valuable for industrial applications that require quick results.
2. **Efficiency and Performance:** FastSAM offers a significant reduction in computational and resource demands without compromising on performance quality. It achieves comparable performance to SAM but with drastically reduced computational resources, enabling real-time application.
3. **Prompt-guided Segmentation:** FastSAM can segment any object within an image guided by various possible user interaction prompts, providing flexibility and adaptability in different scenarios.
4. **Based on YOLOv8-seg:** FastSAM is based on [YOLOv8-seg](../tasks/segment.md), an object detector equipped with an instance segmentation branch. This allows it to effectively produce the segmentation masks of all instances in an image.
5. **Competitive Results on Benchmarks:** On the object proposal task on MS COCO, FastSAM achieves high scores at a significantly faster speed than [SAM](sam.md) on a single NVIDIA RTX 3090, demonstrating its efficiency and capability.
6. **Practical Applications:** The proposed approach provides a new, practical solution for a large number of vision tasks at a really high speed, tens or hundreds of times faster than current methods.
7. **Model Compression Feasibility:** FastSAM demonstrates the feasibility of a path that can significantly reduce the computational effort by introducing an artificial prior to the structure, thus opening new possibilities for large model architecture for general vision tasks.
## Usage
### Python API
The FastSAM models are easy to integrate into your Python applications. Ultralytics provides a user-friendly Python API to streamline the process.
#### Predict Usage
To perform object detection on an image, use the `predict` method as shown below:
!!! example ""
=== "Python"
```python
from ultralytics import FastSAM
from ultralytics.models.fastsam import FastSAMPrompt
# Define an inference source
source = 'path/to/bus.jpg'
# Create a FastSAM model
model = FastSAM('FastSAM-s.pt') # or FastSAM-x.pt
# Run inference on an image
everything_results = model(source, device='cpu', retina_masks=True, imgsz=1024, conf=0.4, iou=0.9)
# Prepare a Prompt Process object
prompt_process = FastSAMPrompt(source, everything_results, device='cpu')
# Everything prompt
ann = prompt_process.everything_prompt()
# Bbox default shape [0,0,0,0] -> [x1,y1,x2,y2]
ann = prompt_process.box_prompt(bbox=[200, 200, 300, 300])
# Text prompt
ann = prompt_process.text_prompt(text='a photo of a dog')
# Point prompt
# points default [[0,0]] [[x1,y1],[x2,y2]]
# point_label default [0] [1,0] 0:background, 1:foreground
ann = prompt_process.point_prompt(points=[[200, 200]], pointlabel=[1])
prompt_process.plot(annotations=ann, output='./')
```
=== "CLI"
```bash
# Load a FastSAM model and segment everything with it
yolo segment predict model=FastSAM-s.pt source=path/to/bus.jpg imgsz=640
```
This snippet demonstrates the simplicity of loading a pre-trained model and running a prediction on an image.
#### Val Usage
Validation of the model on a dataset can be done as follows:
!!! example ""
=== "Python"
```python
from ultralytics import FastSAM
# Create a FastSAM model
model = FastSAM('FastSAM-s.pt') # or FastSAM-x.pt
# Validate the model
results = model.val(data='coco8-seg.yaml')
```
=== "CLI"
```bash
# Load a FastSAM model and validate it on the COCO8 example dataset at image size 640
yolo segment val model=FastSAM-s.pt data=coco8.yaml imgsz=640
```
Please note that FastSAM only supports detection and segmentation of a single class of object. This means it will recognize and segment all objects as the same class. Therefore, when preparing the dataset, you need to convert all object category IDs to 0.
### FastSAM official Usage
FastSAM is also available directly from the [https://github.com/CASIA-IVA-Lab/FastSAM](https://github.com/CASIA-IVA-Lab/FastSAM) repository. Here is a brief overview of the typical steps you might take to use FastSAM:
#### Installation
1. Clone the FastSAM repository:
```shell
git clone https://github.com/CASIA-IVA-Lab/FastSAM.git
```
2. Create and activate a Conda environment with Python 3.9:
```shell
conda create -n FastSAM python=3.9
conda activate FastSAM
```
3. Navigate to the cloned repository and install the required packages:
```shell
cd FastSAM
pip install -r requirements.txt
```
4. Install the CLIP model:
```shell
pip install git+https://github.com/openai/CLIP.git
```
#### Example Usage
1. Download a [model checkpoint](https://drive.google.com/file/d/1m1sjY4ihXBU1fZXdQ-Xdj-mDltW-2Rqv/view?usp=sharing).
2. Use FastSAM for inference. Example commands:
- Segment everything in an image:
```shell
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg
```
- Segment specific objects using text prompt:
```shell
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --text_prompt "the yellow dog"
```
- Segment objects within a bounding box (provide box coordinates in xywh format):
```shell
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --box_prompt "[570,200,230,400]"
```
- Segment objects near specific points:
```shell
python Inference.py --model_path ./weights/FastSAM.pt --img_path ./images/dogs.jpg --point_prompt "[[520,360],[620,300]]" --point_label "[1,0]"
```
Additionally, you can try FastSAM through a [Colab demo](https://colab.research.google.com/drive/1oX14f6IneGGw612WgVlAiy91UHwFAvr9?usp=sharing) or on the [HuggingFace web demo](https://huggingface.co/spaces/An-619/FastSAM) for a visual experience.
## Citations and Acknowledgements
We would like to acknowledge the FastSAM authors for their significant contributions in the field of real-time instance segmentation:
!!! note ""
=== "BibTeX"
```bibtex
@misc{zhao2023fast,
title={Fast Segment Anything},
author={Xu Zhao and Wenchao Ding and Yongqi An and Yinglong Du and Tao Yu and Min Li and Ming Tang and Jinqiao Wang},
year={2023},
eprint={2306.12156},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
The original FastSAM paper can be found on [arXiv](https://arxiv.org/abs/2306.12156). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/CASIA-IVA-Lab/FastSAM). We appreciate their efforts in advancing the field and making their work accessible to the broader community.

@ -1,5 +1,7 @@
---
comments: true
description: Learn about the YOLO family, SAM, MobileSAM, FastSAM, YOLO-NAS, and RT-DETR models supported by Ultralytics, with examples on how to use them via CLI and Python.
keywords: Ultralytics, documentation, YOLO, SAM, MobileSAM, FastSAM, YOLO-NAS, RT-DETR, models, architectures, Python, CLI
---
# Models
@ -8,28 +10,56 @@ Ultralytics supports many models and architectures with more to come in the futu
In this documentation, we provide information on four major models:
1. [YOLOv3](./yolov3.md): The third iteration of the YOLO model family, known for its efficient real-time object detection capabilities.
2. [YOLOv5](./yolov5.md): An improved version of the YOLO architecture, offering better performance and speed tradeoffs compared to previous versions.
3. [YOLOv8](./yolov8.md): The latest version of the YOLO family, featuring enhanced capabilities such as instance segmentation, pose/keypoints estimation, and classification.
4. [Segment Anything Model (SAM)](./sam.md): Meta's Segment Anything Model (SAM).
1. [YOLOv3](./yolov3.md): The third iteration of the YOLO model family originally by Joseph Redmon, known for its efficient real-time object detection capabilities.
2. [YOLOv4](./yolov3.md): A darknet-native update to YOLOv3 released by Alexey Bochkovskiy in 2020.
3. [YOLOv5](./yolov5.md): An improved version of the YOLO architecture by Ultralytics, offering better performance and speed tradeoffs compared to previous versions.
4. [YOLOv6](./yolov6.md): Released by [Meituan](https://about.meituan.com/) in 2022 and is in use in many of the company's autonomous delivery robots.
5. [YOLOv7](./yolov7.md): Updated YOLO models released in 2022 by the authors of YOLOv4.
6. [YOLOv8](./yolov8.md): The latest version of the YOLO family, featuring enhanced capabilities such as instance segmentation, pose/keypoints estimation, and classification.
7. [Segment Anything Model (SAM)](./sam.md): Meta's Segment Anything Model (SAM).
8. [Mobile Segment Anything Model (MobileSAM)](./mobile-sam.md): MobileSAM for mobile applications by Kyung Hee University.
9. [Fast Segment Anything Model (FastSAM)](./fast-sam.md): FastSAM by Image & Video Analysis Group, Institute of Automation, Chinese Academy of Sciences.
10. [YOLO-NAS](./yolo-nas.md): YOLO Neural Architecture Search (NAS) Models.
11. [Realtime Detection Transformers (RT-DETR)](./rtdetr.md): Baidu's PaddlePaddle Realtime Detection Transformer (RT-DETR) models.
You can use these models directly in the Command Line Interface (CLI) or in a Python environment. Below are examples of how to use the models with CLI and Python:
You can use many of these models directly in the Command Line Interface (CLI) or in a Python environment. Below are examples of how to use the models with CLI and Python:
## CLI Example
## Usage
```bash
yolo task=detect mode=train model=yolov8n.yaml data=coco128.yaml epochs=100
```
This example provides simple inference code for YOLO, SAM and RTDETR models. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using models with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
## Python Example
!!! example ""
```python
from ultralytics import YOLO
=== "Python"
model = YOLO("model.yaml") # build a YOLOv8n model from scratch
# YOLO("model.pt") use pre-trained model if available
model.info() # display model information
model.train(data="coco128.yaml", epochs=100) # train the model
```
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()`, `SAM()`, `NAS()` and `RTDETR()` classes to create a model instance in python:
For more details on each model, their supported tasks, modes, and performance, please visit their respective documentation pages linked above.
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
# Run inference with the YOLOv8n model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640
# Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image
yolo predict model=yolov8n.pt source=path/to/bus.jpg
```
For more details on each model, their supported tasks, modes, and performance, please visit their respective documentation pages linked above.

@ -0,0 +1,109 @@
---
comments: true
description: Learn more about MobileSAM, its implementation, comparison with the original SAM, and how to download and test it in the Ultralytics framework. Improve your mobile applications today.
keywords: MobileSAM, Ultralytics, SAM, mobile applications, Arxiv, GPU, API, image encoder, mask decoder, model download, testing method
---
![MobileSAM Logo](https://github.com/ChaoningZhang/MobileSAM/blob/master/assets/logo2.png?raw=true)
# Mobile Segment Anything (MobileSAM)
The MobileSAM paper is now available on [arXiv](https://arxiv.org/pdf/2306.14289.pdf).
A demonstration of MobileSAM running on a CPU can be accessed at this [demo link](https://huggingface.co/spaces/dhkim2810/MobileSAM). The performance on a Mac i5 CPU takes approximately 3 seconds. On the Hugging Face demo, the interface and lower-performance CPUs contribute to a slower response, but it continues to function effectively.
MobileSAM is implemented in various projects including [Grounding-SAM](https://github.com/IDEA-Research/Grounded-Segment-Anything), [AnyLabeling](https://github.com/vietanhdev/anylabeling), and [Segment Anything in 3D](https://github.com/Jumpat/SegmentAnythingin3D).
MobileSAM is trained on a single GPU with a 100k dataset (1% of the original images) in less than a day. The code for this training will be made available in the future.
## Adapting from SAM to MobileSAM
Since MobileSAM retains the same pipeline as the original SAM, we have incorporated the original's pre-processing, post-processing, and all other interfaces. Consequently, those currently using the original SAM can transition to MobileSAM with minimal effort.
MobileSAM performs comparably to the original SAM and retains the same pipeline except for a change in the image encoder. Specifically, we replace the original heavyweight ViT-H encoder (632M) with a smaller Tiny-ViT (5M). On a single GPU, MobileSAM operates at about 12ms per image: 8ms on the image encoder and 4ms on the mask decoder.
The following table provides a comparison of ViT-based image encoders:
| Image Encoder | Original SAM | MobileSAM |
|---------------|--------------|-----------|
| Parameters | 611M | 5M |
| Speed | 452ms | 8ms |
Both the original SAM and MobileSAM utilize the same prompt-guided mask decoder:
| Mask Decoder | Original SAM | MobileSAM |
|--------------|--------------|-----------|
| Parameters | 3.876M | 3.876M |
| Speed | 4ms | 4ms |
Here is the comparison of the whole pipeline:
| Whole Pipeline (Enc+Dec) | Original SAM | MobileSAM |
|--------------------------|--------------|-----------|
| Parameters | 615M | 9.66M |
| Speed | 456ms | 12ms |
The performance of MobileSAM and the original SAM are demonstrated using both a point and a box as prompts.
![Image with Point as Prompt](https://raw.githubusercontent.com/ChaoningZhang/MobileSAM/master/assets/mask_box.jpg?raw=true)
![Image with Box as Prompt](https://raw.githubusercontent.com/ChaoningZhang/MobileSAM/master/assets/mask_box.jpg?raw=true)
With its superior performance, MobileSAM is approximately 5 times smaller and 7 times faster than the current FastSAM. More details are available at the [MobileSAM project page](https://github.com/ChaoningZhang/MobileSAM).
## Testing MobileSAM in Ultralytics
Just like the original SAM, we offer a straightforward testing method in Ultralytics, including modes for both Point and Box prompts.
### Model Download
You can download the model [here](https://github.com/ChaoningZhang/MobileSAM/blob/master/weights/mobile_sam.pt).
### Point Prompt
!!! example ""
=== "Python"
```python
from ultralytics import SAM
# Load the model
model = SAM('mobile_sam.pt')
# Predict a segment based on a point prompt
model.predict('ultralytics/assets/zidane.jpg', points=[900, 370], labels=[1])
```
### Box Prompt
!!! example ""
=== "Python"
```python
from ultralytics import SAM
# Load the model
model = SAM('mobile_sam.pt')
# Predict a segment based on a box prompt
model.predict('ultralytics/assets/zidane.jpg', bboxes=[439, 437, 524, 709])
```
We have implemented `MobileSAM` and `SAM` using the same API. For more usage information, please see the [SAM page](./sam.md).
## Citations and Acknowledgements
If you find MobileSAM useful in your research or development work, please consider citing our paper:
!!! note ""
=== "BibTeX"
```bibtex
@article{mobile_sam,
title={Faster Segment Anything: Towards Lightweight SAM for Mobile Applications},
author={Zhang, Chaoning and Han, Dongshen and Qiao, Yu and Kim, Jung Uk and Bae, Sung Ho and Lee, Seungkyu and Hong, Choong Seon},
journal={arXiv preprint arXiv:2306.14289},
year={2023}
}
```

@ -0,0 +1,101 @@
---
comments: true
description: Discover the features and benefits of RT-DETR, Baidu’s efficient and adaptable real-time object detector powered by Vision Transformers, including pre-trained models.
keywords: RT-DETR, Baidu, Vision Transformers, object detection, real-time performance, CUDA, TensorRT, IoU-aware query selection, Ultralytics, Python API, PaddlePaddle
---
# Baidu's RT-DETR: A Vision Transformer-Based Real-Time Object Detector
## Overview
Real-Time Detection Transformer (RT-DETR), developed by Baidu, is a cutting-edge end-to-end object detector that provides real-time performance while maintaining high accuracy. It leverages the power of Vision Transformers (ViT) to efficiently process multiscale features by decoupling intra-scale interaction and cross-scale fusion. RT-DETR is highly adaptable, supporting flexible adjustment of inference speed using different decoder layers without retraining. The model excels on accelerated backends like CUDA with TensorRT, outperforming many other real-time object detectors.
![Model example image](https://user-images.githubusercontent.com/26833433/238963168-90e8483f-90aa-4eb6-a5e1-0d408b23dd33.png)
**Overview of Baidu's RT-DETR.** The RT-DETR model architecture diagram shows the last three stages of the backbone {S3, S4, S5} as the input to the encoder. The efficient hybrid encoder transforms multiscale features into a sequence of image features through intrascale feature interaction (AIFI) and cross-scale feature-fusion module (CCFM). The IoU-aware query selection is employed to select a fixed number of image features to serve as initial object queries for the decoder. Finally, the decoder with auxiliary prediction heads iteratively optimizes object queries to generate boxes and confidence scores ([source](https://arxiv.org/pdf/2304.08069.pdf)).
### Key Features
- **Efficient Hybrid Encoder:** Baidu's RT-DETR uses an efficient hybrid encoder that processes multiscale features by decoupling intra-scale interaction and cross-scale fusion. This unique Vision Transformers-based design reduces computational costs and allows for real-time object detection.
- **IoU-aware Query Selection:** Baidu's RT-DETR improves object query initialization by utilizing IoU-aware query selection. This allows the model to focus on the most relevant objects in the scene, enhancing the detection accuracy.
- **Adaptable Inference Speed:** Baidu's RT-DETR supports flexible adjustments of inference speed by using different decoder layers without the need for retraining. This adaptability facilitates practical application in various real-time object detection scenarios.
## Pre-trained Models
The Ultralytics Python API provides pre-trained PaddlePaddle RT-DETR models with different scales:
- RT-DETR-L: 53.0% AP on COCO val2017, 114 FPS on T4 GPU
- RT-DETR-X: 54.8% AP on COCO val2017, 74 FPS on T4 GPU
## Usage
You can use RT-DETR for object detection tasks using the `ultralytics` pip package. The following is a sample code snippet showing how to use RT-DETR models for training and inference:
!!! example ""
This example provides simple inference code for RT-DETR. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using RT-DETR with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
```python
from ultralytics import RTDETR
# Load a COCO-pretrained RT-DETR-l model
model = RTDETR('rtdetr-l.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
# Run inference with the RT-DETR-l model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
```bash
# Load a COCO-pretrained RT-DETR-l model and train it on the COCO8 example dataset for 100 epochs
yolo train model=rtdetr-l.pt data=coco8.yaml epochs=100 imgsz=640
# Load a COCO-pretrained RT-DETR-l model and run inference on the 'bus.jpg' image
yolo predict model=rtdetr-l.pt source=path/to/bus.jpg
```
### Supported Tasks
| Model Type | Pre-trained Weights | Tasks Supported |
|---------------------|---------------------|------------------|
| RT-DETR Large | `rtdetr-l.pt` | Object Detection |
| RT-DETR Extra-Large | `rtdetr-x.pt` | Object Detection |
### Supported Modes
| Mode | Supported |
|------------|--------------------|
| Inference | :heavy_check_mark: |
| Validation | :heavy_check_mark: |
| Training | :heavy_check_mark: |
## Citations and Acknowledgements
If you use Baidu's RT-DETR in your research or development work, please cite the [original paper](https://arxiv.org/abs/2304.08069):
!!! note ""
=== "BibTeX"
```bibtex
@misc{lv2023detrs,
title={DETRs Beat YOLOs on Real-time Object Detection},
author={Wenyu Lv and Shangliang Xu and Yian Zhao and Guanzhong Wang and Jinman Wei and Cheng Cui and Yuning Du and Qingqing Dang and Yi Liu},
year={2023},
eprint={2304.08069},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to acknowledge Baidu and the [PaddlePaddle](https://github.com/PaddlePaddle/PaddleDetection) team for creating and maintaining this valuable resource for the computer vision community. Their contribution to the field with the development of the Vision Transformers-based real-time object detector, RT-DETR, is greatly appreciated.
*Keywords: RT-DETR, Transformer, ViT, Vision Transformers, Baidu RT-DETR, PaddlePaddle, Paddle Paddle RT-DETR, real-time object detection, Vision Transformers-based object detection, pre-trained PaddlePaddle RT-DETR models, Baidu's RT-DETR usage, Ultralytics Python API*

@ -1,36 +1,233 @@
---
comments: true
description: Explore the cutting-edge Segment Anything Model (SAM) from Ultralytics that allows real-time image segmentation. Learn about its promptable segmentation, zero-shot performance, and how to use it.
keywords: Ultralytics, image segmentation, Segment Anything Model, SAM, SA-1B dataset, real-time performance, zero-shot transfer, object detection, image analysis, machine learning
---
# Vision Transformers
# Segment Anything Model (SAM)
Vit models currently support Python environment:
Welcome to the frontier of image segmentation with the Segment Anything Model, or SAM. This revolutionary model has changed the game by introducing promptable image segmentation with real-time performance, setting new standards in the field.
```python
from ultralytics.vit import SAM
## Introduction to SAM: The Segment Anything Model
# from ultralytics.vit import MODEL_TYPe
The Segment Anything Model, or SAM, is a cutting-edge image segmentation model that allows for promptable segmentation, providing unparalleled versatility in image analysis tasks. SAM forms the heart of the Segment Anything initiative, a groundbreaking project that introduces a novel model, task, and dataset for image segmentation.
model = SAM("sam_b.pt")
model.info() # display model information
model.predict(...) # train the model
```
SAM's advanced design allows it to adapt to new image distributions and tasks without prior knowledge, a feature known as zero-shot transfer. Trained on the expansive [SA-1B dataset](https://ai.facebook.com/datasets/segment-anything/), which contains more than 1 billion masks spread over 11 million carefully curated images, SAM has displayed impressive zero-shot performance, surpassing previous fully supervised results in many cases.
# Segment Anything
![Dataset sample image](https://user-images.githubusercontent.com/26833433/238056229-0e8ffbeb-f81a-477e-a490-aff3d82fd8ce.jpg)
Example images with overlaid masks from our newly introduced dataset, SA-1B. SA-1B contains 11M diverse, high-resolution, licensed, and privacy protecting images and 1.1B high-quality segmentation masks. These masks were annotated fully automatically by SAM, and as verified by human ratings and numerous experiments, are of high quality and diversity. Images are grouped by number of masks per image for visualization (there are ∼100 masks per image on average).
## About
## Key Features of the Segment Anything Model (SAM)
## Supported Tasks
- **Promptable Segmentation Task:** SAM was designed with a promptable segmentation task in mind, allowing it to generate valid segmentation masks from any given prompt, such as spatial or text clues identifying an object.
- **Advanced Architecture:** The Segment Anything Model employs a powerful image encoder, a prompt encoder, and a lightweight mask decoder. This unique architecture enables flexible prompting, real-time mask computation, and ambiguity awareness in segmentation tasks.
- **The SA-1B Dataset:** Introduced by the Segment Anything project, the SA-1B dataset features over 1 billion masks on 11 million images. As the largest segmentation dataset to date, it provides SAM with a diverse and large-scale training data source.
- **Zero-Shot Performance:** SAM displays outstanding zero-shot performance across various segmentation tasks, making it a ready-to-use tool for diverse applications with minimal need for prompt engineering.
For an in-depth look at the Segment Anything Model and the SA-1B dataset, please visit the [Segment Anything website](https://segment-anything.com) and check out the research paper [Segment Anything](https://arxiv.org/abs/2304.02643).
## How to Use SAM: Versatility and Power in Image Segmentation
The Segment Anything Model can be employed for a multitude of downstream tasks that go beyond its training data. This includes edge detection, object proposal generation, instance segmentation, and preliminary text-to-mask prediction. With prompt engineering, SAM can swiftly adapt to new tasks and data distributions in a zero-shot manner, establishing it as a versatile and potent tool for all your image segmentation needs.
### SAM prediction example
!!! example "Segment with prompts"
Segment image with given prompts.
=== "Python"
```python
from ultralytics import SAM
# Load a model
model = SAM('sam_b.pt')
# Display model information (optional)
model.info()
# Run inference with bboxes prompt
model('ultralytics/assets/zidane.jpg', bboxes=[439, 437, 524, 709])
# Run inference with points prompt
model.predict('ultralytics/assets/zidane.jpg', points=[900, 370], labels=[1])
```
!!! example "Segment everything"
Segment the whole image.
=== "Python"
```python
from ultralytics import SAM
# Load a model
model = SAM('sam_b.pt')
# Display model information (optional)
model.info()
# Run inference
model('path/to/image.jpg')
```
=== "CLI"
```bash
# Run inference with a SAM model
yolo predict model=sam_b.pt source=path/to/image.jpg
```
- The logic here is to segment the whole image if you don't pass any prompts(bboxes/points/masks).
!!! example "SAMPredictor example"
This way you can set image once and run prompts inference multiple times without running image encoder multiple times.
=== "Prompt inference"
```python
from ultralytics.models.sam import Predictor as SAMPredictor
# Create SAMPredictor
overrides = dict(conf=0.25, task='segment', mode='predict', imgsz=1024, model="mobile_sam.pt")
predictor = SAMPredictor(overrides=overrides)
# Set image
predictor.set_image("ultralytics/assets/zidane.jpg") # set with image file
predictor.set_image(cv2.imread("ultralytics/assets/zidane.jpg")) # set with np.ndarray
results = predictor(bboxes=[439, 437, 524, 709])
results = predictor(points=[900, 370], labels=[1])
# Reset image
predictor.reset_image()
```
Segment everything with additional args.
=== "Segment everything"
```python
from ultralytics.models.sam import Predictor as SAMPredictor
# Create SAMPredictor
overrides = dict(conf=0.25, task='segment', mode='predict', imgsz=1024, model="mobile_sam.pt")
predictor = SAMPredictor(overrides=overrides)
# Segment with additional args
results = predictor(source="ultralytics/assets/zidane.jpg", crop_n_layers=1, points_stride=64)
```
- More additional args for `Segment everything` see [`Predictor/generate` Reference](../reference/models/sam/predict.md).
## Available Models and Supported Tasks
| Model Type | Pre-trained Weights | Tasks Supported |
|------------|---------------------|-----------------------|
| sam base | `sam_b.pt` | Instance Segmentation |
| sam large | `sam_l.pt` | Instance Segmentation |
| SAM base | `sam_b.pt` | Instance Segmentation |
| SAM large | `sam_l.pt` | Instance Segmentation |
## Supported Modes
## Operating Modes
| Mode | Supported |
|------------|--------------------|
| Inference | :heavy_check_mark: |
| Validation | :x: |
| Training | :x: |
## SAM comparison vs YOLOv8
Here we compare Meta's smallest SAM model, SAM-b, with Ultralytics smallest segmentation model, [YOLOv8n-seg](../tasks/segment.md):
| Model | Size | Parameters | Speed (CPU) |
|------------------------------------------------|----------------------------|------------------------|----------------------------|
| Meta's SAM-b | 358 MB | 94.7 M | 51096 ms/im |
| [MobileSAM](mobile-sam.md) | 40.7 MB | 10.1 M | 46122 ms/im |
| [FastSAM-s](fast-sam.md) with YOLOv8 backbone | 23.7 MB | 11.8 M | 115 ms/im |
| Ultralytics [YOLOv8n-seg](../tasks/segment.md) | **6.7 MB** (53.4x smaller) | **3.4 M** (27.9x less) | **59 ms/im** (866x faster) |
This comparison shows the order-of-magnitude differences in the model sizes and speeds between models. Whereas SAM presents unique capabilities for automatic segmenting, it is not a direct competitor to YOLOv8 segment models, which are smaller, faster and more efficient.
Tests run on a 2023 Apple M2 Macbook with 16GB of RAM. To reproduce this test:
!!! example ""
=== "Python"
```python
from ultralytics import FastSAM, SAM, YOLO
# Profile SAM-b
model = SAM('sam_b.pt')
model.info()
model('ultralytics/assets')
# Profile MobileSAM
model = SAM('mobile_sam.pt')
model.info()
model('ultralytics/assets')
# Profile FastSAM-s
model = FastSAM('FastSAM-s.pt')
model.info()
model('ultralytics/assets')
# Profile YOLOv8n-seg
model = YOLO('yolov8n-seg.pt')
model.info()
model('ultralytics/assets')
```
## Auto-Annotation: A Quick Path to Segmentation Datasets
Auto-annotation is a key feature of SAM, allowing users to generate a [segmentation dataset](https://docs.ultralytics.com/datasets/segment) using a pre-trained detection model. This feature enables rapid and accurate annotation of a large number of images, bypassing the need for time-consuming manual labeling.
### Generate Your Segmentation Dataset Using a Detection Model
To auto-annotate your dataset with the Ultralytics framework, use the `auto_annotate` function as shown below:
!!! example ""
=== "Python"
```python
from ultralytics.data.annotator import auto_annotate
auto_annotate(data="path/to/images", det_model="yolov8x.pt", sam_model='sam_b.pt')
```
| Argument | Type | Description | Default |
|------------|---------------------|---------------------------------------------------------------------------------------------------------|--------------|
| data | str | Path to a folder containing images to be annotated. | |
| det_model | str, optional | Pre-trained YOLO detection model. Defaults to 'yolov8x.pt'. | 'yolov8x.pt' |
| sam_model | str, optional | Pre-trained SAM segmentation model. Defaults to 'sam_b.pt'. | 'sam_b.pt' |
| device | str, optional | Device to run the models on. Defaults to an empty string (CPU or GPU, if available). | |
| output_dir | str, None, optional | Directory to save the annotated results. Defaults to a 'labels' folder in the same directory as 'data'. | None |
The `auto_annotate` function takes the path to your images, with optional arguments for specifying the pre-trained detection and SAM segmentation models, the device to run the models on, and the output directory for saving the annotated results.
Auto-annotation with pre-trained models can dramatically cut down the time and effort required for creating high-quality segmentation datasets. This feature is especially beneficial for researchers and developers dealing with large image collections, as it allows them to focus on model development and evaluation rather than manual annotation.
## Citations and Acknowledgements
If you find SAM useful in your research or development work, please consider citing our paper:
!!! note ""
=== "BibTeX"
```bibtex
@misc{kirillov2023segment,
title={Segment Anything},
author={Alexander Kirillov and Eric Mintun and Nikhila Ravi and Hanzi Mao and Chloe Rolland and Laura Gustafson and Tete Xiao and Spencer Whitehead and Alexander C. Berg and Wan-Yen Lo and Piotr Dollár and Ross Girshick},
year={2023},
eprint={2304.02643},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
We would like to express our gratitude to Meta AI for creating and maintaining this valuable resource for the computer vision community.
*keywords: Segment Anything, Segment Anything Model, SAM, Meta SAM, image segmentation, promptable segmentation, zero-shot performance, SA-1B dataset, advanced architecture, auto-annotation, Ultralytics, pre-trained models, SAM base, SAM large, instance segmentation, computer vision, AI, artificial intelligence, machine learning, data annotation, segmentation masks, detection model, YOLO detection model, bibtex, Meta AI.*

@ -0,0 +1,127 @@
---
comments: true
description: Explore detailed documentation of YOLO-NAS, a superior object detection model. Learn about its features, pre-trained models, usage with Ultralytics Python API, and more.
keywords: YOLO-NAS, Deci AI, object detection, deep learning, neural architecture search, Ultralytics Python API, YOLO model, pre-trained models, quantization, optimization, COCO, Objects365, Roboflow 100
---
# YOLO-NAS
## Overview
Developed by Deci AI, YOLO-NAS is a groundbreaking object detection foundational model. It is the product of advanced Neural Architecture Search technology, meticulously designed to address the limitations of previous YOLO models. With significant improvements in quantization support and accuracy-latency trade-offs, YOLO-NAS represents a major leap in object detection.
![Model example image](https://learnopencv.com/wp-content/uploads/2023/05/yolo-nas_COCO_map_metrics.png)
**Overview of YOLO-NAS.** YOLO-NAS employs quantization-aware blocks and selective quantization for optimal performance. The model, when converted to its INT8 quantized version, experiences a minimal precision drop, a significant improvement over other models. These advancements culminate in a superior architecture with unprecedented object detection capabilities and outstanding performance.
### Key Features
- **Quantization-Friendly Basic Block:** YOLO-NAS introduces a new basic block that is friendly to quantization, addressing one of the significant limitations of previous YOLO models.
- **Sophisticated Training and Quantization:** YOLO-NAS leverages advanced training schemes and post-training quantization to enhance performance.
- **AutoNAC Optimization and Pre-training:** YOLO-NAS utilizes AutoNAC optimization and is pre-trained on prominent datasets such as COCO, Objects365, and Roboflow 100. This pre-training makes it extremely suitable for downstream object detection tasks in production environments.
## Pre-trained Models
Experience the power of next-generation object detection with the pre-trained YOLO-NAS models provided by Ultralytics. These models are designed to deliver top-notch performance in terms of both speed and accuracy. Choose from a variety of options tailored to your specific needs:
| Model | mAP | Latency (ms) |
|------------------|-------|--------------|
| YOLO-NAS S | 47.5 | 3.21 |
| YOLO-NAS M | 51.55 | 5.85 |
| YOLO-NAS L | 52.22 | 7.87 |
| YOLO-NAS S INT-8 | 47.03 | 2.36 |
| YOLO-NAS M INT-8 | 51.0 | 3.78 |
| YOLO-NAS L INT-8 | 52.1 | 4.78 |
Each model variant is designed to offer a balance between Mean Average Precision (mAP) and latency, helping you optimize your object detection tasks for both performance and speed.
## Usage
Ultralytics has made YOLO-NAS models easy to integrate into your Python applications via our `ultralytics` python package. The package provides a user-friendly Python API to streamline the process.
The following examples show how to use YOLO-NAS models with the `ultralytics` package for inference and validation:
### Inference and Validation Examples
In this example we validate YOLO-NAS-s on the COCO8 dataset.
!!! example ""
This example provides simple inference and validation code for YOLO-NAS. For handling inference results see [Predict](../modes/predict.md) mode. For using YOLO-NAS with additional modes see [Val](../modes/val.md) and [Export](../modes/export.md). YOLO-NAS on the `ultralytics` package does not support training.
=== "Python"
PyTorch pretrained `*.pt` models files can be passed to the `NAS()` class to create a model instance in python:
```python
from ultralytics import NAS
# Load a COCO-pretrained YOLO-NAS-s model
model = NAS('yolo_nas_s.pt')
# Display model information (optional)
model.info()
# Validate the model on the COCO8 example dataset
results = model.val(data='coco8.yaml')
# Run inference with the YOLO-NAS-s model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Load a COCO-pretrained YOLO-NAS-s model and validate it's performance on the COCO8 example dataset
yolo val model=yolo_nas_s.pt data=coco8.yaml
# Load a COCO-pretrained YOLO-NAS-s model and run inference on the 'bus.jpg' image
yolo predict model=yolo_nas_s.pt source=path/to/bus.jpg
```
### Supported Tasks
The YOLO-NAS models are primarily designed for object detection tasks. You can download the pre-trained weights for each variant of the model as follows:
| Model Type | Pre-trained Weights | Tasks Supported |
|------------|-----------------------------------------------------------------------------------------------|------------------|
| YOLO-NAS-s | [yolo_nas_s.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolo_nas_s.pt) | Object Detection |
| YOLO-NAS-m | [yolo_nas_m.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolo_nas_m.pt) | Object Detection |
| YOLO-NAS-l | [yolo_nas_l.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolo_nas_l.pt) | Object Detection |
### Supported Modes
The YOLO-NAS models support both inference and validation modes, allowing you to predict and validate results with ease. Training mode, however, is currently not supported.
| Mode | Supported |
|------------|--------------------|
| Inference | :heavy_check_mark: |
| Validation | :heavy_check_mark: |
| Training | :x: |
Harness the power of the YOLO-NAS models to drive your object detection tasks to new heights of performance and speed.
## Citations and Acknowledgements
If you employ YOLO-NAS in your research or development work, please cite SuperGradients:
!!! note ""
=== "BibTeX"
```bibtex
@misc{supergradients,
doi = {10.5281/ZENODO.7789328},
url = {https://zenodo.org/record/7789328},
author = {Aharon, Shay and {Louis-Dupont} and {Ofri Masad} and Yurkova, Kate and {Lotem Fridman} and {Lkdci} and Khvedchenya, Eugene and Rubin, Ran and Bagrov, Natan and Tymchenko, Borys and Keren, Tomer and Zhilko, Alexander and {Eran-Deci}},
title = {Super-Gradients},
publisher = {GitHub},
journal = {GitHub repository},
year = {2021},
}
```
We express our gratitude to Deci AI's [SuperGradients](https://github.com/Deci-AI/super-gradients/) team for their efforts in creating and maintaining this valuable resource for the computer vision community. We believe YOLO-NAS, with its innovative architecture and superior object detection capabilities, will become a critical tool for developers and researchers alike.
*Keywords: YOLO-NAS, Deci AI, object detection, deep learning, neural architecture search, Ultralytics Python API, YOLO model, SuperGradients, pre-trained models, quantization-friendly basic block, advanced training schemes, post-training quantization, AutoNAC optimization, COCO, Objects365, Roboflow 100*

@ -1,7 +1,107 @@
---
comments: true
description: Get an overview of YOLOv3, YOLOv3-Ultralytics and YOLOv3u. Learn about their key features, usage, and supported tasks for object detection.
keywords: YOLOv3, YOLOv3-Ultralytics, YOLOv3u, Object Detection, Inference, Training, Ultralytics
---
# 🚧Page Under Construction ⚒
# YOLOv3, YOLOv3-Ultralytics, and YOLOv3u
This page is currently under construction!👷Please check back later for updates. 😃🔜
## Overview
This document presents an overview of three closely related object detection models, namely [YOLOv3](https://pjreddie.com/darknet/yolo/), [YOLOv3-Ultralytics](https://github.com/ultralytics/yolov3), and [YOLOv3u](https://github.com/ultralytics/ultralytics).
1. **YOLOv3:** This is the third version of the You Only Look Once (YOLO) object detection algorithm. Originally developed by Joseph Redmon, YOLOv3 improved on its predecessors by introducing features such as multiscale predictions and three different sizes of detection kernels.
2. **YOLOv3-Ultralytics:** This is Ultralytics' implementation of the YOLOv3 model. It reproduces the original YOLOv3 architecture and offers additional functionalities, such as support for more pre-trained models and easier customization options.
3. **YOLOv3u:** This is an updated version of YOLOv3-Ultralytics that incorporates the anchor-free, objectness-free split head used in YOLOv8 models. YOLOv3u maintains the same backbone and neck architecture as YOLOv3 but with the updated detection head from YOLOv8.
![Ultralytics YOLOv3](https://raw.githubusercontent.com/ultralytics/assets/main/yolov3/banner-yolov3.png)
## Key Features
- **YOLOv3:** Introduced the use of three different scales for detection, leveraging three different sizes of detection kernels: 13x13, 26x26, and 52x52. This significantly improved detection accuracy for objects of different sizes. Additionally, YOLOv3 added features such as multi-label predictions for each bounding box and a better feature extractor network.
- **YOLOv3-Ultralytics:** Ultralytics' implementation of YOLOv3 provides the same performance as the original model but comes with added support for more pre-trained models, additional training methods, and easier customization options. This makes it more versatile and user-friendly for practical applications.
- **YOLOv3u:** This updated model incorporates the anchor-free, objectness-free split head from YOLOv8. By eliminating the need for pre-defined anchor boxes and objectness scores, this detection head design can improve the model's ability to detect objects of varying sizes and shapes. This makes YOLOv3u more robust and accurate for object detection tasks.
## Supported Tasks
YOLOv3, YOLOv3-Ultralytics, and YOLOv3u all support the following tasks:
- Object Detection
## Supported Modes
All three models support the following modes:
- Inference
- Validation
- Training
- Export
## Performance
Below is a comparison of the performance of the three models. The performance is measured in terms of the Mean Average Precision (mAP) on the COCO dataset:
TODO
## Usage
You can use YOLOv3 for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv3 model for inference:
!!! example ""
This example provides simple inference code for YOLOv3. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv3 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv3n model
model = YOLO('yolov3n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
# Run inference with the YOLOv3n model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Load a COCO-pretrained YOLOv3n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov3n.pt data=coco8.yaml epochs=100 imgsz=640
# Load a COCO-pretrained YOLOv3n model and run inference on the 'bus.jpg' image
yolo predict model=yolov3n.pt source=path/to/bus.jpg
```
## Citations and Acknowledgements
If you use YOLOv3 in your research, please cite the original YOLO papers and the Ultralytics YOLOv3 repository:
!!! note ""
=== "BibTeX"
```bibtex
@article{redmon2018yolov3,
title={YOLOv3: An Incremental Improvement},
author={Redmon, Joseph and Farhadi, Ali},
journal={arXiv preprint arXiv:1804.02767},
year={2018}
}
```
Thank you to Joseph Redmon and Ali Farhadi for developing the original YOLOv3.

@ -0,0 +1,71 @@
---
comments: true
description: Explore our detailed guide on YOLOv4, a state-of-the-art real-time object detector. Understand its architectural highlights, innovative features, and application examples.
keywords: ultralytics, YOLOv4, object detection, neural network, real-time detection, object detector, machine learning
---
# YOLOv4: High-Speed and Precise Object Detection
Welcome to the Ultralytics documentation page for YOLOv4, a state-of-the-art, real-time object detector launched in 2020 by Alexey Bochkovskiy at [https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet). YOLOv4 is designed to provide the optimal balance between speed and accuracy, making it an excellent choice for many applications.
![YOLOv4 architecture diagram](https://user-images.githubusercontent.com/26833433/246185689-530b7fe8-737b-4bb0-b5dd-de10ef5aface.png)
**YOLOv4 architecture diagram**. Showcasing the intricate network design of YOLOv4, including the backbone, neck, and head components, and their interconnected layers for optimal real-time object detection.
## Introduction
YOLOv4 stands for You Only Look Once version 4. It is a real-time object detection model developed to address the limitations of previous YOLO versions like [YOLOv3](./yolov3.md) and other object detection models. Unlike other convolutional neural network (CNN) based object detectors, YOLOv4 is not only applicable for recommendation systems but also for standalone process management and human input reduction. Its operation on conventional graphics processing units (GPUs) allows for mass usage at an affordable price, and it is designed to work in real-time on a conventional GPU while requiring only one such GPU for training.
## Architecture
YOLOv4 makes use of several innovative features that work together to optimize its performance. These include Weighted-Residual-Connections (WRC), Cross-Stage-Partial-connections (CSP), Cross mini-Batch Normalization (CmBN), Self-adversarial-training (SAT), Mish-activation, Mosaic data augmentation, DropBlock regularization, and CIoU loss. These features are combined to achieve state-of-the-art results.
A typical object detector is composed of several parts including the input, the backbone, the neck, and the head. The backbone of YOLOv4 is pre-trained on ImageNet and is used to predict classes and bounding boxes of objects. The backbone could be from several models including VGG, ResNet, ResNeXt, or DenseNet. The neck part of the detector is used to collect feature maps from different stages and usually includes several bottom-up paths and several top-down paths. The head part is what is used to make the final object detections and classifications.
## Bag of Freebies
YOLOv4 also makes use of methods known as "bag of freebies," which are techniques that improve the accuracy of the model during training without increasing the cost of inference. Data augmentation is a common bag of freebies technique used in object detection, which increases the variability of the input images to improve the robustness of the model. Some examples of data augmentation include photometric distortions (adjusting the brightness, contrast, hue, saturation, and noise of an image) and geometric distortions (adding random scaling, cropping, flipping, and rotating). These techniques help the model to generalize better to different types of images.
## Features and Performance
YOLOv4 is designed for optimal speed and accuracy in object detection. The architecture of YOLOv4 includes CSPDarknet53 as the backbone, PANet as the neck, and YOLOv3 as the detection head. This design allows YOLOv4 to perform object detection at an impressive speed, making it suitable for real-time applications. YOLOv4 also excels in accuracy, achieving state-of-the-art results in object detection benchmarks.
## Usage Examples
As of the time of writing, Ultralytics does not currently support YOLOv4 models. Therefore, any users interested in using YOLOv4 will need to refer directly to the YOLOv4 GitHub repository for installation and usage instructions.
Here is a brief overview of the typical steps you might take to use YOLOv4:
1. Visit the YOLOv4 GitHub repository: [https://github.com/AlexeyAB/darknet](https://github.com/AlexeyAB/darknet).
2. Follow the instructions provided in the README file for installation. This typically involves cloning the repository, installing necessary dependencies, and setting up any necessary environment variables.
3. Once installation is complete, you can train and use the model as per the usage instructions provided in the repository. This usually involves preparing your dataset, configuring the model parameters, training the model, and then using the trained model to perform object detection.
Please note that the specific steps may vary depending on your specific use case and the current state of the YOLOv4 repository. Therefore, it is strongly recommended to refer directly to the instructions provided in the YOLOv4 GitHub repository.
We regret any inconvenience this may cause and will strive to update this document with usage examples for Ultralytics once support for YOLOv4 is implemented.
## Conclusion
YOLOv4 is a powerful and efficient object detection model that strikes a balance between speed and accuracy. Its use of unique features and bag of freebies techniques during training allows it to perform excellently in real-time object detection tasks. YOLOv4 can be trained and used by anyone with a conventional GPU, making it accessible and practical for a wide range of applications.
## Citations and Acknowledgements
We would like to acknowledge the YOLOv4 authors for their significant contributions in the field of real-time object detection:
!!! note ""
=== "BibTeX"
```bibtex
@misc{bochkovskiy2020yolov4,
title={YOLOv4: Optimal Speed and Accuracy of Object Detection},
author={Alexey Bochkovskiy and Chien-Yao Wang and Hong-Yuan Mark Liao},
year={2020},
eprint={2004.10934},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
The original YOLOv4 paper can be found on [arXiv](https://arxiv.org/pdf/2004.10934.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/AlexeyAB/darknet). We appreciate their efforts in advancing the field and making their work accessible to the broader community.

@ -1,12 +1,24 @@
---
comments: true
description: Discover YOLOv5u, a boosted version of the YOLOv5 model featuring an improved accuracy-speed tradeoff and numerous pre-trained models for various object detection tasks.
keywords: YOLOv5u, object detection, pre-trained models, Ultralytics, Inference, Validation, YOLOv5, YOLOv8, anchor-free, objectness-free, real-time applications, machine learning
---
# YOLOv5u
# YOLOv5
## About
## Overview
Anchor-free YOLOv5 models with improved accuracy-speed tradeoff.
YOLOv5u represents an advancement in object detection methodologies. Originating from the foundational architecture of the [YOLOv5](https://github.com/ultralytics/yolov5) model developed by Ultralytics, YOLOv5u integrates the anchor-free, objectness-free split head, a feature previously introduced in the [YOLOv8](./yolov8.md) models. This adaptation refines the model's architecture, leading to an improved accuracy-speed tradeoff in object detection tasks. Given the empirical results and its derived features, YOLOv5u provides an efficient alternative for those seeking robust solutions in both research and practical applications.
![Ultralytics YOLOv5](https://raw.githubusercontent.com/ultralytics/assets/main/yolov5/v70/splash.png)
## Key Features
- **Anchor-free Split Ultralytics Head:** Traditional object detection models rely on predefined anchor boxes to predict object locations. However, YOLOv5u modernizes this approach. By adopting an anchor-free split Ultralytics head, it ensures a more flexible and adaptive detection mechanism, consequently enhancing the performance in diverse scenarios.
- **Optimized Accuracy-Speed Tradeoff:** Speed and accuracy often pull in opposite directions. But YOLOv5u challenges this tradeoff. It offers a calibrated balance, ensuring real-time detections without compromising on accuracy. This feature is particularly invaluable for applications that demand swift responses, such as autonomous vehicles, robotics, and real-time video analytics.
- **Variety of Pre-trained Models:** Understanding that different tasks require different toolsets, YOLOv5u provides a plethora of pre-trained models. Whether you're focusing on Inference, Validation, or Training, there's a tailor-made model awaiting you. This variety ensures you're not just using a one-size-fits-all solution, but a model specifically fine-tuned for your unique challenge.
## Supported Tasks
@ -22,20 +34,82 @@ Anchor-free YOLOv5 models with improved accuracy-speed tradeoff.
| Validation | :heavy_check_mark: |
| Training | :heavy_check_mark: |
??? Performance
!!! Performance
=== "Detection"
| Model | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
| ---------------------------------------------------------------------------------------- | --------------------- | -------------------- | ------------------------------ | ----------------------------------- | ------------------ | ----------------- |
| [YOLOv5nu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5nu.pt) | 640 | 34.3 | 73.6 | 1.06 | 2.6 | 7.7 |
| [YOLOv5su](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5su.pt) | 640 | 43.0 | 120.7 | 1.27 | 9.1 | 24.0 |
| [YOLOv5mu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5mu.pt) | 640 | 49.0 | 233.9 | 1.86 | 25.1 | 64.2 |
| [YOLOv5lu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5lu.pt) | 640 | 52.2 | 408.4 | 2.50 | 53.2 | 135.0 |
| [YOLOv5xu](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5xu.pt) | 640 | 53.2 | 763.2 | 3.81 | 97.2 | 246.4 |
| | | | | | | |
| [YOLOv5n6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5n6u.pt) | 1280 | 42.1 | - | - | 4.3 | 7.8 |
| [YOLOv5s6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5s6u.pt) | 1280 | 48.6 | - | - | 15.3 | 24.6 |
| [YOLOv5m6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5m6u.pt) | 1280 | 53.6 | - | - | 41.2 | 65.7 |
| [YOLOv5l6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5l6u.pt) | 1280 | 55.7 | - | - | 86.1 | 137.4 |
| [YOLOv5x6u](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5x6u.pt) | 1280 | 56.8 | - | - | 155.4 | 250.7 |
| Model | YAML | size<br><sup>(pixels) | mAP<sup>val<br>50-95 | Speed<br><sup>CPU ONNX<br>(ms) | Speed<br><sup>A100 TensorRT<br>(ms) | params<br><sup>(M) | FLOPs<br><sup>(B) |
|---------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|-----------------------|----------------------|--------------------------------|-------------------------------------|--------------------|-------------------|
| [yolov5nu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5nu.pt) | [yolov5n.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 34.3 | 73.6 | 1.06 | 2.6 | 7.7 |
| [yolov5su.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5su.pt) | [yolov5s.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 43.0 | 120.7 | 1.27 | 9.1 | 24.0 |
| [yolov5mu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5mu.pt) | [yolov5m.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 49.0 | 233.9 | 1.86 | 25.1 | 64.2 |
| [yolov5lu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5lu.pt) | [yolov5l.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 52.2 | 408.4 | 2.50 | 53.2 | 135.0 |
| [yolov5xu.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5xu.pt) | [yolov5x.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5.yaml) | 640 | 53.2 | 763.2 | 3.81 | 97.2 | 246.4 |
| | | | | | | | |
| [yolov5n6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5n6u.pt) | [yolov5n6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 42.1 | 211.0 | 1.83 | 4.3 | 7.8 |
| [yolov5s6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5s6u.pt) | [yolov5s6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 48.6 | 422.6 | 2.34 | 15.3 | 24.6 |
| [yolov5m6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5m6u.pt) | [yolov5m6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 53.6 | 810.9 | 4.36 | 41.2 | 65.7 |
| [yolov5l6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5l6u.pt) | [yolov5l6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 55.7 | 1470.9 | 5.47 | 86.1 | 137.4 |
| [yolov5x6u.pt](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov5x6u.pt) | [yolov5x6.yaml](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/models/v5/yolov5-p6.yaml) | 1280 | 56.8 | 2436.5 | 8.98 | 155.4 | 250.7 |
## Usage
You can use YOLOv5u for object detection tasks using the Ultralytics repository. The following is a sample code snippet showing how to use YOLOv5u model for inference:
!!! example ""
This example provides simple inference code for YOLOv5. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv5 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv5n model
model = YOLO('yolov5n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
# Run inference with the YOLOv5n model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Load a COCO-pretrained YOLOv5n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov5n.pt data=coco8.yaml epochs=100 imgsz=640
# Load a COCO-pretrained YOLOv5n model and run inference on the 'bus.jpg' image
yolo predict model=yolov5n.pt source=path/to/bus.jpg
```
## Citations and Acknowledgements
If you use YOLOv5 or YOLOv5u in your research, please cite the Ultralytics YOLOv5 repository as follows:
!!! note ""
=== "BibTeX"
```bibtex
@software{yolov5,
title = {Ultralytics YOLOv5},
author = {Glenn Jocher},
year = {2020},
version = {7.0},
license = {AGPL-3.0},
url = {https://github.com/ultralytics/yolov5},
doi = {10.5281/zenodo.3908559},
orcid = {0000-0001-5950-6979}
}
```
Special thanks to Glenn Jocher and the Ultralytics team for their work on developing and maintaining the YOLOv5 and YOLOv5u models.

@ -0,0 +1,114 @@
---
comments: true
description: Explore Meituan YOLOv6, a state-of-the-art object detection model striking a balance between speed and accuracy. Dive into features, pre-trained models, and Python usage.
keywords: Meituan YOLOv6, object detection, Ultralytics, YOLOv6 docs, Bi-directional Concatenation, Anchor-Aided Training, pretrained models, real-time applications
---
# Meituan YOLOv6
## Overview
[Meituan](https://about.meituan.com/) YOLOv6 is a cutting-edge object detector that offers remarkable balance between speed and accuracy, making it a popular choice for real-time applications. This model introduces several notable enhancements on its architecture and training scheme, including the implementation of a Bi-directional Concatenation (BiC) module, an anchor-aided training (AAT) strategy, and an improved backbone and neck design for state-of-the-art accuracy on the COCO dataset.
![Meituan YOLOv6](https://user-images.githubusercontent.com/26833433/240750495-4da954ce-8b3b-41c4-8afd-ddb74361d3c2.png)
![Model example image](https://user-images.githubusercontent.com/26833433/240750557-3e9ec4f0-0598-49a8-83ea-f33c91eb6d68.png)
**Overview of YOLOv6.** Model architecture diagram showing the redesigned network components and training strategies that have led to significant performance improvements. (a) The neck of YOLOv6 (N and S are shown). Note for M/L, RepBlocks is replaced with CSPStackRep. (b) The
structure of a BiC module. (c) A SimCSPSPPF block. ([source](https://arxiv.org/pdf/2301.05586.pdf)).
### Key Features
- **Bidirectional Concatenation (BiC) Module:** YOLOv6 introduces a BiC module in the neck of the detector, enhancing localization signals and delivering performance gains with negligible speed degradation.
- **Anchor-Aided Training (AAT) Strategy:** This model proposes AAT to enjoy the benefits of both anchor-based and anchor-free paradigms without compromising inference efficiency.
- **Enhanced Backbone and Neck Design:** By deepening YOLOv6 to include another stage in the backbone and neck, this model achieves state-of-the-art performance on the COCO dataset at high-resolution input.
- **Self-Distillation Strategy:** A new self-distillation strategy is implemented to boost the performance of smaller models of YOLOv6, enhancing the auxiliary regression branch during training and removing it at inference to avoid a marked speed decline.
## Pre-trained Models
YOLOv6 provides various pre-trained models with different scales:
- YOLOv6-N: 37.5% AP on COCO val2017 at 1187 FPS with NVIDIA Tesla T4 GPU.
- YOLOv6-S: 45.0% AP at 484 FPS.
- YOLOv6-M: 50.0% AP at 226 FPS.
- YOLOv6-L: 52.8% AP at 116 FPS.
- YOLOv6-L6: State-of-the-art accuracy in real-time.
YOLOv6 also provides quantized models for different precisions and models optimized for mobile platforms.
## Usage
You can use YOLOv6 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv6 models for training:
!!! example ""
This example provides simple training code for YOLOv6. For more options including training settings see [Train](../modes/train.md) mode. For using YOLOv6 with additional modes see [Predict](../modes/predict.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
```python
from ultralytics import YOLO
# Build a YOLOv6n model from scratch
model = YOLO('yolov6n.yaml')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
# Run inference with the YOLOv6n model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Build a YOLOv6n model from scratch and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov6n.yaml data=coco8.yaml epochs=100 imgsz=640
# Build a YOLOv6n model from scratch and run inference on the 'bus.jpg' image
yolo predict model=yolov6n.yaml source=path/to/bus.jpg
```
### Supported Tasks
| Model Type | Pre-trained Weights | Tasks Supported |
|------------|---------------------|------------------|
| YOLOv6-N | `yolov6-n.pt` | Object Detection |
| YOLOv6-S | `yolov6-s.pt` | Object Detection |
| YOLOv6-M | `yolov6-m.pt` | Object Detection |
| YOLOv6-L | `yolov6-l.pt` | Object Detection |
| YOLOv6-L6 | `yolov6-l6.pt` | Object Detection |
## Supported Modes
| Mode | Supported |
|------------|--------------------|
| Inference | :heavy_check_mark: |
| Validation | :heavy_check_mark: |
| Training | :heavy_check_mark: |
## Citations and Acknowledgements
We would like to acknowledge the authors for their significant contributions in the field of real-time object detection:
!!! note ""
=== "BibTeX"
```bibtex
@misc{li2023yolov6,
title={YOLOv6 v3.0: A Full-Scale Reloading},
author={Chuyi Li and Lulu Li and Yifei Geng and Hongliang Jiang and Meng Cheng and Bo Zhang and Zaidan Ke and Xiaoming Xu and Xiangxiang Chu},
year={2023},
eprint={2301.05586},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
```
The original YOLOv6 paper can be found on [arXiv](https://arxiv.org/abs/2301.05586). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/meituan/YOLOv6). We appreciate their efforts in advancing the field and making their work accessible to the broader community.

@ -0,0 +1,65 @@
---
comments: true
description: Explore the YOLOv7, a real-time object detector. Understand its superior speed, impressive accuracy, and unique trainable bag-of-freebies optimization focus.
keywords: YOLOv7, real-time object detector, state-of-the-art, Ultralytics, MS COCO dataset, model re-parameterization, dynamic label assignment, extended scaling, compound scaling
---
# YOLOv7: Trainable Bag-of-Freebies
YOLOv7 is a state-of-the-art real-time object detector that surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS. It has the highest accuracy (56.8% AP) among all known real-time object detectors with 30 FPS or higher on GPU V100. Moreover, YOLOv7 outperforms other object detectors such as YOLOR, YOLOX, Scaled-YOLOv4, YOLOv5, and many others in speed and accuracy. The model is trained on the MS COCO dataset from scratch without using any other datasets or pre-trained weights. Source code for YOLOv7 is available on GitHub.
![YOLOv7 comparison with SOTA object detectors](https://github.com/ultralytics/ultralytics/assets/26833433/5e1e0420-8122-4c79-b8d0-2860aa79af92)
**Comparison of state-of-the-art object detectors.** From the results in Table 2 we know that the proposed method has the best speed-accuracy trade-off comprehensively. If we compare YOLOv7-tiny-SiLU with YOLOv5-N (r6.1), our method is 127 fps faster and 10.7% more accurate on AP. In addition, YOLOv7 has 51.4% AP at frame rate of 161 fps, while PPYOLOE-L with the same AP has only 78 fps frame rate. In terms of parameter usage, YOLOv7 is 41% less than PPYOLOE-L. If we compare YOLOv7-X with 114 fps inference speed to YOLOv5-L (r6.1) with 99 fps inference speed, YOLOv7-X can improve AP by 3.9%. If YOLOv7-X is compared with YOLOv5-X (r6.1) of similar scale, the inference speed of YOLOv7-X is 31 fps faster. In addition, in terms the amount of parameters and computation, YOLOv7-X reduces 22% of parameters and 8% of computation compared to YOLOv5-X (r6.1), but improves AP by 2.2% ([Source](https://arxiv.org/pdf/2207.02696.pdf)).
## Overview
Real-time object detection is an important component in many computer vision systems, including multi-object tracking, autonomous driving, robotics, and medical image analysis. In recent years, real-time object detection development has focused on designing efficient architectures and improving the inference speed of various CPUs, GPUs, and neural processing units (NPUs). YOLOv7 supports both mobile GPU and GPU devices, from the edge to the cloud.
Unlike traditional real-time object detectors that focus on architecture optimization, YOLOv7 introduces a focus on the optimization of the training process. This includes modules and optimization methods designed to improve the accuracy of object detection without increasing the inference cost, a concept known as the "trainable bag-of-freebies".
## Key Features
YOLOv7 introduces several key features:
1. **Model Re-parameterization**: YOLOv7 proposes a planned re-parameterized model, which is a strategy applicable to layers in different networks with the concept of gradient propagation path.
2. **Dynamic Label Assignment**: The training of the model with multiple output layers presents a new issue: "How to assign dynamic targets for the outputs of different branches?" To solve this problem, YOLOv7 introduces a new label assignment method called coarse-to-fine lead guided label assignment.
3. **Extended and Compound Scaling**: YOLOv7 proposes "extend" and "compound scaling" methods for the real-time object detector that can effectively utilize parameters and computation.
4. **Efficiency**: The method proposed by YOLOv7 can effectively reduce about 40% parameters and 50% computation of state-of-the-art real-time object detector, and has faster inference speed and higher detection accuracy.
## Usage Examples
As of the time of writing, Ultralytics does not currently support YOLOv7 models. Therefore, any users interested in using YOLOv7 will need to refer directly to the YOLOv7 GitHub repository for installation and usage instructions.
Here is a brief overview of the typical steps you might take to use YOLOv7:
1. Visit the YOLOv7 GitHub repository: [https://github.com/WongKinYiu/yolov7](https://github.com/WongKinYiu/yolov7).
2. Follow the instructions provided in the README file for installation. This typically involves cloning the repository, installing necessary dependencies, and setting up any necessary environment variables.
3. Once installation is complete, you can train and use the model as per the usage instructions provided in the repository. This usually involves preparing your dataset, configuring the model parameters, training the model, and then using the trained model to perform object detection.
Please note that the specific steps may vary depending on your specific use case and the current state of the YOLOv7 repository. Therefore, it is strongly recommended to refer directly to the instructions provided in the YOLOv7 GitHub repository.
We regret any inconvenience this may cause and will strive to update this document with usage examples for Ultralytics once support for YOLOv7 is implemented.
## Citations and Acknowledgements
We would like to acknowledge the YOLOv7 authors for their significant contributions in the field of real-time object detection:
!!! note ""
=== "BibTeX"
```bibtex
@article{wang2022yolov7,
title={{YOLOv7}: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors},
author={Wang, Chien-Yao and Bochkovskiy, Alexey and Liao, Hong-Yuan Mark},
journal={arXiv preprint arXiv:2207.02696},
year={2022}
}
```
The original YOLOv7 paper can be found on [arXiv](https://arxiv.org/pdf/2207.02696.pdf). The authors have made their work publicly available, and the codebase can be accessed on [GitHub](https://github.com/WongKinYiu/yolov7). We appreciate their efforts in advancing the field and making their work accessible to the broader community.

@ -1,19 +1,32 @@
---
comments: true
description: Explore the thrilling features of YOLOv8, the latest version of our real-time object detector! Learn how advanced architectures, pre-trained models and optimal balance between accuracy & speed make YOLOv8 the perfect choice for your object detection tasks.
keywords: YOLOv8, Ultralytics, real-time object detector, pre-trained models, documentation, object detection, YOLO series, advanced architectures, accuracy, speed
---
# YOLOv8
## About
## Overview
YOLOv8 is the latest iteration in the YOLO series of real-time object detectors, offering cutting-edge performance in terms of accuracy and speed. Building upon the advancements of previous YOLO versions, YOLOv8 introduces new features and optimizations that make it an ideal choice for various object detection tasks in a wide range of applications.
![Ultralytics YOLOv8](https://raw.githubusercontent.com/ultralytics/assets/main/yolov8/yolo-comparison-plots.png)
## Key Features
- **Advanced Backbone and Neck Architectures:** YOLOv8 employs state-of-the-art backbone and neck architectures, resulting in improved feature extraction and object detection performance.
- **Anchor-free Split Ultralytics Head:** YOLOv8 adopts an anchor-free split Ultralytics head, which contributes to better accuracy and a more efficient detection process compared to anchor-based approaches.
- **Optimized Accuracy-Speed Tradeoff:** With a focus on maintaining an optimal balance between accuracy and speed, YOLOv8 is suitable for real-time object detection tasks in diverse application areas.
- **Variety of Pre-trained Models:** YOLOv8 offers a range of pre-trained models to cater to various tasks and performance requirements, making it easier to find the right model for your specific use case.
## Supported Tasks
| Model Type | Pre-trained Weights | Task |
|-------------|------------------------------------------------------------------------------------------------------------------|-----------------------|
| YOLOv8 | `yolov8n.pt`, `yolov8s.pt`, `yolov8m.pt`, `yolov8l.pt`, `yolov8x.pt` | Detection |
| YOLOv8-seg | `yolov8n-seg.pt`, `yolov8s-seg.pt`, `yolov8m-seg.pt`, `yolov8l-seg.pt`, `yolov8x-seg.pt` | Instance Segmentation |
| YOLOv8-pose | `yolov8n-pose.pt`, `yolov8s-pose.pt`, `yolov8m-pose.pt`, `yolov8l-pose.pt`, `yolov8x-pose.pt` ,`yolov8x-pose-p6` | Pose/Keypoints |
| YOLOv8-cls | `yolov8n-cls.pt`, `yolov8s-cls.pt`, `yolov8m-cls.pt`, `yolov8l-cls.pt`, `yolov8x-cls.pt` | Classification |
| Model Type | Pre-trained Weights | Task |
|-------------|---------------------------------------------------------------------------------------------------------------------|-----------------------|
| YOLOv8 | `yolov8n.pt`, `yolov8s.pt`, `yolov8m.pt`, `yolov8l.pt`, `yolov8x.pt` | Detection |
| YOLOv8-seg | `yolov8n-seg.pt`, `yolov8s-seg.pt`, `yolov8m-seg.pt`, `yolov8l-seg.pt`, `yolov8x-seg.pt` | Instance Segmentation |
| YOLOv8-pose | `yolov8n-pose.pt`, `yolov8s-pose.pt`, `yolov8m-pose.pt`, `yolov8l-pose.pt`, `yolov8x-pose.pt`, `yolov8x-pose-p6.pt` | Pose/Keypoints |
| YOLOv8-cls | `yolov8n-cls.pt`, `yolov8s-cls.pt`, `yolov8m-cls.pt`, `yolov8l-cls.pt`, `yolov8x-cls.pt` | Classification |
## Supported Modes
@ -23,7 +36,7 @@ comments: true
| Validation | :heavy_check_mark: |
| Training | :heavy_check_mark: |
??? Performance
!!! Performance
=== "Detection"
@ -65,3 +78,65 @@ comments: true
| [YOLOv8l-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8l-pose.pt) | 640 | 67.6 | 90.0 | 784.5 | 2.59 | 44.4 | 168.6 |
| [YOLOv8x-pose](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-pose.pt) | 640 | 69.2 | 90.2 | 1607.1 | 3.73 | 69.4 | 263.2 |
| [YOLOv8x-pose-p6](https://github.com/ultralytics/assets/releases/download/v0.0.0/yolov8x-pose-p6.pt) | 1280 | 71.6 | 91.2 | 4088.7 | 10.04 | 99.1 | 1066.4 |
## Usage
You can use YOLOv8 for object detection tasks using the Ultralytics pip package. The following is a sample code snippet showing how to use YOLOv8 models for inference:
!!! example ""
This example provides simple inference code for YOLOv8. For more options including handling inference results see [Predict](../modes/predict.md) mode. For using YOLOv8 with additional modes see [Train](../modes/train.md), [Val](../modes/val.md) and [Export](../modes/export.md).
=== "Python"
PyTorch pretrained `*.pt` models as well as configuration `*.yaml` files can be passed to the `YOLO()` class to create a model instance in python:
```python
from ultralytics import YOLO
# Load a COCO-pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Display model information (optional)
model.info()
# Train the model on the COCO8 example dataset for 100 epochs
results = model.train(data='coco8.yaml', epochs=100, imgsz=640)
# Run inference with the YOLOv8n model on the 'bus.jpg' image
results = model('path/to/bus.jpg')
```
=== "CLI"
CLI commands are available to directly run the models:
```bash
# Load a COCO-pretrained YOLOv8n model and train it on the COCO8 example dataset for 100 epochs
yolo train model=yolov8n.pt data=coco8.yaml epochs=100 imgsz=640
# Load a COCO-pretrained YOLOv8n model and run inference on the 'bus.jpg' image
yolo predict model=yolov8n.pt source=path/to/bus.jpg
```
## Citations and Acknowledgements
If you use the YOLOv8 model or any other software from this repository in your work, please cite it using the following format:
!!! note ""
=== "BibTeX"
```bibtex
@software{yolov8_ultralytics,
author = {Glenn Jocher and Ayush Chaurasia and Jing Qiu},
title = {Ultralytics YOLOv8},
version = {8.0.0},
year = {2023},
url = {https://github.com/ultralytics/ultralytics},
orcid = {0000-0001-5950-6979, 0000-0002-7603-6750, 0000-0003-3783-7069},
license = {AGPL-3.0}
}
```
Please note that the DOI is pending and will be added to the citation once it is available. The usage of the software is in accordance with the AGPL-3.0 license.

@ -1,5 +1,7 @@
---
comments: true
description: Learn how to profile speed and accuracy of YOLOv8 across various export formats; get insights on mAP50-95, accuracy_top5 metrics, and more.
keywords: Ultralytics, YOLOv8, benchmarking, speed profiling, accuracy profiling, mAP50-95, accuracy_top5, ONNX, OpenVINO, TensorRT, YOLO export formats
---
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
@ -23,50 +25,52 @@ full list of export arguments.
!!! example ""
=== "Python"
```python
from ultralytics.yolo.utils.benchmarks import benchmark
from ultralytics.utils.benchmarks import benchmark
# Benchmark on GPU
benchmark(model='yolov8n.pt', imgsz=640, half=False, device=0)
benchmark(model='yolov8n.pt', data='coco8.yaml', imgsz=640, half=False, device=0)
```
=== "CLI"
```bash
yolo benchmark model=yolov8n.pt imgsz=640 half=False device=0
yolo benchmark model=yolov8n.pt data='coco8.yaml' imgsz=640 half=False device=0
```
## Arguments
Arguments such as `model`, `imgsz`, `half`, `device`, and `hard_fail` provide users with the flexibility to fine-tune
Arguments such as `model`, `data`, `imgsz`, `half`, `device`, and `verbose` provide users with the flexibility to fine-tune
the benchmarks to their specific needs and compare the performance of different export formats with ease.
| Key | Value | Description |
|-------------|---------|----------------------------------------------------------------------|
| `model` | `None` | path to model file, i.e. yolov8n.pt, yolov8n.yaml |
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
| `half` | `False` | FP16 quantization |
| `int8` | `False` | INT8 quantization |
| `device` | `None` | device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
| `hard_fail` | `False` | do not continue on error (bool), or val floor threshold (float) |
| Key | Value | Description |
|-----------|---------|-----------------------------------------------------------------------|
| `model` | `None` | path to model file, i.e. yolov8n.pt, yolov8n.yaml |
| `data` | `None` | path to YAML referencing the benchmarking dataset (under `val` label) |
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
| `half` | `False` | FP16 quantization |
| `int8` | `False` | INT8 quantization |
| `device` | `None` | device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
| `verbose` | `False` | do not continue on error (bool), or val floor threshold (float) |
## Export Formats
Benchmarks will attempt to run automatically on all possible export formats below.
| Format | `format` Argument | Model | Metadata |
|--------------------------------------------------------------------|-------------------|---------------------------|----------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ |
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ |
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ |
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ |
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ |
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlmodel` | ✅ |
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ |
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ |
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ |
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ |
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ |
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ |
See full `export` details in the [Export](https://docs.ultralytics.com/modes/export/) page.
| Format | `format` Argument | Model | Metadata | Arguments |
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
| [ncnn](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |
See full `export` details in the [Export](https://docs.ultralytics.com/modes/export/) page.

@ -1,5 +1,7 @@
---
comments: true
description: Step-by-step guide on exporting your YOLOv8 models to various format like ONNX, TensorRT, CoreML and more for deployment. Explore now!.
keywords: YOLO, YOLOv8, Ultralytics, Model export, ONNX, TensorRT, CoreML, TensorFlow SavedModel, OpenVINO, PyTorch, export model
---
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
@ -21,19 +23,19 @@ export arguments.
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load an official model
model = YOLO('path/to/best.pt') # load a custom trained
# Export the model
model.export(format='onnx')
```
=== "CLI"
```bash
yolo export model=yolov8n.pt format=onnx # export official model
yolo export model=path/to/best.pt format=onnx # export custom trained model
@ -58,8 +60,8 @@ the intended use case and can be used effectively in the target environment.
| `optimize` | `False` | TorchScript: optimize for mobile |
| `half` | `False` | FP16 quantization |
| `int8` | `False` | INT8 quantization |
| `dynamic` | `False` | ONNX/TF/TensorRT: dynamic axes |
| `simplify` | `False` | ONNX: simplify model |
| `dynamic` | `False` | ONNX/TensorRT: dynamic axes |
| `simplify` | `False` | ONNX/TensorRT: simplify model |
| `opset` | `None` | ONNX: opset version (optional, defaults to latest) |
| `workspace` | `4` | TensorRT: workspace size (GB) |
| `nms` | `False` | CoreML: add NMS |
@ -69,17 +71,18 @@ the intended use case and can be used effectively in the target environment.
Available YOLOv8 export formats are in the table below. You can export to any format using the `format` argument,
i.e. `format='onnx'` or `format='engine'`.
| Format | `format` Argument | Model | Metadata |
|--------------------------------------------------------------------|-------------------|---------------------------|----------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ |
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ |
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ |
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ |
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ |
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlmodel` | ✅ |
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ |
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ |
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ |
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ |
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ |
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ |
| Format | `format` Argument | Model | Metadata | Arguments |
|--------------------------------------------------------------------|-------------------|---------------------------|----------|-----------------------------------------------------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ | - |
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ | `imgsz`, `optimize` |
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `opset` |
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ | `imgsz`, `half` |
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ | `imgsz`, `half`, `dynamic`, `simplify`, `workspace` |
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlpackage` | ✅ | `imgsz`, `half`, `int8`, `nms` |
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ | `imgsz`, `keras` |
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ | `imgsz` |
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ | `imgsz`, `half`, `int8` |
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ | `imgsz` |
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ | `imgsz` |
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ | `imgsz` |
| [ncnn](https://github.com/Tencent/ncnn) | `ncnn` | `yolov8n_ncnn_model/` | ✅ | `imgsz`, `half` |

@ -1,5 +1,7 @@
---
comments: true
description: From training to tracking, make the most of YOLOv8 with Ultralytics. Get insights and examples for each supported mode including validation, export, and benchmarking.
keywords: Ultralytics, YOLOv8, Machine Learning, Object Detection, Training, Validation, Prediction, Export, Tracking, Benchmarking
---
# Ultralytics YOLOv8 Modes
@ -8,12 +10,12 @@ comments: true
Ultralytics YOLOv8 supports several **modes** that can be used to perform different tasks. These modes are:
**Train**: For training a YOLOv8 model on a custom dataset.
**Val**: For validating a YOLOv8 model after it has been trained.
**Predict**: For making predictions using a trained YOLOv8 model on new images or videos.
**Export**: For exporting a YOLOv8 model to a format that can be used for deployment.
**Track**: For tracking objects in real-time using a YOLOv8 model.
**Benchmark**: For benchmarking YOLOv8 exports (ONNX, TensorRT, etc.) speed and accuracy.
- **Train**: For training a YOLOv8 model on a custom dataset.
- **Val**: For validating a YOLOv8 model after it has been trained.
- **Predict**: For making predictions using a trained YOLOv8 model on new images or videos.
- **Export**: For exporting a YOLOv8 model to a format that can be used for deployment.
- **Track**: For tracking objects in real-time using a YOLOv8 model.
- **Benchmark**: For benchmarking YOLOv8 exports (ONNX, TensorRT, etc.) speed and accuracy.
## [Train](train.md)

@ -1,104 +1,336 @@
---
comments: true
description: Discover how to use YOLOv8 predict mode for various tasks. Learn about different inference sources like images, videos, and data formats.
keywords: Ultralytics, YOLOv8, predict mode, inference sources, prediction tasks, streaming mode, image processing, video processing, machine learning, AI
---
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
YOLOv8 **predict mode** can generate predictions for various tasks, returning either a list of `Results` objects or a
memory-efficient generator of `Results` objects when using the streaming mode. Enable streaming mode by
passing `stream=True` in the predictor's call method.
YOLOv8 **predict mode** can generate predictions for various tasks, returning either a list of `Results` objects or a memory-efficient generator of `Results` objects when using the streaming mode. Enable streaming mode by passing `stream=True` in the predictor's call method.
!!! example "Predict"
=== "Return a list with `Stream=False`"
=== "Return a list with `stream=False`"
```python
inputs = [img, img] # list of numpy arrays
results = model(inputs) # list of Results objects
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model(['im1.jpg', 'im2.jpg']) # return a list of Results objects
# Process results list
for result in results:
boxes = result.boxes # Boxes object for bbox outputs
masks = result.masks # Masks object for segmentation masks outputs
probs = result.probs # Class probabilities for classification outputs
keypoints = result.keypoints # Keypoints object for pose outputs
probs = result.probs # Probs object for classification outputs
```
=== "Return a generator with `Stream=True`"
=== "Return a generator with `stream=True`"
```python
inputs = [img, img] # list of numpy arrays
results = model(inputs, stream=True) # generator of Results objects
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # pretrained YOLOv8n model
# Run batched inference on a list of images
results = model(['im1.jpg', 'im2.jpg'], stream=True) # return a generator of Results objects
# Process results generator
for result in results:
boxes = result.boxes # Boxes object for bbox outputs
masks = result.masks # Masks object for segmentation masks outputs
probs = result.probs # Class probabilities for classification outputs
keypoints = result.keypoints # Keypoints object for pose outputs
probs = result.probs # Probs object for classification outputs
```
## Inference Sources
YOLOv8 can process different types of input sources for inference, as shown in the table below. The sources include static images, video streams, and various data formats. The table also indicates whether each source can be used in streaming mode with the argument `stream=True` ✅. Streaming mode is beneficial for processing videos or live streams as it creates a generator of results instead of loading all frames into memory.
!!! tip "Tip"
Streaming mode with `stream=True` should be used for long videos or large predict sources, otherwise results will accumuate in memory and will eventually cause out-of-memory errors.
## Sources
YOLOv8 can accept various input sources, as shown in the table below. This includes images, URLs, PIL images, OpenCV,
numpy arrays, torch tensors, CSV files, videos, directories, globs, YouTube videos, and streams. The table indicates
whether each source can be used in streaming mode with `stream=True` ✅ and an example argument for each source.
| source | model(arg) | type | notes |
|-------------|--------------------------------------------|----------------|------------------|
| image | `'im.jpg'` | `str`, `Path` | |
| URL | `'https://ultralytics.com/images/bus.jpg'` | `str` | |
| screenshot | `'screen'` | `str` | |
| PIL | `Image.open('im.jpg')` | `PIL.Image` | HWC, RGB |
| OpenCV | `cv2.imread('im.jpg')[:,:,::-1]` | `np.ndarray` | HWC, BGR to RGB |
| numpy | `np.zeros((640,1280,3))` | `np.ndarray` | HWC |
| torch | `torch.zeros(16,3,320,640)` | `torch.Tensor` | BCHW, RGB |
| CSV | `'sources.csv'` | `str`, `Path` | RTSP, RTMP, HTTP |
| video ✅ | `'vid.mp4'` | `str`, `Path` | |
| directory ✅ | `'path/'` | `str`, `Path` | |
| glob ✅ | `'path/*.jpg'` | `str` | Use `*` operator |
| YouTube ✅ | `'https://youtu.be/Zgi9g1ksQHc'` | `str` | |
| stream ✅ | `'rtsp://example.com/media.mp4'` | `str` | RTSP, RTMP, HTTP |
## Arguments
`model.predict` accepts multiple arguments that control the prediction operation. These arguments can be passed directly to `model.predict`:
Use `stream=True` for processing long videos or large datasets to efficiently manage memory. When `stream=False`, the results for all frames or data points are stored in memory, which can quickly add up and cause out-of-memory errors for large inputs. In contrast, `stream=True` utilizes a generator, which only keeps the results of the current frame or data point in memory, significantly reducing memory consumption and preventing out-of-memory issues.
| Source | Argument | Type | Notes |
|---------------|--------------------------------------------|-----------------|---------------------------------------------------------------------------------------------|
| image | `'image.jpg'` | `str` or `Path` | Single image file. |
| URL | `'https://ultralytics.com/images/bus.jpg'` | `str` | URL to an image. |
| screenshot | `'screen'` | `str` | Capture a screenshot. |
| PIL | `Image.open('im.jpg')` | `PIL.Image` | HWC format with RGB channels. |
| OpenCV | `cv2.imread('im.jpg')` | `np.ndarray` | HWC format with BGR channels `uint8 (0-255)`. |
| numpy | `np.zeros((640,1280,3))` | `np.ndarray` | HWC format with BGR channels `uint8 (0-255)`. |
| torch | `torch.zeros(16,3,320,640)` | `torch.Tensor` | BCHW format with RGB channels `float32 (0.0-1.0)`. |
| CSV | `'sources.csv'` | `str` or `Path` | CSV file containing paths to images, videos, or directories. |
| video ✅ | `'video.mp4'` | `str` or `Path` | Video file in formats like MP4, AVI, etc. |
| directory ✅ | `'path/'` | `str` or `Path` | Path to a directory containing images or videos. |
| glob ✅ | `'path/*.jpg'` | `str` | Glob pattern to match multiple files. Use the `*` character as a wildcard. |
| YouTube ✅ | `'https://youtu.be/Zgi9g1ksQHc'` | `str` | URL to a YouTube video. |
| stream ✅ | `'rtsp://example.com/media.mp4'` | `str` | URL for streaming protocols such as RTSP, RTMP, or an IP address. |
| multi-stream ✅ | `'list.streams'` | `str` or `Path` | `*.streams` text file with one stream URL per row, i.e. 8 streams will run at batch-size 8. |
Below are code examples for using each source type:
!!! example "Prediction sources"
=== "image"
Run inference on an image file.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define path to the image file
source = 'path/to/image.jpg'
# Run inference on the source
results = model(source) # list of Results objects
```
=== "screenshot"
Run inference on the current screen content as a screenshot.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define current screenshot as source
source = 'screen'
# Run inference on the source
results = model(source) # list of Results objects
```
=== "URL"
Run inference on an image or video hosted remotely via URL.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define remote image or video URL
source = 'https://ultralytics.com/images/bus.jpg'
# Run inference on the source
results = model(source) # list of Results objects
```
=== "PIL"
Run inference on an image opened with Python Imaging Library (PIL).
```python
from PIL import Image
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Open an image using PIL
source = Image.open('path/to/image.jpg')
# Run inference on the source
results = model(source) # list of Results objects
```
=== "OpenCV"
Run inference on an image read with OpenCV.
```python
import cv2
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Read an image using OpenCV
source = cv2.imread('path/to/image.jpg')
# Run inference on the source
results = model(source) # list of Results objects
```
=== "numpy"
Run inference on an image represented as a numpy array.
```python
import numpy as np
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Create a random numpy array of HWC shape (640, 640, 3) with values in range [0, 255] and type uint8
source = np.random.randint(low=0, high=255, size=(640, 640, 3), dtype='uint8')
# Run inference on the source
results = model(source) # list of Results objects
```
=== "torch"
Run inference on an image represented as a PyTorch tensor.
```python
import torch
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Create a random torch tensor of BCHW shape (1, 3, 640, 640) with values in range [0, 1] and type float32
source = torch.rand(1, 3, 640, 640, dtype=torch.float32)
# Run inference on the source
results = model(source) # list of Results objects
```
=== "CSV"
Run inference on a collection of images, URLs, videos and directories listed in a CSV file.
```python
import torch
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define a path to a CSV file with images, URLs, videos and directories
source = 'path/to/file.csv'
# Run inference on the source
results = model(source) # list of Results objects
```
=== "video"
Run inference on a video file. By using `stream=True`, you can create a generator of Results objects to reduce memory usage.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define path to video file
source = 'path/to/video.mp4'
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
```
=== "directory"
Run inference on all images and videos in a directory. To also capture images and videos in subdirectories use a glob pattern, i.e. `path/to/dir/**/*`.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define path to directory containing images and videos for inference
source = 'path/to/dir'
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
```
=== "glob"
Run inference on all images and videos that match a glob expression with `*` characters.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define a glob search for all JPG files in a directory
source = 'path/to/dir/*.jpg'
# OR define a recursive glob search for all JPG files including subdirectories
source = 'path/to/dir/**/*.jpg'
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
```
=== "YouTube"
Run inference on a YouTube video. By using `stream=True`, you can create a generator of Results objects to reduce memory usage for long videos.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Define source as YouTube video URL
source = 'https://youtu.be/Zgi9g1ksQHc'
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
```
=== "Streams"
Run inference on remote streaming sources using RTSP, RTMP, and IP address protocols. If mutliple streams are provided in a `*.streams` text file then batched inference will run, i.e. 8 streams will run at batch-size 8, otherwise single streams will run at batch-size 1.
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Single stream with batch-size 1 inference
source = 'rtsp://example.com/media.mp4' # RTSP, RTMP or IP streaming address
# Multiple streams with batched inference (i.e. batch-size 8 for 8 streams)
source = 'path/to/list.streams' # *.streams text file with one streaming address per row
# Run inference on the source
results = model(source, stream=True) # generator of Results objects
```
## Inference Arguments
`model.predict()` accepts multiple arguments that can be passed at inference time to override defaults:
!!! example
```
model.predict(source, save=True, imgsz=320, conf=0.5)
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Run inference on 'bus.jpg' with arguments
model.predict('bus.jpg', save=True, imgsz=320, conf=0.5)
```
All supported arguments:
| Key | Value | Description |
|------------------|------------------------|----------------------------------------------------------|
| `source` | `'ultralytics/assets'` | source directory for images or videos |
| `conf` | `0.25` | object confidence threshold for detection |
| `iou` | `0.7` | intersection over union (IoU) threshold for NMS |
| `half` | `False` | use half precision (FP16) |
| `device` | `None` | device to run on, i.e. cuda device=0/1/2/3 or device=cpu |
| `show` | `False` | show results if possible |
| `save` | `False` | save images with results |
| `save_txt` | `False` | save results as .txt file |
| `save_conf` | `False` | save results with confidence scores |
| `save_crop` | `False` | save cropped images with results |
| `hide_labels` | `False` | hide labels |
| `hide_conf` | `False` | hide confidence scores |
| `max_det` | `300` | maximum number of detections per image |
| `vid_stride` | `False` | video frame-rate stride |
| `line_thickness` | `3` | bounding box thickness (pixels) |
| `visualize` | `False` | visualize model features |
| `augment` | `False` | apply image augmentation to prediction sources |
| `agnostic_nms` | `False` | class-agnostic NMS |
| `retina_masks` | `False` | use high-resolution segmentation masks |
| `classes` | `None` | filter results by class, i.e. class=0, or class=[0,2,3] |
| `boxes` | `True` | Show boxes in segmentation predictions |
| Name | Type | Default | Description |
|----------------|----------------|------------------------|--------------------------------------------------------------------------------|
| `source` | `str` | `'ultralytics/assets'` | source directory for images or videos |
| `conf` | `float` | `0.25` | object confidence threshold for detection |
| `iou` | `float` | `0.7` | intersection over union (IoU) threshold for NMS |
| `imgsz` | `int or tuple` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
| `half` | `bool` | `False` | use half precision (FP16) |
| `device` | `None or str` | `None` | device to run on, i.e. cuda device=0/1/2/3 or device=cpu |
| `show` | `bool` | `False` | show results if possible |
| `save` | `bool` | `False` | save images with results |
| `save_txt` | `bool` | `False` | save results as .txt file |
| `save_conf` | `bool` | `False` | save results with confidence scores |
| `save_crop` | `bool` | `False` | save cropped images with results |
| `hide_labels` | `bool` | `False` | hide labels |
| `hide_conf` | `bool` | `False` | hide confidence scores |
| `max_det` | `int` | `300` | maximum number of detections per image |
| `vid_stride` | `bool` | `False` | video frame-rate stride |
| `line_width` | `None or int` | `None` | The line width of the bounding boxes. If None, it is scaled to the image size. |
| `visualize` | `bool` | `False` | visualize model features |
| `augment` | `bool` | `False` | apply image augmentation to prediction sources |
| `agnostic_nms` | `bool` | `False` | class-agnostic NMS |
| `retina_masks` | `bool` | `False` | use high-resolution segmentation masks |
| `classes` | `None or list` | `None` | filter results by class, i.e. classes=0, or classes=[0,2,3] |
| `boxes` | `bool` | `True` | Show boxes in segmentation predictions |
## Image and Video Formats
YOLOv8 supports various image and video formats, as specified
in [yolo/data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/yolo/data/utils.py). See the
tables below for the valid suffixes and example predict commands.
YOLOv8 supports various image and video formats, as specified in [data/utils.py](https://github.com/ultralytics/ultralytics/blob/main/ultralytics/data/utils.py). See the tables below for the valid suffixes and example predict commands.
### Image Suffixes
### Images
The below table contains valid Ultralytics image formats.
| Image Suffixes | Example Predict Command | Reference |
|----------------|----------------------------------|-------------------------------------------------------------------------------|
@ -113,7 +345,9 @@ tables below for the valid suffixes and example predict commands.
| .webp | `yolo predict source=image.webp` | [WebP](https://en.wikipedia.org/wiki/WebP) |
| .pfm | `yolo predict source=image.pfm` | [Portable FloatMap](https://en.wikipedia.org/wiki/Netpbm#File_formats) |
### Video Suffixes
### Videos
The below table contains valid Ultralytics video formats.
| Video Suffixes | Example Predict Command | Reference |
|----------------|----------------------------------|----------------------------------------------------------------------------------|
@ -132,149 +366,280 @@ tables below for the valid suffixes and example predict commands.
## Working with Results
The `Results` object contains the following components:
- `Results.boxes`: `Boxes` object with properties and methods for manipulating bounding boxes
- `Results.masks`: `Masks` object for indexing masks or getting segment coordinates
- `Results.probs`: `torch.Tensor` containing class probabilities or logits
- `Results.orig_img`: Original image loaded in memory
- `Results.path`: `Path` containing the path to the input image
Each result is composed of a `torch.Tensor` by default, which allows for easy manipulation:
All Ultralytics `predict()` calls will return a list of `Results` objects:
!!! example "Results"
```python
results = results.cuda()
results = results.cpu()
results = results.to('cpu')
results = results.numpy()
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Run inference on an image
results = model('bus.jpg') # list of 1 Results object
results = model(['bus.jpg', 'zidane.jpg']) # list of 2 Results objects
```
### Boxes
`Results` objects have the following attributes:
| Attribute | Type | Description |
|--------------|-----------------------|------------------------------------------------------------------------------------------|
| `orig_img` | `numpy.ndarray` | The original image as a numpy array. |
| `orig_shape` | `tuple` | The original image shape in (height, width) format. |
| `boxes` | `Boxes, optional` | A Boxes object containing the detection bounding boxes. |
| `masks` | `Masks, optional` | A Masks object containing the detection masks. |
| `probs` | `Probs, optional` | A Probs object containing probabilities of each class for classification task. |
| `keypoints` | `Keypoints, optional` | A Keypoints object containing detected keypoints for each object. |
| `speed` | `dict` | A dictionary of preprocess, inference, and postprocess speeds in milliseconds per image. |
| `names` | `dict` | A dictionary of class names. |
| `path` | `str` | The path to the image file. |
`Results` objects have the following methods:
| Method | Return Type | Description |
|-----------------|-----------------|-------------------------------------------------------------------------------------|
| `__getitem__()` | `Results` | Return a Results object for the specified index. |
| `__len__()` | `int` | Return the number of detections in the Results object. |
| `update()` | `None` | Update the boxes, masks, and probs attributes of the Results object. |
| `cpu()` | `Results` | Return a copy of the Results object with all tensors on CPU memory. |
| `numpy()` | `Results` | Return a copy of the Results object with all tensors as numpy arrays. |
| `cuda()` | `Results` | Return a copy of the Results object with all tensors on GPU memory. |
| `to()` | `Results` | Return a copy of the Results object with tensors on the specified device and dtype. |
| `new()` | `Results` | Return a new Results object with the same image, path, and names. |
| `keys()` | `List[str]` | Return a list of non-empty attribute names. |
| `plot()` | `numpy.ndarray` | Plots the detection results. Returns a numpy array of the annotated image. |
| `verbose()` | `str` | Return log string for each task. |
| `save_txt()` | `None` | Save predictions into a txt file. |
| `save_crop()` | `None` | Save cropped predictions to `save_dir/cls/file_name.jpg`. |
| `tojson()` | `None` | Convert the object to JSON format. |
For more details see the `Results` class [documentation](../reference/engine/results.md).
`Boxes` object can be used to index, manipulate, and convert bounding boxes to different formats. Box format conversion
operations are cached, meaning they're only calculated once per object, and those values are reused for future calls.
### Boxes
- Indexing a `Boxes` object returns a `Boxes` object:
`Boxes` object can be used to index, manipulate, and convert bounding boxes to different formats.
!!! example "Boxes"
```python
results = model(img)
boxes = results[0].boxes
box = boxes[0] # returns one box
box.xyxy
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Run inference on an image
results = model('bus.jpg') # results list
# View results
for r in results:
print(r.boxes) # print the Boxes object containing the detection bounding boxes
```
- Properties and conversions
Here is a table for the `Boxes` class methods and properties, including their name, type, and description:
!!! example "Boxes Properties"
| Name | Type | Description |
|-----------|---------------------------|--------------------------------------------------------------------|
| `cpu()` | Method | Move the object to CPU memory. |
| `numpy()` | Method | Convert the object to a numpy array. |
| `cuda()` | Method | Move the object to CUDA memory. |
| `to()` | Method | Move the object to the specified device. |
| `xyxy` | Property (`torch.Tensor`) | Return the boxes in xyxy format. |
| `conf` | Property (`torch.Tensor`) | Return the confidence values of the boxes. |
| `cls` | Property (`torch.Tensor`) | Return the class values of the boxes. |
| `id` | Property (`torch.Tensor`) | Return the track IDs of the boxes (if available). |
| `xywh` | Property (`torch.Tensor`) | Return the boxes in xywh format. |
| `xyxyn` | Property (`torch.Tensor`) | Return the boxes in xyxy format normalized by original image size. |
| `xywhn` | Property (`torch.Tensor`) | Return the boxes in xywh format normalized by original image size. |
```python
boxes.xyxy # box with xyxy format, (N, 4)
boxes.xywh # box with xywh format, (N, 4)
boxes.xyxyn # box with xyxy format but normalized, (N, 4)
boxes.xywhn # box with xywh format but normalized, (N, 4)
boxes.conf # confidence score, (N, 1)
boxes.cls # cls, (N, 1)
boxes.data # raw bboxes tensor, (N, 6) or boxes.boxes
```
For more details see the `Boxes` class [documentation](../reference/engine/results.md).
### Masks
`Masks` object can be used index, manipulate and convert masks to segments. The segment conversion operation is cached.
`Masks` object can be used index, manipulate and convert masks to segments.
!!! example "Masks"
```python
results = model(inputs)
masks = results[0].masks # Masks object
masks.xy # x, y segments (pixels), List[segment] * N
masks.xyn # x, y segments (normalized), List[segment] * N
masks.data # raw masks tensor, (N, H, W) or masks.masks
from ultralytics import YOLO
# Load a pretrained YOLOv8n-seg Segment model
model = YOLO('yolov8n-seg.pt')
# Run inference on an image
results = model('bus.jpg') # results list
# View results
for r in results:
print(r.masks) # print the Masks object containing the detected instance masks
```
Here is a table for the `Masks` class methods and properties, including their name, type, and description:
| Name | Type | Description |
|-----------|---------------------------|-----------------------------------------------------------------|
| `cpu()` | Method | Returns the masks tensor on CPU memory. |
| `numpy()` | Method | Returns the masks tensor as a numpy array. |
| `cuda()` | Method | Returns the masks tensor on GPU memory. |
| `to()` | Method | Returns the masks tensor with the specified device and dtype. |
| `xyn` | Property (`torch.Tensor`) | A list of normalized segments represented as tensors. |
| `xy` | Property (`torch.Tensor`) | A list of segments in pixel coordinates represented as tensors. |
For more details see the `Masks` class [documentation](../reference/engine/results.md).
### Keypoints
`Keypoints` object can be used index, manipulate and normalize coordinates.
!!! example "Keypoints"
```python
from ultralytics import YOLO
# Load a pretrained YOLOv8n-pose Pose model
model = YOLO('yolov8n-pose.pt')
# Run inference on an image
results = model('bus.jpg') # results list
# View results
for r in results:
print(r.keypoints) # print the Keypoints object containing the detected keypoints
```
### probs
Here is a table for the `Keypoints` class methods and properties, including their name, type, and description:
| Name | Type | Description |
|-----------|---------------------------|-------------------------------------------------------------------|
| `cpu()` | Method | Returns the keypoints tensor on CPU memory. |
| `numpy()` | Method | Returns the keypoints tensor as a numpy array. |
| `cuda()` | Method | Returns the keypoints tensor on GPU memory. |
| `to()` | Method | Returns the keypoints tensor with the specified device and dtype. |
| `xyn` | Property (`torch.Tensor`) | A list of normalized keypoints represented as tensors. |
| `xy` | Property (`torch.Tensor`) | A list of keypoints in pixel coordinates represented as tensors. |
| `conf` | Property (`torch.Tensor`) | Returns confidence values of keypoints if available, else None. |
For more details see the `Keypoints` class [documentation](../reference/engine/results.md).
`probs` attribute of `Results` class is a `Tensor` containing class probabilities of a classification operation.
### Probs
`Probs` object can be used index, get `top1` and `top5` indices and scores of classification.
!!! example "Probs"
```python
results = model(inputs)
results[0].probs # cls prob, (num_class, )
from ultralytics import YOLO
# Load a pretrained YOLOv8n-cls Classify model
model = YOLO('yolov8n-cls.pt')
# Run inference on an image
results = model('bus.jpg') # results list
# View results
for r in results:
print(r.probs) # print the Probs object containing the detected class probabilities
```
Class reference documentation for `Results` module and its components can be found [here](../reference/yolo/engine/results.md)
Here's a table summarizing the methods and properties for the `Probs` class:
| Name | Type | Description |
|------------|---------------------------|-------------------------------------------------------------------------|
| `cpu()` | Method | Returns a copy of the probs tensor on CPU memory. |
| `numpy()` | Method | Returns a copy of the probs tensor as a numpy array. |
| `cuda()` | Method | Returns a copy of the probs tensor on GPU memory. |
| `to()` | Method | Returns a copy of the probs tensor with the specified device and dtype. |
| `top1` | Property (`int`) | Index of the top 1 class. |
| `top5` | Property (`list[int]`) | Indices of the top 5 classes. |
| `top1conf` | Property (`torch.Tensor`) | Confidence of the top 1 class. |
| `top5conf` | Property (`torch.Tensor`) | Confidences of the top 5 classes. |
For more details see the `Probs` class [documentation](../reference/engine/results.md).
## Plotting results
## Plotting Results
You can use `plot()` function of `Result` object to plot results on in image object. It plots all components(boxes,
masks, classification logits, etc.) found in the results object
You can use the `plot()` method of a `Result` objects to visualize predictions. It plots all prediction types (boxes, masks, keypoints, probabilities, etc.) contained in the `Results` object onto a numpy array that can then be shown or saved.
!!! example "Plotting"
```python
res = model(img)
res_plotted = res[0].plot()
cv2.imshow("result", res_plotted)
```
| Argument | Description |
|--------------------------------|----------------------------------------------------------------------------------------|
| `conf (bool)` | Whether to plot the detection confidence score. |
| `line_width (float, optional)` | The line width of the bounding boxes. If None, it is scaled to the image size. |
| `font_size (float, optional)` | The font size of the text. If None, it is scaled to the image size. |
| `font (str)` | The font to use for the text. |
| `pil (bool)` | Whether to use PIL for image plotting. |
| `example (str)` | An example string to display. Useful for indicating the expected format of the output. |
| `img (numpy.ndarray)` | Plot to another image. if not, plot to original image. |
| `labels (bool)` | Whether to plot the label of bounding boxes. |
| `boxes (bool)` | Whether to plot the bounding boxes. |
| `masks (bool)` | Whether to plot the masks. |
| `probs (bool)` | Whether to plot classification probability. |
from PIL import Image
from ultralytics import YOLO
# Load a pretrained YOLOv8n model
model = YOLO('yolov8n.pt')
# Run inference on 'bus.jpg'
results = model('bus.jpg') # results list
# Show the results
for r in results:
im_array = r.plot() # plot a BGR numpy array of predictions
im = Image.fromarray(im_array[..., ::-1]) # RGB PIL image
im.show() # show image
im.save('results.jpg') # save image
```
The `plot()` method supports the following arguments:
| Argument | Type | Description | Default |
|--------------|-----------------|--------------------------------------------------------------------------------|---------------|
| `conf` | `bool` | Whether to plot the detection confidence score. | `True` |
| `line_width` | `float` | The line width of the bounding boxes. If None, it is scaled to the image size. | `None` |
| `font_size` | `float` | The font size of the text. If None, it is scaled to the image size. | `None` |
| `font` | `str` | The font to use for the text. | `'Arial.ttf'` |
| `pil` | `bool` | Whether to return the image as a PIL Image. | `False` |
| `img` | `numpy.ndarray` | Plot to another image. if not, plot to original image. | `None` |
| `im_gpu` | `torch.Tensor` | Normalized image in gpu with shape (1, 3, 640, 640), for faster mask plotting. | `None` |
| `kpt_radius` | `int` | Radius of the drawn keypoints. Default is 5. | `5` |
| `kpt_line` | `bool` | Whether to draw lines connecting keypoints. | `True` |
| `labels` | `bool` | Whether to plot the label of bounding boxes. | `True` |
| `boxes` | `bool` | Whether to plot the bounding boxes. | `True` |
| `masks` | `bool` | Whether to plot the masks. | `True` |
| `probs` | `bool` | Whether to plot classification probability | `True` |
## Streaming Source `for`-loop
Here's a Python script using OpenCV (cv2) and YOLOv8 to run inference on video frames. This script assumes you have already installed the necessary packages (opencv-python and ultralytics).
Here's a Python script using OpenCV (`cv2`) and YOLOv8 to run inference on video frames. This script assumes you have already installed the necessary packages (`opencv-python` and `ultralytics`).
!!! example "Streaming for-loop"
```python
import cv2
from ultralytics import YOLO
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Open the video file
video_path = "path/to/your/video/file.mp4"
cap = cv2.VideoCapture(video_path)
# Loop through the video frames
while cap.isOpened():
# Read a frame from the video
success, frame = cap.read()
if success:
# Run YOLOv8 inference on the frame
results = model(frame)
# Visualize the results on the frame
annotated_frame = results[0].plot()
# Display the annotated frame
cv2.imshow("YOLOv8 Inference", annotated_frame)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
# Break the loop if the end of the video is reached
break
# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()
```
This script will run predictions on each frame of the video, visualize the results, and display them in a window. The loop can be exited by pressing 'q'.

@ -1,100 +1,286 @@
---
comments: true
description: Learn how to use Ultralytics YOLO for object tracking in video streams. Guides to use different trackers and customise tracker configurations.
keywords: Ultralytics, YOLO, object tracking, video streams, BoT-SORT, ByteTrack, Python guide, CLI guide
---
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
<img width="1024" src="https://user-images.githubusercontent.com/26833433/243418637-1d6250fd-1515-4c10-a844-a32818ae6d46.png">
Object tracking is a task that involves identifying the location and class of objects, then assigning a unique ID to
that detection in video streams.
Object tracking is a task that involves identifying the location and class of objects, then assigning a unique ID to that detection in video streams.
The output of tracker is the same as detection with an added object ID.
## Available Trackers
The following tracking algorithms have been implemented and can be enabled by passing `tracker=tracker_type.yaml`
Ultralytics YOLO supports the following tracking algorithms. They can be enabled by passing the relevant YAML configuration file such as `tracker=tracker_type.yaml`:
* [BoT-SORT](https://github.com/NirAharon/BoT-SORT) - `botsort.yaml`
* [ByteTrack](https://github.com/ifzhang/ByteTrack) - `bytetrack.yaml`
* [BoT-SORT](https://github.com/NirAharon/BoT-SORT) - Use `botsort.yaml` to enable this tracker.
* [ByteTrack](https://github.com/ifzhang/ByteTrack) - Use `bytetrack.yaml` to enable this tracker.
The default tracker is BoT-SORT.
## Tracking
Use a trained YOLOv8n/YOLOv8n-seg model to run tracker on video streams.
To run the tracker on video streams, use a trained Detect, Segment or Pose model such as YOLOv8n, YOLOv8n-seg and YOLOv8n-pose.
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load an official detection model
model = YOLO('yolov8n-seg.pt') # load an official segmentation model
model = YOLO('path/to/best.pt') # load a custom model
# Track with the model
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", show=True)
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", show=True, tracker="bytetrack.yaml")
# Load an official or custom model
model = YOLO('yolov8n.pt') # Load an official Detect model
model = YOLO('yolov8n-seg.pt') # Load an official Segment model
model = YOLO('yolov8n-pose.pt') # Load an official Pose model
model = YOLO('path/to/best.pt') # Load a custom trained model
# Perform tracking with the model
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", show=True) # Tracking with default tracker
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", show=True, tracker="bytetrack.yaml") # Tracking with ByteTrack tracker
```
=== "CLI"
```bash
yolo track model=yolov8n.pt source="https://youtu.be/Zgi9g1ksQHc" # official detection model
yolo track model=yolov8n-seg.pt source=... # official segmentation model
yolo track model=path/to/best.pt source=... # custom model
yolo track model=path/to/best.pt tracker="bytetrack.yaml" # bytetrack tracker
# Perform tracking with various models using the command line interface
yolo track model=yolov8n.pt source="https://youtu.be/Zgi9g1ksQHc" # Official Detect model
yolo track model=yolov8n-seg.pt source="https://youtu.be/Zgi9g1ksQHc" # Official Segment model
yolo track model=yolov8n-pose.pt source="https://youtu.be/Zgi9g1ksQHc" # Official Pose model
yolo track model=path/to/best.pt source="https://youtu.be/Zgi9g1ksQHc" # Custom trained model
# Track using ByteTrack tracker
yolo track model=path/to/best.pt tracker="bytetrack.yaml"
```
As in the above usage, we support both the detection and segmentation models for tracking and the only thing you need to
do is loading the corresponding (detection or segmentation) model.
As can be seen in the above usage, tracking is available for all Detect, Segment and Pose models run on videos or streaming sources.
## Configuration
### Tracking
### Tracking Arguments
Tracking configuration shares properties with Predict mode, such as `conf`, `iou`, and `show`. For further configurations, refer to the [Predict](https://docs.ultralytics.com/modes/predict/) model page.
Tracking shares the configuration with predict, i.e `conf`, `iou`, `show`. More configurations please refer
to [predict page](https://docs.ultralytics.com/modes/predict/).
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Configure the tracking parameters and run the tracker
model = YOLO('yolov8n.pt')
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", conf=0.3, iou=0.5, show=True)
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", conf=0.3, iou=0.5, show=True)
```
=== "CLI"
```bash
# Configure tracking parameters and run the tracker using the command line interface
yolo track model=yolov8n.pt source="https://youtu.be/Zgi9g1ksQHc" conf=0.3, iou=0.5 show
```
### Tracker
### Tracker Selection
Ultralytics also allows you to use a modified tracker configuration file. To do this, simply make a copy of a tracker config file (for example, `custom_tracker.yaml`) from [ultralytics/cfg/trackers](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/trackers) and modify any configurations (except the `tracker_type`) as per your needs.
We also support using a modified tracker config file, just copy a config file i.e `custom_tracker.yaml`
from [ultralytics/tracker/cfg](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/tracker/cfg) and modify
any configurations(expect the `tracker_type`) you need to.
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load the model and run the tracker with a custom configuration file
model = YOLO('yolov8n.pt')
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", tracker='custom_tracker.yaml')
results = model.track(source="https://youtu.be/Zgi9g1ksQHc", tracker='custom_tracker.yaml')
```
=== "CLI"
```bash
# Load the model and run the tracker with a custom configuration file using the command line interface
yolo track model=yolov8n.pt source="https://youtu.be/Zgi9g1ksQHc" tracker='custom_tracker.yaml'
```
Please refer to [ultralytics/tracker/cfg](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/tracker/cfg)
page
For a comprehensive list of tracking arguments, refer to the [ultralytics/cfg/trackers](https://github.com/ultralytics/ultralytics/tree/main/ultralytics/cfg/trackers) page.
## Python Examples
### Persisting Tracks Loop
Here is a Python script using OpenCV (`cv2`) and YOLOv8 to run object tracking on video frames. This script still assumes you have already installed the necessary packages (`opencv-python` and `ultralytics`).
!!! example "Streaming for-loop with tracking"
```python
import cv2
from ultralytics import YOLO
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Open the video file
video_path = "path/to/video.mp4"
cap = cv2.VideoCapture(video_path)
# Loop through the video frames
while cap.isOpened():
# Read a frame from the video
success, frame = cap.read()
if success:
# Run YOLOv8 tracking on the frame, persisting tracks between frames
results = model.track(frame, persist=True)
# Visualize the results on the frame
annotated_frame = results[0].plot()
# Display the annotated frame
cv2.imshow("YOLOv8 Tracking", annotated_frame)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
# Break the loop if the end of the video is reached
break
# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()
```
Please note the change from `model(frame)` to `model.track(frame)`, which enables object tracking instead of simple detection. This modified script will run the tracker on each frame of the video, visualize the results, and display them in a window. The loop can be exited by pressing 'q'.
### Plotting Tracks Over Time
Visualizing object tracks over consecutive frames can provide valuable insights into the movement patterns and behavior of detected objects within a video. With Ultralytics YOLOv8, plotting these tracks is a seamless and efficient process.
In the following example, we demonstrate how to utilize YOLOv8's tracking capabilities to plot the movement of detected objects across multiple video frames. This script involves opening a video file, reading it frame by frame, and utilizing the YOLO model to identify and track various objects. By retaining the center points of the detected bounding boxes and connecting them, we can draw lines that represent the paths followed by the tracked objects.
!!! example "Plotting tracks over multiple video frames"
```python
from collections import defaultdict
import cv2
import numpy as np
from ultralytics import YOLO
# Load the YOLOv8 model
model = YOLO('yolov8n.pt')
# Open the video file
video_path = "path/to/video.mp4"
cap = cv2.VideoCapture(video_path)
# Store the track history
track_history = defaultdict(lambda: [])
# Loop through the video frames
while cap.isOpened():
# Read a frame from the video
success, frame = cap.read()
if success:
# Run YOLOv8 tracking on the frame, persisting tracks between frames
results = model.track(frame, persist=True)
# Get the boxes and track IDs
boxes = results[0].boxes.xywh.cpu()
track_ids = results[0].boxes.id.int().cpu().tolist()
# Visualize the results on the frame
annotated_frame = results[0].plot()
# Plot the tracks
for box, track_id in zip(boxes, track_ids):
x, y, w, h = box
track = track_history[track_id]
track.append((float(x), float(y))) # x, y center point
if len(track) > 30: # retain 90 tracks for 90 frames
track.pop(0)
# Draw the tracking lines
points = np.hstack(track).astype(np.int32).reshape((-1, 1, 2))
cv2.polylines(annotated_frame, [points], isClosed=False, color=(230, 230, 230), thickness=10)
# Display the annotated frame
cv2.imshow("YOLOv8 Tracking", annotated_frame)
# Break the loop if 'q' is pressed
if cv2.waitKey(1) & 0xFF == ord("q"):
break
else:
# Break the loop if the end of the video is reached
break
# Release the video capture object and close the display window
cap.release()
cv2.destroyAllWindows()
```
### Multithreaded Tracking
Multithreaded tracking provides the capability to run object tracking on multiple video streams simultaneously. This is particularly useful when handling multiple video inputs, such as from multiple surveillance cameras, where concurrent processing can greatly enhance efficiency and performance.
In the provided Python script, we make use of Python's `threading` module to run multiple instances of the tracker concurrently. Each thread is responsible for running the tracker on one video file, and all the threads run simultaneously in the background.
To ensure that each thread receives the correct parameters (the video file and the model to use), we define a function `run_tracker_in_thread` that accepts these parameters and contains the main tracking loop. This function reads the video frame by frame, runs the tracker, and displays the results.
Two different models are used in this example: `yolov8n.pt` and `yolov8n-seg.pt`, each tracking objects in a different video file. The video files are specified in `video_file1` and `video_file2`.
The `daemon=True` parameter in `threading.Thread` means that these threads will be closed as soon as the main program finishes. We then start the threads with `start()` and use `join()` to make the main thread wait until both tracker threads have finished.
Finally, after all threads have completed their task, the windows displaying the results are closed using `cv2.destroyAllWindows()`.
!!! example "Streaming for-loop with tracking"
```python
import threading
import cv2
from ultralytics import YOLO
def run_tracker_in_thread(filename, model):
video = cv2.VideoCapture(filename)
frames = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
for _ in range(frames):
ret, frame = video.read()
if ret:
results = model.track(source=frame, persist=True)
res_plotted = results[0].plot()
cv2.imshow('p', res_plotted)
if cv2.waitKey(1) == ord('q'):
break
# Load the models
model1 = YOLO('yolov8n.pt')
model2 = YOLO('yolov8n-seg.pt')
# Define the video files for the trackers
video_file1 = 'path/to/video1.mp4'
video_file2 = 'path/to/video2.mp4'
# Create the tracker threads
tracker_thread1 = threading.Thread(target=run_tracker_in_thread, args=(video_file1, model1), daemon=True)
tracker_thread2 = threading.Thread(target=run_tracker_in_thread, args=(video_file2, model2), daemon=True)
# Start the tracker threads
tracker_thread1.start()
tracker_thread2.start()
# Wait for the tracker threads to finish
tracker_thread1.join()
tracker_thread2.join()
# Clean up and close windows
cv2.destroyAllWindows()
```
This example can easily be extended to handle more video files and models by creating more threads and applying the same methodology.

@ -1,16 +1,12 @@
---
comments: true
---
---
comments: true
description: Step-by-step guide to train YOLOv8 models with Ultralytics YOLO with examples of single-GPU and multi-GPU training. Efficient way for object detection training.
keywords: Ultralytics, YOLOv8, YOLO, object detection, train mode, custom dataset, GPU training, multi-GPU, hyperparameters, CLI examples, Python examples
---
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
**Train mode** is used for training a YOLOv8 model on a custom dataset. In this mode, the model is trained using the
specified dataset and hyperparameters. The training process involves optimizing the model's parameters so that it can
accurately predict the classes and locations of objects in an image.
**Train mode** is used for training a YOLOv8 model on a custom dataset. In this mode, the model is trained using the specified dataset and hyperparameters. The training process involves optimizing the model's parameters so that it can accurately predict the classes and locations of objects in an image.
!!! tip "Tip"
@ -18,26 +14,28 @@ accurately predict the classes and locations of objects in an image.
## Usage Examples
Train YOLOv8n on the COCO128 dataset for 100 epochs at image size 640. See Arguments section below for a full list of
training arguments.
Train YOLOv8n on the COCO128 dataset for 100 epochs at image size 640. See Arguments section below for a full list of training arguments.
!!! example ""
!!! example "Single-GPU and CPU Training Example"
Device is determined automatically. If a GPU is available then it will be used, otherwise training will start on CPU.
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.yaml') # build a new model from YAML
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
model = YOLO('yolov8n.yaml').load('yolov8n.pt') # build from YAML and transfer weights
# Train the model
model.train(data='coco128.yaml', epochs=100, imgsz=640)
results = model.train(data='coco128.yaml', epochs=100, imgsz=640)
```
=== "CLI"
```bash
# Build a new model from YAML and start training from scratch
yolo detect train data=coco128.yaml model=yolov8n.yaml epochs=100 imgsz=640
@ -49,57 +47,213 @@ training arguments.
yolo detect train data=coco128.yaml model=yolov8n.yaml pretrained=yolov8n.pt epochs=100 imgsz=640
```
### Multi-GPU Training
The training device can be specified using the `device` argument. If no argument is passed GPU `device=0` will be used if available, otherwise `device=cpu` will be used.
!!! example "Multi-GPU Training Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model with 2 GPUs
results = model.train(data='coco128.yaml', epochs=100, imgsz=640, device=[0, 1])
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model using GPUs 0 and 1
yolo detect train data=coco128.yaml model=yolov8n.pt epochs=100 imgsz=640 device=0,1
```
### Apple M1 and M2 MPS Training
With the support for Apple M1 and M2 chips integrated in the Ultralytics YOLO models, it's now possible to train your models on devices utilizing the powerful Metal Performance Shaders (MPS) framework. The MPS offers a high-performance way of executing computation and image processing tasks on Apple's custom silicon.
To enable training on Apple M1 and M2 chips, you should specify 'mps' as your device when initiating the training process. Below is an example of how you could do this in Python and via the command line:
!!! example "MPS Training Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load a pretrained model (recommended for training)
# Train the model with 2 GPUs
results = model.train(data='coco128.yaml', epochs=100, imgsz=640, device='mps')
```
=== "CLI"
```bash
# Start training from a pretrained *.pt model using GPUs 0 and 1
yolo detect train data=coco128.yaml model=yolov8n.pt epochs=100 imgsz=640 device=mps
```
While leveraging the computational power of the M1/M2 chips, this enables more efficient processing of the training tasks. For more detailed guidance and advanced configuration options, please refer to the [PyTorch MPS documentation](https://pytorch.org/docs/stable/notes/mps.html).
### Resuming Interrupted Trainings
Resuming training from a previously saved state is a crucial feature when working with deep learning models. This can come in handy in various scenarios, like when the training process has been unexpectedly interrupted, or when you wish to continue training a model with new data or for more epochs.
When training is resumed, Ultralytics YOLO loads the weights from the last saved model and also restores the optimizer state, learning rate scheduler, and the epoch number. This allows you to continue the training process seamlessly from where it was left off.
You can easily resume training in Ultralytics YOLO by setting the `resume` argument to `True` when calling the `train` method, and specifying the path to the `.pt` file containing the partially trained model weights.
Below is an example of how to resume an interrupted training using Python and via the command line:
!!! example "Resume Training Example"
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('path/to/last.pt') # load a partially trained model
# Resume training
results = model.train(resume=True)
```
=== "CLI"
```bash
# Resume an interrupted training
yolo train resume model=path/to/last.pt
```
By setting `resume=True`, the `train` function will continue training from where it left off, using the state stored in the 'path/to/last.pt' file. If the `resume` argument is omitted or set to `False`, the `train` function will start a new training session.
Remember that checkpoints are saved at the end of every epoch by default, or at fixed interval using the `save_period` argument, so you must complete at least 1 epoch to resume a training run.
## Arguments
Training settings for YOLO models refer to the various hyperparameters and configurations used to train the model on a
dataset. These settings can affect the model's performance, speed, and accuracy. Some common YOLO training settings
include the batch size, learning rate, momentum, and weight decay. Other factors that may affect the training process
include the choice of optimizer, the choice of loss function, and the size and composition of the training dataset. It
is important to carefully tune and experiment with these settings to achieve the best possible performance for a given
task.
| Key | Value | Description |
|-------------------|----------|-----------------------------------------------------------------------------|
| `model` | `None` | path to model file, i.e. yolov8n.pt, yolov8n.yaml |
| `data` | `None` | path to data file, i.e. coco128.yaml |
| `epochs` | `100` | number of epochs to train for |
| `patience` | `50` | epochs to wait for no observable improvement for early stopping of training |
| `batch` | `16` | number of images per batch (-1 for AutoBatch) |
| `imgsz` | `640` | size of input images as integer or w,h |
| `save` | `True` | save train checkpoints and predict results |
| `save_period` | `-1` | Save checkpoint every x epochs (disabled if < 1) |
| `cache` | `False` | True/ram, disk or False. Use cache for data loading |
| `device` | `None` | device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
| `workers` | `8` | number of worker threads for data loading (per RANK if DDP) |
| `project` | `None` | project name |
| `name` | `None` | experiment name |
| `exist_ok` | `False` | whether to overwrite existing experiment |
| `pretrained` | `False` | whether to use a pretrained model |
| `optimizer` | `'SGD'` | optimizer to use, choices=['SGD', 'Adam', 'AdamW', 'RMSProp'] |
| `verbose` | `False` | whether to print verbose output |
| `seed` | `0` | random seed for reproducibility |
| `deterministic` | `True` | whether to enable deterministic mode |
| `single_cls` | `False` | train multi-class data as single-class |
| `rect` | `False` | rectangular training with each batch collated for minimum padding |
| `cos_lr` | `False` | use cosine learning rate scheduler |
| `close_mosaic` | `0` | (int) disable mosaic augmentation for final epochs |
| `resume` | `False` | resume training from last checkpoint |
| `amp` | `True` | Automatic Mixed Precision (AMP) training, choices=[True, False] |
| `lr0` | `0.01` | initial learning rate (i.e. SGD=1E-2, Adam=1E-3) |
| `lrf` | `0.01` | final learning rate (lr0 * lrf) |
| `momentum` | `0.937` | SGD momentum/Adam beta1 |
| `weight_decay` | `0.0005` | optimizer weight decay 5e-4 |
| `warmup_epochs` | `3.0` | warmup epochs (fractions ok) |
| `warmup_momentum` | `0.8` | warmup initial momentum |
| `warmup_bias_lr` | `0.1` | warmup initial bias lr |
| `box` | `7.5` | box loss gain |
| `cls` | `0.5` | cls loss gain (scale with pixels) |
| `dfl` | `1.5` | dfl loss gain |
| `pose` | `12.0` | pose loss gain (pose-only) |
| `kobj` | `2.0` | keypoint obj loss gain (pose-only) |
| `label_smoothing` | `0.0` | label smoothing (fraction) |
| `nbs` | `64` | nominal batch size |
| `overlap_mask` | `True` | masks should overlap during training (segment train only) |
| `mask_ratio` | `4` | mask downsample ratio (segment train only) |
| `dropout` | `0.0` | use dropout regularization (classify train only) |
| `val` | `True` | validate/test during training |
Training settings for YOLO models refer to the various hyperparameters and configurations used to train the model on a dataset. These settings can affect the model's performance, speed, and accuracy. Some common YOLO training settings include the batch size, learning rate, momentum, and weight decay. Other factors that may affect the training process include the choice of optimizer, the choice of loss function, and the size and composition of the training dataset. It is important to carefully tune and experiment with these settings to achieve the best possible performance for a given task.
| Key | Value | Description |
|-------------------|----------|------------------------------------------------------------------------------------------------|
| `model` | `None` | path to model file, i.e. yolov8n.pt, yolov8n.yaml |
| `data` | `None` | path to data file, i.e. coco128.yaml |
| `epochs` | `100` | number of epochs to train for |
| `patience` | `50` | epochs to wait for no observable improvement for early stopping of training |
| `batch` | `16` | number of images per batch (-1 for AutoBatch) |
| `imgsz` | `640` | size of input images as integer |
| `save` | `True` | save train checkpoints and predict results |
| `save_period` | `-1` | Save checkpoint every x epochs (disabled if < 1) |
| `cache` | `False` | True/ram, disk or False. Use cache for data loading |
| `device` | `None` | device to run on, i.e. cuda device=0 or device=0,1,2,3 or device=cpu |
| `workers` | `8` | number of worker threads for data loading (per RANK if DDP) |
| `project` | `None` | project name |
| `name` | `None` | experiment name |
| `exist_ok` | `False` | whether to overwrite existing experiment |
| `pretrained` | `False` | whether to use a pretrained model |
| `optimizer` | `'auto'` | optimizer to use, choices=[SGD, Adam, Adamax, AdamW, NAdam, RAdam, RMSProp, auto] |
| `verbose` | `False` | whether to print verbose output |
| `seed` | `0` | random seed for reproducibility |
| `deterministic` | `True` | whether to enable deterministic mode |
| `single_cls` | `False` | train multi-class data as single-class |
| `rect` | `False` | rectangular training with each batch collated for minimum padding |
| `cos_lr` | `False` | use cosine learning rate scheduler |
| `close_mosaic` | `10` | (int) disable mosaic augmentation for final epochs (0 to disable) |
| `resume` | `False` | resume training from last checkpoint |
| `amp` | `True` | Automatic Mixed Precision (AMP) training, choices=[True, False] |
| `fraction` | `1.0` | dataset fraction to train on (default is 1.0, all images in train set) |
| `profile` | `False` | profile ONNX and TensorRT speeds during training for loggers |
| `freeze` | `None` | (int or list, optional) freeze first n layers, or freeze list of layer indices during training |
| `lr0` | `0.01` | initial learning rate (i.e. SGD=1E-2, Adam=1E-3) |
| `lrf` | `0.01` | final learning rate (lr0 * lrf) |
| `momentum` | `0.937` | SGD momentum/Adam beta1 |
| `weight_decay` | `0.0005` | optimizer weight decay 5e-4 |
| `warmup_epochs` | `3.0` | warmup epochs (fractions ok) |
| `warmup_momentum` | `0.8` | warmup initial momentum |
| `warmup_bias_lr` | `0.1` | warmup initial bias lr |
| `box` | `7.5` | box loss gain |
| `cls` | `0.5` | cls loss gain (scale with pixels) |
| `dfl` | `1.5` | dfl loss gain |
| `pose` | `12.0` | pose loss gain (pose-only) |
| `kobj` | `2.0` | keypoint obj loss gain (pose-only) |
| `label_smoothing` | `0.0` | label smoothing (fraction) |
| `nbs` | `64` | nominal batch size |
| `overlap_mask` | `True` | masks should overlap during training (segment train only) |
| `mask_ratio` | `4` | mask downsample ratio (segment train only) |
| `dropout` | `0.0` | use dropout regularization (classify train only) |
| `val` | `True` | validate/test during training |
## Logging
In training a YOLOv8 model, you might find it valuable to keep track of the model's performance over time. This is where logging comes into play. Ultralytics' YOLO provides support for three types of loggers - Comet, ClearML, and TensorBoard.
To use a logger, select it from the dropdown menu in the code snippet above and run it. The chosen logger will be installed and initialized.
### Comet
[Comet](https://www.comet.ml/site/) is a platform that allows data scientists and developers to track, compare, explain and optimize experiments and models. It provides functionalities such as real-time metrics, code diffs, and hyperparameters tracking.
To use Comet:
!!! example ""
=== "Python"
```python
# pip install comet_ml
import comet_ml
comet_ml.init()
```
Remember to sign in to your Comet account on their website and get your API key. You will need to add this to your environment variables or your script to log your experiments.
### ClearML
[ClearML](https://www.clear.ml/) is an open-source platform that automates tracking of experiments and helps with efficient sharing of resources. It is designed to help teams manage, execute, and reproduce their ML work more efficiently.
To use ClearML:
!!! example ""
=== "Python"
```python
# pip install clearml
import clearml
clearml.browser_login()
```
After running this script, you will need to sign in to your ClearML account on the browser and authenticate your session.
### TensorBoard
[TensorBoard](https://www.tensorflow.org/tensorboard) is a visualization toolkit for TensorFlow. It allows you to visualize your TensorFlow graph, plot quantitative metrics about the execution of your graph, and show additional data like images that pass through it.
To use TensorBoard in [Google Colab](https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/examples/tutorial.ipynb):
!!! example ""
=== "CLI"
```bash
load_ext tensorboard
tensorboard --logdir ultralytics/runs # replace with 'runs' directory
```
To use TensorBoard locally run the below command and view results at http://localhost:6006/.
!!! example ""
=== "CLI"
```bash
tensorboard --logdir ultralytics/runs # replace with 'runs' directory
```
This will load TensorBoard and direct it to the directory where your training logs are saved.
After setting up your logger, you can then proceed with your model training. All training metrics will be automatically logged in your chosen platform, and you can access these logs to monitor your model's performance over time, compare different models, and identify areas for improvement.

@ -1,12 +1,12 @@
---
comments: true
description: 'Guide for Validating YOLOv8 Models: Learn how to evaluate the performance of your YOLO models using validation settings and metrics with Python and CLI examples.'
keywords: Ultralytics, YOLO Docs, YOLOv8, validation, model evaluation, hyperparameters, accuracy, metrics, Python, CLI
---
<img width="1024" src="https://github.com/ultralytics/assets/raw/main/yolov8/banner-integrations.png">
**Val mode** is used for validating a YOLOv8 model after it has been trained. In this mode, the model is evaluated on a
validation set to measure its accuracy and generalization performance. This mode can be used to tune the hyperparameters
of the model to improve its performance.
**Val mode** is used for validating a YOLOv8 model after it has been trained. In this mode, the model is evaluated on a validation set to measure its accuracy and generalization performance. This mode can be used to tune the hyperparameters of the model to improve its performance.
!!! tip "Tip"
@ -14,20 +14,19 @@ of the model to improve its performance.
## Usage Examples
Validate trained YOLOv8n model accuracy on the COCO128 dataset. No argument need to passed as the `model` retains it's
training `data` and arguments as model attributes. See Arguments section below for a full list of export arguments.
Validate trained YOLOv8n model accuracy on the COCO128 dataset. No argument need to passed as the `model` retains it's training `data` and arguments as model attributes. See Arguments section below for a full list of export arguments.
!!! example ""
=== "Python"
```python
from ultralytics import YOLO
# Load a model
model = YOLO('yolov8n.pt') # load an official model
model = YOLO('path/to/best.pt') # load a custom model
# Validate the model
metrics = model.val() # no arguments needed, dataset and settings remembered
metrics.box.map # map50-95
@ -36,7 +35,7 @@ training `data` and arguments as model attributes. See Arguments section below f
metrics.box.maps # a list contains map50-95 of each category
```
=== "CLI"
```bash
yolo detect val model=yolov8n.pt # val official model
yolo detect val model=path/to/best.pt # val custom model
@ -44,18 +43,12 @@ training `data` and arguments as model attributes. See Arguments section below f
## Arguments
Validation settings for YOLO models refer to the various hyperparameters and configurations used to
evaluate the model's performance on a validation dataset. These settings can affect the model's performance, speed, and
accuracy. Some common YOLO validation settings include the batch size, the frequency with which validation is performed
during training, and the metrics used to evaluate the model's performance. Other factors that may affect the validation
process include the size and composition of the validation dataset and the specific task the model is being used for. It
is important to carefully tune and experiment with these settings to ensure that the model is performing well on the
validation dataset and to detect and prevent overfitting.
Validation settings for YOLO models refer to the various hyperparameters and configurations used to evaluate the model's performance on a validation dataset. These settings can affect the model's performance, speed, and accuracy. Some common YOLO validation settings include the batch size, the frequency with which validation is performed during training, and the metrics used to evaluate the model's performance. Other factors that may affect the validation process include the size and composition of the validation dataset and the specific task the model is being used for. It is important to carefully tune and experiment with these settings to ensure that the model is performing well on the validation dataset and to detect and prevent overfitting.
| Key | Value | Description |
|---------------|---------|--------------------------------------------------------------------|
| `data` | `None` | path to data file, i.e. coco128.yaml |
| `imgsz` | `640` | image size as scalar or (h, w) list, i.e. (640, 480) |
| `imgsz` | `640` | size of input images as integer |
| `batch` | `16` | number of images per batch (-1 for AutoBatch) |
| `save_json` | `False` | save results to JSON file |
| `save_hybrid` | `False` | save hybrid version of labels (labels + additional predictions) |
@ -68,23 +61,4 @@ validation dataset and to detect and prevent overfitting.
| `plots` | `False` | show plots during training |
| `rect` | `False` | rectangular val with each batch collated for minimum padding |
| `split` | `val` | dataset split to use for validation, i.e. 'val', 'test' or 'train' |
## Export Formats
Available YOLOv8 export formats are in the table below. You can export to any format using the `format` argument,
i.e. `format='onnx'` or `format='engine'`.
| Format | `format` Argument | Model | Metadata |
|--------------------------------------------------------------------|-------------------|---------------------------|----------|
| [PyTorch](https://pytorch.org/) | - | `yolov8n.pt` | ✅ |
| [TorchScript](https://pytorch.org/docs/stable/jit.html) | `torchscript` | `yolov8n.torchscript` | ✅ |
| [ONNX](https://onnx.ai/) | `onnx` | `yolov8n.onnx` | ✅ |
| [OpenVINO](https://docs.openvino.ai/latest/index.html) | `openvino` | `yolov8n_openvino_model/` | ✅ |
| [TensorRT](https://developer.nvidia.com/tensorrt) | `engine` | `yolov8n.engine` | ✅ |
| [CoreML](https://github.com/apple/coremltools) | `coreml` | `yolov8n.mlmodel` | ✅ |
| [TF SavedModel](https://www.tensorflow.org/guide/saved_model) | `saved_model` | `yolov8n_saved_model/` | ✅ |
| [TF GraphDef](https://www.tensorflow.org/api_docs/python/tf/Graph) | `pb` | `yolov8n.pb` | ❌ |
| [TF Lite](https://www.tensorflow.org/lite) | `tflite` | `yolov8n.tflite` | ✅ |
| [TF Edge TPU](https://coral.ai/docs/edgetpu/models-intro/) | `edgetpu` | `yolov8n_edgetpu.tflite` | ✅ |
| [TF.js](https://www.tensorflow.org/js) | `tfjs` | `yolov8n_web_model/` | ✅ |
| [PaddlePaddle](https://github.com/PaddlePaddle) | `paddle` | `yolov8n_paddle_model/` | ✅ |
|

@ -11,7 +11,7 @@
data-strict="0"
data-reactions-enabled="1"
data-emit-metadata="0"
data-input-position="bottom"
data-input-position="top"
data-theme="preferred_color_scheme"
data-lang="en"
crossorigin="anonymous"

@ -0,0 +1,26 @@
{% import "partials/language.html" as lang with context %}
<!-- taken from
https://github.com/squidfunk/mkdocs-material/blob/master/src/partials/source-file.html -->
<br>
<div class="md-source-file">
<small>
<!-- mkdocs-git-revision-date-localized-plugin -->
{% if page.meta.git_revision_date_localized %}
📅 {{ lang.t("source.file.date.updated") }}:
{{ page.meta.git_revision_date_localized }}
{% if page.meta.git_creation_date_localized %}
<br/>
🎂 {{ lang.t("source.file.date.created") }}:
{{ page.meta.git_creation_date_localized }}
{% endif %}
<!-- mkdocs-git-revision-date-plugin -->
{% elif page.meta.revision_date %}
📅 {{ lang.t("source.file.date.updated") }}:
{{ page.meta.revision_date }}
{% endif %}
</small>
</div>

@ -1,28 +1,85 @@
---
comments: true
description: Explore various methods to install Ultralytics using pip, conda, git and Docker. Learn how to use Ultralytics with command line interface or within your Python projects.
keywords: Ultralytics installation, pip install Ultralytics, Docker install Ultralytics, Ultralytics command line interface, Ultralytics Python interface
---
## Install
## Install Ultralytics
Install YOLOv8 via the `ultralytics` pip package for the latest stable release or by cloning
the [https://github.com/ultralytics/ultralytics](https://github.com/ultralytics/ultralytics) repository for the most
up-to-date version.
Ultralytics provides various installation methods including pip, conda, and Docker. Install YOLOv8 via the `ultralytics` pip package for the latest stable release or by cloning the [Ultralytics GitHub repository](https://github.com/ultralytics/ultralytics) for the most up-to-date version. Docker can be used to execute the package in an isolated container, avoiding local installation.
!!! example "Install"
=== "pip install (recommended)"
=== "Pip install (recommended)"
Install the `ultralytics` package using pip, or update an existing installation by running `pip install -U ultralytics`. Visit the Python Package Index (PyPI) for more details on the `ultralytics` package: [https://pypi.org/project/ultralytics/](https://pypi.org/project/ultralytics/).
[![PyPI version](https://badge.fury.io/py/ultralytics.svg)](https://badge.fury.io/py/ultralytics) [![Downloads](https://static.pepy.tech/badge/ultralytics)](https://pepy.tech/project/ultralytics)
```bash
# Install the ultralytics package using pip
pip install ultralytics
```
=== "git clone (for development)"
=== "Conda install"
Conda is an alternative package manager to pip which may also be used for installation. Visit Anaconda for more details at [https://anaconda.org/conda-forge/ultralytics](https://anaconda.org/conda-forge/ultralytics). Ultralytics feedstock repository for updating the conda package is at [https://github.com/conda-forge/ultralytics-feedstock/](https://github.com/conda-forge/ultralytics-feedstock/).
[![Conda Recipe](https://img.shields.io/badge/recipe-ultralytics-green.svg)](https://anaconda.org/conda-forge/ultralytics) [![Conda Downloads](https://img.shields.io/conda/dn/conda-forge/ultralytics.svg)](https://anaconda.org/conda-forge/ultralytics) [![Conda Version](https://img.shields.io/conda/vn/conda-forge/ultralytics.svg)](https://anaconda.org/conda-forge/ultralytics) [![Conda Platforms](https://img.shields.io/conda/pn/conda-forge/ultralytics.svg)](https://anaconda.org/conda-forge/ultralytics)
```bash
# Install the ultralytics package using conda
conda install -c conda-forge ultralytics
```
!!! note
If you are installing in a CUDA environment best practice is to install `ultralytics`, `pytorch` and `pytorch-cuda` in the same command to allow the conda package manager to resolve any conflicts, or else to install `pytorch-cuda` last to allow it override the CPU-specific `pytorch` package if necesary.
```bash
# Install all packages together using conda
conda install -c conda-forge -c pytorch -c nvidia ultralytics pytorch torchvision pytorch-cuda=11.8
```
=== "Git clone"
Clone the `ultralytics` repository if you are interested in contributing to the development or wish to experiment with the latest source code. After cloning, navigate into the directory and install the package in editable mode `-e` using pip.
```bash
# Clone the ultralytics repository
git clone https://github.com/ultralytics/ultralytics
# Navigate to the cloned directory
cd ultralytics
# Install the package in editable mode for development
pip install -e .
```
See the `ultralytics` [requirements.txt](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) file for a list of dependencies. Note that `pip` automatically installs all required dependencies.
=== "Docker"
Utilize Docker to execute the `ultralytics` package in an isolated container. By employing the official `ultralytics` image from [Docker Hub](https://hub.docker.com/r/ultralytics/ultralytics), you can avoid local installation. Below are the commands to get the latest image and execute it:
<a href="https://hub.docker.com/r/ultralytics/ultralytics"><img src="https://img.shields.io/docker/pulls/ultralytics/ultralytics?logo=docker" alt="Docker Pulls"></a>
```bash
# Set image name as a variable
t=ultralytics/ultralytics:latest
# Pull the latest ultralytics image from Docker Hub
sudo docker pull $t
# Run the ultralytics image in a container with GPU support
sudo docker run -it --ipc=host --gpus all $t
```
The above command initializes a Docker container with the latest `ultralytics` image. The `-it` flag assigns a pseudo-TTY and maintains stdin open, enabling you to interact with the container. The `--ipc=host` flag sets the IPC (Inter-Process Communication) namespace to the host, which is essential for sharing memory between processes. The `--gpus all` flag enables access to all available GPUs inside the container, which is crucial for tasks that require GPU computation.
Note: To work with files on your local machine within the container, use Docker volumes for mounting a local directory into the container:
```bash
# Mount local directory to a directory inside the container
sudo docker run -it --ipc=host --gpus all -v /path/on/host:/path/in/container $t
```
Alter `/path/on/host` with the directory path on your local machine, and `/path/in/container` with the desired path inside the Docker container for accessibility.
See the `ultralytics` [requirements.txt](https://github.com/ultralytics/ultralytics/blob/main/requirements.txt) file for a list of dependencies. Note that all examples above install all required dependencies.
!!! tip "Tip"
@ -32,13 +89,11 @@ See the `ultralytics` [requirements.txt](https://github.com/ultralytics/ultralyt
<img width="800" alt="PyTorch Installation Instructions" src="https://user-images.githubusercontent.com/26833433/228650108-ab0ec98a-b328-4f40-a40d-95355e8a84e3.png">
</a>
## Use Ultralytics with CLI
## Use with CLI
The YOLO command line interface (CLI) allows for simple single-line commands without the need for a Python environment.
The Ultralytics command line interface (CLI) allows for simple single-line commands without the need for a Python environment.
CLI requires no customization or Python code. You can simply run all tasks from the terminal with the `yolo` command. Check out the [CLI Guide](usage/cli.md) to learn more about using YOLOv8 from the command line.
!!! example
=== "Syntax"
@ -93,10 +148,9 @@ CLI requires no customization or Python code. You can simply run all tasks from
yolo cfg
```
!!! warning "Warning"
Arguments must be passed as `arg=val` pairs, split by an equals `=` sign and delimited by spaces ` ` between pairs. Do not use `--` argument prefixes or commas `,` beteen arguments.
Arguments must be passed as `arg=val` pairs, split by an equals `=` sign and delimited by spaces ` ` between pairs. Do not use `--` argument prefixes or commas `,` between arguments.
- `yolo predict model=yolov8n.pt imgsz=640 conf=0.25` &nbsp;
- `yolo predict model yolov8n.pt imgsz 640 conf 0.25` &nbsp;
@ -104,7 +158,7 @@ CLI requires no customization or Python code. You can simply run all tasks from
[CLI Guide](usage/cli.md){ .md-button .md-button--primary}
## Use with Python
## Use Ultralytics with Python
YOLOv8's Python interface allows for seamless integration into your Python projects, making it easy to load, run, and process the model's output. Designed with simplicity and ease of use in mind, the Python interface enables users to quickly implement object detection, segmentation, and classification in their projects. This makes YOLOv8's Python interface an invaluable tool for anyone looking to incorporate these functionalities into their Python projects.
@ -114,24 +168,111 @@ For example, users can load a model, train it, evaluate its performance on a val
```python
from ultralytics import YOLO
# Create a new YOLO model from scratch
model = YOLO('yolov8n.yaml')
# Load a pretrained YOLO model (recommended for training)
model = YOLO('yolov8n.pt')
# Train the model using the 'coco128.yaml' dataset for 3 epochs
results = model.train(data='coco128.yaml', epochs=3)
# Evaluate the model's performance on the validation set
results = model.val()
# Perform object detection on an image using the model
results = model('https://ultralytics.com/images/bus.jpg')
# Export the model to ONNX format
success = model.export(format='onnx')
```
[Python Guide](usage/python.md){.md-button .md-button--primary}
## Ultralytics Settings
The Ultralytics library provides a powerful settings management system to enable fine-grained control over your experiments. By making use of the `SettingsManager` housed within the `ultralytics.utils` module, users can readily access and alter their settings. These are stored in a YAML file and can be viewed or modified either directly within the Python environment or via the Command-Line Interface (CLI).
### Inspecting Settings
To gain insight into the current configuration of your settings, you can view them directly:
!!! example "View settings"
=== "Python"
You can use Python to view your settings. Start by importing the `settings` object from the `ultralytics` module. Print and return settings using the following commands:
```python
from ultralytics import settings
# View all settings
print(settings)
# Return a specific setting
value = settings['runs_dir']
```
=== "CLI"
Alternatively, the command-line interface allows you to check your settings with a simple command:
```bash
yolo settings
```
### Modifying Settings
Ultralytics allows users to easily modify their settings. Changes can be performed in the following ways:
!!! example "Update settings"
=== "Python"
Within the Python environment, call the `update` method on the `settings` object to change your settings:
```python
from ultralytics import settings
# Update a setting
settings.update({'runs_dir': '/path/to/runs'})
# Update multiple settings
settings.update({'runs_dir': '/path/to/runs', 'tensorboard': False})
# Reset settings to default values
settings.reset()
```
=== "CLI"
If you prefer using the command-line interface, the following commands will allow you to modify your settings:
```bash
# Update a setting
yolo settings runs_dir='/path/to/runs'
# Update multiple settings
yolo settings runs_dir='/path/to/runs' tensorboard=False
# Reset settings to default values
yolo settings reset
```
### Understanding Settings
The table below provides an overview of the settings available for adjustment within Ultralytics. Each setting is outlined along with an example value, the data type, and a brief description.
| Name | Example Value | Data Type | Description |
|--------------------|-----------------------|-----------|------------------------------------------------------------------------------------------------------------------|
| `settings_version` | `'0.0.4'` | `str` | Ultralytics _settings_ version (different from Ultralytics [pip](https://pypi.org/project/ultralytics/) version) |
| `datasets_dir` | `'/path/to/datasets'` | `str` | The directory where the datasets are stored |
| `weights_dir` | `'/path/to/weights'` | `str` | The directory where the model weights are stored |
| `runs_dir` | `'/path/to/runs'` | `str` | The directory where the experiment runs are stored |
| `uuid` | `'a1b2c3d4'` | `str` | The unique identifier for the current settings |
| `sync` | `True` | `bool` | Whether to sync analytics and crashes to HUB |
| `api_key` | `''` | `str` | Ultralytics HUB [API Key](https://hub.ultralytics.com/settings?tab=api+keys) |
| `clearml` | `True` | `bool` | Whether to use ClearML logging |
| `comet` | `True` | `bool` | Whether to use [Comet ML](https://bit.ly/yolov8-readme-comet) for experiment tracking and visualization |
| `dvc` | `True` | `bool` | Whether to use [DVC for experiment tracking](https://dvc.org/doc/dvclive/ml-frameworks/yolo) and version control |
| `hub` | `True` | `bool` | Whether to use [Ultralytics HUB](https://hub.ultralytics.com) integration |
| `mlflow` | `True` | `bool` | Whether to use MLFlow for experiment tracking |
| `neptune` | `True` | `bool` | Whether to use Neptune for experiment tracking |
| `raytune` | `True` | `bool` | Whether to use Ray Tune for hyperparameter tuning |
| `tensorboard` | `True` | `bool` | Whether to use TensorBoard for visualization |
| `wandb` | `True` | `bool` | Whether to use Weights & Biases logging |
As you navigate through your projects or experiments, be sure to revisit these settings to ensure that they are optimally configured for your needs.

Some files were not shown because too many files have changed in this diff Show More

Loading…
Cancel
Save