You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
229 lines
17 KiB
229 lines
17 KiB
2 years ago
|
---
|
||
|
comments: true
|
||
1 year ago
|
description: Learn how to train your data on custom datasets using YOLOv5. Simple and updated guide on collection and organization of images, labelling, model training and deployment.
|
||
|
keywords: YOLOv5, train on custom dataset, image collection, model training, object detection, image labelling, Ultralytics, PyTorch, machine learning
|
||
2 years ago
|
---
|
||
|
|
||
1 year ago
|
📚 This guide explains how to train your own **custom dataset** with [YOLOv5](https://github.com/ultralytics/yolov5) 🚀. UPDATED 7 June 2023.
|
||
2 years ago
|
|
||
|
## Before You Start
|
||
|
|
||
1 year ago
|
Clone repo and install [requirements.txt](https://github.com/ultralytics/yolov5/blob/master/requirements.txt) in a [**Python>=3.8.0**](https://www.python.org/) environment, including [**PyTorch>=1.8**](https://pytorch.org/get-started/locally/). [Models](https://github.com/ultralytics/yolov5/tree/master/models) and [datasets](https://github.com/ultralytics/yolov5/tree/master/data) download automatically from the latest YOLOv5 [release](https://github.com/ultralytics/yolov5/releases).
|
||
2 years ago
|
|
||
|
```bash
|
||
|
git clone https://github.com/ultralytics/yolov5 # clone
|
||
|
cd yolov5
|
||
|
pip install -r requirements.txt # install
|
||
|
```
|
||
|
|
||
|
## Train On Custom Data
|
||
|
|
||
|
<a href="https://bit.ly/ultralytics_hub" target="_blank">
|
||
|
<img width="100%" src="https://github.com/ultralytics/assets/raw/main/im/integrations-loop.png"></a>
|
||
|
<br>
|
||
|
<br>
|
||
|
|
||
|
Creating a custom model to detect your objects is an iterative process of collecting and organizing images, labeling your objects of interest, training a model, deploying it into the wild to make predictions, and then using that deployed model to collect examples of edge cases to repeat and improve.
|
||
|
|
||
|
### 1. Create Dataset
|
||
|
|
||
|
YOLOv5 models must be trained on labelled data in order to learn classes of objects in that data. There are two options for creating your dataset before you start training:
|
||
|
|
||
1 year ago
|
<details open markdown>
|
||
|
<summary>Use <a href="https://roboflow.com/?ref=ultralytics">Roboflow</a> to create your dataset in YOLO format 🌟</summary>
|
||
2 years ago
|
|
||
1 year ago
|
!!! note
|
||
2 years ago
|
|
||
1 year ago
|
Roboflow users can use Ultralytics under the [AGPL license](https://github.com/ultralytics/ultralytics/blob/main/LICENSE) or can request an [Enterprise license](https://ultralytics.com/license) directly from Ultralytics. Be aware that Roboflow does not provide Ultralytics licenses, and it is the responsibility of the user to ensure appropriate licensing.
|
||
2 years ago
|
|
||
2 years ago
|
### 1.1 Collect Images
|
||
|
|
||
|
Your model will learn by example. Training on images similar to the ones it will see in the wild is of the utmost importance. Ideally, you will collect a wide variety of images from the same configuration (camera, angle, lighting, etc.) as you will ultimately deploy your project.
|
||
|
|
||
|
If this is not possible, you can start from [a public dataset](https://universe.roboflow.com/?ref=ultralytics) to train your initial model and then [sample images from the wild during inference](https://blog.roboflow.com/computer-vision-active-learning-tips/?ref=ultralytics) to improve your dataset and model iteratively.
|
||
|
|
||
|
### 1.2 Create Labels
|
||
|
|
||
|
Once you have collected images, you will need to annotate the objects of interest to create a ground truth for your model to learn from.
|
||
|
|
||
|
<p align="center"><a href="https://app.roboflow.com/?model=yolov5&ref=ultralytics" title="Create a Free Roboflow Account"><img width="450" src="https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/6152a275ad4b4ac20cd2e21a_roboflow-annotate.gif" /></a></p>
|
||
|
|
||
1 year ago
|
[Roboflow Annotate](https://roboflow.com/annotate?ref=ultralytics) is a simple web-based tool for managing and labeling your images with your team and exporting them in [YOLOv5's annotation format](https://roboflow.com/formats/yolov5-pytorch-txt?ref=ultralytics).
|
||
2 years ago
|
|
||
|
### 1.3 Prepare Dataset for YOLOv5
|
||
|
|
||
|
Whether you [label your images with Roboflow](https://roboflow.com/annotate?ref=ultralytics) or not, you can use it to convert your dataset into YOLO format, create a YOLOv5 YAML configuration file, and host it for importing into your training script.
|
||
|
|
||
|
[Create a free Roboflow account](https://app.roboflow.com/?model=yolov5&ref=ultralytics)
|
||
1 year ago
|
and upload your dataset to a `Public` workspace, label any unannotated images, then generate and export a version of your dataset in `YOLOv5 Pytorch` format.
|
||
2 years ago
|
|
||
1 year ago
|
Note: YOLOv5 does online augmentation during training, so we do not recommend applying any augmentation steps in Roboflow for training with YOLOv5. But we recommend applying the following preprocessing steps:
|
||
2 years ago
|
|
||
|
<p align="center"><img width="450" src="https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/6152a273477fccf42a0fd3d6_roboflow-preprocessing.png" title="Recommended Preprocessing Steps" /></p>
|
||
|
|
||
|
* **Auto-Orient** - to strip EXIF orientation from your images.
|
||
|
* **Resize (Stretch)** - to the square input size of your model (640x640 is the YOLOv5 default).
|
||
|
|
||
1 year ago
|
Generating a version will give you a point in time snapshot of your dataset so you can always go back and compare your future model training runs against it, even if you add more images or change its configuration later.
|
||
2 years ago
|
|
||
|
<p align="center"><img width="450" src="https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/6152a2733fd1da943619934e_roboflow-export.png" title="Export in YOLOv5 Format" /></p>
|
||
|
|
||
1 year ago
|
Export in `YOLOv5 Pytorch` format, then copy the snippet into your training script or notebook to download your dataset.
|
||
2 years ago
|
|
||
|
<p align="center"><img width="450" src="https://uploads-ssl.webflow.com/5f6bc60e665f54545a1e52a5/6152a273a92e4f5cb72594df_roboflow-snippet.png" title="Roboflow dataset download snippet" /></p>
|
||
|
|
||
|
Now continue with `2. Select a Model`.
|
||
|
</details>
|
||
|
|
||
1 year ago
|
<details markdown>
|
||
2 years ago
|
<summary>Or manually prepare your dataset</summary>
|
||
|
|
||
|
### 1.1 Create dataset.yaml
|
||
|
|
||
|
[COCO128](https://www.kaggle.com/ultralytics/coco128) is an example small tutorial dataset composed of the first 128 images in [COCO](http://cocodataset.org/#home) train2017. These same 128 images are used for both training and validation to verify our training pipeline is capable of overfitting. [data/coco128.yaml](https://github.com/ultralytics/yolov5/blob/master/data/coco128.yaml), shown below, is the dataset config file that defines 1) the dataset root directory `path` and relative paths to `train` / `val` / `test` image directories (or *.txt files with image paths) and 2) a class `names` dictionary:
|
||
2 years ago
|
|
||
2 years ago
|
```yaml
|
||
|
# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
|
||
|
path: ../datasets/coco128 # dataset root dir
|
||
|
train: images/train2017 # train images (relative to 'path') 128 images
|
||
|
val: images/train2017 # val images (relative to 'path') 128 images
|
||
|
test: # test images (optional)
|
||
|
|
||
|
# Classes (80 COCO classes)
|
||
|
names:
|
||
|
0: person
|
||
|
1: bicycle
|
||
|
2: car
|
||
|
...
|
||
|
77: teddy bear
|
||
|
78: hair drier
|
||
|
79: toothbrush
|
||
|
```
|
||
|
|
||
|
### 1.2 Create Labels
|
||
|
|
||
2 years ago
|
After using an annotation tool to label your images, export your labels to **YOLO format**, with one `*.txt` file per image (if no objects in image, no `*.txt` file is required). The `*.txt` file specifications are:
|
||
2 years ago
|
|
||
|
- One row per object
|
||
|
- Each row is `class x_center y_center width height` format.
|
||
|
- Box coordinates must be in **normalized xywh** format (from 0 - 1). If your boxes are in pixels, divide `x_center` and `width` by image width, and `y_center` and `height` by image height.
|
||
|
- Class numbers are zero-indexed (start from 0).
|
||
|
|
||
|
<p align="center"><img width="750" src="https://user-images.githubusercontent.com/26833433/91506361-c7965000-e886-11ea-8291-c72b98c25eec.jpg"></p>
|
||
|
|
||
|
The label file corresponding to the above image contains 2 persons (class `0`) and a tie (class `27`):
|
||
|
|
||
|
<p align="center"><img width="428" src="https://user-images.githubusercontent.com/26833433/112467037-d2568c00-8d66-11eb-8796-55402ac0d62f.png"></p>
|
||
|
|
||
|
### 1.3 Organize Directories
|
||
|
|
||
|
Organize your train and val images and labels according to the example below. YOLOv5 assumes `/coco128` is inside a `/datasets` directory **next to** the `/yolov5` directory. **YOLOv5 locates labels automatically for each image** by replacing the last instance of `/images/` in each image path with `/labels/`. For example:
|
||
2 years ago
|
|
||
2 years ago
|
```bash
|
||
|
../datasets/coco128/images/im0.jpg # image
|
||
|
../datasets/coco128/labels/im0.txt # label
|
||
|
```
|
||
|
|
||
|
<p align="center"><img width="700" src="https://user-images.githubusercontent.com/26833433/134436012-65111ad1-9541-4853-81a6-f19a3468b75f.png"></p>
|
||
|
</details>
|
||
|
|
||
|
### 2. Select a Model
|
||
|
|
||
|
Select a pretrained model to start training from. Here we select [YOLOv5s](https://github.com/ultralytics/yolov5/blob/master/models/yolov5s.yaml), the second-smallest and fastest model available. See our README [table](https://github.com/ultralytics/yolov5#pretrained-checkpoints) for a full comparison of all models.
|
||
|
|
||
|
<p align="center"><img width="800" alt="YOLOv5 Models" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/model_comparison.png"></p>
|
||
|
|
||
|
### 3. Train
|
||
|
|
||
|
Train a YOLOv5s model on COCO128 by specifying dataset, batch-size, image size and either pretrained `--weights yolov5s.pt` (recommended), or randomly initialized `--weights '' --cfg yolov5s.yaml` (not recommended). Pretrained weights are auto-downloaded from the [latest YOLOv5 release](https://github.com/ultralytics/yolov5/releases).
|
||
|
|
||
|
```bash
|
||
|
python train.py --img 640 --epochs 3 --data coco128.yaml --weights yolov5s.pt
|
||
|
```
|
||
2 years ago
|
|
||
2 years ago
|
!!! tip "Tip"
|
||
|
|
||
1 year ago
|
💡 Add `--cache ram` or `--cache disk` to speed up training (requires significant RAM/disk resources).
|
||
2 years ago
|
|
||
|
!!! tip "Tip"
|
||
|
|
||
1 year ago
|
💡 Always train from a local dataset. Mounted or network drives like Google Drive will be very slow.
|
||
2 years ago
|
|
||
|
All training results are saved to `runs/train/` with incrementing run directories, i.e. `runs/train/exp2`, `runs/train/exp3` etc. For more details see the Training section of our tutorial notebook. <a href="https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> <a href="https://www.kaggle.com/ultralytics/yolov5"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open In Kaggle"></a>
|
||
|
|
||
|
### 4. Visualize
|
||
|
|
||
|
#### Comet Logging and Visualization 🌟 NEW
|
||
|
|
||
2 years ago
|
[Comet](https://bit.ly/yolov5-readme-comet) is now fully integrated with YOLOv5. Track and visualize model metrics in real time, save your hyperparameters, datasets, and model checkpoints, and visualize your model predictions with [Comet Custom Panels](https://bit.ly/yolov5-colab-comet-panels)! Comet makes sure you never lose track of your work and makes it easy to share results and collaborate across teams of all sizes!
|
||
2 years ago
|
|
||
|
Getting started is easy:
|
||
2 years ago
|
|
||
2 years ago
|
```shell
|
||
|
pip install comet_ml # 1. install
|
||
|
export COMET_API_KEY=<Your API Key> # 2. paste API key
|
||
|
python train.py --img 640 --epochs 3 --data coco128.yaml --weights yolov5s.pt # 3. train
|
||
|
```
|
||
|
|
||
2 years ago
|
To learn more about all the supported Comet features for this integration, check out the [Comet Tutorial](https://docs.ultralytics.com/yolov5/tutorials/comet_logging_integration). If you'd like to learn more about Comet, head over to our [documentation](https://bit.ly/yolov5-colab-comet-docs). Get started by trying out the Comet Colab Notebook:
|
||
2 years ago
|
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1RG0WOQyxlDlo5Km8GogJpIEJlg_5lyYO?usp=sharing)
|
||
|
|
||
|
<img width="1920" alt="yolo-ui" src="https://user-images.githubusercontent.com/26833433/202851203-164e94e1-2238-46dd-91f8-de020e9d6b41.png">
|
||
|
|
||
|
#### ClearML Logging and Automation 🌟 NEW
|
||
|
|
||
|
[ClearML](https://cutt.ly/yolov5-notebook-clearml) is completely integrated into YOLOv5 to track your experimentation, manage dataset versions and even remotely execute training runs. To enable ClearML:
|
||
|
|
||
|
- `pip install clearml`
|
||
|
- run `clearml-init` to connect to a ClearML server (**deploy your own open-source server [here](https://github.com/allegroai/clearml-server)**, or use our free hosted server [here](https://cutt.ly/yolov5-notebook-clearml))
|
||
|
|
||
|
You'll get all the great expected features from an experiment manager: live updates, model upload, experiment comparison etc. but ClearML also tracks uncommitted changes and installed packages for example. Thanks to that ClearML Tasks (which is what we call experiments) are also reproducible on different machines! With only 1 extra line, we can schedule a YOLOv5 training task on a queue to be executed by any number of ClearML Agents (workers).
|
||
|
|
||
2 years ago
|
You can use ClearML Data to version your dataset and then pass it to YOLOv5 simply using its unique ID. This will help you keep track of your data without adding extra hassle. Explore the [ClearML Tutorial](https://docs.ultralytics.com/yolov5/tutorials/clearml_logging_integration) for details!
|
||
2 years ago
|
|
||
|
<a href="https://cutt.ly/yolov5-notebook-clearml">
|
||
|
<img alt="ClearML Experiment Management UI" src="https://github.com/thepycoder/clearml_screenshots/raw/main/scalars.jpg" width="1280"/></a>
|
||
|
|
||
|
#### Local Logging
|
||
|
|
||
|
Training results are automatically logged with [Tensorboard](https://www.tensorflow.org/tensorboard) and [CSV](https://github.com/ultralytics/yolov5/pull/4148) loggers to `runs/train`, with a new experiment directory created for each new training as `runs/train/exp2`, `runs/train/exp3`, etc.
|
||
|
|
||
2 years ago
|
This directory contains train and val statistics, mosaics, labels, predictions and augmented mosaics, as well as metrics and charts including precision-recall (PR) curves and confusion matrices.
|
||
2 years ago
|
|
||
|
<img alt="Local logging results" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/image-local_logging.jpg" width="1280"/>
|
||
|
|
||
|
Results file `results.csv` is updated after each epoch, and then plotted as `results.png` (below) after training completes. You can also plot any `results.csv` file manually:
|
||
|
|
||
|
```python
|
||
|
from utils.plots import plot_results
|
||
2 years ago
|
|
||
2 years ago
|
plot_results('path/to/results.csv') # plot 'results.csv' as 'results.png'
|
||
|
```
|
||
|
|
||
|
<p align="center"><img width="800" alt="results.png" src="https://github.com/ultralytics/yolov5/releases/download/v1.0/results.png"></p>
|
||
|
|
||
|
## Next Steps
|
||
|
|
||
|
Once your model is trained you can use your best checkpoint `best.pt` to:
|
||
2 years ago
|
|
||
2 years ago
|
* Run [CLI](https://github.com/ultralytics/yolov5#quick-start-examples) or [Python](https://docs.ultralytics.com/yolov5/tutorials/pytorch_hub_model_loading) inference on new images and videos
|
||
2 years ago
|
* [Validate](https://github.com/ultralytics/yolov5/blob/master/val.py) accuracy on train, val and test splits
|
||
2 years ago
|
* [Export](https://docs.ultralytics.com/yolov5/tutorials/model_export) to TensorFlow, Keras, ONNX, TFlite, TF.js, CoreML and TensorRT formats
|
||
|
* [Evolve](https://docs.ultralytics.com/yolov5/tutorials/hyperparameter_evolution) hyperparameters to improve performance
|
||
2 years ago
|
* [Improve](https://docs.roboflow.com/adding-data/upload-api?ref=ultralytics) your model by sampling real-world images and adding them to your dataset
|
||
|
|
||
|
## Environments
|
||
|
|
||
2 years ago
|
YOLOv5 is designed to be run in the following up-to-date verified environments (with all dependencies including [CUDA](https://developer.nvidia.com/cuda)/[CUDNN](https://developer.nvidia.com/cudnn), [Python](https://www.python.org/) and [PyTorch](https://pytorch.org/) preinstalled):
|
||
2 years ago
|
|
||
|
- **Notebooks** with free GPU: <a href="https://bit.ly/yolov5-paperspace-notebook"><img src="https://assets.paperspace.io/img/gradient-badge.svg" alt="Run on Gradient"></a> <a href="https://colab.research.google.com/github/ultralytics/yolov5/blob/master/tutorial.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"></a> <a href="https://www.kaggle.com/ultralytics/yolov5"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" alt="Open In Kaggle"></a>
|
||
2 years ago
|
- **Google Cloud** Deep Learning VM. See [GCP Quickstart Guide](https://docs.ultralytics.com/yolov5/environments/google_cloud_quickstart_tutorial/)
|
||
|
- **Amazon** Deep Learning AMI. See [AWS Quickstart Guide](https://docs.ultralytics.com/yolov5/environments/aws_quickstart_tutorial/)
|
||
|
- **Docker Image**. See [Docker Quickstart Guide](https://docs.ultralytics.com/yolov5/environments/docker_image_quickstart_tutorial/) <a href="https://hub.docker.com/r/ultralytics/yolov5"><img src="https://img.shields.io/docker/pulls/ultralytics/yolov5?logo=docker" alt="Docker Pulls"></a>
|
||
2 years ago
|
|
||
|
## Status
|
||
|
|
||
|
<a href="https://github.com/ultralytics/yolov5/actions/workflows/ci-testing.yml"><img src="https://github.com/ultralytics/yolov5/actions/workflows/ci-testing.yml/badge.svg" alt="YOLOv5 CI"></a>
|
||
|
|
||
1 year ago
|
If this badge is green, all [YOLOv5 GitHub Actions](https://github.com/ultralytics/yolov5/actions) Continuous Integration (CI) tests are currently passing. CI tests verify correct operation of YOLOv5 [training](https://github.com/ultralytics/yolov5/blob/master/train.py), [validation](https://github.com/ultralytics/yolov5/blob/master/val.py), [inference](https://github.com/ultralytics/yolov5/blob/master/detect.py), [export](https://github.com/ultralytics/yolov5/blob/master/export.py) and [benchmarks](https://github.com/ultralytics/yolov5/blob/master/benchmarks.py) on macOS, Windows, and Ubuntu every 24 hours and on every commit.
|