OpenMMLab Detection Toolbox and Benchmark
https://mmdetection.readthedocs.io/
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
275 lines
9.9 KiB
275 lines
9.9 KiB
# 3: Train with customized models and standard datasets |
|
|
|
In this note, you will know how to train, test and inference your own customized models under standard datasets. We use the cityscapes dataset to train a customized Cascade Mask R-CNN R50 model as an example to demonstrate the whole process, which using [`AugFPN`](https://github.com/Gus-Guo/AugFPN) to replace the defalut `FPN` as neck, and add `Rotate` or `Translate` as training-time auto augmentation. |
|
|
|
The basic steps are as below: |
|
|
|
1. Prepare the standard dataset |
|
2. Prepare your own customized model |
|
3. Prepare a config |
|
4. Train, test, and inference models on the standard dataset. |
|
|
|
## Prepare the standard dataset |
|
|
|
In this note, as we use the standard cityscapes dataset as an example. |
|
|
|
It is recommended to symlink the dataset root to `$MMDETECTION/data`. |
|
If your folder structure is different, you may need to change the corresponding paths in config files. |
|
|
|
```none |
|
mmdetection |
|
├── mmdet |
|
├── tools |
|
├── configs |
|
├── data |
|
│ ├── coco |
|
│ │ ├── annotations |
|
│ │ ├── train2017 |
|
│ │ ├── val2017 |
|
│ │ ├── test2017 |
|
│ ├── cityscapes |
|
│ │ ├── annotations |
|
│ │ ├── leftImg8bit |
|
│ │ │ ├── train |
|
│ │ │ ├── val |
|
│ │ ├── gtFine |
|
│ │ │ ├── train |
|
│ │ │ ├── val |
|
│ ├── VOCdevkit |
|
│ │ ├── VOC2007 |
|
│ │ ├── VOC2012 |
|
|
|
``` |
|
|
|
The cityscapes annotations have to be converted into the coco format using `tools/convert_datasets/cityscapes.py`: |
|
|
|
```shell |
|
pip install cityscapesscripts |
|
python tools/convert_datasets/cityscapes.py ./data/cityscapes --nproc 8 --out-dir ./data/cityscapes/annotations |
|
``` |
|
|
|
Currently the config files in `cityscapes` use COCO pre-trained weights to initialize. |
|
You could download the pre-trained models in advance if network is unavailable or slow, otherwise it would cause errors at the beginning of training. |
|
|
|
## Prepare your own customized model |
|
|
|
The second step is to use your own module or training setting. Assume that we want to implement a new neck called `AugFPN` to replace with the default `FPN` under the existing detector Cascade Mask R-CNN R50. The following implements`AugFPN` under MMDetection. |
|
|
|
### 1. Define a new neck (e.g. AugFPN) |
|
|
|
Firstly create a new file `mmdet/models/necks/augfpn.py`. |
|
|
|
```python |
|
from ..builder import NECKS |
|
|
|
@NECKS.register_module() |
|
class AugFPN(nn.Module): |
|
|
|
def __init__(self, |
|
in_channels, |
|
out_channels, |
|
num_outs, |
|
start_level=0, |
|
end_level=-1, |
|
add_extra_convs=False): |
|
pass |
|
|
|
def forward(self, inputs): |
|
# implementation is ignored |
|
pass |
|
``` |
|
|
|
### 2. Import the module |
|
|
|
You can either add the following line to `mmdet/models/necks/__init__.py`, |
|
|
|
```python |
|
from .augfpn import AugFPN |
|
``` |
|
|
|
or alternatively add |
|
|
|
```python |
|
custom_imports = dict( |
|
imports=['mmdet.models.necks.augfpn.py'], |
|
allow_failed_imports=False) |
|
``` |
|
|
|
to the config file and avoid modifying the original code. |
|
|
|
### 3. Modify the config file |
|
|
|
```python |
|
neck=dict( |
|
type='AugFPN', |
|
in_channels=[256, 512, 1024, 2048], |
|
out_channels=256, |
|
num_outs=5) |
|
``` |
|
|
|
For more detailed usages about customize your own models (e.g. implement a new backbone, head, loss, etc) and runtime training settings (e.g. define a new optimizer, use gradient clip, customize training schedules and hooks, etc), please refer to the guideline [Customize Models](tutorials/customize_models.md) and [Customize Runtime Settings](tutorials/customize_runtime.md) respectively. |
|
|
|
## Prepare a config |
|
|
|
The third step is to prepare a config for your own training setting. Assume that we want to add `AugFPN` and `Rotate` or `Translate` augmentation to existing Cascade Mask R-CNN R50 to train the cityscapes dataset, and assume the config is under directory `configs/cityscapes/` and named as `cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py`, the config is as below. |
|
|
|
```python |
|
# The new config inherits the base configs to highlight the necessary modification |
|
_base_ = [ |
|
'../_base_/models/cascade_mask_rcnn_r50_fpn.py', |
|
'../_base_/datasets/cityscapes_instance.py', '../_base_/default_runtime.py' |
|
] |
|
|
|
model = dict( |
|
# set None to avoid loading ImageNet pretrained backbone, |
|
# instead here we set `load_from` to load from COCO pretrained detectors. |
|
pretrained=None, |
|
# replace neck from defaultly `FPN` to our new implemented module `AugFPN` |
|
neck=dict( |
|
type='AugFPN', |
|
in_channels=[256, 512, 1024, 2048], |
|
out_channels=256, |
|
num_outs=5), |
|
# We also need to change the num_classes in head from 80 to 8, to match the |
|
# cityscapes dataset's annotation. This modification involves `bbox_head` and `mask_head`. |
|
roi_head=dict( |
|
bbox_head=[ |
|
dict( |
|
type='Shared2FCBBoxHead', |
|
in_channels=256, |
|
fc_out_channels=1024, |
|
roi_feat_size=7, |
|
# change the number of classes from defaultly COCO to cityscapes |
|
num_classes=8, |
|
bbox_coder=dict( |
|
type='DeltaXYWHBBoxCoder', |
|
target_means=[0., 0., 0., 0.], |
|
target_stds=[0.1, 0.1, 0.2, 0.2]), |
|
reg_class_agnostic=True, |
|
loss_cls=dict( |
|
type='CrossEntropyLoss', |
|
use_sigmoid=False, |
|
loss_weight=1.0), |
|
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, |
|
loss_weight=1.0)), |
|
dict( |
|
type='Shared2FCBBoxHead', |
|
in_channels=256, |
|
fc_out_channels=1024, |
|
roi_feat_size=7, |
|
# change the number of classes from defaultly COCO to cityscapes |
|
num_classes=8, |
|
bbox_coder=dict( |
|
type='DeltaXYWHBBoxCoder', |
|
target_means=[0., 0., 0., 0.], |
|
target_stds=[0.05, 0.05, 0.1, 0.1]), |
|
reg_class_agnostic=True, |
|
loss_cls=dict( |
|
type='CrossEntropyLoss', |
|
use_sigmoid=False, |
|
loss_weight=1.0), |
|
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, |
|
loss_weight=1.0)), |
|
dict( |
|
type='Shared2FCBBoxHead', |
|
in_channels=256, |
|
fc_out_channels=1024, |
|
roi_feat_size=7, |
|
# change the number of classes from defaultly COCO to cityscapes |
|
num_classes=8, |
|
bbox_coder=dict( |
|
type='DeltaXYWHBBoxCoder', |
|
target_means=[0., 0., 0., 0.], |
|
target_stds=[0.033, 0.033, 0.067, 0.067]), |
|
reg_class_agnostic=True, |
|
loss_cls=dict( |
|
type='CrossEntropyLoss', |
|
use_sigmoid=False, |
|
loss_weight=1.0), |
|
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)) |
|
], |
|
mask_head=dict( |
|
type='FCNMaskHead', |
|
num_convs=4, |
|
in_channels=256, |
|
conv_out_channels=256, |
|
# change the number of classes from defaultly COCO to cityscapes |
|
num_classes=8, |
|
loss_mask=dict( |
|
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)))) |
|
|
|
# over-write `train_pipeline` for new added `AutoAugment` training setting |
|
img_norm_cfg = dict( |
|
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True) |
|
train_pipeline = [ |
|
dict(type='LoadImageFromFile'), |
|
dict(type='LoadAnnotations', with_bbox=True, with_mask=True), |
|
dict( |
|
type='AutoAugment', |
|
policies=[ |
|
[dict( |
|
type='Rotate', |
|
level=5, |
|
img_fill_val=(124, 116, 104), |
|
prob=0.5, |
|
scale=1) |
|
], |
|
[dict(type='Rotate', level=7, img_fill_val=(124, 116, 104)), |
|
dict( |
|
type='Translate', |
|
level=5, |
|
prob=0.5, |
|
img_fill_val=(124, 116, 104)) |
|
], |
|
]), |
|
dict( |
|
type='Resize', img_scale=[(2048, 800), (2048, 1024)], keep_ratio=True), |
|
dict(type='RandomFlip', flip_ratio=0.5), |
|
dict(type='Normalize', **img_norm_cfg), |
|
dict(type='Pad', size_divisor=32), |
|
dict(type='DefaultFormatBundle'), |
|
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels', 'gt_masks']), |
|
] |
|
|
|
# set batch_size per gpu, and set new training pipeline |
|
data = dict( |
|
samples_per_gpu=1, |
|
workers_per_gpu=3, |
|
# over-write `pipeline` with new training pipeline setting |
|
train=dict(dataset=dict(pipeline=train_pipeline))) |
|
|
|
# Set optimizer |
|
optimizer = dict(type='SGD', lr=0.01, momentum=0.9, weight_decay=0.0001) |
|
optimizer_config = dict(grad_clip=None) |
|
# Set customized learning policy |
|
lr_config = dict( |
|
policy='step', |
|
warmup='linear', |
|
warmup_iters=500, |
|
warmup_ratio=0.001, |
|
step=[8]) |
|
total_epochs = 10 |
|
|
|
# We can use the COCO pretrained Cascade Mask R-CNN R50 model for more stable performance initialization |
|
load_from = 'http://download.openmmlab.com/mmdetection/v2.0/cascade_rcnn/cascade_mask_rcnn_r50_fpn_1x_coco/cascade_mask_rcnn_r50_fpn_1x_coco_20200203-9d4dcb24.pth' |
|
``` |
|
|
|
## Train a new model |
|
|
|
To train a model with the new config, you can simply run |
|
|
|
```shell |
|
python tools/train.py configs/cityscapes/cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py |
|
``` |
|
|
|
For more detailed usages, please refer to the [Case 1](1_exist_data_model.md). |
|
|
|
## Test and inference |
|
|
|
To test the trained model, you can simply run |
|
|
|
```shell |
|
python tools/test.py configs/cityscapes/cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py work_dirs/cascade_mask_rcnn_r50_augfpn_autoaug_10e_cityscapes.py/latest.pth --eval bbox segm |
|
``` |
|
|
|
For more detailed usages, please refer to the [Case 1](1_exist_data_model.md).
|
|
|