OpenMMLab Detection Toolbox and Benchmark https://mmdetection.readthedocs.io/

History

BigDong 11f3ca2ba6 [Refactor] move model.pretrained to model.backbone.init_cfg (#5370 ) * [Refactor] move model pretrained to backbone init_cfg * fix some bug and add a test unit * add * fix bug * fix priority * fix bug * delete test backbone init cfg * fix potential bug * fix potential bug * add detectors test unit * fix logic * add test unit * fix		3 years ago
..
README.md	[Docs] Supplement FAQ doc, add training loss=nan solution (#5312 )	3 years ago
mask_rcnn_r50_fpn_mocov2-pretrain_1x_coco.py	[Refactor] move model.pretrained to model.backbone.init_cfg (#5370 )	3 years ago
mask_rcnn_r50_fpn_mocov2-pretrain_ms-2x_coco.py	[Refactor] move model.pretrained to model.backbone.init_cfg (#5370 )	3 years ago
mask_rcnn_r50_fpn_swav-pretrain_1x_coco.py	[Refactor] move model.pretrained to model.backbone.init_cfg (#5370 )	3 years ago
mask_rcnn_r50_fpn_swav-pretrain_ms-2x_coco.py	[Refactor] move model.pretrained to model.backbone.init_cfg (#5370 )	3 years ago

README.md

Backbones Trained by Self-Supervise Algorithms

Introduction

We support to apply the backbone models pre-trained by different self-supervised methods in detection systems and provide their results on Mask R-CNN.

The pre-trained models are converted from MoCo and downloaded from SwAV.

For SwAV, please cite

@article{caron2020unsupervised,
  title={Unsupervised Learning of Visual Features by Contrasting Cluster Assignments},
  author={Caron, Mathilde and Misra, Ishan and Mairal, Julien and Goyal, Priya and Bojanowski, Piotr and Joulin, Armand},
  booktitle={Proceedings of Advances in Neural Information Processing Systems (NeurIPS)},
  year={2020}
}

For MoCo, please cite

@Article{he2019moco,
  author  = {Kaiming He and Haoqi Fan and Yuxin Wu and Saining Xie and Ross Girshick},
  title   = {Momentum Contrast for Unsupervised Visual Representation Learning},
  journal = {arXiv preprint arXiv:1911.05722},
  year    = {2019},
}
@Article{chen2020mocov2,
  author  = {Xinlei Chen and Haoqi Fan and Ross Girshick and Kaiming He},
  title   = {Improved Baselines with Momentum Contrastive Learning},
  journal = {arXiv preprint arXiv:2003.04297},
  year    = {2020},
}

Usage

To use a self-supervisely pretrained backbone, there are two steps to do:

Download and convert the model to PyTorch-style supported by MMDetection
Modify the config and change the training setting accordingly

Convert model

For more general usage, we also provide script selfsup2mmdet.py in the tools directory to convert the key of models pretrained by different self-supervised methods to PyTorch-style checkpoints used in MMDetection.

python -u tools/model_converters/selfsup2mmdet.py ${PRETRAIN_PATH} ${STORE_PATH} --selfsup ${method}

This script convert model from PRETRAIN_PATH and store the converted model in STORE_PATH.

For example, to use a ResNet-50 backbone released by MoCo, you can download it from here and use the following command

python -u tools/model_converters/selfsup2mmdet.py ./moco_v2_800ep_pretrain.pth.tar mocov2_r50_800ep_pretrain.pth --selfsup moco

To use the ResNet-50 backbone released by SwAV, you can download it from here

Modify config

The backbone requires SyncBN and the fronzen_stages need to be changed. A config that use the moco backbone is as below

_base_ = [
    '../_base_/models/mask_rcnn_r50_fpn.py',
    '../_base_/datasets/coco_instance.py',
    '../_base_/schedules/schedule_1x.py', '../_base_/default_runtime.py'
]

model = dict(
    pretrained='./mocov2_r50_800ep_pretrain.pth',
    backbone=dict(
        frozen_stages=0,
        norm_cfg=dict(type='SyncBN', requires_grad=True),
        norm_eval=False))

Results

Method	Backbone	Style	Lr schd	box AP	mask AP	Config	Download
Mask RCNN	R50 by MoCo v2	pytorch	1x	38.0	34.3	config	model \| log
Mask RCNN	R50 by MoCo v2	pytorch	multi-scale 2x	40.8	36.8	config	model \| log
Mask RCNN	R50 by SwAV	pytorch	1x	39.1	35.7	config	model \| log
Mask RCNN	R50 by SwAV	pytorch	multi-scale 2x	41.3	37.3	config	model \| log

Notice

We only provide single-scale 1x and multi-scale 2x configs as examples to show how to use backbones trained by self-supervised algorithms. We will try to reproduce the results in their corresponding paper using the released backbone in the future. Please stay tuned.