## About code isolation This `downstream_mmdet` is isolated from pre-training codes. One can treat this `downstream_mmdet` as an independent codebase 🛠️. ## Fine-tuned ConvNeXt-B weights, log files, and performance
[[`weights (pre-trained by SparK)`](https://drive.google.com/file/d/1ZjWbqI1qoBcqeQijI5xX9E-YNkxpJcYV/view?usp=share_link)] [[`weights (fine-tuned on COCO)`](https://drive.google.com/file/d/1t10dmzg5KOO27o2yIglK-gQepB5gR4zR/view?usp=share_link)] [[`log.json`](https://drive.google.com/file/d/1TuNboXl1qwjf1tggZ3QOssI67uU7Jtig/view?usp=share_link)] [[`log`](https://drive.google.com/file/d/1JY5CkL_MX08zJ8P1FBIeC60OJsuIiyZc/view?usp=sharing)]

## Installation [MMDetection with commit 6a979e2](https://github.com/facebookresearch/detectron2/releases/tag/v0.6) before fine-tuning ConvNeXt on COCO We refer to the codebases of [ConvNeXt](https://github.com/facebookresearch/ConvNeXt/tree/048efcea897d999aed302f2639b6270aedf8d4c8) and [Swin-Transformer](https://github.com/SwinTransformer/Swin-Transformer-Object-Detection/tree/6a979e2164e3fb0de0ca2546545013a4d71b2f7d). Please refer to [README.md](https://github.com/SwinTransformer/Swin-Transformer-Object-Detection/blob/6a979e2164e3fb0de0ca2546545013a4d71b2f7d/README.md) for installation and dataset preparation instructions. Note the COCO dataset folder should be at `downstream_mmdet/data/coco`. The folder should follow the directory structure requried by `MMDetection`, which should look like this: ``` downstream_mmdet/data/coco: annotations/: captions_train2017.json captions_val2017.json instances_train2017.json instances_val2017.json person_keypoints_train2017.json person_keypoints_val2017.json train2017/: a_lot_images.jpg val2017/: a_lot_images.jpg ``` ### Training To train a detector with pre-trained models, run: ``` # single-gpu training python tools/train.py --cfg-options model.pretrained= [other optional arguments] # multi-gpu training tools/dist_train.sh --cfg-options model.pretrained= [other optional arguments] ``` For example, to train a Mask R-CNN model with a SparK pretrained `ConvNeXt-B` backbone and 4 gpus, run: ``` tools/dist_train.sh configs/convnext_spark/mask_rcnn_convnext_base_patch4_window7_mstrain_480-800_adamw_3x_coco_in1k.py 4 \ --cfg-options model.pretrained=/some/path/to/official_convnext_base_1kpretrained.pth ``` The Mask R-CNN 3x fine-tuning config file can be found at [`configs/convnext_spark`](configs/convnext_spark). This config is basically a copy of [https://github.com/facebookresearch/ConvNeXt/blob/main/object_detection/configs/convnext/mask_rcnn_convnext_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco_in1k.py](https://github.com/facebookresearch/ConvNeXt/blob/main/object_detection/configs/convnext/mask_rcnn_convnext_tiny_patch4_window7_mstrain_480-800_adamw_3x_coco_in1k.py). ### Inference ``` # single-gpu testing python tools/test.py --eval bbox segm # multi-gpu testing tools/dist_test.sh --eval bbox segm ``` ## Acknowledgment We appreciate these useful codebases: - [MMDetection](https://github.com/open-mmlab/mmdetection) - [ConvNeXt](https://github.com/facebookresearch/ConvNeXt) - [Swin-Transformer-Object-Detection](https://github.com/SwinTransformer/Swin-Transformer-Object-Detection)