You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
keyu tian c4d28a1c6a [upd] add notes for 384 training 2 years ago
..
models
README.md [upd] add notes for 384 training 2 years ago
arg.py [upd] add notes for 384 training 2 years ago
data.py [upd] add notes for 384 training 2 years ago
launch.py
lr_decay.py
main.py
main.sh
requirements.txt
util.py [upd] add notes for 384 training 2 years ago

README.md

About code isolation

This downstream_imagenet is isolated from pre-training codes. One can treat this downstream_imagenet as an independent codebase 🛠.

Preparation for ImageNet-1k fine-tuning

See INSTALL.md to prepare pip dependencies and the ImageNet dataset.

Note: for network definitions, we directly use timm.models.ResNet and official ConvNeXt.

Fine-tuning on ImageNet-1k from pre-trained weights

Run downstream_imagenet/main.sh. It is required to specify your experiment_name <experiment_name> ImageNet data folder --data_path, model name --model, and checkpoint file path --resume_from to run fine-tuning. All the other configurations have their default values, listed in downstream_imagenet/arg.py#L13. You can override any defaults by passing key-word arguments (like --bs=2048) to main.sh.

Here is an example command fine-tuning a ResNet50 on single machine with 8 GPUs:

$ cd /path/to/SparK/downstream_imagenet
$ bash ./main.sh <experiment_name> \
  --num_nodes=1 --ngpu_per_node=8 \
  --data_path=/path/to/imagenet \
  --model=resnet50 --resume_from=/some/path/to/timm_resnet50_1kpretrained.pth

For multiple machines, change the num_nodes to your count and plus these args:

--node_rank=<rank_starts_from_0> --master_address=<some_address> --master_port=<some_port>

Note that the first argument <experiment_name> is the name of your experiment, which would be used to create an output directory named output_<experiment_name>.

Logging

Once an experiment starts running, the following files would be automatically created and updated in output_<experiment_name>:

  • <model>_1kfinetuned_last.pth: the latest model weights

  • <model>_1kfinetuned_best.pth: model weights with the highest acc

  • <model>_1kfinetuned_best_ema.pth: EMA weights with the highest acc

  • finetune_log.txt: records some important information such as:

    • git_commit_id: git version
    • cmd: all arguments passed to the script

    It also reports training loss/acc, best evaluation acc, and remaining time at each epoch.

These files can help trace the experiment well.

Resuming

Add --resume_from=path/to/<model>_1kfinetuned_last.pth to resume from a latest saved checkpoint.