## About code isolation This `downstream_imagenet` is isolated from pre-training codes. One can treat this `downstream_imagenet` as an independent codebase 🛠️. ## Preparation for ImageNet-1k fine-tuning See [INSTALL.md](https://github.com/keyu-tian/SparK/blob/main/INSTALL.md) to prepare `pip` dependencies and the ImageNet dataset. **Note: for network definitions, we directly use `timm.models.ResNet` and [official ConvNeXt](https://github.com/facebookresearch/ConvNeXt/blob/048efcea897d999aed302f2639b6270aedf8d4c8/models/convnext.py).** ## Fine-tuning on ImageNet-1k from pre-trained weights Run [downstream_imagenet/main.sh](https://github.com/keyu-tian/SparK/blob/main/downstream_imagenet/main.sh). It is **required** to specify your experiment_name `` ImageNet data folder `--data_path`, model name `--model`, and checkpoint file path `--resume_from` to run fine-tuning. All the other configurations have their default values, listed in [downstream_imagenet/arg.py#L13](https://github.com/keyu-tian/SparK/blob/main/downstream_imagenet/arg.py#L13). You can override any defaults by passing key-word arguments (like `--bs=2048`) to `main.sh`. Here is an example command fine-tuning a ResNet50 on single machine with 8 GPUs: ```shell script $ cd /path/to/SparK/downstream_imagenet $ bash ./main.sh \ --num_nodes=1 --ngpu_per_node=8 \ --data_path=/path/to/imagenet \ --model=resnet50 --resume_from=/some/path/to/timm_resnet50_1kpretrained.pth ``` For multiple machines, change the `num_nodes` to your count and plus these args: ```shell script --node_rank= --master_address= --master_port= ``` Note that the first argument `` is the name of your experiment, which would be used to create an output directory named `output_`. ## Logging Once an experiment starts running, the following files would be automatically created and updated in `output_`: - `_1kfinetuned_last.pth`: the latest model weights - `_1kfinetuned_best.pth`: model weights with the highest acc - `_1kfinetuned_best_ema.pth`: EMA weights with the highest acc - `finetune_log.txt`: records some important information such as: - `git_commit_id`: git version - `cmd`: all arguments passed to the script It also reports training loss/acc, best evaluation acc, and remaining time at each epoch. These files can help trace the experiment well. ## Resuming Add `--resume_from=path/to/_1kfinetuned_last.pth` to resume from a latest saved checkpoint.