diff --git a/README.md b/README.md index 4451759..83e983a 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# SparK✨: the first successful BERT-style pre-training on any convolutional nets [![arXiv](https://img.shields.io/badge/arXiv-2301.03580-b31b1b.svg)](https://arxiv.org/abs/2301.03580) +# SparK✨: the first successful BERT-style pre-training on any convolutional networks [![arXiv](https://img.shields.io/badge/arXiv-2301.03580-b31b1b.svg)](https://arxiv.org/abs/2301.03580) This is an official implementation of the paper "Designing BERT for Convolutional Networks: ***Spar***se and Hierarchical Mas***k***ed Modeling". @@ -23,7 +23,7 @@ This is an official implementation of the paper "Designing BERT for Convolutiona

-### 🔥 ConvNeXt gains more from BERT-style pre-training than Swin-Transformer, up to +3.5 points: +### 🔥 ConvNeXt gains more from pre-training than Swin-Transformer, up to +3.5 points: