# SparK: BERT/MAE-style Pretraining on Any Convolutional Networks [](https://www.reddit.com/r/MachineLearning/comments/10ix0l1/r_iclr2023_spotlight_the_first_bertstyle/) [](https://twitter.com/keyutian/status/1616606179144380422)
# SparK: the first successful BERT/MAE-style pretraining on any convolutional networks [](https://www.reddit.com/r/MachineLearning/comments/10ix0l1/r_iclr2023_spotlight_the_first_bertstyle/) [](https://twitter.com/keyutian/status/1616606179144380422)
Implementation of the paper [Designing BERT for Convolutional Networks: ***Spar***se and Hierarchical Mas***k***ed Modeling](https://arxiv.org/abs/2301.03580).
This is the official implementation of ICLR paper [Designing BERT for Convolutional Networks: ***Spar***se and Hierarchical Mas***k***ed Modeling](https://arxiv.org/abs/2301.03580).
We've tried our best to make the codebase clean, short, easy to read, state-of-the-art, and only rely on minimal dependencies.