Shortcuts

Efficient Multi-order Gated Aggregation Network

Abstract

Since the recent success of Vision Transformers (ViTs), explorations toward ViT-style architectures have triggered the resurgence of ConvNets. In this work, we explore the representation ability of modern ConvNets from a novel view of multi-order game-theoretic interaction, which reflects inter-variable interaction effects w.r.t.~contexts of different scales based on game theory. Within the modern ConvNet framework, we tailor the two feature mixers with conceptually simple yet effective depthwise convolutions to facilitate middle-order information across spatial and channel spaces respectively. In this light, a new family of pure ConvNet architecture, dubbed MogaNet, is proposed, which shows excellent scalability and attains competitive results among state-of-the-art models with more efficient use of parameters on ImageNet and multifarious typical vision benchmarks, including COCO object detection, ADE20K semantic segmentation, 2D&3D human pose estimation, and video prediction. Typically, MogaNet hits 80.0% and 87.8% top-1 accuracy with 5.2M and 181M parameters on ImageNet, outperforming ParC-Net-S and ConvNeXt-L while saving 59% FLOPs and 17M parameters. The source code is available at https://github.com/Westlake-AI/MogaNet.

How to use it?

from mmpretrain import inference_model

predict = inference_model('moganet-tiny_3rdparty_8xb128_in1k', 'demo/bird.JPEG')
print(predict['pred_class'])
print(predict['pred_score'])

Models and results

Image Classification on ImageNet-1k

Model

Pretrain

Params (M)

Flops (G)

Top-1 (%)

Top-5 (%)

Config

Download

moganet-xtiny_3rdparty_8xb128_in1k*

From scratch

2.97

0.79

76.48

93.49

config

model

moganet-tiny_3rdparty_8xb128_in1k*

From scratch

5.20

1.09

77.24

93.51

config

model

moganet-small_3rdparty_8xb128_in1k*

From scratch

4.94

25.35

83.38

96.58

config

model

moganet-base_3rdparty_8xb128_in1k*

From scratch

9.88

43.72

84.20

96.77

config

model

moganet-large_3rdparty_8xb128_in1k*

From scratch

15.84

82.48

84.76

97.15

config

model

moganet-xlarge_3rdparty_16xb32_in1k*

From scratch

34.43

180.8

85.11

97.38

config

model

Models with * are converted from the official repo. The config files of these models are only for inference. We haven’t reproduce the training results.

Citation

@article{Li2022MogaNet,
  title={Efficient Multi-order Gated Aggregation Network},
  author={Siyuan Li and Zedong Wang and Zicheng Liu and Cheng Tan and Haitao Lin and Di Wu and Zhiyuan Chen and Jiangbin Zheng and Stan Z. Li},
  journal={ArXiv},
  year={2022},
  volume={abs/2211.03295}
}
Read the Docs v: latest
Versions
latest
Downloads
epub
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.