Pyramid Vision Transformer With Python

  • (2020/06/21) Code of PVTv2 is released! PVTv2 largely improves PVTv1 and works better than Swin Transformer with ImageNet-1K pre-training.

The image is from Transformers: Revenge of the Fallen.

This repository contains the official implementation of PVTv1 & PVTv2 in image classification, object detection, and semantic segmentation tasks.

Model Zoo

Image Classification

Classification configs & weights see >>>here<<<.

Method Size [email protected] #Params (M)
PVTv2-B0 224 70.5 3.7
PVTv2-B1 224 78.7 14.0
PVTv2-B2-Linear 224 82.1 22.6
PVTv2-B2 224 82.0 25.4
PVTv2-B3

 

To finish reading, please visit source site