Jittor implementation of Vision Transformer with Deformable Attention

This repository contains a simple implementation of Vision Transformer with Deformable Attention [arXiv].

Currently, we only release the code of models and the training scripts are under development including advance data augmentations and mixed precision training.

Pytorch version: Github

Dependencies

  • NVIDIA GPU + CUDA 11.1 + cuDNN 8.0.3
  • Python 3.7 (Recommend to use Anaconda)
  • jittor == 1.3.1.40
  • jimm

TODO

  • Training scripts with advance data augmentations.

Citation

If you find our work is useful in your research, please consider citing:

@misc{xia2022vision,
title={Vision Transformer with Deformable Attention},
author={Zhuofan Xia and Xuran Pan and

 

 

 

To finish reading, please visit source site