All in One: Exploring Unified Video-Language Pre-training

Code for the paper: All in One: Exploring Unified Video-Language Pre-training Arxiv

ppl

Install

1. PytorchLighting

In this work, we use PytorchLighting for distributed training with mixed precision.
Install pytorch and PytorchLighting first.

conda create -n allinone python=3.7
source activate allinone
conda install pytorch torchvision torchaudio cudatoolkit=10.2 -c pytorch
cd [Path_To_This_Code]
pip install -r requirements.txt

2. On-the-fly decode

To speed up the pre-training, we adopt on-the-fly decode for fast IO.
Install ffmpeg and pytorchvideo (for data augmentation) as below.