Gaussian Multi-head Attention for Simultaneous Machine Translation

Source code for our ACL 2022 paper “Gaussian Multi-head Attention for Simultaneous Machine Translation” (PDF)

Our method is implemented based on the open-source toolkit Fairseq.

Core code of Gaussian Multi-head Attention is in fairseq/modules/gaussian_multihead_attention.py

Requirements and Installation

  • Python version = 3.6

  • PyTorch version = 1.7

  • Install fairseq:

    git clone https://github.com/ictnlp/GMA.git
    cd GMA
    pip install --editable ./

Quick Start

Data Pre-processing

We use the data of IWSLT15 English-Vietnamese (download here) and WMT15 German-English (download here), and apply BPE with 32K merge operations on WMT15 German-English via subword_nmt/apply_bpe.py.

Then, we process the data into the fairseq format: