A complete suite for training sequence-to-sequence models in PyTorch

This is a complete suite for training sequence-to-sequence models in PyTorch. It consists of several models and code to both train and infer using them.

Using this code you can train:

Neural-machine-translation (NMT) models
Language models
Image to caption generation
Skip-thought sentence representations
And more…

Installation

git clone --recursive https://github.com/eladhoffer/seq2seq.pytorch
cd seq2seq.pytorch; python setup.py develop

Models

Models currently available:

Datasets

Datasets currently available:

All datasets can be tokenized using 3 available segmentation methods:

Character based segmentation
Word based segmentation
Byte-pair-encoding (BPE) as suggested by bpe with selectable number of tokens.

After choosing a tokenization method,

To finish reading, please visit source site