A fast and easy implementation of Transformer with PyTorch

FasySeq

FasySeq is a shorthand as a Fast and easy sequential modeling toolkit. It aims to provide a seq2seq model to researchers and developers, which can be trained efficiently and modified easily. This toolkit is based on Transformer(Vaswani et al.), and will add more seq2seq models in the future.

Dependency

PyTorch >= 1.4
NLTK

Result

Structure

To Be Updated

  • top-k and top-p sampling
  • multi-GPU inference
  • length penalty in beam search

Preprocess

Build Vocabulary

createVocab.py

NamedArguments Description
-f/–file The files used to build the vocabulary.
Type: List
–vocab_num The maximum size of vocabulary, the excess word will be discard according to the frequency.
Type: Int Default: -1
–min_freq The minimum frequency of token in vocabulary.

 

 

 

To finish reading, please visit source site