Attention Based Grapheme To Phoneme with python
G2P
The G2P algorithm is used to generate the most probable pronunciation for a word not contained in the lexicon dictionary. It could be used as a preprocess of text-to-speech system to generate pronunciation for OOV words.
Dependencies
The following libraries are used:
pytorch
tqdm
matplotlib
Install dependencies using pip:
pip3 install -r requirements.txt
Dataset
Currently the following languages are supported:
- EN: English
- FA: Farsi
- RU: Russian
You could easily provide and use your own language specific pronunciatin doctionary for training G2P. More details about data preparation and contribution could be found in resources.
Feel free to provide resources for other languages.
Attention Model
Both encoder-decoder seq2seq model and attention model could handle G2P problem. Here we train attention based model.

The