Neural Lexicon Reader: Reduce Pronunciation Errors in End-to-end TTS by Leveraging External Textual Knowledge
This is an implementation of the paper, along with the pipeline and pretrained model using an open dataset. Audio samples of the paper is available here. This open pipeline uses the Databaker dataset. Please refer to our previous pipeline for dataset preprocessing, while only the Databaker dataset is used. Besides, you need to run lexicon/build_databaker.py to build the vocabulary, download the lexicon from zdic.net, and encode them with XLM-R. Feel free to change the target directory to save the data, […]
Read more