Chimera: Learning Shared Semantic Space for Speech-to-Text Translation

This is a Pytorch implementation for the “Chimera” paper Learning Shared Semantic Space for Speech-to-Text Translation (accepted by ACL Findings 2021), which aims to bridge the modality gap by unifying the task of MT (textual Machine Translation) and ST (Speech-to-Text Translation). It has achieved new SOTA performance on all 8 language pairs in MuST-C benchmark, by utilizing an external MT corpus.

![]( =100%x)

This repository is up to now a nightly version, and is bug-prone because of code refactoring. Also it is not fully tested on configurations other than the authors’ working environment yet. However, we encourage you to first have a look at the results and model codes to get a general impression of what this project is about.

The code base is forked from




To finish reading, please visit source site