A Gentle Introduction To MuRIL : Multilingual Representations for Indian Languages

This article was published as a part of the Data Science Blogathon

MuRIL is a starting point of what we believe can be the next big evolution for Indian language understanding. We hope it will prove to be a better foundation for researchers, startups, students, and anyone else interested in building Indian language technologies” said Partha Talukdar, Research Scientist, Google Research India.

What is MuRIL?

MuRIL, short for Multilingual Representations for Indian Languages, is none other than a free and open-source Machine learning tool specifically designed for Indian languages. Google’s Indian Research Unit has launched it in the year 2020. It helps to build local technologies in vernacular languages with a common framework.

What are all the languages it supports?

MuRIL currently supports the following 17 languages:

  1. Assamese
  2. Bengali
  3. English
  4. Gujarati
  5. Hindi
  6. Kannada
  7. Kashmiri
  8. Malayalam
  9. Marathi
  10. Nepali
  11. Oriya
  12. Punjabi
  13. Sanskrit
  14. Sindhi
  15. Tamil
  16. Telugu
  17. Urdu

Why MuRIL?