Training BERT Text Classifier on Tensor Processing Unit (TPU)

Training hugging face most famous model on TPU for social media Tunisian Arabizi sentiment analysis.   Introduction The Arabic speakers usually express themself in local dialect on social media, so Tunisians use Tunisian Arabizi which consists of Arabic written in form of Latin alphabets. The sentiment analysis relies on cultural knowledge and word sense with contextual information. We will be using both Arabizi dialect and sentimental analysis to solve the problem in this project. The competition is hosted on Zindi which […]

Read more

Amazon Product review Sentiment Analysis using BERT

This article was published as a part of the Data Science Blogathon Introduction Natural Language processing, a sub-field of machine learning has gained immense popularity in the last 5 years in both research and industrial applications due to the advancement in the field of deep learning and improvement in the computational power of hardware systems. It is a technique for computers to understand how human languages work involving the usage of computational linguistics and the computer science domain. In recent years, […]

Read more

A Gentle Introduction To MuRIL : Multilingual Representations for Indian Languages

This article was published as a part of the Data Science Blogathon “MuRIL is a starting point of what we believe can be the next big evolution for Indian language understanding. We hope it will prove to be a better foundation for researchers, startups, students, and anyone else interested in building Indian language technologies” said Partha Talukdar, Research Scientist, Google Research India. What is MuRIL? MuRIL, short for Multilingual Representations for Indian Languages, is none other than a free and open-source […]

Read more

Why and how to use BERT for NLP Text Classification?

This article was published as a part of the Data Science Blogathon Introduction NLP or Natural Language Processing is an exponentially growing field. In the “new normal” imposed by covid19, a significant proportion of educational material, news, discussions happen through digital media platforms. This provides more text data available to work upon! Originally, simple RNNS (Recurrent Neural Networks) were used for training text data. But in recent years there have been many new research publications that provide state-of-the-art results. One of […]

Read more

All You Need to know about BERT

This article was published as a part of the Data Science Blogathon Introduction Machines understand language through language representations. These language representations are in the form of vectors of real numbers. Proper language representation is necessary for a better understanding of the language by the machine. Language representations are of two types: (i) Context-free language representation such as Glove and Word2vec where embeddings for each token in the vocabulary are constant and it doesn’t depend on the context of the word. […]

Read more

Measuring Text Similarity Using BERT

This article was published as a part of the Data Science Blogathon BERT is too kind — so this article will be touching on BERT and sequence relationships! Abstract A significant portion of NLP relies on the connection in highly-dimensional spaces. Typically an NLP processing will take any text, prepare it to generate a tremendous vector/array rendering said text — then make certain transformations. It’s a highly-dimensional charm. At an exceptional level, there’s not much extra to it. We require to […]

Read more

BERT for Natural Language Inference simplified in Pytorch!

This article was published as a part of the Data Science Blogathon Introduction to BERT: BERT stands for Bidirectional Encoder Representations from Transformers. It was introduced in 2018 by Google Researchers. BERT achieved state-of-art performance in most of the NLP tasks at that time and drawn the attention of the data science community worldwide. It is extensively used today by data science practitioners for various NLP tasks. Details about the working of the BERT model can be found here. Introduction to […]

Read more

Summarize Twitter Live data using Pretrained NLP models

Introduction Twitter users spend an average of 4 minutes on social media Twitter. On an average of 1 minute, they read the same stuff. It shows that users spend around 25% of their time reading the same stuff. Also, most of the tweets will not appear on your dashboard. You may get to know the trending topics, but you miss not trending topics. In trending topics, you might only read the top 5 tweets and their comments. So, what are […]

Read more

Demystifying BERT: A Comprehensive Guide to the Groundbreaking NLP Framework

Overview Google’s BERT has transformed the Natural Language Processing (NLP) landscape Learn what BERT is, how it works, the seismic impact it has made, among other things We’ll also implement BERT in Python to give you a hands-on learning experience   Introduction to the World of BERT Picture this – you’re working on a really cool data science project and have applied the latest state-of-the-art library to get a pretty good result. And boom! A few days later, there’s a […]

Read more

A Complete List of Important Natural Language Processing Frameworks you should Know (NLP Infographic)

Overview Here’s a list of the most important Natural Language Processing (NLP) frameworks you need to know in the last two years From Google AI’s Transformer to Facebook Research’s XLM/mBERT, we chart the rise of NLP through the lens of these seismic breakthroughs   Introduction Have you heard about the latest Natural Language Processing framework that was released recently? I don’t blame you if you’re still catching up with the superb StanfordNLP library or the PyTorch-Transformers framework! There has been […]

Read more
1 2