Training BERT Text Classifier on Tensor Processing Unit (TPU)

Training hugging face most famous model on TPU for social media Tunisian Arabizi sentiment analysis.   Introduction The Arabic speakers usually express themself in local dialect on social media, so Tunisians use Tunisian Arabizi which consists of Arabic written in form of Latin alphabets. The sentiment analysis relies on cultural knowledge and word sense with contextual information. We will be using both Arabizi dialect and sentimental analysis to solve the problem in this project. The competition is hosted on Zindi which […]

Read more

Spam Detection – An application of Deep Learning

This article was published as a part of the Data Science Blogathon What each big tech company wants is the Security and Safety of its customers. By detecting spam alerts in emails and messages, they want to secure their network and enhance the trust of their customers. The official messaging app of Apple and the official chatting app of Google i.e Gmail is unbeatable examples of such applications where the process of spam detection and filtering works well to protect users […]

Read more

FuzzyWuzzy Python Library: Interesting Tool for NLP and Text Analytics

This article was published as a part of the Data Science Blogathon Introduction There are many ways to compare text in python. But, often we search for an easy way to compare text. Comparing text is needed for various text analytics and Natural Language Processing purposes. One of the easiest ways of comparing text in python is using the fuzzy-wuzzy library. Here, we get a score out of 100, based on the similarity of the strings. Basically, we are given the similarity […]

Read more

Part 3: Topic Modeling and Latent Dirichlet Allocation (LDA) using Gensim and Sklearn

This article was published as a part of the Data Science Blogathon Overview In the previous two installments, we had understood in detail the common text terms in Natural Language Processing (NLP), what are topics, what is topic modeling, why it is required, its uses, types of models and dwelled deep into one of the important techniques called Latent Dirichlet Allocation (LDA). In this last leg of the Topic Modeling and LDA series, we shall see how to extract topics through […]

Read more

Custom Text Classification on Android using TensorFlow Lite

This article was published as a part of the Data Science Blogathon Introduction A lot of social media platforms have been using AI these days to classify vulgar and offensive posts and automatically take them down. I thought why not try doing something similar; and so, I’ve come up with this end-to-end tutorial that will help you build your own corpus for training a text classification model, and later export and deploy it on an Android app for you to use. […]

Read more

LSTM for Text Classification in Python

This article was published as a part of the Data Science Blogathon With an emerging field of deep learning, performing complex operations has become faster and easier. As you start exploring the field of deep learning, you are definitely going to come across words like Neural networks, recurrent neural networks, LSTM, GRU, etc. This article explains LSTM and its use in Text Classification. So what is LSTM? And how can it be used? What is LSTM? LSTM stands for Long-Short Term […]

Read more

Resume Screening with Natural Language Processing in Python

For each recruitment, companies take out online ads, referrals and go through them manually. Companies often submit thousands of resumes for every posting. When companies collect resumes through online advertisements, they categorize those resumes according to their requirements. After collecting resumes, companies close advertisements and online applying portals. Then they send the collected resumes to the Hiring Team(s). It becomes very difficult for the hiring teams to read the resume and select the resume according to the requirement, there is […]

Read more

Sentiment Analysis using NLTK – A Practical Approach

This article was published as a part of the Data Science Blogathon Introduction The ultimate goal of this blog is to predict the sentiment of a given text using python where we use NLTK aka Natural Language Processing Toolkit, a package in python made especially for text-based analysis. So with a few lines of code, we can easily predict whether a sentence or a review(used in the blog) is a positive or a negative review. Before moving on to the implementation […]

Read more

Automated Spam E-mail Detection Model(Using common NLP tasks)

Hope you all are doing Good !!! Welcome to my blog! Today we are going to understand about basics of NLP with the help of the Email Spam Detection dataset. We see some common NLP tasks that one can perform easily and how one can complete an end-to-end project. Whether you know NLP or not, this guide should help you as a ready reference. For the dataset used click on the above link or here. Let’s get started, Natural Language […]

Read more

Text Preprocessing made easy!

This article was published as a part of the Data Science Blogathon Introduction We will learn the basics of text preprocessing in this article. Humans communicate using words and hence generate a lot of text data for companies in the form of reviews, suggestions, feedback, social media, etc. A lot of valuable insights can be generated from this text data and hence companies try to apply various machine learning or deep learning models to this data to gain actionable insights. Text […]

Read more
1 2 3