Beginner’s Guide To Text Classification Using PyCaret

Introduction Have you ever solved a Machine Learning problem in just one go? Solving a problem using machine learning isn’t straightforward. It involves various steps to come up with an accurate solution. The process/steps to be followed for solving an ml problem is known as ML Pipeline/ML Cycle. ML Pipeline/ ML Cycle (Credits: https://medium.com/analytics-vidhya/machine-learning-development-life-cycle-dfe88c44222e) As shown in the figure, the Machine Learning pipeline consists of different steps like: Understand Problem Statement, Hypothesis Generation, Exploratory Data Analysis, Data Preprocessing, Feature Engineering, […]

Read more

Practical Guide to Word Embedding System

This article was published as a part of the Data Science Blogathon Pre-requisites – Basic knowledge of Python – Understanding of basics of NLP(Natural Language Processing)   Introduction In natural language processing, word embedding is used for the representation of words for Text Analysis, in the form of a vector that performs the encoding of the meaning of the word such that the words which are closer in that vector space are expected to have similar in mean. Consider, boy-men vs […]

Read more

Word2Vec For Word Embeddings -A Beginner’s Guide

This article was published as a part of the Data Science Blogathon Why are word embeddings needed? Let us consider the two sentences – “You can scale your business.” and “You can grow your business.”. These two sentences have the same meaning. If we consider a vocabulary considering these two sentences, it will constitute of these words: {You, can, scale, grow, your, business}. A one-hot encoding of these words would create a vector of length 6. The encodings for each of […]

Read more

Part 6: Step by Step Guide to Master NLP – Word2Vec

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article of this series, we completed the statistical or frequency-based word embedding techniques, which are pre-word embedding era techniques. So, in this article, we will discuss the recent word-era embedding techniques. NOTE: In recent word-era embedding, there are many such techniques but in this article, we will discuss only the Word2Vec […]

Read more

Part 5: Step by Step Guide to Master NLP – Word Embedding and Text Vectorization

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). Up to the previous part of this article series, we almost completed the necessary steps involved in text cleaning and normalization pre-processing. After that, we will convert the processed text to numeric feature vectors so that we can feed it to computers for Machine Learning applications. NOTE: Some concepts included in the pipeline of […]

Read more

Part 8: Step by Step Guide to Master NLP – Useful Natural Language Processing Tasks

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). Up to part-7 of this series, we completed the most useful concepts in NLP. While going away in this series, let’s first discuss some of the useful tasks of NLP so that you have much clarity about what you can do by learning the NLP. After this part, we will start our discussion on […]

Read more

Part- 6: Step by Step Guide to Master Natural Language Processing (NLP) in Python

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article of this series, we completed the statistical or frequency-based word embedding techniques, which are pre-word embedding era techniques. So, in this article, we will discuss the recent word-era embedding techniques. NOTE: In recent word-era embedding, there are many such techniques but in this article, we will discuss only the Word2Vec […]

Read more

Multilingualism in Natural Language Processing: Targeting Low Resource Indian Languages

Introduction A language is a systematic form of communication that can take a variety of forms. There are approximately 7,000 languages believed to be spoken across the globe. Despite this diversity, the majority of the world’s population speaks only a fraction of these languages. In Spite of such a rich diversity Languages are still evolving across time much like the society we live in. While the English language is uniform, having the distinct status of being the official language of […]

Read more

Multilingualism in Natural Language Processing targeting low resource Indian languages

Introduction Language is a systematic form of communication that can take a variety of forms. There are approximately 7,000 languages believed to be spoken across the globe. Despite this diversity, the majority of the world’s population speaks only a fraction of these languages. In Spite of such a rich diversity Languages are still evolving across time much like the society we live in. While the English language is uniform, having the distinct status of being the official language of multiple […]

Read more

A Step-by-Step NLP Guide to Learn ELMo for Extracting Features from Text

Introduction I work on different Natural Language Processing (NLP) problems (the perks of being a data scientist!). Each NLP problem is a unique challenge in its own way. That’s just a reflection of how complex, beautiful and wonderful the human language is. But one thing has always been a thorn in an NLP practitioner’s mind is the inability (of machines) to understand the true meaning of a sentence. Yes, I’m talking about context. Traditional NLP techniques and frameworks were great when […]

Read more
1 2