August 20, 2021 Advanced, Classification, Libraries, NLP, Project, Python, Text, Topic Modeling, Word Embeddings

Beginner’s Guide To Text Classification Using PyCaret

Introduction Have you ever solved a Machine Learning problem in just one go? Solving a problem using machine learning isn’t straightforward. It involves various steps to come up with an accurate solution. The process/steps to be followed for solving an ml problem is known as ML Pipeline/ML Cycle. ML Pipeline/ ML Cycle (Credits: https://medium.com/analytics-vidhya/machine-learning-development-life-cycle-dfe88c44222e) As shown in the figure, the Machine Learning pipeline consists of different steps like: Understand Problem Statement, Hypothesis Generation, Exploratory Data Analysis, Data Preprocessing, Feature Engineering, […]

August 15, 2021 Advanced, Libraries, NLP, Project, Python, Text, Unsupervised

Getting started with NLP using NLTK Library

1010010 01101001 01110100 01101000 01101001 01101011 01100001 Did you understand the above binary code? If yes, then you’re a computer. If no, then you’re a Human. 🙂 I know it’s a difficult task for us to understand binary code just like computers because binary code is a Machine Understandable Language. Likewise, even computers don’t understand human language. So, how to make computers understand human language? The answer is Natural Language Processing. With the help of NLP, we can teach computers […]

August 1, 2021 Advanced, Libraries, NLP, Python, Text

Why must text data be pre-processed ?

This article was published as a part of the Data Science Blogathon Introduction Language is a structured medium we humans use to communicate with each other. Language can be in the form of speech or text. “Blah blah”, “Meh”, “zzzz…” Yup, we can understand these words. But the question is, “Can computers understand these?” Nop, machines can’t understandthese. In fact, machines can’t understand any text data at all, be it the word “blah” or the word “machine”. They only understand numbers. […]

July 17, 2021 Beginner, Data Science, Libraries, Machine Learning, NLP, Python, Text

NLTK: A Beginners Hands-on Guide to Natural Language Processing

This article was published as a part of the Data Science Blogathon Introduction: NLTK is a toolkit build for working with NLP in Python. It provides us various text processing libraries with a lot of test datasets. A variety of tasks can be performed using NLTK such as tokenizing, parse tree visualization, etc… In this article, we will go through how we can set up NLTK in our system and use them for performing various NLP tasks during the text processing […]

July 14, 2021 Advanced, Libraries, NLP, Project, Python, Structured Data, Text

FuzzyWuzzy Python Library: Interesting Tool for NLP and Text Analytics

This article was published as a part of the Data Science Blogathon Introduction There are many ways to compare text in python. But, often we search for an easy way to compare text. Comparing text is needed for various text analytics and Natural Language Processing purposes. One of the easiest ways of comparing text in python is using the fuzzy-wuzzy library. Here, we get a score out of 100, based on the similarity of the strings. Basically, we are given the similarity […]

July 14, 2021 Beginner, Data Science, Libraries, NLP, Python, Text

Text Analysis with Spacy to Master NLP techniques

This article was published as a part of the Data Science Blogathon Natural Language Processing(NLP) is a branch of Artificial Intelligence that deals with Daily Language. Have you ever wonder how Alexa, Siri, Google Assistant understand us with voice and respond to us. Human Language is the fuzziest and complex. As they receive text input first preprocessing of text happens and many techniques are embedded which lets them understand grammar. In this tutorial, we will study some techniques which are helpful […]

July 14, 2021 Data Science, Intermediate, Libraries, NLP, Python, Text, Word Embeddings

Practical Guide to Word Embedding System

This article was published as a part of the Data Science Blogathon Pre-requisites – Basic knowledge of Python – Understanding of basics of NLP(Natural Language Processing) Introduction In natural language processing, word embedding is used for the representation of words for Text Analysis, in the form of a vector that performs the encoding of the meaning of the word such that the words which are closer in that vector space are expected to have similar in mean. Consider, boy-men vs […]

July 9, 2021 Data Science, Libraries, NLP, Python, Text, Unstructured Data

Part 2: Topic Modeling and Latent Dirichlet Allocation (LDA) using Gensim and Sklearn

This article was published as a part of the Data Science Blogathon Introduction In the previous article, we had started with understanding the basic terminologies of text in Natural Language Processing(NLP), what is topic modeling, its applications, the types of models, and the different topic modeling techniques available. Let’s continue from there, explore Latent Dirichlet Allocation (LDA), working of LDA, and its similarity to another very popular dimensionality reduction technique called Principal Component Analysis (PCA). Table of Contents A Little […]

July 9, 2021 Beginner, Libraries, NLP, Python, Text, Topic Modeling

Topic modeling With Naive Bayes Classifier

This article was published as a part of the Data Science Blogathon Introduction Naive Bayes is a powerful tool that leverages Bayes’ Theorem to understand and mimic complex data structures. In recent years, it has commonly been used for Natural Language Processing (NLP) tasks, such as text categorization. Today, we will be constructing a Naive Bayes text classifier for topic categorization. Before we move forward with the explanation, I want to emphasize that Naive Bayes is not the traditional method of […]

July 8, 2021 Advanced, Classification, Libraries, NLP, Project, Python, Structured Data, Supervised, Text

Sentiment Analysis using NLTK – A Practical Approach

This article was published as a part of the Data Science Blogathon Introduction The ultimate goal of this blog is to predict the sentiment of a given text using python where we use NLTK aka Natural Language Processing Toolkit, a package in python made especially for text-based analysis. So with a few lines of code, we can easily predict whether a sentence or a review(used in the blog) is a positive or a negative review. Before moving on to the implementation […]

1 2 3 4 »