Bag-of-words vs TFIDF vectorization –A Hands-on Tutorial

This article was published as a part of the Data Science Blogathon Whenever we apply any algorithm to textual data, we need to convert the text to a numeric form. Hence, there arises a need for some pre-processing techniques that can convert our text to numbers. Both bag-of-words (BOW) and TFIDF are pre-processing techniques that can generate a numeric form from an input text. Bag-of-Words: The bag-of-words model converts text into fixed-length vectors by counting how many times each word appears. […]

Read more

NLTK: A Beginners Hands-on Guide to Natural Language Processing

This article was published as a part of the Data Science Blogathon Introduction:  NLTK is a toolkit build for working with NLP in Python. It provides us various text processing libraries with a lot of test datasets. A variety of tasks can be performed using NLTK such as tokenizing, parse tree visualization, etc… In this article, we will go through how we can set up NLTK in our system and use them for performing various NLP tasks during the text processing […]

Read more

Text Analysis with Spacy to Master NLP techniques

This article was published as a part of the Data Science Blogathon Natural Language Processing(NLP) is a branch of Artificial Intelligence that deals with Daily Language. Have you ever wonder how Alexa, Siri, Google Assistant understand us with voice and respond to us. Human Language is the fuzziest and complex. As they receive text input first preprocessing of text happens and many techniques are embedded which lets them understand grammar. In this tutorial, we will study some techniques which are helpful […]

Read more

Part 7: Step by Step Guide to Master NLP – Word Embedding in Detail

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous articles (part-5 and 6), we completed the different text vectorization and word embeddings techniques in detail. In this article, firstly we will discuss the co-occurrence matrix, which is also a word vectorization technique and after that, we will be discussing new concepts related to the Word embedding that includes, Applications of […]

Read more

Practical Guide to Word Embedding System

This article was published as a part of the Data Science Blogathon Pre-requisites – Basic knowledge of Python – Understanding of basics of NLP(Natural Language Processing)   Introduction In natural language processing, word embedding is used for the representation of words for Text Analysis, in the form of a vector that performs the encoding of the meaning of the word such that the words which are closer in that vector space are expected to have similar in mean. Consider, boy-men vs […]

Read more

A simple start with Natural Language Processing!

This article was published as a part of the Data Science Blogathon Introduction to NLP: After I got acquainted with Machine learning concepts, I was wary of venturing into NLP. To me, NLP was a subject area posing a complicated outlook. But after my first encounter with it, I have come to realize that though it is hard to master it, it is easy to follow the concepts. I am presenting some basic NLP concepts and their work. NLP or Natural […]

Read more

Build your own AI chatbot from scratch!

This article was published as a part of the Data Science Blogathon Introduction It’s pretty simple! Today we will learn to create an AI chatbot from scratch using Intent matching and NLP algorithms. Let’s see what we are gonna do: * Prepare our dataset with questions(keywords) and respective intents. * Prepare a JSON file containing replies for each intent. * Transform our data into Tf-Idf Vectors. * Use Deep Neural Network to classify the User’s question into one of the intents […]

Read more

Topic extraction From Prime Minister Modi’s Speech

This article was published as a part of the Data Science Blogathon INTRODUCTION Artificial Intelligence (AI) has been a trendy term among individuals for many years. Earlier, when we used to hear the term “AI”, we could only think about Robots. However AI is not limited to robots, and nowadays, every electronic device we use has AI associated with it, be it smartphones, smart TVs, refrigerators, or Air conditioners. AI basically means a machine can take its decision without human intervention. […]

Read more

Part 6: Step by Step Guide to Master NLP – Word2Vec

This article was published as a part of the Data Science Blogathon Introduction This article is part of an ongoing blog series on Natural Language Processing (NLP). In the previous article of this series, we completed the statistical or frequency-based word embedding techniques, which are pre-word embedding era techniques. So, in this article, we will discuss the recent word-era embedding techniques. NOTE: In recent word-era embedding, there are many such techniques but in this article, we will discuss only the Word2Vec […]

Read more

Topic modeling With Naive Bayes Classifier

This article was published as a part of the Data Science Blogathon Introduction Naive Bayes is a powerful tool that leverages Bayes’ Theorem to understand and mimic complex data structures. In recent years, it has commonly been used for Natural Language Processing (NLP) tasks, such as text categorization. Today, we will be constructing a Naive Bayes text classifier for topic categorization. Before we move forward with the explanation, I want to emphasize that Naive Bayes is not the traditional method of […]

Read more
1 2 3 4 54