NLP Essentials: Removing Stopwords and Performing Text Normalization using NLTK and spaCy in Python

Overview Learn how to remove stopwords and perform text normalization in Python – an essential Natural Language Processing (NLP) read We will explore the different methods to remove stopwords as well as talk about text normalization techniques like stemming and lemmatization Put your theory into practice by performing stopwords removal and text normalization in Python using the popular NLTK, spaCy and Gensim libraries   Introduction Don’t you love how wonderfully diverse Natural Language Processing (NLP) is? Things we never imagined […]

Read more

Sentiment Analysis in Python With TextBlob

Introduction State-of-the-art technologies in NLP allow us to analyze natural languages on different layers: from simple segmentation of textual information to more sophisticated methods of sentiment categorizations. However, it does not inevitably mean that you should be highly advanced in programming to implement high-level tasks such as sentiment analysis in Python. Sentiment Analysis The algorithms of sentiment analysis mostly focus on defining opinions, attitudes, and even emoticons in a corpus of texts. The range of established sentiments significantly varies from […]

Read more

Python for NLP: Tokenization, Stemming, and Lemmatization with SpaCy Library

In the previous article, we started our discussion about how to do natural language processing with Python. We saw how to read and write text and PDF files. In this article, we will start working with the spaCy library to perform a few more basic NLP tasks such as tokenization, stemming and lemmatization. Introduction to SpaCy The spaCy library is one of the most popular NLP libraries along with NLTK. The basic difference between the two libraries is the fact […]

Read more

Text Summarization with NLTK in Python

Introduction As I write this article, 1,907,223,370 websites are active on the internet and 2,722,460 emails are being sent per second. This is an unbelievably huge amount of data. It is impossible for a user to get insights from such huge volumes of data. Furthermore, a large portion of this data is either redundant or doesn’t contain much useful information. The most efficient way to get access to the most important parts of the data, without having to sift through […]

Read more

Implementing Word2Vec with Gensim Library in Python

Introduction Humans have a natural ability to understand what other people are saying and what to say in response. This ability is developed by consistently interacting with other people and the society over many years. The language plays a very important role in how humans interact. Languages that humans use for interaction are called natural languages. The rules of various natural languages are different. However, there is one thing in common in natural languages: flexibility and evolution. Natural languages are […]

Read more
1 2