Part 2: Topic Modeling and Latent Dirichlet Allocation (LDA) using Gensim and Sklearn

This article was published as a part of the Data Science Blogathon Introduction In the previous article, we had started with understanding the basic terminologies of text in Natural Language Processing(NLP), what is topic modeling, its applications, the types of models, and the different topic modeling techniques available. Let’s continue from there, explore Latent Dirichlet Allocation (LDA), working of LDA, and its similarity to another very popular dimensionality reduction technique called Principal Component Analysis (PCA).   Table of Contents A Little […]

Read more

Ultimate guide to deal with Text Data (using Python) – for Data Scientists and Engineers

Introduction One of the biggest breakthroughs required for achieving any level of artificial intelligence is to have machines which can process text data. Thankfully, the amount of text data being generated in this universe has exploded exponentially in the last few years. It has become imperative for an organization to have a structure in place to mine actionable insights from the text being generated. From social media analytics to risk management and cybercrime protection, dealing with text data has never […]

Read more

How to Get Started with NLP – 6 Unique Methods to Perform Tokenization

Overview Looking to get started with Natural Language Processing (NLP)? Here’s the perfect first step Learn how to perform tokenization – a key aspect to preparing your data for building NLP models We present 6 different ways to perform tokenization on text data   Introduction Are you fascinated by the amount of text data available on the internet? Are you looking for ways to work with this text data but aren’t sure where to begin? Machines, after all, recognize numbers, […]

Read more

NLP Essentials: Removing Stopwords and Performing Text Normalization using NLTK and spaCy in Python

Overview Learn how to remove stopwords and perform text normalization in Python – an essential Natural Language Processing (NLP) read We will explore the different methods to remove stopwords as well as talk about text normalization techniques like stemming and lemmatization Put your theory into practice by performing stopwords removal and text normalization in Python using the popular NLTK, spaCy and Gensim libraries   Introduction Don’t you love how wonderfully diverse Natural Language Processing (NLP) is? Things we never imagined […]

Read more

An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes)

Introduction E-commerce has revolutionized the way we shop. That phone you’ve been saving up to buy for months? It’s just a search and a few clicks away. Items are delivered within a matter of days (sometimes even the next day!). For online retailers, there are no constraints related to inventory management or space management They can sell as many different products as they want. Brick and mortar stores can keep only a limited number of products due to the finite space […]

Read more