Beginners Tutorial for Regular Expressions in Python

Importance of Regular Expressions In last few years, there has been a dramatic shift in usage of general purpose programming languages for data science and machine learning. This was not always the case – a decade back this thought would have met a lot of skeptic eyes! This means that more people / organizations are using tools like Python / JavaScript for solving their data needs. This is where Regular Expressions become super useful. Regular expressions are normally the default way […]

Read more

Sentiment Analysis of Twitter Posts on Chennai Floods using Python

Introduction The best way to learn data science is to do data science. No second thought about it! One of the ways, I do this is continuously look for interesting work done by other community members. Once I understand the project, I do / improve the project on my own. Honestly, I can’t think of a better way to learn data science. As part of my search, I came across a study on sentiment analysis of Chennai Floods on Analytics Vidhya. […]

Read more

Python for NLP: Working with Text and PDF Files

This is the first article in my series of articles on Python for Natural Language Processing (NLP). In this article, we will start with the basics of Python for NLP. We will see how we can work with simple text files and PDF files using Python. Working with Text Files Text files are probably the most basic types of files that you are going to encounter in your NLP endeavors. In this section, we will see how to read from […]

Read more

Python for NLP: Tokenization, Stemming, and Lemmatization with SpaCy Library

In the previous article, we started our discussion about how to do natural language processing with Python. We saw how to read and write text and PDF files. In this article, we will start working with the spaCy library to perform a few more basic NLP tasks such as tokenization, stemming and lemmatization. Introduction to SpaCy The spaCy library is one of the most popular NLP libraries along with NLTK. The basic difference between the two libraries is the fact […]

Read more

Python for NLP: Vocabulary and Phrase Matching with SpaCy

This is the third article in this series of articles on Python for Natural Language Processing. In the previous article, we saw how Python’s NLTK and spaCy libraries can be used to perform simple NLP tasks such as tokenization, stemming and lemmatization. We also saw how to perform parts of speech tagging, named entity recognition and noun-parsing. However, all of these operations are performed on individual words. In this article, we will move a step further and explore vocabulary and […]

Read more

Python for NLP: Parts of Speech Tagging and Named Entity Recognition

This is the 4th article in my series of articles on Python for NLP. In my previous article, I explained how the spaCy library can be used to perform tasks like vocabulary and phrase matching. In this article, we will study parts of speech tagging and named entity recognition in detail. We will see how the spaCy library can be used to perform these two tasks. Parts of Speech (POS) Tagging Parts of speech tagging simply refers to assigning parts […]

Read more

Python for NLP: Sentiment Analysis with Scikit-Learn

This is the fifth article in the series of articles on NLP for Python. In my previous article, I explained how Python’s spaCy library can be used to perform parts of speech tagging and named entity recognition. In this article, I will demonstrate how to do sentiment analysis using Twitter data using the Scikit-Learn library. Sentiment analysis refers to analyzing an opinion or feelings about something using data like text or images, regarding almost anything. Sentiment analysis helps companies in […]

Read more

Python for NLP: Topic Modeling

This is the sixth article in my series of articles on Python for NLP. In my previous article, I talked about how to perform sentiment analysis of Twitter data using Python’s Scikit-Learn library. In this article, we will study topic modeling, which is another very important application of NLP. We will see how to do topic modeling with Python. What is Topic Modeling Topic modeling is an unsupervised technique that intends to analyze large volumes of text data by clustering […]

Read more

Python for NLP: Introduction to the TextBlob Library

Introduction This is the seventh article in my series of articles on Python for NLP. In my previous article, I explained how to perform topic modeling using Latent Dirichlet Allocation and Non-Negative Matrix factorization. We used the Scikit-Learn library to perform topic modeling. In this article, we will explore TextBlob, which is another extremely powerful NLP library for Python. TextBlob is built upon NLTK and provides an easy to use interface to the NLTK library. We will see how TextBlob […]

Read more

Python for NLP: Introduction to the Pattern Library

This is the eighth article in my series of articles on Python for NLP. In my previous article, I explained how Python’s TextBlob library can be used to perform a variety of NLP tasks ranging from tokenization to POS tagging, and text classification to sentiment analysis. In this article, we will explore Python’s Pattern library, which is another extremely useful Natural Language Processing library. The Pattern library is a multipurpose library capable of handling the following tasks: Natural Language Processing: […]

Read more
1 11 12 13 14 15