Text Mining Simplified – IPL 2020 Tweet Analysis with R

This article was published as a part of the Data Science Blogathon. Introduction Text mining utilizes different AI technologies to automatically process data and generate valuable insights, enabling companies to make data-driven decisions. Text mining identifies facts, relationships, and assertions that would otherwise remain buried in the mass of textual big data. Once extracted, this information is converted into a structured form that can be further analyzed, or presented directly using clustered HTML tables, mind maps, charts, etc. Advantages of […]

Read more

Beginners Guide to Topic Modeling in Python

Introduction Analytics Industry is all about obtaining the “Information” from the data. With the growing amount of data in recent years, that too mostly unstructured, it’s difficult to obtain the relevant and desired information. But, technology has developed some powerful methods which can be used to mine through the data and fetch the information that we are looking for. One such technique in the field of text mining is Topic Modelling. As the name suggests, it is a process to […]

Read more

A Step-by-Step NLP Guide to Learn ELMo for Extracting Features from Text

Introduction I work on different Natural Language Processing (NLP) problems (the perks of being a data scientist!). Each NLP problem is a unique challenge in its own way. That’s just a reflection of how complex, beautiful and wonderful the human language is. But one thing has always been a thorn in an NLP practitioner’s mind is the inability (of machines) to understand the true meaning of a sentence. Yes, I’m talking about context. Traditional NLP techniques and frameworks were great when […]

Read more

An Introduction to Text Summarization using the TextRank Algorithm (with Python implementation)

Introduction Text Summarization is one of those applications of Natural Language Processing (NLP) which is bound to have a huge impact on our lives. With growing digital media and ever growing publishing – who has the time to go through entire articles / documents / books to decide whether they are useful or not? Thankfully – this technology is already here. Have you come across the mobile app inshorts? It’s an innovative news app that converts news articles into a […]

Read more

Build a Natural Language Generation (NLG) System using PyTorch

Overview Introduction to Natural Language Generation (NLG) and related things- Data Preparation Training Neural Language Models Build a Natural Language Generation System using PyTorch Introduction In the last few years, Natural language processing (NLP) has seen quite a significant growth thanks to advancements in deep learning algorithms and the availability of sufficient computational power. However, feed-forward neural networks are not considered optimal for modeling a language or text. This is because the feed-forward network does not take into consideration the […]

Read more

Building a Recommendation System using Word2vec: A Unique Tutorial with Case Study in Python

Overview Recommendation engines are ubiquitous nowadays and data scientists are expected to know how to build one Word2vec is an ultra-popular word embeddings used for performing a variety of NLP tasks We will use word2vec to build our own recommendation system. Curious how NLP and recommendation engines combine? Let’s find out!   Introduction Be honest – how many times have you used the ‘Recommended for you’ section on Amazon? Ever since I found out a few years back that machine […]

Read more

An Essential Guide to Pretrained Word Embeddings for NLP Practitioners

Overview Understand the importance of pretrained word embeddings Learn about the two popular types of pretrained word embeddings – Word2Vec and GloVe Compare the performance of pretrained word embeddings and learning embeddings from scratch   Introduction How do we make machines understand text data? We know that machines are supremely adept at dealing and working with numerical data but they become sputtering instruments if we feed raw text data to them. The idea is to create a representation of words […]

Read more

Introductory guide to Information Retrieval using kNN and KDTree

Introduction I love cricket as much as I love data science. A few years back (on 16 November 2013 to be precise), my favorite cricketer – Sachin Tendulkar retired from International Cricket. I spent that entire day reading articles and blogs about him on the web. By the end of the day, I had read close to 50 articles about him. Interestingly, while I was reading these articles – none of the websites suggested me articles outside of Sachin or cricket. […]

Read more

Text Mining 101: A Stepwise Introduction to Topic Modeling using Latent Semantic Analysis (using Python)

Introduction Have you ever been inside a well-maintained library? I’m always incredibly impressed with the way the librarians keep everything organized, by name, content, and other topics. But if you gave these librarians thousands of books and asked them to arrange each book on the basis of their genre, they will struggle to accomplish this task in a day, let alone an hour! However, this won’t happen to you if these books came in a digital format, right? All the […]

Read more

Elon Musk AI Text Generator with LSTMs in Tensorflow 2

Introduction Elon Musk has become an internet sensation over the past couple of years, with his views about the future, funny personality along with his passion for technology. By now everyone knows him, either as that electric car guy, or that guy who builds flamethrowers. He is mostly active on his Twitter, where he shares everything, Even memes! He inspires a lot of young people in the IT industry, and I wanted to do a fun little project, where I […]

Read more
1 2 3