Step by step guide to extract insights from free text (unstructured data)

Text Mining is one of the most complex analysis in the industry of analytics. The reason for this is that, while doing text mining, we deal with unstructured data. We do not have clearly defined observation and variables (rows and columns). Hence, for doing any kind of analytics, you need to first convert this unstructured data into a structured dataset and then proceed with normal modelling framework. The additional step of converting an unstructured data into a structured format is […]

Read more

Text Mining hack: Subject Extraction made easy using Google API

Let’s do a simple exercise. You need to identify the subject and the sentiment in following sentences: Google is the best resource for any kind of information. I came across a fabulous knowledge portal – Analytics Vidhya Messi played well but Argentina still lost the match Opera is not the best browser Yes, like UAE will win the Cricket World Cup. Was this exercise simple? Even if this looks like a simple exercise, now imagine creating an algorithm to do this? How does that […]

Read more

How Search Engines like Google Retrieve Results: Introduction to Information Extraction using Python and spaCy

Overview How do search engines like Google understand our queries and provide relevant results? Learn about the concept of information extraction We will apply information extraction in Python using the popular spaCy library – so a lot of hands-on learning is ahead!   Introduction I rely heavily on search engines (especially Google) in my daily role as a data scientist. My search results span a variety of queries – Python code questions, machine learning algorithms, comparison of Natural Language Processing […]

Read more