Measuring Audience Sentiments about Movies using Twitter and Text Analytics

Introduction The practice of using analytics to measure movie’s success is not a new phenomenon. Most of these predictive models are based on structured data with input variables such as Cost of Production, Genre of the Movie, Actor, Director, Production House, Marketing expenditure, no of distribution platforms, etc. However, with the advent of social media platforms, young demographics, digital media and the increasing adoption of platforms like Twitter, Facebook, etc to express views and opinions. Social Media has become a […]

Read more

A Must-Read Introduction to Sequence Modelling (with use cases)

Introduction Artificial Neural Networks (ANN) were supposed to replicate the architecture of the human brain, yet till about a decade ago, the only common feature between ANN and our brain was the nomenclature of their entities (for instance – neuron). These neural networks were almost useless as they had very low predictive power and less number of practical applications. But thanks to the rapid advancement in technology in the last decade, we have seen the gap being bridged to the […]

Read more

Top 5 Data Science GitHub Repositories and Reddit Discussions (January 2019)

Introduction There’s nothing quite like GitHub and Reddit for data science. Both platforms have been of immense help to me in my data science journey. GitHub is the ultimate one-stop platform for hosting your code. It excels at easing the collaboration process between team members. Most leading data scientists and organizations use GitHub to open-source their libraries and frameworks. So not only do we stay up-to-date with the latest developments in our field, we get to replicate their models on our […]

Read more

6 Practices to enhance the performance of a Text Classification Model

Introduction A few months back, I was working on creating a sentiment classifier for Twitter data. After trying the common approaches, I was still struggling to get good accuracy on the results. Text classification problems and algorithms have been around for a while now. They are widely used for Email Spam Filtering by the likes of Google and Yahoo, for conducting sentiment analysis of twitter data and automatic news categorization in google alerts. However, while dealing with enormous amount of text […]

Read more

An NLP Approach to Mining Online Reviews using Topic Modeling (with Python codes)

Introduction E-commerce has revolutionized the way we shop. That phone you’ve been saving up to buy for months? It’s just a search and a few clicks away. Items are delivered within a matter of days (sometimes even the next day!). For online retailers, there are no constraints related to inventory management or space management They can sell as many different products as they want. Brick and mortar stores can keep only a limited number of products due to the finite space […]

Read more

Text Mining 101: A Stepwise Introduction to Topic Modeling using Latent Semantic Analysis (using Python)

Introduction Have you ever been inside a well-maintained library? I’m always incredibly impressed with the way the librarians keep everything organized, by name, content, and other topics. But if you gave these librarians thousands of books and asked them to arrange each book on the basis of their genre, they will struggle to accomplish this task in a day, let alone an hour! However, this won’t happen to you if these books came in a digital format, right? All the […]

Read more

10 Powerful Applications of Linear Algebra in Data Science (with Multiple Resources)

Overview Linear algebra powers various and diverse data science algorithms and applications Here, we present 10 such applications where linear algebra will help you become a better data scientist We have categorized these applications into various fields – Basic Machine Learning, Dimensionality Reduction, Natural Language Processing, and Computer Vision   Introduction If Data Science was Batman, Linear Algebra would be Robin. This faithful sidekick is often ignored. But in reality, it powers major areas of Data Science including the hot […]

Read more

Beginners Tutorial for Regular Expressions in Python

Importance of Regular Expressions In last few years, there has been a dramatic shift in usage of general purpose programming languages for data science and machine learning. This was not always the case – a decade back this thought would have met a lot of skeptic eyes! This means that more people / organizations are using tools like Python / JavaScript for solving their data needs. This is where Regular Expressions become super useful. Regular expressions are normally the default way […]

Read more

Sentiment Analysis of Twitter Posts on Chennai Floods using Python

Introduction The best way to learn data science is to do data science. No second thought about it! One of the ways, I do this is continuously look for interesting work done by other community members. Once I understand the project, I do / improve the project on my own. Honestly, I can’t think of a better way to learn data science. As part of my search, I came across a study on sentiment analysis of Chennai Floods on Analytics Vidhya. […]

Read more

What is AWS? Why Every Data Science Professional Should Learn Amazon Web Services

Overview Amazon Web Services (AWS) is the leading cloud platform for deploying machine learning solutions Every data science professional should learn how AWS works   Introduction “Your machine ran out of memory.” Sounds familiar? It certainly is for me – especially anytime I try to run a complex machine learning algorithm on my personal machine. It’s quite a frustrating experience that a lot of data science professionals feel. We don’t have the unlimited computing power of the tech behemoths – […]

Read more
1 2 3 4 5