February 13, 2021 Advanced, Classification, NLP, Project, R, Social Media, Technique, Text, Unstructured Data, Unsupervised Leave a comment

Customer Sentiments Analysis of Pepsi and Coca-Cola using Twitter Data in R

This article was published as a part of the Data Science Blogathon. Introduction Coca-Cola and PepsiCo are well-established names in the soft drink industry with both in the fortune 500. The companies that own a wide spectrum of product lines in a highly competitive market have a fierce rivalry with each other and constantly competing for market share in almost all subsequent product verticals. We will analyze the sentiment of customers of these two companies with the help of 5000 […]

November 10, 2020 Classification, Intermediate, Machine Learning, NLP, Python, R, Supervised, Technique, Text, Unstructured Data Leave a comment

Complete tutorial on Text Classification using Conditional Random Fields Model (in Python)

Introduction The amount of text data being generated in the world is staggering. Google processes more than 40,000 searches EVERY second! According to a Forbes report, every single minute we send 16 million text messages and post 510,00 comments on Facebook. For a layman, it is difficult to even grasp the sheer magnitude of data out there? News sites and other online media alone generate tons of text content on an hourly basis. Analyzing patterns in that data can become […]

November 4, 2020 Big data, Business Analytics, Data Exploration, Intermediate, NLP, Project, R, Sports, Text, Unstructured Data Leave a comment

Who is the world cheering for? 2014 FIFA WC winner predicted using Twitter feed (in R)

Sports are filled with emotions! Cheering of audience, reactions to events on various media channels are some of the factors, which make a huge impact on the mind of the players. If people support you, your chances to win are greatly enhanced. Live example of this fact, are the statistics of Indian cricket team playing in India and abroad. The win rate of Indian cricket team in India is approximately twice the win rate abroad. Football is again a game driven largely by emotions. […]

November 4, 2020 Classification, Intermediate, Machine Learning, NLP, Project, R, Supervised, Text, Unstructured Data Leave a comment

Kaggle Solution: What’s Cooking ? (Text Mining Competition)

Introduction Tutorial on Text Mining, XGBoost and Ensemble Modeling in R I came across What’s Cooking competition on Kaggle last week. At first, I was intrigued by its name. I checked it and realized that this competition is about to finish. My bad! It was a text mining competition. This competition went live for 103 days and ended on 20th December 2015. Still, I decided to test my skills. I downloaded the data set, built a model and managed to get a score of […]

November 4, 2020 Data Science, Intermediate, Machine Learning, NLP, Project, R Leave a comment

Measuring Audience Sentiments about Movies using Twitter and Text Analytics

Introduction The practice of using analytics to measure movie’s success is not a new phenomenon. Most of these predictive models are based on structured data with input variables such as Cost of Production, Genre of the Movie, Actor, Director, Production House, Marketing expenditure, no of distribution platforms, etc. However, with the advent of social media platforms, young demographics, digital media and the increasing adoption of platforms like Twitter, Facebook, etc to express views and opinions. Social Media has become a […]

October 31, 2020 Banking, Big data, Business Analytics, Data Visualization, Intermediate, NLP, R, Technique, Text, Unstructured Data Leave a comment

Build a word cloud using text mining tools of R

This is how a word cloud of our entire website looks like! A word cloud is a graphical representation of frequently used words in a collection of text files. The height of each word in this picture is an indication of frequency of occurrence of the word in the entire text. By the end of this article, you will be able to make a word cloud using R on any given set of text files. Such diagrams are very useful when doing […]

October 29, 2020 Big data, Business Analytics, Intermediate, NLP, R, Technique, Text, Unstructured Data Leave a comment

Hacks to perform faster Text Mining in R

Introduction Data science demands versatility. Move away from your regular methods, challenge your ways of working, explore new ways of doing things more efficiently. On reminiscing about my old days, my initial years in data science, I had also got trapped by this devil of ‘complacency’. At one point, I was not challenging myself enough. I wasn’t experimenting with the ways of doing work. I accepted the things as they were, until I realized ‘Complacency is a state of mind […]