The Top Skills for a Career in Datascience in 2021

Datascience is exploding in popularity due to how it’s tethered to the future of technology, supply-demand for high paying jobs and being on the bleeding edge of corporate culture, startups and innovation! Students from South and East Asia especially can fast track lucrative technology careers with data science even as tech startups are exploding in those areas with increased foreign funding. Think carefully. Would you consider becoming a Data Scientist? According to Coursera: A data scientist might do the following […]

Read more

Step by step guide to extract insights from free text (unstructured data)

Text Mining is one of the most complex analysis in the industry of analytics. The reason for this is that, while doing text mining, we deal with unstructured data. We do not have clearly defined observation and variables (rows and columns). Hence, for doing any kind of analytics, you need to first convert this unstructured data into a structured dataset and then proceed with normal modelling framework. The additional step of converting an unstructured data into a structured format is […]

Read more

Framework to build a niche dictionary for text mining

Having the right dictionary is at the heart of any text mining analysis. Dictionary for text mining can be compared to maps while travelling in a new city. The more precise and accurate maps you use, the faster you reach to the destination. On the other hand, a wrong or incomplete map can end up confusing the traveler. Use of dictionary helps us convert unstructured text into structured data. The more precise dictionary you have for the analysis, the more accurate […]

Read more

Who is the world cheering for? 2014 FIFA WC winner predicted using Twitter feed (in R)

Sports are filled with emotions! Cheering of audience, reactions to events on various media channels are some of the factors, which make a huge impact on the mind of the players. If people support you, your chances to win are greatly enhanced. Live example of this fact, are the statistics of Indian cricket team playing in India and abroad. The win rate of Indian cricket team in India is approximately twice the win rate abroad. Football is again a game driven largely by emotions. […]

Read more

Build a word cloud using text mining tools of R

 This is how a word cloud of our entire website looks like! A word cloud is a graphical representation of frequently used words in a collection of text files. The height of each word in this picture is an indication of frequency of occurrence of the word in the entire text. By the end of this article, you will be able to make a word cloud using R on any given set of text files. Such diagrams are very useful when doing […]

Read more

Hacks to perform faster Text Mining in R

Introduction Data science demands versatility. Move away from your regular methods, challenge your ways of working, explore new ways of doing things more efficiently. On reminiscing about my old days, my initial years in data science, I had also got trapped by this devil of ‘complacency’. At one point, I was not challenging myself enough. I wasn’t  experimenting with the ways of doing work. I accepted the things as they were, until I realized ‘Complacency is a state of mind […]

Read more

Random Forests Algorithm

One of the most popular methods or frameworks used by data scientists at the Rose Data Science Professional Practice Group is Random Forests. The Random Forests algorithm is one of the best among classification algorithms – able to classify large amounts of data with accuracy. Random Forests are an ensemble learning method (also thought of as a form of nearest neighbor predictor) for classification and regression that construct a number of decision trees at training time and outputting the class that is […]

Read more

Plotly Beta: Graphing and Analytics Platform

Hey Data Scientists, I wanted to reach out about Plot.ly, a new startup for analyzing and beautifully visualizing data. We just launched a beta. It is built for math, science, and data applications. We’d love your thoughts. Overview:  You can import data from anywhere, and analyze it in our grid with stats, fits, functions, and more. Our plotting APIs (R, Python, MATLAB, Arduino, REST, Julia, Perl) and grid make interactive, web-ready, publication-quality graphs.  We have a Python Shell, and interactive graphs […]

Read more

Data Science – learn R or Python?

Hi Folks, I have a query around whether to learn R from scratch or should I leverage my basic python knowledge to extend into Data Science with scikit,numpy ,pandas? So I am bit confused … I am not shy to learn New programming language like R etc bur really need to know who edges out whom in market. Maybe i should learn R too along with Python so  your valuable opinion matters.             Also i […]

Read more

Data Science In The Cloud With DataJoy

DataJoy is an unbelievably fantastic way for a working data scientist to have their favorite tools at hand. I am a minimalist when it comes to being mobile, whether working on the road, traveling for leisure, and sometimes both. I do not like to keep files on my laptop and I do not, for the most part, like to worry about keeping updated applications on my laptop. I have tried as much as possible to push my life into the […]

Read more
1 2