Top 15 Open-Source Datasets of 2020 that every Data Scientist Should add to their Portfolio!

Overview Here is a list of Top 15 Datasets for 2020 that we feel every data scientist should practice on The article contains 5 datasets each for machine learning, computer vision, and NLP By no means is this list exhaustive. Feel free to add other datasets in the comments below   Introduction For the things we have to learn before we can do them, we learn by doing them -Aristotle I am sure everyone can attest to this saying. No […]

Read more

Maximizing Sales with Market Basket Analysis

Sales data analyses can provide a wealth of insights for any business but rarely is it made available to the public. In 2018, however, a retail chain provided Black Friday sales data on Kaggle as part of a Kaggle competition. Although the store and product lines are anonymized, the dataset presents a great learning opportunity to find business insights! In this post, we’ll cover how to prepare data, perform basic analysis, and glean additional insights via a technique called Market […]

Read more

25 Open Datasets for Deep Learning Every Data Scientist Must Work With

Introduction The key to getting better at deep learning (or most fields in life) is practice. Practice on a variety of problems – from image processing to speech recognition. Each of these problem has it’s own unique nuance and approach. But where can you get this data? A lot of research papers you see these days use proprietary datasets that are usually not released to the general public. This becomes a problem, if you want to learn and apply your […]

Read more