How to Set Axis Range (xlim, ylim) in Matplotlib

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. Much of Matplotlib’s popularity comes from its customization options – you can tweak just about any element from its hierarchy of objects. In this tutorial, we’ll take a look at how to set the axis range (xlim, ylim) in Matplotlib, to truncate or expand the view to specific limits. Creating a Plot Let’s first create a simple plot: import matplotlib.pyplot as plt import numpy as np […]

Read more

Matplotlib Scatter Plot – Tutorial and Examples

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. From simple to complex visualizations, it’s the go-to library for most. In this tutorial, we’ll take a look at how to plot a scatter plot in Matplotlib. Import Data We’ll be using the Ames Housing dataset and visualizing correlations between features from it. Let’s import Pandas and load in the dataset: import pandas as pd df = pd.read_csv(‘AmesHousing.csv’) Plot a Scatter Plot in Matplotlib Now, with […]

Read more

How I used NLP (Spacy) to screen Data Science Resumes

Resume making is very tricky. A candidate has many dilemmas, whether to state a project at length or just mention the bare minimum whether to mention many skills or just mention his/her core competency skill whether to mention many programming languages or just cite a few whether to restrict the resume to 2 pages or 1 page These dilemmas are equally hard for Data Scientists looking for a change or even for aspiring Data Scientist. Now before you wonder where […]

Read more

Quick Guide: Steps To Perform Text Data Cleaning in Python

Introduction Twitter has become an inevitable channel for brand management. It has compelled brands to become more responsive to their customers. On the other hand, the damage it would cause can’t be undone. The 140 character tweets has now become a powerful tool for customers / users to directly convey messages to brands. For companies, these tweets carry a lot of information like sentiment, engagement, reviews and features of its products and what not. However, mining these tweets isn’t easy. Why? Because, before you mine this data, you need […]

Read more

10 Powerful Applications of Linear Algebra in Data Science (with Multiple Resources)

Overview Linear algebra powers various and diverse data science algorithms and applications Here, we present 10 such applications where linear algebra will help you become a better data scientist We have categorized these applications into various fields – Basic Machine Learning, Dimensionality Reduction, Natural Language Processing, and Computer Vision   Introduction If Data Science was Batman, Linear Algebra would be Robin. This faithful sidekick is often ignored. But in reality, it powers major areas of Data Science including the hot […]

Read more

Kernel Density Estimation in Python Using Scikit-Learn

Introduction This article is an introduction to kernel density estimation using Python’s machine learning library scikit-learn. Kernel density estimation (KDE) is a non-parametric method for estimating the probability density function of a given random variable. It is also referred to by its traditional name, the Parzen-Rosenblatt Window method, after its discoverers. Given a sample of independent, identically distributed (i.i.d) observations ((x_1,x_2,ldots,x_n)) of a random variable from an unknown source distribution, the kernel density estimate, is given by: $$p(x) = frac{1}{nh} […]

Read more

Change Figure Size in Matplotlib

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. Much of Matplotlib’s popularity comes from its customization options – you can tweak just about any element from its hierarchy of objects. In this tutorial, we’ll take a look at how to change a figure size in Matplotlib. Creating a Plot Let’s first create a simple plot in a figure: import matplotlib.pyplot as plt import numpy as np x = np.arange(0, 10, 0.1) y = np.sin(x) […]

Read more

How I Became a Data Science Competition Master from Scratch

Overview Winning data science competitions can be a complex process – but you can crack the top 3 if you have a framework to follow Hear from a top data science hackathon expert and how he went from scratch to winning data science competitions   Introduction There is no alternative to learning through experience. Especially in the data science industry! I recently won the top prize in Zindi’s Zimnat Insurance Recommendation challenge – an achievement that ranks top among my […]

Read more

6 Key Points you Should Focus on for your Next Data Science Interview

Overview Preparing for your next data science interview? You need to ensure you’re covering your basics Here are 6 key points we’ve taken from our data science interview experience that you should focus on   Introduction You’ve finally done it! You have landed an interview for a data science role. Now, a day before your interview, you’re not sure what to study. The day is almost here but there is so much to cover! Sound familiar? Interviews can be daunting […]

Read more

The Best Data Science Libraries in Python

Preface Due to its exceptional abilities, Python is the most commonly used programming language in the field of Data Science these days. While Python provides a lot of functionality, the availability of various multi-purpose, ready-to-use libraries is what makes the language top choice for Data Scientists. Some of these libraries are well known and widely used, while others are not so common. In this article I have tried to compile a list of Python libraries and categorized them according to […]

Read more
1 5 6 7 8 9