Top 5 Machine Learning GitHub Repositories & Reddit Discussions (October 2018)

Introduction “Should I use GitHub for my projects?” – I’m often asked this question by aspiring data scientists. There’s only one answer to this – “Absolutely!”. GitHub is an invaluable platform for data scientists looking to stand out from the crowd. It’s an online resume for displaying your code to recruiters and other fellow professionals. The fact that GitHub hosts open-source projects from the top tech behemoths like Google, Facebook, IBM, NVIDIA, etc. is what adds to the gloss of […]

Read more

How do Transformers Work in NLP? A Guide to the Latest State-of-the-Art Models

Overview The Transformer model in NLP has truly changed the way we work with text data Transformer is behind the recent NLP developments, including Google’s BERT Learn how the Transformer idea works, how it’s related to language modeling, sequence-to-sequence modeling, and how it enables Google’s BERT model   Introduction I love being a data scientist working in Natural Language Processing (NLP) right now. The breakthroughs and developments are occurring at an unprecedented pace. From the super-efficient ULMFiT framework to Google’s […]

Read more

Innoplexus Sentiment Analysis Hackathon: Top 3 Out-of-the-Box Winning Approaches

Overview Hackathons are a wonderful opportunity to gauge your data science knowledge and compete to win lucrative prizes and job opportunities Here are the top 3 approaches from the Innoplexus Sentiment Analysis Hackathon – a superb NLP challenge   Introduction I’m a big fan of hackathons. I’ve learned so much about data science from participating in these hackathons in the past few years. I’ll admit it – I have gained a lot of knowledge through this medium and this, in […]

Read more

How to use a Machine Learning Model to Make Predictions on Streaming Data using PySpark

Overview Streaming data is a thriving concept in the machine learning space Learn how to use a machine learning model (such as logistic regression) to make predictions on streaming data using PySpark We’ll cover the basics of Streaming Data and Spark Streaming, and then dive into the implementation part   Introduction Picture this – every second, more than 8,500 Tweets are sent, more than 900 photos are uploaded on Instagram, more than 4,200 Skype calls are made, more than 78,000 […]

Read more

Seaborn Distribution/Histogram Plot – Tutorial and Examples

Introduction Seaborn is one of the most widely used data visualization libraries in Python, as an extension to Matplotlib. It offers a simple, intuitive, yet highly customizable API for data visualization. In this tutorial, we’ll take a look at how to plot a histogram plot in Seaborn. We’ll cover how to plot a histogram with Seaborn, how to change Histogram bin sizes, as well as plot Kernel Density Estimation plots on top of Histograms and show distribution data instead of […]

Read more

Matplotlib Bar Plot – Tutorial and Examples

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. From simple to complex visualizations, it’s the go-to library for most. In this tutorial, we’ll take a look at how to plot a bar plot in Matplotlib. Bar graphs display numerical quantities on one axis and categorical variables on the other, letting you see how many occurrences there are for the different categories. Bar charts can be used for visualizing a time series, as well as […]

Read more

Matplotlib: Change Scatter Plot Marker Size

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. Much of Matplotlib’s popularity comes from its customization options – you can tweak just about any element from its hierarchy of objects. In this tutorial, we’ll take a look at how to change the marker size in a Matplotlib scatter plot. Import Data We’ll use the World Happiness dataset, and compare the Happiness Score against varying features to see what influences perceived happiness in the world: […]

Read more

Matplotlib Histogram Plot – Tutorial and Examples

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. From simple to complex visualizations, it’s the go-to library for most. In this tutorial, we’ll take a look at how to plot a histogram plot in Matplotlib. Histogram plots are a great way to visualize distributions of data – In a histogram, each bar groups numbers into ranges. Taller bars show that more data falls in that range. A histogram displays the shape and spread of […]

Read more

Rotate Axis Labels in Matplotlib

Introduction Matplotlib is one of the most widely used data visualization libraries in Python. Much of Matplotlib’s popularity comes from its customization options – you can tweak just about any element from its hierarchy of objects. In this tutorial, we’ll take a look at how to rotate axis text/labels in a Matplotlib plot. Creating a Plot Let’s create a simple plot first: import matplotlib.pyplot as plt import numpy as np x = np.arange(0, 10, 0.1) y = np.sin(x) plt.plot(x, y) […]

Read more

How to Plot Inline and With Qt – Matplotlib with IPython/Jupyter Notebooks

Introduction There are a number of different data visualization libraries for Python. Out of all of the libraries, however, Matplotlib is easily the most popular and widely used one. With Matplotlib you can create both simple and complex visualizations. Jupyter notebooks are one of the most popular methods of sharing data science and data analysis projects, code, and visualization. Although you may know how to visualize data with Matplotlib, you may not know how to use Matplotlib in a Jupyter […]

Read more
1 3 4 5 6 7 9