How to use a Machine Learning Model to Make Predictions on Streaming Data using PySpark

Overview

  • Streaming data is a thriving concept in the machine learning space
  • Learn how to use a machine learning model (such as logistic regression) to make predictions on streaming data using PySpark
  • We’ll cover the basics of Streaming Data and Spark Streaming, and then dive into the implementation part

 

Introduction

Picture this – every second, more than 8,500 Tweets are sent, more than 900 photos are uploaded on Instagram, more than 4,200 Skype calls are made, more than 78,000 Google Searches happen, and more than 2 million emails are sent (according to Internet Live Stats).

We are generating data at an unprecedented pace and scale right now. What a great time to be working in the data science space! But with great data, comes equally complex challenges.

Primarily – how do we collect data at this scale? How do we ensure that our machine learning pipeline continues to churn out results as soon as the data is generated and collected? These are significant challenges the industry is facing

 

 

 

To finish reading, please visit source site

Leave a Reply