Apache Kafka + KSQL + TensorFlow for Data Scientists via Python + Jupyter Notebook

Why would a data scientist use Kafka Jupyter Python KSQL TensorFlow all together in a single notebook?

There is an impedance mismatch between model development using Python and its Machine Learning tool stack and a scalable, reliable data platform. The former is what you need for quick and easy prototyping to build analytic models. The latter is what you need to use for data ingestion, preprocessing, model deployment and monitoring at scale. It requires low latency, high throughput, zero data loss and 24/7 availability requirements.

This is the main reason I see in the field why companies struggle to bring analytic models into production to add business value. Python in practice is not the most well-known technology for large scale and performant, reliable environments. However, it is a great tool for data scientist and a great client of a data platform like Apache Kafka.

Therefore, I created a project to demonstrate how this impedance mismatch can be solved. A much more detailed blog post about this topic will come on Confluent Blog soon. In this blog post here, I want to discuss and share my Github project:

Making Machine Learning Simple and Scalable

 

To finish reading, please visit source site