How to Save and Reuse Data Preparation Objects in Scikit-Learn

Last Updated on June 30, 2020

It is critical that any data preparation performed on a training dataset is also performed on a new dataset in the future.

This may include a test dataset when evaluating a model or new data from the domain when using a model to make predictions.

Typically, the model fit on the training dataset is saved for later use. The correct solution to preparing new data for the model in the future is to also save any data preparation objects, like data scaling methods, to file along with the model.

In this tutorial, you will discover how to save a model and data preparation object to file for later use.

After completing this tutorial, you will know:

  • The challenge of correctly preparing test data and new data for a machine learning model.
  • The solution of saving the model and data preparation objects to file for later use.
  • How to save and later load and use a machine learning model and data preparation model on new data.

Kick-start your project with my new book Data Preparation for Machine Learning, including step-by-step tutorials and the Python source code files for all examples.

Let’s get
To finish reading, please visit source site