How to Make Predictions with scikit-learn

Last Updated on January 10, 2020 How to predict classification or regression outcomeswith scikit-learn models in Python. Once you choose and fit a final machine learning model in scikit-learn, you can use it to make predictions on new data instances. There is some confusion amongst beginners about how exactly to do this. I often see questions such as: How do I make predictions with my model in scikit-learn? In this tutorial, you will discover exactly how you can make classification […]

Read more

How to Develop a Framework to Spot-Check Machine Learning Algorithms in Python

Last Updated on August 28, 2020 Spot-checking algorithms is a technique in applied machine learning designed to quickly and objectively provide a first set of results on a new predictive modeling problem. Unlike grid searching and other types of algorithm tuning that seek the optimal algorithm or optimal configuration for an algorithm, spot-checking is intended to evaluate a diverse set of algorithms rapidly and provide a rough first-cut result. This first cut result may be used to get an idea […]

Read more

Your First Machine Learning Project in Python Step-By-Step

Last Updated on August 19, 2020 Do you want to do machine learning using Python, but you’re having trouble getting started? In this post, you will complete your first machine learning project using Python. In this step-by-step tutorial you will: Download and install Python SciPy and get the most useful package for machine learning in Python. Load a dataset and understand it’s structure using statistical summaries and data visualization. Create 6 machine learning models, pick the best and build confidence […]

Read more

How to Fix FutureWarning Messages in scikit-learn

Last Updated on August 21, 2019 Upcoming changes to the scikit-learn library for machine learning are reported through the use of FutureWarning messages when the code is run. Warning messages can be confusing to beginners as it looks like there is a problem with the code or that they have done something wrong. Warning messages are also not good for operational code as they can obscure errors and program output. There are many ways to handle a warning message, including […]

Read more

How to Save a NumPy Array to File for Machine Learning

Last Updated on August 19, 2020 Developing machine learning models in Python often requires the use of NumPy arrays. NumPy arrays are efficient data structures for working with data in Python, and machine learning models like those in the scikit-learn library, and deep learning models like those in the Keras library, expect input data in the format of NumPy arrays and make predictions in the format of NumPy arrays. As such, it is common to need to save NumPy arrays […]

Read more

How to Connect Model Input Data With Predictions for Machine Learning

Last Updated on August 19, 2020 Fitting a model to a training dataset is so easy today with libraries like scikit-learn. A model can be fit and evaluated on a dataset in just a few lines of code. It is so easy that it has become a problem. The same few lines of code are repeated again and again and it may not be obvious how to actually use the model to make a prediction. Or, if a prediction is […]

Read more

Tune Hyperparameters for Classification Machine Learning Algorithms

Last Updated on August 28, 2020 Machine learning algorithms have hyperparameters that allow you to tailor the behavior of the algorithm to your specific dataset. Hyperparameters are different from parameters, which are the internal coefficients or weights for a model found by the learning algorithm. Unlike parameters, hyperparameters are specified by the practitioner when configuring the model. Typically, it is challenging to know what values to use for the hyperparameters of a given algorithm on a given dataset, therefore it […]

Read more

Best Results for Standard Machine Learning Datasets

Last Updated on August 28, 2020 It is important that beginner machine learning practitioners practice on small real-world datasets. So-called standard machine learning datasets contain actual observations, fit into memory, and are well studied and well understood. As such, they can be used by beginner practitioners to quickly test, explore, and practice data preparation and modeling techniques. A practitioner can confirm whether they have the data skills required to achieve a good result on a standard machine learning dataset. A […]

Read more

4 Distance Measures for Machine Learning

Last Updated on August 19, 2020 Distance measures play an important role in machine learning. They provide the foundation for many popular and effective machine learning algorithms like k-nearest neighbors for supervised learning and k-means clustering for unsupervised learning. Different distance measures must be chosen and used depending on the types of the data. As such, it is important to know how to implement and calculate a range of different popular distance measures and the intuitions for the resulting scores. […]

Read more

10 Clustering Algorithms With Python

Last Updated on August 20, 2020 Clustering or cluster analysis is an unsupervised learning problem. It is often used as a data analysis technique for discovering interesting patterns in data, such as groups of customers based on their behavior. There are many clustering algorithms to choose from and no single best clustering algorithm for all cases. Instead, it is a good idea to explore a range of clustering algorithms and different configurations for each algorithm. In this tutorial, you will […]

Read more
1 5 6 7 8