How to Develop Random Forest Ensembles With XGBoost

The XGBoost library provides an efficient implementation of gradient boosting that can be configured to train random forest ensembles. Random forest is a simpler algorithm than gradient boosting. The XGBoost library allows the models to be trained in a way that repurposes and harnesses the computational efficiencies implemented in the library for training random forest models. In this tutorial, you will discover how to use the XGBoost library to develop random forest ensembles. After completing this tutorial, you will know: […]

Read more

Blending Ensemble Machine Learning With Python

Blending is an ensemble machine learning algorithm. It is a colloquial name for stacked generalization or stacking ensemble where instead of fitting the meta-model on out-of-fold predictions made by the base model, it is fit on predictions made on a holdout dataset. Blending was used to describe stacking models that combined many hundreds of predictive models by competitors in the $1M Netflix machine learning competition, and as such, remains a popular technique and name for stacking in competitive machine learning […]

Read more

How to Develop a Random Subspace Ensemble With Python

Random Subspace Ensemble is a machine learning algorithm that combines the predictions from multiple decision trees trained on different subsets of columns in the training dataset. Randomly varying the columns used to train each contributing member of the ensemble has the effect of introducing diversity into the ensemble and, in turn, can lift performance over using a single decision tree. It is related to other ensembles of decision trees such as bootstrap aggregation (bagging) that creates trees using different samples […]

Read more

Error-Correcting Output Codes (ECOC) for Machine Learning

Machine learning algorithms, like logistic regression and support vector machines, are designed for two-class (binary) classification problems. As such, these algorithms must either be modified for multi-class (more than two) classification problems or not used at all. The Error-Correcting Output Codes method is a technique that allows a multi-class classification problem to be reframed as multiple binary classification problems, allowing the use of native binary classification models to be used directly. Unlike one-vs-rest and one-vs-one methods that offer a similar […]

Read more

Why Use Ensemble Learning?

What are the Benefits of Ensemble Methods for Machine Learning? Ensembles are predictive models that combine predictions from two or more other models. Ensemble learning methods are popular and the go-to technique when the best performance on a predictive modeling project is the most important outcome. Nevertheless, they are not always the most appropriate technique to use and beginners the field of applied machine learning have the expectation that ensembles or a specific ensemble method are always the best method […]

Read more

A Gentle Introduction to Ensemble Learning

Many decisions we make in life are based on the opinions of multiple other people. This includes choosing a book to read based on reviews, choosing a course of action based on the advice of multiple medical doctors, and determining guilt. Often, decision making by a group of individuals results in a better outcome than a decision made by any one member of the group. This is generally referred to as the wisdom of the crowd. We can achieve a […]

Read more

6 Books on Ensemble Learning

Ensemble learning involves combining the predictions from multiple machine learning models. The effect can be both improved predictive performance and lower variance of the predictions made by the model. Ensemble methods are covered in most textbooks on machine learning; nevertheless, there are books dedicated to the topic. In this post, you will discover the top books on the topic of ensemble machine learning. After reading this post, you will know: Books on ensemble learning, including their table of contents and […]

Read more

How to Reduce Variance in a Final Machine Learning Model

Last Updated on June 26, 2020 A final machine learning model is one trained on all available data and is then used to make predictions on new data. A problem with most final models is that they suffer variance in their predictions. This means that each time you fit a model, you get a slightly different set of parameters that in turn will make slightly different predictions. Sometimes more and sometimes less skillful than what you expected. This can be […]

Read more

How to Use Out-of-Fold Predictions in Machine Learning

Last Updated on August 28, 2020 Machine learning algorithms are typically evaluated using resampling techniques such as k-fold cross-validation. During the k-fold cross-validation process, predictions are made on test sets comprised of data not used to train the model. These predictions are referred to as out-of-fold predictions, a type of out-of-sample predictions. Out-of-fold predictions play an important role in machine learning in both estimating the performance of a model when making predictions on new data in the future, so-called the […]

Read more

How to Develop Super Learner Ensembles in Python

Last Updated on August 17, 2020 Selecting a machine learning algorithm for a predictive modeling problem involves evaluating many different models and model configurations using k-fold cross-validation. The super learner is an ensemble machine learning algorithm that combines all of the models and model configurations that you might investigate for a predictive modeling problem and uses them to make a prediction as-good-as or better than any single model that you may have investigated. The super learner algorithm is an application […]

Read more
1 2 3