How to Develop a Feature Selection Subspace Ensemble in Python

Random subspace ensembles consist of the same model fit on different randomly selected groups of input features (columns) in the training dataset. There are many ways to choose groups of features in the training dataset, and feature selection is a popular class of data preparation techniques designed specifically for this purpose. The features selected by different configurations of the same feature selection method and different feature selection methods entirely can be used as the basis for ensemble learning. In this […]

Read more

Extreme Gradient Boosting (XGBoost) Ensemble in Python

Extreme Gradient Boosting (XGBoost) is an open-source library that provides an efficient and effective implementation of the gradient boosting algorithm. Although other open-source implementations of the approach existed before XGBoost, the release of XGBoost appeared to unleash the power of the technique and made the applied machine learning community take notice of gradient boosting more generally. Shortly after its development and initial release, XGBoost became the go-to method and often the key component in winning solutions for classification and regression […]

Read more

How to Develop a Light Gradient Boosted Machine (LightGBM) Ensemble

Light Gradient Boosted Machine, or LightGBM for short, is an open-source library that provides an efficient and effective implementation of the gradient boosting algorithm. LightGBM extends the gradient boosting algorithm by adding a type of automatic feature selection as well as focusing on boosting examples with larger gradients. This can result in a dramatic speedup of training and improved predictive performance. As such, LightGBM has become a de facto algorithm for machine learning competitions when working with tabular data for […]

Read more

How to Develop Random Forest Ensembles With XGBoost

The XGBoost library provides an efficient implementation of gradient boosting that can be configured to train random forest ensembles. Random forest is a simpler algorithm than gradient boosting. The XGBoost library allows the models to be trained in a way that repurposes and harnesses the computational efficiencies implemented in the library for training random forest models. In this tutorial, you will discover how to use the XGBoost library to develop random forest ensembles. After completing this tutorial, you will know: […]

Read more

Blending Ensemble Machine Learning With Python

Blending is an ensemble machine learning algorithm. It is a colloquial name for stacked generalization or stacking ensemble where instead of fitting the meta-model on out-of-fold predictions made by the base model, it is fit on predictions made on a holdout dataset. Blending was used to describe stacking models that combined many hundreds of predictive models by competitors in the $1M Netflix machine learning competition, and as such, remains a popular technique and name for stacking in competitive machine learning […]

Read more

How to Develop a Random Subspace Ensemble With Python

Random Subspace Ensemble is a machine learning algorithm that combines the predictions from multiple decision trees trained on different subsets of columns in the training dataset. Randomly varying the columns used to train each contributing member of the ensemble has the effect of introducing diversity into the ensemble and, in turn, can lift performance over using a single decision tree. It is related to other ensembles of decision trees such as bootstrap aggregation (bagging) that creates trees using different samples […]

Read more

Error-Correcting Output Codes (ECOC) for Machine Learning

Machine learning algorithms, like logistic regression and support vector machines, are designed for two-class (binary) classification problems. As such, these algorithms must either be modified for multi-class (more than two) classification problems or not used at all. The Error-Correcting Output Codes method is a technique that allows a multi-class classification problem to be reframed as multiple binary classification problems, allowing the use of native binary classification models to be used directly. Unlike one-vs-rest and one-vs-one methods that offer a similar […]

Read more

Why Use Ensemble Learning?

What are the Benefits of Ensemble Methods for Machine Learning? Ensembles are predictive models that combine predictions from two or more other models. Ensemble learning methods are popular and the go-to technique when the best performance on a predictive modeling project is the most important outcome. Nevertheless, they are not always the most appropriate technique to use and beginners the field of applied machine learning have the expectation that ensembles or a specific ensemble method are always the best method […]

Read more

A Gentle Introduction to Ensemble Learning

Many decisions we make in life are based on the opinions of multiple other people. This includes choosing a book to read based on reviews, choosing a course of action based on the advice of multiple medical doctors, and determining guilt. Often, decision making by a group of individuals results in a better outcome than a decision made by any one member of the group. This is generally referred to as the wisdom of the crowd. We can achieve a […]

Read more

6 Books on Ensemble Learning

Ensemble learning involves combining the predictions from multiple machine learning models. The effect can be both improved predictive performance and lower variance of the predictions made by the model. Ensemble methods are covered in most textbooks on machine learning; nevertheless, there are books dedicated to the topic. In this post, you will discover the top books on the topic of ensemble machine learning. After reading this post, you will know: Books on ensemble learning, including their table of contents and […]

Read more
1 2 3 4 5