A Gentle Introduction to Multiple-Model Machine Learning

An ensemble learning method involves combining the predictions from multiple contributing models. Nevertheless, not all techniques that make use of multiple machine learning models are ensemble learning algorithms. It is common to divide a prediction problem into subproblems. For example, some problems naturally subdivide into independent but related subproblems and a machine learning model can be prepared for each. It is less clear whether these represent examples of ensemble learning, although we might distinguish these methods from ensembles given the […]

Read more

A Gentle Introduction to Ensemble Diversity for Machine Learning

Ensemble learning combines the predictions from machine learning models for classification and regression. We pursue using ensemble methods to achieve improved predictive performance, and it is this improvement over any of the contributing models that defines whether an ensemble is good or not. A property that is present in a good ensemble is the diversity of the predictions made by contributing models. Diversity is a slippery concept as it has not been precisely defined; nevertheless, it provides a useful practical […]

Read more

Essence of Bootstrap Aggregation Ensembles

Bootstrap aggregation, or bagging, is a popular ensemble method that fits a decision tree on different bootstrap samples of the training dataset. It is simple to implement and effective on a wide range of problems, and importantly, modest extensions to the technique result in ensemble methods that are among some of the most powerful techniques, like random forest, that perform well on a wide range of predictive modeling problems. As such, we can generalize the bagging method to a framework […]

Read more

Histogram-Based Gradient Boosting Ensembles in Python

Gradient boosting is an ensemble of decision trees algorithms. It may be one of the most popular techniques for structured (tabular) classification and regression predictive modeling problems given that it performs so well across a wide range of datasets in practice. A major problem of gradient boosting is that it is slow to train the model. This is particularly a problem when using the model on large datasets with tens of thousands of examples (rows). Training the trees that are […]

Read more

Ensemble Learning Algorithm Complexity and Occam’s Razor

Occam’s razor suggests that in machine learning, we should prefer simpler models with fewer coefficients over complex models like ensembles. Taken at face value, the razor is a heuristic that suggests more complex hypotheses make more assumptions that, in turn, will make them too narrow and not generalize well. In machine learning, it suggests complex models like ensembles will overfit the training dataset and perform poorly on new data. In practice, ensembles are almost universally the type of model chosen […]

Read more

What Is Meta-Learning in Machine Learning?

Meta-learning in machine learning refers to learning algorithms that learn from other learning algorithms. Most commonly, this means the use of machine learning algorithms that learn how to best combine the predictions from other machine learning algorithms in the field of ensemble learning. Nevertheless, meta-learning might also refer to the manual process of model selecting and algorithm tuning performed by a practitioner on a machine learning project that modern automl algorithms seek to automate. It also refers to learning across […]

Read more

Dynamic Classifier Selection Ensembles in Python

Dynamic classifier selection is a type of ensemble learning algorithm for classification predictive modeling. The technique involves fitting multiple machine learning models on the training dataset, then selecting the model that is expected to perform best when making a prediction, based on the specific details of the example to be predicted. This can be achieved using a k-nearest neighbor model to locate examples in the training dataset that are closest to the new example to be predicted, evaluating all models […]

Read more

Develop an Intuition for How Ensemble Learning Works

Ensembles are a machine learning method that combine the predictions from multiple models in an effort to achieve better predictive performance. There are many different types of ensembles, although all approaches have two key properties: they require that the contributing models are different so that they make different errors and they combine the predictions in an attempt to harness what each different model does well. Nevertheless, it is not clear how ensembles manage to achieve this, especially in the context […]

Read more

Multivariate Adaptive Regression Splines (MARS) in Python

Multivariate Adaptive Regression Splines, or MARS, is an algorithm for complex non-linear regression problems. The algorithm involves finding a set of simple linear functions that in aggregate result in the best predictive performance. In this way, MARS is a type of ensemble of simple linear functions and can achieve good performance on challenging regression problems with many input variables and complex non-linear relationships. In this tutorial, you will discover how to develop Multivariate Adaptive Regression Spline models in Python. After […]

Read more

Develop a Bagging Ensemble with Different Data Transformations

Bootstrap aggregation, or bagging, is an ensemble where each model is trained on a different sample of the training dataset. The idea of bagging can be generalized to other techniques for changing the training dataset and fitting the same model on each changed version of the data. One approach is to use data transforms that change the scale and probability distribution of input variables as the basis for the training of contributing members to a bagging-like ensemble. We can refer […]

Read more
1 2 3 4 5