How To Choose The Right Test Options When Evaluating Machine Learning Algorithms

Last Updated on June 21, 2016 The test options you use when evaluating machine learning algorithms can mean the difference between over-learning, a mediocre result and a usable state-of-the-art result that you can confidently shout from the roof tops (you really do feel like doing that sometimes). In this post you will discover the standard test options you can use in your algorithm evaluation test harness and how to choose the right options next time. Randomness The root of the […]

Read more

A Simple Intuition for Overfitting, or Why Testing on Training Data is a Bad Idea

Last Updated on August 21, 2016 When you first start out with machine learning you load a dataset and try models. You might think to yourself, why can’t I just build a model with all of the data and evaluate it on the same dataset? It seems reasonable. More data to train the model is better, right? Evaluating the model and reporting results on the same dataset will tell you how good the model is, right? Wrong. In this post […]

Read more

Classification Accuracy is Not Enough: More Performance Measures You Can Use

Last Updated on June 20, 2019 When you build a model for a classification problem you almost always want to look at the accuracy of that model as the number of correct predictions from all predictions made. This is the classification accuracy. In a previous post, we have looked at evaluating the robustness of a model for making predictions on unseen data using cross-validation and multiple cross-validation where we used classification accuracy and average classification accuracy. Once you have a […]

Read more

Machine Learning Tips from a World Class Practitioner: Phil Brierley

Last Updated on June 7, 2016 Phil Brierley won the Heritage Health Prize Kaggle machine learning competition. Phil was trained as a mechanical engineer and has a background in data mining with his company Tiberius Data Mining. He is heavily into R these days and keeps a blog at Another Data Mining Blog. In October 2013 he presented to the Melbourne Users of R special interest group. The title of his talk was “Techniques to improve the accuracy of your Predictive Models” and you can […]

Read more

BigML Review: Discover the Clever Features in This Machine Learning as a Service Platform

Last Updated on August 16, 2020 Machine Learning has been commoditized into a service. This is a recent trend that looks like it will develop into the mainstream like commoditized storage and virtualization. It is the natural next step. In this review you will learn about BigML that provides commoditized machine learning as a service for business analysts and application integration. About BigML BigML was co-founded by a group of five guys in 2011. Francisco Martin seems to be active […]

Read more

BigML Tutorial: Develop Your First Decision Tree and Make Predictions

Last Updated on June 7, 2016 BigML is a fresh new and interesting machine learning as a service company based out of Corvallis, Oregon, USA. In a previous post, we reviewed the BigML service, the key features and the ways in which you could use this service in your business, on you side project or to present to clients. In this tutorial we will walk through a step-by-step tutorial on developing a predictive model using the BigML platform and use […]

Read more

The Seductive Trap of Black-Box Machine Learning

Last Updated on April 4, 2018 For as long as I have been participating in data mining and machine learning competitions, I have thought about automating my participation. Maybe it shows that I want to solve the problem of building the tool more than I want to solve the problem at hand. When working on a dataset, I typically spend a disproportionate amount of time thinking about algorithm tuning and running tuning experiments. I am prone to performing post-competition analysis […]

Read more

How to Layout and Manage Your Machine Learning Project

Last Updated on June 7, 2016 Project layout is critical for machine learning projects just as it is for software development projects. I think of it like language. A project layout organizes thoughts and gives you context for ideas just like knowing the names for things gives you the basis for thinking. In this post I want to highlight some considerations in the layout and management of your machine learning project. This is very much related to the goals of […]

Read more

Model Prediction Accuracy Versus Interpretation in Machine Learning

Last Updated on August 15, 2020 In their book Applied Predictive Modeling, Kuhn and Johnson comment early on the trade-off of model prediction accuracy versus model interpretation. For a given problem, it is critical to have a clear idea of the which is a priority, accuracy or explainability so that this trade-off can be made explicitly rather than implicitly. In this post you will discover and consider this important trade-off. Model Accuracy vs ExplainabilityPhoto by Donald Hobern, some rights reserved […]

Read more

Clever Application Of A Predictive Model

Last Updated on August 15, 2020 What if you could use a predictive model to find new combinations of attributes that do not exist in the data but could be valuable. In Chapter 10 of Applied Predictive Modeling, Kuhn and Johnson provide a case study that does just this. It’s a fascinating and creative example of how to use a predictive model. In this post we will discover this less obvious use of a predictive model and the types of […]

Read more
1 2 3 4 6