Articles About Machine Learning

14 Different Types of Learning in Machine Learning

Last Updated on November 11, 2019 Machine learning is a large field of study that overlaps with and inherits ideas from many related fields such as artificial intelligence. The focus of the field is learning, that is, acquiring skills or knowledge from experience. Most commonly, this means synthesizing useful concepts from historical data. As such, there are many different types of learning that you may encounter as a practitioner in the field of machine learning: from whole fields of study […]

Read more

How to Save a NumPy Array to File for Machine Learning

Last Updated on August 19, 2020 Developing machine learning models in Python often requires the use of NumPy arrays. NumPy arrays are efficient data structures for working with data in Python, and machine learning models like those in the scikit-learn library, and deep learning models like those in the Keras library, expect input data in the format of NumPy arrays and make predictions in the format of NumPy arrays. As such, it is common to need to save NumPy arrays […]

Read more

How to Connect Model Input Data With Predictions for Machine Learning

Last Updated on August 19, 2020 Fitting a model to a training dataset is so easy today with libraries like scikit-learn. A model can be fit and evaluated on a dataset in just a few lines of code. It is so easy that it has become a problem. The same few lines of code are repeated again and again and it may not be obvious how to actually use the model to make a prediction. Or, if a prediction is […]

Read more

What Does Stochastic Mean in Machine Learning?

Last Updated on July 24, 2020 The behavior and performance of many machine learning algorithms are referred to as stochastic. Stochastic refers to a variable process where the outcome involves some randomness and has some uncertainty. It is a mathematical term and is closely related to “randomness” and “probabilistic” and can be contrasted to the idea of “deterministic.” The stochastic nature of machine learning algorithms is an important foundational concept in machine learning and is required to be understand in […]

Read more

How to Save and Reuse Data Preparation Objects in Scikit-Learn

Last Updated on June 30, 2020 It is critical that any data preparation performed on a training dataset is also performed on a new dataset in the future. This may include a test dataset when evaluating a model or new data from the domain when using a model to make predictions. Typically, the model fit on the training dataset is saved for later use. The correct solution to preparing new data for the model in the future is to also […]

Read more

3 Ways to Encode Categorical Variables for Deep Learning

Last Updated on August 27, 2020 Machine learning and deep learning models, like those in Keras, require all input and output variables to be numeric. This means that if your data contains categorical data, you must encode it to numbers before you can fit and evaluate a model. The two most popular techniques are an integer encoding and a one hot encoding, although a newer technique called learned embedding may provide a useful middle ground between these two methods. In […]

Read more

How to Perform Feature Selection with Categorical Data

Last Updated on August 18, 2020 Feature selection is the process of identifying and selecting a subset of input features that are most relevant to the target variable. Feature selection is often straightforward when working with real-valued data, such as using the Pearson’s correlation coefficient, but can be challenging when working with categorical data. The two most commonly used feature selection methods for categorical input data when the target variable is also categorical (e.g. classification predictive modeling) are the chi-squared […]

Read more

How to Choose a Feature Selection Method For Machine Learning

Last Updated on August 20, 2020 Feature selection is the process of reducing the number of input variables when developing a predictive model. It is desirable to reduce the number of input variables to both reduce the computational cost of modeling and, in some cases, to improve the performance of the model. Statistical-based feature selection methods involve evaluating the relationship between each input variable and the target variable using statistics and selecting those input variables that have the strongest relationship […]

Read more

How to Use an Empirical Distribution Function in Python

Last Updated on August 28, 2020 An empirical distribution function provides a way to model and sample cumulative probabilities for a data sample that does not fit a standard probability distribution. As such, it is sometimes called the empirical cumulative distribution function, or ECDF for short. In this tutorial, you will discover the empirical probability distribution function. After completing this tutorial, you will know: Some data samples cannot be summarized using a standard distribution. An empirical distribution function provides a […]

Read more

A Gentle Introduction to Model Selection for Machine Learning

Given easy-to-use machine learning libraries like scikit-learn and Keras, it is straightforward to fit many different machine learning models on a given predictive modeling dataset. The challenge of applied machine learning, therefore, becomes how to choose among a range of different models that you can use for your problem. Naively, you might believe that model performance is sufficient, but should you consider other concerns, such as how long the model takes to train or how easy it is to explain […]

Read more
1 179 180 181 182 183 203