Articles About Machine Learning

Develop a Model for the Imbalanced Classification of Good and Bad Credit

Last Updated on August 28, 2020 Misclassification errors on the minority class are more important than other types of prediction errors for some imbalanced classification tasks. One example is the problem of classifying bank customers as to whether they should receive a loan or not. Giving a loan to a bad customer marked as a good customer results in a greater cost to the bank than denying a loan to a good customer marked as a bad customer. This requires […]

Read more

Imbalanced Classification Model to Detect Mammography Microcalcifications

Last Updated on August 21, 2020 Cancer detection is a popular example of an imbalanced classification problem because there are often significantly more cases of non-cancer than actual cancer. A standard imbalanced classification dataset is the mammography dataset that involves detecting breast cancer from radiological scans, specifically the presence of clusters of microcalcifications that appear bright on a mammogram. This dataset was constructed by scanning the images, segmenting them into candidate objects, and using computer vision techniques to describe each […]

Read more

Predictive Model for the Phoneme Imbalanced Classification Dataset

Last Updated on August 21, 2020 Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced. Nevertheless, accuracy is equally important in both classes. An example is the classification of vowel sounds from European languages as either nasal or oral on speech recognition where there are many more examples of nasal than oral vowels. Classification accuracy is important for both classes, although accuracy as a metric cannot […]

Read more

Imbalanced Classification with the Adult Income Dataset

Last Updated on August 21, 2020 Many binary classification tasks do not have an equal number of examples from each class, e.g. the class distribution is skewed or imbalanced. A popular example is the adult income dataset that involves predicting personal income levels as above or below $50,000 per year based on personal details such as relationship and education level. There are many more cases of incomes less than $50K than above $50K, although the skew is not severe. This […]

Read more

Step-By-Step Framework for Imbalanced Classification Projects

Last Updated on March 19, 2020 Classification predictive modeling problems involve predicting a class label for a given set of inputs. It is a challenging problem in general, especially if little is known about the dataset, as there are tens, if not hundreds, of machine learning algorithms to choose from. The problem is made significantly more difficult if the distribution of examples across the classes is imbalanced. This requires the use of specialized methods to either change the dataset or […]

Read more

Imbalanced Classification with the Fraudulent Credit Card Transactions Dataset

Last Updated on August 21, 2020 Fraud is a major problem for credit card companies, both because of the large volume of transactions that are completed each day and because many fraudulent transactions look a lot like normal transactions. Identifying fraudulent credit card transactions is a common type of imbalanced binary classification where the focus is on the positive class (is fraud) class. As such, metrics like precision and recall can be used to summarize model performance in terms of […]

Read more

Imbalanced Multiclass Classification with the Glass Identification Dataset

Last Updated on August 21, 2020 Multiclass classification problems are those where a label must be predicted, but there are more than two labels that may be predicted. These are challenging predictive modeling problems because a sufficiently representative number of examples of each class is required for a model to learn the problem. It is made challenging when the number of examples in each class is imbalanced, or skewed toward one or a few of the classes with very few […]

Read more

Imbalanced Multiclass Classification with the E.coli Dataset

Last Updated on August 21, 2020 Multiclass classification problems are those where a label must be predicted, but there are more than two labels that may be predicted. These are challenging predictive modeling problems because a sufficiently representative number of examples of each class is required for a model to learn the problem. It is made challenging when the number of examples in each class is imbalanced, or skewed toward one or a few of the classes with very few […]

Read more

Neural Networks are Function Approximation Algorithms

Last Updated on August 27, 2020 Supervised learning in machine learning can be described in terms of function approximation. Given a dataset comprised of inputs and outputs, we assume that there is an unknown underlying function that is consistent in mapping inputs to outputs in the target domain and resulted in the dataset. We then use supervised learning algorithms to approximate this function. Neural networks are an example of a supervised machine learning algorithm that is perhaps best understood in […]

Read more

How to Perform Data Cleaning for Machine Learning with Python

Last Updated on June 30, 2020 Data cleaning is a critically important step in any machine learning project. In tabular data, there are many different statistical analysis and data visualization techniques you can use to explore your data in order to identify data cleaning operations you may want to perform. Before jumping to the sophisticated methods, there are some very basic data cleaning operations that you probably should perform on every single machine learning project. These are so basic that […]

Read more
1 184 185 186 187 188 203