A Gentle Introduction to Sparse Matrices for Machine Learning

Last Updated on August 9, 2019

Matrices that contain mostly zero values are called sparse, distinct from matrices where most of the values are non-zero, called dense.

Large sparse matrices are common in general and especially in applied machine learning, such as in data that contains counts, data encodings that map categories to counts, and even in whole subfields of machine learning such as natural language processing.

It is computationally expensive to represent and work with sparse matrices as though they are dense, and much improvement in performance can be achieved by using representations and operations that specifically handle the matrix sparsity.

In this tutorial, you will discover sparse matrices, the issues they present, and how to work with them directly in Python.

After completing this tutorial, you will know:

That sparse matrices contain mostly zero values and are distinct from dense matrices.
The myriad of areas where you are likely to encounter sparse matrices in data, data preparation, and sub-fields of machine learning.
That there are many efficient ways to store and work with sparse matrices and SciPy provides implementations that you can use directly.

Kick-start your project with my new book Linear Algebra for
To finish reading, please visit source site

Linear Algebra