Matrices that mostly contain zeroes are said to be sparse.
Sparse matrices are commonly used in applied machine learning (such as in data containing data-encodings that map categories to count) and even in whole subfields of machine learning such as natural language processing (NLP).
Sparse matrices contain only a few non-zero values. Storing such data in a two-dimensional matrix data structure is a waste of space. Also, it is computationally expensive to represent and work with sparse matrices as though they are dense. A significant improvement in performance can be achieved by using representations and operations that specifically handle matrix sparsity.
Python’s SciPy provides tools for creating sparse matrices using multiple data structures, as well as tools for converting a dense matrix to a sparse matrix. The sparse matrix representation outputs the row-column tuple where the matrix contains non-zero values along with those values.
import numpy as npfrom scipy.sparse import csr_matrix# create a 2-D representation of the matrixA = np.array([[1, 0, 0, 0, 0, 0], [0, 0, 2, 0, 0, 1],\[0, 0, 0, 2, 0, 0]])print("Dense matrix representation: \n", A)# convert to sparse matrix representationS = csr_matrix(A)print("Sparse matrix: \n",S)# convert back to 2-D representation of the matrixB = S.todense()print("Dense matrix: \n", B)
Free Resources