Gradients of Matrices

Learn how to calculate the gradients of matrices.

Gradient with respect to matrices

Many times, machine learning objectives, such as minimizing the loss function of a linear regression, can be written using matrices and vectors, making them compact and easy to understand. Therefore, to make computations easier, it is worthwhile to understand how gradients are computed when matrices are involved.

The gradient of matrices with respect to vectors (or matrices) can also be computed like the Jacobian of vector-valued functions. The Jacobian can be thought of as a multi-dimensional tensor that is a collection of partial derivatives. For example, the gradient for a m×nm \times n matrix AA with respect to the p×qp \times q matrix BB will be a (m×n)×(p×q)(m \times n) \times (p \times q) Jacobian whose entries will be given as follows:

Get hands-on with 1400+ tech skills courses.