Loss
Understand the steps in computing the appropriate loss, and explore the binary-cross entropy loss.
We'll cover the following...
Defining the appropriate loss
We already have a model, and now we need to define an appropriate loss for it. A binary classification problem calls for the binary cross-entropy (BCE) loss, which is sometimes known as log loss.
The BCE loss requires the predicted probabilities, as returned by the sigmoid function, and the true labels (y
) for its computation. For each data point i
in the training set, it starts by computing the error corresponding to the point’s true class.
If the data point belongs to the positive class (y=1
), we would like our model to predict a probability close to one, right? A perfect one would result in the logarithm of one, which is zero. It makes sense; a perfect prediction means zero loss. It goes like this:
...