Loss

Understand the steps in computing the appropriate loss, and explore the binary-cross entropy loss.

Defining the appropriate loss

We already have a model, and now we need to define an appropriate loss for it. A binary classification problem calls for the binary cross-entropy (BCE) loss, which is sometimes known as log loss.

The BCE loss requires the predicted probabilities, as returned by the sigmoid function, and the true labels (y) for its computation. For each data point i in the training set, it starts by computing the error corresponding to the point’s true class.

If the data point belongs to the positive class (y=1), we would like our model to predict a probability close to one, right? A perfect one would result in the logarithm of one, which is zero. It makes sense; a perfect prediction means zero loss. It goes like this:

yi=1=>errori=log(P(yi=1))y_i = 1 => error_i = log(P(y_i = 1)) ...

Access this course and 1400+ top-rated courses and projects.