Binary Cross-Entropy Calculator
Loss function for binary classification.
Formula first
Overview
Binary Cross-Entropy measures the divergence between two probability distributions, typically the true labels and the predicted probabilities in a binary classification task. It calculates a loss value that penalizes predictions exponentially as they diverge from the actual class value.
Symbols
Variables
L = Loss, y = Actual Label (0/1), p = Predicted Prob
Apply it well
When To Use
When to use: This equation is the standard loss function for binary classification problems where the output is a single probability between 0 and 1. It is most effective when paired with a sigmoid activation function in the final layer of a neural network.
Why it matters: It provides a smooth, convex surface for optimization, allowing gradient descent to effectively update model weights. By heavily penalizing confident but incorrect predictions, it forces the model to learn more distinct boundaries between classes.
Avoid these traps
Common Mistakes
- Using p=0 or p=1 directly.
- Forgetting the (1-y) term.
One free problem
Practice Problem
A machine learning model identifies a transaction as fraudulent (y = 1). The model's predicted probability of fraud is 0.85. Calculate the binary cross-entropy loss for this specific prediction.
Solve for:
Hint: When y = 1, the formula simplifies to L = -ln(p).
The full worked solution stays in the interactive walkthrough.
References
Sources
- Wikipedia: Cross-entropy
- Goodfellow, I., Bengio, Y., Courville, A. (2016). Deep Learning. MIT Press.
- Deep Learning (Ian Goodfellow, Yoshua Bengio, and Aaron Courville)
- Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press. (Chapter 6, Section 6.2.2.2)
- Bishop, C. M. (2006). Pattern Recognition and Machine Learning. Springer. (Chapter 4, Section 4.3.4)
- Standard curriculum — Machine Learning (Classification Losses)