SparseCategoricalCrossentropy
- Replaces BinaryCrossentropy used in binary classification
- “Categorical” because we’re classifying into categories (1-10)
- “Sparse” because each example belongs to only one category
- Example: A digit image is either 0, 1, 2, …, or 9, but not multiple digits simultaneously