Difference between SVM Loss and Softmax Loss
WIth both, we are first multiplying the weight matrix W * input matrix xi and adding a bias to get our vector of scores. The difference between these two are in how we choose to interpret these scores:
- With SVM Loss, we only care that the true class score is higher than the rest by some margin.
- With Softmax loss, we compute a probability distribution, then look at the -log(P) of the correct class.