Optimization - Gradient Descent
http://cs231n.github.io/optimization-1/
Slope = Gradient is the dot product of the direction with the gradient.
Ways to compute Gradients
Finite Differences (terrible Idea. super slow) -
NEVER DO THIS
For each Weight W, add some margin and calculate the gradient dW. Terribly inefficient as it iterates one at a time - super slow for millions of weights.
Calculus to compute an analytic gradient -
DO THIS!
Use calculus to figure out an analytic expression for the W, and calculate the gradient dW using calculus in one step . Exact and much faster
Q: How do you make sure your analytic gradient is correct?
A: You can scale down the problem and use your numerical gradient calculation to ensure that your analytic expression and gradient calculation is correct.