Gradient Descent (Vanilla)
At every timestep, evaluate the gradient, then take a step in the direction of that gradient for each dimension/weight. The amount of step you take is determined by your step size.
Step size is a hyperparameter - and one of the most important hyperparameters you must set. In practice , figuring out step size isthe first hyperparameter you try to set.