Neural Networks
Before: Linear Score function f = W*x
We've been using this as the function we want to optimize.
Now: 2-layer Neural Network
Stack two of these together
F = W2max(0,W1x)
3-layer Neural Network
F = W3 max(0, W2max(0,W1*x) )
Each row of a weight matrix is something like a template
Q: So on the bottomrow you have image templates for W1, would you also have images for W2?A: Since W1 is directly connected to x input, this is what is very interpretable. H is going to be a score of how much of each template you solve.
Q: If our input x is of a left facing horse, and W1 we have a template of a left facing and right facing horse, then what’s happening?
A: In H, you might have a really high score for your left-facing horse, and a lower score for your right facing horse, then W2 would be a weighted sum of these templates, but if you have either a really high score for one of these templates, or a medium score for one of these templates, all of these kidns of combinations are going to give high scores right. In the end you’re going to get something which generally scores high when you have a horse of any kind. If you have a front Facing Horse you may have a medium value for both the left and right templates
Q: Is W2 or H doing the weighting?
A: H is the like the score function; h is the value of scores for each of those templates that you have in W1. H is how much of each template in W1 is present, and W2 is going to weight all of these intermediate scores to get your final scores for the class
Q: Nonlinear thing?
A: H is the value after the nonlinearity