Regularization

The whole idea of regularization is that anything that you do to your model, this added regularization somehow penalizes the complexity of the model, vs having the model solely try to fit the training data.

With regularization, you’re penalizing the parameters of the model to force it towards a simpler solution. There are different methods of implementing regularization, shown below.

Types of regularization

Q:How does the L2 Regularization measure the complexity of the model?

A: Example:

W1 = [1,0,0,0] W2 = [.25, .25, .25. ,25] x = some X vector with dims. [4 x 1].

When we’re doing linear classification, we’re really taking the dot product between our x and our W. In terms of linear classification - these two W’s are the same because when dot producted with X, they give the same result. So which Wx would L2 regularization prefer?

  • L2 Regularization would prefer W2 because it prefers to spread the weight influence across all the different values of x rather than specific elements of x. (This way, with varying x’s the model could still influence a correct prediction.)
  • L1 Regularization would prefer W1 because conversely to L2, L1 states that model simplicity is defined by the number of 0’s in its weight vector.

results matching ""

    No results matching ""