上QQ阅读APP看书,第一时间看更新
L2 regularization and Ridge
L2 regularization prevents the weights {wi} from being too spread. The smaller weights that rise up for non-correlated though potentially meaningful, features will not become insignificant when compared to the weights associated to the important correlated features. L2 regularization will enforce similar scaling of the weights. A direct consequence of L2 regularization is to reduce the negative impact of collinearity, since the weights can no longer perge from one another.
The Stochastic Gradient Descent algorithm with L2 regularization is known as the Ridge algorithm.