Although regularization is mainly a tool to prevent overfitting and
yield more general models. It also makes problems easier to optimize by
improving the condition number κ.
The gradient and the Hessian of this objective can be readily derived as XTX+αI.
Let λ1>λ2>⋯>λd be the eigenvalues of
XTX sorted in descending order. Because XTX is
positive semi-definite, we know that all eigenvalues are positive, and
the matrix has eigenvalue decomposition
XTX=QλQ−1.
In case that
α=0, the eigenvalues of the Hessian are simply the eigenvalues of
XTX and the condition number is given by
λnλ1. For the case that α>0, we can
decompose the Hessian as follows
Such that the eigenvalues of the regularized objectives Hessian are given by
λ~i=λi+α. Therefore the condition number
of the regularized objective is given by
λn+αλ1+α. To conclude that regularization improves the condition number we show
that
λnλ1>λn+αλ1+α