Steps 1
to 4 should be repeated many times (100/200/...) to obtain a
stable estimate of mean (slope in step 4). The slope will generally
be less than one, reflecting the "overfitting" in the data under
study. Empirically, the slope has been found to be similar in
pre-specified models and stepwise selected models. Linear shrinkage
may be referred to as "shrinkage after fitting", since the regression
coefficients are first estimated, and subsequently shrunk.
Ridge
regression was proposed in the 1970s to reduce variability
between estimated regression coefficients (Hoerl
and Kennard, 1970). For generalized linear models,
penalized maximum likelihood estimation has been proposed more
recently (Harrell
et al., 1996). A penalty factor is taken into account
that reduces the spread in predictions. The penalty factor may
be estimated by an evaluation of the effective degrees of freedom
in relation to the adequacy of the fit to the data in an AIC
measure, for varying values of the penalty factor. Ridge regression
may be referred to as "shrinkage during fitting", since the
regression coefficients are shrunk during the fitting process.
The
Lasso is a recently proposed method. It is a variant of
ridge regression (Tibshirani,
1997), where the penalty works on the absolute values
of regression coefficients of standardized predictors. This
means that some coefficients are shrunk to zero. Since the covariables
with zero coefficients can be omitted, selection of covariables
is obtained. For this reason, the Lasso may be referred to as
"selection through shrinkage."