When modeling decisions
are based on the data under study, we usually pick up more extreme
patterns than present in the underlying population. Hence, p-values
are estimated too small, and regression coefficients too large
(in absolute value). Regression to the mean may be expected
when the model is validated in new patients.