Skip to Content
Interactive Textbook on Clinical Symptom Research Logo


Home Button

Statistical Models for Prognostication
Author Bio
Introduction
Predictions: Statistical Models
Insight: Statistical Models
Ingredients: Statistical Models
Theoretical Aspects
Central Concepts
Currently selected section: Regression Models
Problems: Regression
Practical Advice
Example 1
Example 2
Chapter 8: Statistical Models for Prognostication: Development of Regression Models
        
Estimation of Regression Coefficients

The standard methods to estimate the regression coefficients are the least square estimation for linear regression models and the maximum likelihood estimation for generalized linear regression models (such as logistic or Cox regression). These methods provide estimates of the coefficients that best fit the data under study. If the model is predetermined, the estimates are (almost) unbiased. However, when the model specification is based on the data, too extreme estimates result: positive coefficients are overestimated and negative coefficients are underestimated (Steyerberg et al., 1999).

Linear Predictor and Shrinkage

To obtain predictions, regression coefficients and covariable values are multiplied in the linear predictor (see Central Concepts in Predictive Modeling: The Linear Predictor). It appears that the linear predictor provides too extreme predictions: low predictions are too low and high predictions are too high. This holds, even when the model is completely pre-specified (Copas, 1983). It may be explained by the uncertainty in the estimated coefficients, which are estimates from the data rather than fixed constants (Van Houwelingen and Le Cessie, 1990).

Also, note that construction of the linear predictor is a ranking procedure, where patients at high risk are distinguished from those at low risk. Ranking based on a limited number of observations will suffer from regression to the mean; extreme predictions will be too extreme. The extremeness of predictions can be reduced by the application of "shrinkage" methods.

QUESTION 7.5

Extreme predictions from a regression model can be prevented through:

Selection AApplication of stepwise selection methods.
Selection BApplication of shrinkage methods.

Simple Shrinkage Method

The simplest shrinkage method is to apply a linear (or uniform) shrinkage factor for all regression coefficients. This shrinkage factor may be estimated by heuristic formulas. For a linear model, the shrinkage factor is estimated as (Copas, 1983):

R2adj / R2, where the adjusted R2 is estimated as 1 - (1 - R2) (n-1)/(n-p-1).

For a generalized linear model, the shrinkage factor can be estimated as (Copas, 1983) (Van Houwelingen and Le Cessie, 1990):

(Model chi-square - (df-1)) / Model chi-square

Where:
  • model chi-square is the difference in -2 log likelihood of the model and a null model that includes only an intercept (i.e. the likelihood ratio chi-square statistic for testing the joint influence of all predictors simultaneously), and
  • df indicates the degrees of freedom of the predictors in the model.

Advanced Shrinkage Methods

In the following, we discuss three more advanced shrinkage methods:

Previous Page