Skip to Content
Interactive Textbook on Clinical Symptom Research Logo


Home Button

Statistical Models for Prognostication
Author Bio
Introduction
Predictions: Statistical Models
Insight: Statistical Models
Ingredients: Statistical Models
Theoretical Aspects
Central Concepts
Currently selected section: Regression Models
Problems: Regression
Practical Advice
Example 1
Example 2




Chapter 8: Statistical Models for Prognostication: Development of Regression Models
        

Continuous covariables are sometimes treated as categorical variables by applying cut-off values. For example, age as a predictor of mortality may be dichotomized at 65 years. Cut-off values are sometimes chosen after a search for an optimum by some statistical criterion. This results in a bias in the estimated regression coefficient for the categorized predictor: it will be overestimated. This is explained by the fact that the largest coefficient is selected from the set of all possible coefficients corresponding to each cut-off. The p-value for the categorized predictor can be adjusted by advanced statistical procedures (Mazumdar and Glassman, 2000). We prefer the use of (transformed) continuous covariables over categorized covariables, since information is lost by categorization; see for example the relationship between age and mortality in the following graph.

Figure 5.1: Age and 30-day Mortality Relationship
Graphic depiction of age and 30 day mortality, described in text
Illustration of the relationship between age and 30-day mortality after acute
myocardial infarction. Data from the GUSTO-I trial (Lee et al., 1995) were
analyzed with age as a linear, continuous variable (thick line) and with a
dichotomized version of age (<65 years versus >65 years). With the
dichotomized version of age, there is an unnaturally big step between age
64 and age 65, and no difference in predicted risk among patients younger
than 64 and among those older than 65 years of age.

 

Return to Current Section