For continuous
variables, linear regression models can be used. Sometimes
the outcome scale is transformed, by taking the logarithm, for
example, to achieve a better compliance with model assumptions.
For dichotomous
variables, logistic
regression models are popular (Hosmer
and Lemeshow, 1989). They
have largely replaced discriminant models in medical applications.
A logistic link function is used in the regression formula,
such that the regression formula can be written as:
Logit[Prob(outcome)]
= a + b1x1
+ b2x2
+ … + bixi.
Here,
a is the intercept, b1 to
bi are regression coefficients
for i covariables x1
to xi, similar to other
regression models. The logit indicates the natural logarithm
of the odds of the probability p that the outcome occurs:
log(p/(1-p)). Odds
ratios can be calculated by exponentiating the coefficients:
OR=exp(bi). The relationship
between the probability of the outcome and the logit is a characteristic
curve.
Figure
4.1: Logistic Link Function
|
|---|
|
Illustration
of the logistic link function. The relationship between
the probability of an outcome and the logit of the probability
is a characteristic curve. The logit is calculated
as: ln(probability/(1-probability)). When the logit
is 0, the probability is 50%. |
|