| |
Complex
planned experiments and observational studies using survey data
have a common need to assign significance levels reflecting the
complexity of hypotheses, multiple measurements, and a variety
of test statistics. Kirk (1968)
notes that significance levels can be associated with different
conceptual units. They might include individual comparisons (between
two rates), a hypothesis (looking for same effect in all strata),
a family of comparisons (a preponderance of evidence in Meta Analysis),
or the entire experiment or study (the lifetime earnings of disabled
are < 50% of non-disabled).
A significance level
of 0.05 means that in one comparison (of any kind) out of 20,
the population statistic will not fall within the 95% confidence
limits around the sample statistic. If a study, using survey data,
has a single hypothesis that is tested using a single test statistic
then the significance level and the error rate are the same. In
other words, an investigator's decision, based on the significance
level, will be in error 5% of the time. Because virtually all
studies seek to reject null hypotheses the investigator will act
on the basis of the statistical test.
However, most real-world
studies involve numerous measurements, test statistics, strata,
and contingent hypotheses. They offer the opportunity to perform
so many statistical significance tests that some of them will
incorrectly indicate that the null hypothesis should be rejected
-- and the investigator will be happy to oblige! If a hundred
tests were performed and the null hypothesis was true in every
one (no difference in the population) then using a significance
level of .05 in the test statistic would cause you to make the
wrong decision in about five of the hypotheses.
There are no single
best approaches to the multiple comparison problem. Statisticians
have developed special tests with names like Least Significant
Differences (LSD), Honestly Significant Differences (HSD), Scheffe's
Method, Duncan's Multiple Range Test, and the Simultaneous Test
Procedure (STP) to address this issue (Kirk,
1968). Many of the major statistical package programs can
present all of these tests and more when you perform analysis
of variance and similar procedures.
|