| |
Evaluation of health
care services involves a number of methodological issues that
differentiate controlled evaluations of multi-faceted health care
system innovations from the traditional randomized controlled
trial. A full discussion of these issues is beyond the scope of
this chapter. However, awareness of these issues can help broaden
the scientific perspectives of researchers who are familiar with
the traditional clinical trial, but not with evaluation of multi-faceted
health care system innovations.
Single vs. multi-faceted
interventions
The traditional perspective
holds that evaluation of a multi-faceted intervention is not scientifically
valid because it is difficult to isolate the factor that produced
observed effects. This may be true if the research objective is
to determine and explain the efficacy of a treatment. In health
services research, there is now growing evidence that single interventions
often have minimal effects, while multi-faceted interventions
may have larger effects. McCulloch
and colleagues (2000) recently demonstrated the positive
effects -- in improved retinal screening, foot exam rates and
hemoglobin A1c testing rates -- of a multi-faceted quality improvement
intervention involving automated registries, reminders, patient
self-management support, integration of specialist expertise into
primary care and use of group visits. Hence, there is a need for
novel research designs and innovative approaches to meta-analysis
that permit evaluating the effectiveness of multi-faceted interventions.
Appropriate control
conditions
In the paradigm of
the randomized controlled trial, a placebo control group with
double blinding is considered the gold standard. In evaluating
the effectiveness of multi-faceted health care innovations, "usual
care" is often the most informative control condition, and "blinding"
is usually not possible. Fortunately, there are many unblinded
evaluations of multi-faceted health care interventions with usual
care control groups published in leading medical journals (Aubert
et al., 1998; Sadur
et al., 1999; Rosenqvist
et al., 1988; Gulliford
and Mahabir, 1999). These studies provide examples
of how such research can be designed and implemented, and attest
to the ability of such studies to pass rigorous peer review.
Unit of assignment
to experimental and control conditions
In the traditional
randomized controlled trial, the unit of assignment to intervention
or control groups is usually the patient. While patient-level
randomization is often used in evaluations of multi-faceted health
care innovations, it is increasingly common for the practice or
clinic to be the unit of assignment, where groups
of patients in a given practice are "cluster randomized"
(Rothman and
Greenland, 1998) to receive a given treatment (Donohoe
et al., 2000; Walker
et al., 2000; Carlson
and Rosenqvist, 1991). This is particularly true when
intervention delivery requires training providers and making changes
in care delivery that will affect all patients in the practice
setting. Simply randomizing by individual
patients within a clinic would likely contaminate the
intervention, as providers would unrealistically be asked to implement
both the intervention and usual care conditions. Elements of the
intervention in this situation may "seep into" care provided to
the usual care group.
Randomization
When the unit of assignment
is the practice or clinic, it is often not possible to employ
a design in which practices or clinics are randomly assigned to
implement the intervention or to continue usual care. When randomization
is not possible, it is important to devise a control group of
comparable practices or clinics. Whether the practices or clinics
are randomized or not, if the unit of assignment to intervention
or control groups is the practice or clinic, then the methods
of data analysis must take intraclass correlation of patients
within setting into account (Campbell
et al., 2000a, 2000b;
Wood
and Freemantle, 1999). Failure to account for intraclass
correlation can lead to highly biased variance estimates, and
tests of significance that are not valid.
|