Detailed Evaluation Questions for Primary Papers
December 07, 2006
Note: For background information and definitions of terms used
below, see resources cited in WWWeb Epidemiology &
Evidence-based Medicine Sources for Veterinarians and in Clinical Epidemiology & Evidence-Based Medicine Glossary.
Apply to Structured Abstracts:
(adapted from Ann Int Med (1990)113:69-76,
(1993)118:731-737, and Can Med Assoc (1994) 150:1611-1615 )
The abstract is crucial as it often the only section beyond the title that is read by
the busy clinician. The following are the recommended headings for structured abstracts
and are questions directed at the content of the abstract. Note that narrative abstracts
of clinical studies should contain all of these points as well but they are harder to
identify because of the absence of the headings.
- Is the primary question or objective of the study clearly and explicitly stated?
- Is the type of study design given (e.g. RBCT (randomized blinded controlled trial),
cohort, case-control, survey, case series)?
- Are the critical technical points for the design included (e.g., random allocation,
stratified random sampling, convenience, blinding, reference or "gold"
- Is the type of clinical (primary, referral, or academic), laboratory, or field
setting and level of care given? Does this setting differ from mine sufficiently that
study generalizability is reduced?
Patients or Subjects:
- Are the number of subjects, the technical descriptors (random, systematic or
haphazard) of how they were selected, the duration of enrollment, and the number of
refusals or non-responders given?
- Are their key demographic characteristics, (age, sex, breed) and the spectrum of the
clinical disorder sufficiently described? Are the subjects similar enough to those in my
- If an RBCT (Randomized, blinded, controlled trial), are inclusion and exclusion criteria
- If matching was used, are the criteria stated?
- If follow-up is involved, is the duration of follow-up given and the number lost or the
number withdrawn due to adverse effects given?
Measurements and Main Results:
- Are the method, duration, dosage, and common clinical names of the drugs in
- Are the unfamiliar procedures in question briefly described?
- Are clinically relevant measurements and main results given? Are they briefly
described if unfamiliar to the intended readership?
- Are the methods of controlling bias given for subjective measurements?
- Are 95% confidence limits and assessment of chance (p-values) given for numerical
- Are implications of the results and their clinical application including limitations
- Are the conclusions consistent with the results?
- If further study is needed before the results are applicable in the clinical setting, is
[Return to Contents List]
to Apply to Primary Papers:
adapted from Am J Physiology
(1995)13:S21-S25, Med J Australia (1992) 157:389-394 and BMJ
(see "Questions to Apply to
Materials and Methods:
- Is sufficient (but not excessive) background material present with references to
provide a conceptual and theoretical basis for the study and to indicate why the study
question is worth addressing?
- Are the relevant favorable (supporting) and unfavorable (non-supporting) papers cited?
- Is the exact question or hypothesis and specific aims being addressed clearly stated
with measurable objectives? Are they clinically relevant?
- Is the study design and laboratory methods appropriate to address the study
question? Are limitations and potential problems with study methods acknowledged in this
or the discussion section?
- Are references for standard methods or "gold" standards provided and are
modifications described in sufficient detail that others familiar with the study area
could repeat or extend the study?
- If new methods or instrumentation are used, are they completely described so others can
replicate it? Are the repeatability and reliability of new methods and instruments
- Is symmetry established and preserved where possible in the design and execution of the
study (controls, blinding, randomization)?
- Are the methods (e.g., blinding, replication) for assuring the validity,
reproducibility, blinding, and quality control of measurements addressed, particularly for
subjective clinical observations?
- If sampling from a group is involved is the source population and the spectrum of
disease sufficiently described (i.e., can you determine if a given patient of yours would
or would not be eligible for the study)? Are the sampling procedures, inclusion /
exclusion criteria, and the sample size sufficient?
- If controls are involved, is the method of selecting controls in observational studies
or for allocating controls in experimental studies sufficient? Are the two groups
sufficiently comparable at the outset? If controls arent used, why not?
- Are company sources of drugs, chemicals, and devices given? Do any present a strong
potential for conflict of interest?
- Are appropriate statistical procedures used with references given for those that are
Discussion and Conclusions:
- Are the outcomes or endpoint appropriate for the clinical study question? Are
clinically important side effects addressed?
- Are the results presented in a fashion (clear tables, graphs or text) such that they
make sense? Is information about the distributions of relevant variables presented?
- Are potential confounders or the effects of other co-treatments accounted for
- Are deviations from the research protocol addressed (e.g., unexpected death of subjects,
breaking of blinding, miss-labeling of samples)?
- Are results analyzed statistically? Are the statistical procedures executed correctly?
- Are statistically significant results also clinically or biologically significant? Is
the distinction between statistical significance and clinical or biological significance
- If the results are statistically insignificant, is the power of the study to obtain the
minimum biologically or clinically significant results given?
- If subjects are followed over time, is the duration and degree of follow-up sufficient
and are losses to follow-up explained sufficiently?
- Could selection bias, measurement bias, confounding bias, or chance better account for
the results than what is stated?
- Is there evidence of "data dredging" (desperately searching for significance
in a sea of insignificance)?
References and Acknowledgements:
- Does the discussion logically relate the findings to a sufficient range of
previously published information? Are important recent primary references, particularly
contrary, included in this? Are the citations comprised of a minimal amount of unpublished
or weak primary information (e.g., meeting abstracts and theses)?
- Are limitations of the study methods addressed?
- If findings differ from previously published findings, are the potential reasons for the
discrepancy discussed sufficiently?
- If results are not statistically significant, is the minimum biologically or clinically
significant effect that the study had the power to detect discussed?
- Are the potential effects of protocol deviations discussed?
- If a survey, is the proportion of non-respondents and their similarity to respondents
addressed? If a prospective study, are the potential effects of the loss to follow up
- Is the relevance of the findings discussed and a minimum of unwarranted speculation or
overgeneralization presented? Is the degree of generalization appropriate for the size,
setting, and strength of the study?
- Are recent, relevant primary papers cited? Are the ones on which the work is based,
such as those describing the materials and methods used and those on which interpretation
of the results rests, primary papers from refereed scientific journals?
- Are sources of support acknowledged? Do any of these present a potential conflict of
[Return to Contents List]
Additional Questions to
Apply to Specific Study Designs:
(Modified from pages 267-270 in: Fletcher RH,
Fletcher SW, Wagner EH (1996). Clinical Epidemiology: The Essentials, 3rd,
Williams and Wilkins ISBN 0-683-03269-0.)
Cross-sectional (Prevalence) Studies:
- Are the criteria for being a case clearly described? What are they? For example,
titer presence may indicate prior infection, passive protection, active protection from
vaccination but it may or may not indicate clinical disease.
- What is the population in which the cases were found? Is this population likely to have
a higher or lower prevalence than other populations of interest to the clinician?
- Are the study subjects an unbiased sample of the population to which the results are
- Were cases entered at the onset of disease (incident cases rather than prevalent
- Were controls similar to cases in other important respects (e.g., age, breed) except for
the exposures of interest?
- Was the ascertainment of exposure information similar in the cases and controls,
preserving symmetry (Recall bias, measurement bias from non-blinded observers)?
- Do the subjects represent a disease spectrum (acute vs. chronic, mild vs. severe)
similar to that of your clients?
Cohort Studies and Randomized Clinical Trials:
- Entered at a clinically useful point (inception cohort), such as after first diagnosis
of the problem or after initial treatment? Enrolling subjects after followup time has
passed means that prevalent rather incident cases are enrolled, likely missing those that
either do very well (recover rapidly and completely) or very poorly (die).
- At risk of developing the outcome? Those subjects who already have the condition or
cant acquire it (e.g. immune from previous exposure, physiologically incapable of
acquiring the condition such as pregnancy) should not be in the study.
- At a similar point in the course of the exposure or disease? Prognosis often varies with
duration of an exposure or the time since exposure or the disease occurred.
Randomized Trials: (In addition to above)
- Was the followup of all subjects complete? Those lost to followup can bias a study if
they do better or worse than those that remain and more are lost from one group than
- Were all subjects examined with equal intensity? Were observers doing the measurements
blinded, preserving symmetry and reducing systematic measurement bias?
- Were other factors affecting the outcome other than the one of interest (e.g., age,
gender, breed) either equally distributed in the outcome groups or controlled by the
method of analysis? If not, could the difference in these other factors account for the
- Were subjects randomly allocated to treatment and control groups? Were they
allocated as individuals or as groups?
- Are the results from subjects analyzed on the same basis that the subjects were
allocated to treatments? If allocated by group membership, are the results analyzed by
group rather than by individuals (error of inflated precision or pseudoreplication)?
- Were the observers (e.g., owners, clinicians, researchers, technicians) blinded (masked)
to subjects group allocation?
- Were the co-interventions (e.g., other treatments) the same in both groups, preserving
[Return to Section Contents]
[Return to Contents List]