Perspectives
The history and fate of the gold standard
For
the past half-century, physicians and clinical researchers have
remained confident that randomised controlled trials (RCTs) provide the
most rigorous test of preventive, diagnostic, and therapeutic
interventions. They are ubiquitously referred to as the “gold standard”
of empirical biomedical investigation, to the point where this status is
often presented as a self-evident starting point in diagnostic or
therapeutic evaluation. However, this status has long been contested,
ever more so now with the emergence of “big data”, randomised registry
trials, and other modes of knowledge production in medicine. In an era
of increasing methodological self-reflection, it is useful to step back
and examine how and when RCTs became the gold standard and what our
aspirations for gold standards reveal about our deeper medical identity.
RCTs
had a complicated prehistory, entailing attempts to ensure equivalent
active and control groups, the occasional blinding of researchers or
subjects, and the development of statistical methods of comparison. They
took on their recognisably modern form with the British Medical
Research Council's landmark 1948 trial of streptomycin for pulmonary
tuberculosis. As statisticians and clinical pharmacologists attempted to
make sense of the pharmaceutical revolution after World War 2, the
power of RCTs seemed critical. When the 1962 amendments to the US Food
and Drug Administration mandated proof of efficacy through
“well-controlled” studies—namely, RCTs—before new drug approval, the US
Government set the stage for the avalanche of pharmaceutical trials that
followed. In the UK throughout the 1960s and 1970s, Archibald Cochrane
advocated for the utility of RCTs to sort therapeutic wheat from chaff.
His work set the stage for such worldwide champions of rational
therapeutic assessment as Thomas Chalmers, Iain Chalmers, David Sackett,
and their colleagues.
But when did RCTs become the “gold
standard”? The first instance we have found of the phrase “gold
standard” to refer to RCTs came in the pages of The New England Journal of Medicine (NEJM)
in December, 1982, in an article written by Alvan Feinstein and Ralph
Horwitz. This date surprised us as a very late date for the first usage.
Despite extensive searching, we have found no earlier occurrence of
“gold standard” in reference to RCTs. We are eager to be proven wrong,
but until all textbooks, conference proceedings, journals, and archival
collections have been digitised and made full-text searchable, the gold
standard of historical research itself remains elusive. Of interest,
Feinstein and Horwitz described RCTs not as a gold standard that all
research must strive to attain, but as an elusive ideal in many
circumstances. Their article was actually a brief in support of the
rigorous conduct of other clinical epidemiological research designs. As
they remarked, “epidemiological research has become increasingly
important because it offers a substitute for the unattainable scientific
gold standard of a randomized experimental trial”.
It is
worth looking more closely at how and when this notion entered the
medical literature. The phrase “gold standard” has a long prehistory in
medical journals, but with different meanings. It first appeared in The Lancet
in 1870 in a discussion of international coinage and efforts to restore
the value of the guinea. Over the next 60 years it occurred repeatedly,
in discussions of the actual gold standard: the technique of
international finance that links the value of a nation's currency to a
set amount of gold, facilitating exchange between different currencies.
During the 1930s the term gained a new usage, in discussions of
pharmacological use of gold, whether for tuberculosis (unsuccessfully,
in 1934) or rheumatoid arthritis (successfully, in 1937). It first
appeared in NEJM in 1933, in a humorous riff by Harvey Cushing
about the state of surgery and dentistry in the USA. The next five
references, through 1959, all referred to the financial gold standard.
Of
course, the financial gold standard was rarely seen as a “gold
standard”. Instead, it proved controversial for much of its history.
Isaac Newton put Britain on a gold standard in 1717, an arrangement
formalised by the Royal Mint in 1816. Many other countries followed
suit. The system broke down during the economic turmoil of the early
20th century. Inflation during World War 1 forced the UK and other
countries off the gold standard. A variant was restored in 1925, but
that too had to be abandoned in 1931 during the Great Depression. After
preliminary moves by the USA in the 1930s, Richard Nixon finally took
the USA off its gold standard in 1971. In 1976 the US Government revised
its definition of the dollar to remove all references to gold.
It
was, ironically, in this setting—of the final abandonment of the
financial gold standard in the USA—that the phrase began to appear in The Lancet and NEJM as something valuable, not merely as a standard of exchange but as the definitive exemplar of quality and reliability. A 1975 Lancet
review of new diagnostic criteria described how they set the “gold
standard”, providing a new “esperanto of liver disease”. Writing in NEJM
in 1979, Victor McKusick called the presentations given by residents at
Grand Rounds at Johns Hopkins the “gold standard” for medical
communication. Book reviewers described new textbooks as the “gold
standard” for their fields. By the early 1980s, clinical researchers
described specific procedures as the diagnostic or therapeutic gold
standard (eg, adrenal vein catheterisation in The Lancet in 1980, cardiac catheterisation in NEJM in 1981, or haemodialysis in The Lancet
in 1982). After the first occurrence in 1982 of gold standard in
association with RCTs, the phrase became commonplace, appearing less
often within quotation marks, and by the 1990s paralleling the rise of
the term evidence-based medicine.
What can we make of the
irony of this usage entering medicine in the years after it was
abandoned as a tool of international finance in the 1970s? It appeared
in diverse therapeutic and diagnostic contexts, reflecting the broad
aspirations in medicine for evidentiary solid ground and standardisation
throughout this era. But many of its uses in relation to RCTs were
critiques, reflecting a legacy of the controversies that had long
ensnared those who would claim the epistemic hegemony of RCTs. The
debates about RCTs, and about the notion of a medical “gold standard”
more generally, often took on religious overtones. Angry about
cardiologists' demands that coronary artery bypass grafting be subjected
to RCTs, Lawrence Bonchek encouraged surgeons in 1979 to “resist the
almost religious fervor of those who would sanctify randomized studies
as the only means of learning the truth”. Writing in 1992, P Finbarr
Duggan complained that the phrase “gold standard” itself “smacks of
dogma” and should be abandoned.
The religious language
here may not be coincidental. Arthur Kleinman and others have argued
that the emergence of biomedicine within the monotheistic traditions of
Europe and the Middle East imbued medicine with a commitment to
universal truths, unitary paradigms, and a “single-minded approach to
illness and care”. The idea of a gold standard, that there is one best
way to do something, whether conduct clinical research, diagnose a
disease, or treat a patient, emerges from this underlying commitment.
While the desire to base clinical decisions on the best possible
evidence reflects a genuine effort to improve the quality of medical
care, commitment to a gold standard does more than that. Allegiance to a
single approach provides a focus around which communities can organise
and rally. But critics have pointed to the dangers of such medical
monotheism. As pioneering cardiac surgeon René Favaloro wrote in 1998,
reflecting on three decades of debate about bypass grafting, “Randomized
trials have developed such high scientific stature and acceptance that
they are accorded an almost religious sanctification…If relied on
exclusively they may be dangerous.” Quoting Feinstein, Favaloro argued
that medical decisions often had to be made without guidance from
clinical trials: “To acknowledge this reality requires no loss of
reverence, allegiance, or respect for the primacy of randomized trials
as a ‘gold standard’ in scientific research.” Favaloro saw this as a
particular challenge for surgery, but physicians in all specialties have
at times resented the yoke of evidence-based medicine.
Weighing ingots on the Chancellor Balance at the Royal Mint, Tower Hill, London, UK in the early 20th century
Print Collector/Getty Images
The
past several years have seen increasing calls for an ecumenical
approach to clinical research, with more flexible standards for what
counts as acceptable study designs. Physicians have developed new
methods to extract robust analyses from patient registries and from the
ever-growing databases provided by electronic medical records. Will this
erode the status of RCTs as a gold standard? The rise of personalised
medicine, meanwhile, might make it more difficult to defend gold
standards in diagnostic and therapeutic practice. Personalised medicine
refocuses clinical attention away from the “typical” patients analysed
by RCTs and onto the idiosyncrasies, genetic or otherwise, of individual
patients. Has the phrase outlived its usefulness in medicine? It is too
soon to tell. Yet even as some physicians turn away from their
commitment to medical gold standards, some politicians, newly wary about
global financial turbulence, talk of restoring the financial gold
standard. Gold standards, whether actual or figurative, represent
structures of exchange and aspirations toward stability, despite
developments that threaten both.
Further reading
- View in Article
- | Summary
- | Full Text
- | Full Text PDF
- | PubMed
- | Scopus (2)
Bonchek, 1979Bonchek, LI. Are randomized trials appropriate for evaluating new operations?. N Engl J Med. 1979;
301: 44–45
Chalmers, 2014Chalmers, I. The development of fair tests of treatment. Lancet. 2014;
383: 1713–1714
Daly, 2005Daly, J. Evidence-based medicine and the search for a science of clinical care. University of California Press,
Berkeley; 2005
Duggan, 1992Duggan, PF. Time to abolish “gold standard”. BMJ. 1992;
304: 1568–1569
Favaloro, 1998Favaloro, RG. Critical analysis of coronary artery bypass graft surgery: a 30–year journey. J Am Coll Cardiol. 1998;
31: 1B–63B
Feinstein and Horwitz, 1982Feinstein, AR and Horwitz, RI. Double standards, scientific methods, and epidemiologic research. N Engl J Med. 1982;
307: 1611–1617
Kleinman, 1995Kleinman, A. What Is specific to biomedicine?. in: Writing at the margin: discourse between anthropology and medicine. University of California Press,
Berkeley; 1995: 21–40
Timmermans and Berg, 2003Timmermans, S and Berg, M. The gold standard: the challenge of evidence-based medicine. Temple University Press,
Philadelphia, PA; 2003