Predicting Harms and Benefits in Translational Trials: Ethics,
Evidence, and Uncertainty
article has not abstract
Published in the journal:
. PLoS Med 8(3): e32767. doi:10.1371/journal.pmed.1001010
Category:
Essay
doi:
https://doi.org/10.1371/journal.pmed.1001010
Summary
article has not abstract
Summary Points
-
Ethical judgments about risk, benefit, and patient eligibility in clinical trials hinge on predictions about harm, therapeutic response, and clinical promise.
-
Predictions for novel interventions in preclinical stages of development suffer from two problems: insufficient attention to threats to validity in preclinical research and a reliance on an overly narrow base of evidence that includes only animal and clinical studies of the intervention in question (“evidential conservatism”).
-
To improve ethical and scientific decision-making in early phase studies, decision-makers should explicitly attend to reporting quality and methodological features in preclinical experiments that address threats to internal, construct, and external validity.
-
Decision-makers should also use evidence that sheds light on the reliability of causal claims embedded within a proposed trial. This evidence can be gathered from outcomes of previous trials involving agents targeting related biological pathways (“reference classes”).
Introduction
First-in-human clinical trials represent a critical juncture in the translation of laboratory discoveries. However, because they involve the greatest degree of uncertainty at any point in the drug development process, their initiation is beset by a series of nettlesome ethical questions [1]: has clinical promise been sufficiently demonstrated in animals? Should trial access be restricted to patients with refractory disease? Should trials be viewed as therapeutic? Have researchers adequately minimized risks?
The resolution of such ethical questions inevitably turns on claims about future events like harms, therapeutic response, and clinical translation. Recurrent failures in clinical translation, like Eli Lilly's Alzheimer candidate semagacestat, highlight the severe limitations of current methods of prediction. In this case, patients in the active arm of the placebo-controlled trial had earlier onset of dementia and elevated rates of skin cancer [2].
Various authoritative accounts of human research ethics state that decision-making about risk and benefit should be careful, systematic, and non-arbitrary [3]–[5]. Yet, these sources provide little guidance about what kinds of evidence stakeholders should use to ensure their estimates of such events ground responsible ethical decisions. In this article, we suggest that investigators, oversight bodies, and sponsors often base their predictions on a flawed and inappropriately narrow preclinical evidence base.
Prediction and Ethical Decision-Making
According to the core tenets of human research ethics, investigators, sponsors, and institutional review boards (IRBs) are obligated to ensure that risks to volunteers are minimized and balanced favorably with anticipated benefits to society and, if applicable, to the volunteers themselves [4],[6]. Accurate prediction plays a critical role in this process. When research teams underestimate the probability of favorable clinical or translational outcomes, they undermine health care systems by impeding clinical translation. When investigators overestimate the probability of favorable outcomes, they potentially expose individuals to unjustified burdens, which may be considerable for phase 1 studies involving unproven drugs. In both cases, misestimation threatens the integrity of the scientific enterprise, because it frustrates prudent allocation of research resources [7].
Naturally, there are limits to the reliability with which forecasts based on experimental evidence predict clinical outcomes. However, in late stages of clinical development, forecasts underwriting ethical and scientific decision-making have proven fairly reliable. Several analyses of cancer randomized controlled trials indicate that new interventions are just as likely to prove more effective than comparator ones as they were to prove inferior [8]–[10]. Similar findings have been reported for other indications [11]. In the aggregate at least, researchers and review committees neither overestimate nor underestimate the medical benefits of allocating some patients to new interventions and others to standard drugs.
Whether decision-makers utilize evidence as effectively when predicting outcomes in early phase research has not been systematically investigated. Nevertheless, there are grounds for concern such that a systematic investigation is overdue. Highly promising preclinical findings in cancer, stroke, HIV vaccines, and neurodegenerative diseases frequently fail clinical translation. In cancer, only 5% of products entering trials are eventually licensed [12],[13]. In one study, approximately 5% of high impact basic science reports were clinically translated within 10 years [14]. We suggest that these disappointments partly reflect two problems in the way evidence is used in predicting clinical outcomes.
Preclinical Reporting and Validity
First, decision-makers may not be adequately responsive to problems in preclinical research practice [15]. Systematic reviews repeatedly demonstrate that many animal studies do not enable reliable causal inference and clinical generalization because they do not address important threats to internal, construct, and external validity. With respect to the first, one recent analysis of animal studies showed that only 12% used random allocation and 14% used blinded outcome assessment [16]. Construct validity concerns the relationship between clinical implementation of an intervention and implementations evaluated in preclinical studies. A recent review found that clinical studies of cardiac arrest interventions applied treatment significantly sooner after cardiac events than in preclinical studies [17]. In the case of Astra Zeneca's failed stroke drug NXY-059, use of normotensive rodents in preclinical development may have led to spurious predictions of clinical activity [18]. Preclinical studies do not always test the extent to which cause and effect relationships hold up under varied conditions (external validity). In a systematic review of neuroprotective agents in phase 2 and 3 trials, only two of ten agents were tested in both rodents and higher order species [19]. Finally, deficiencies in reporting and aggregation of preclinical evidence deprive decision-makers of crucial evidence. In one recent analysis, publication bias in preclinical stroke studies led to a 30% overestimation of treatment effect size [20]. Clearly, preclinical researchers should endeavor to follow reporting guidelines [21] such as the recently proposed Animals In Research: Reporting In Vivo Experiments Guidelines (ARRIVE; http://www.nc3rs.org.uk/page.asp?id=1357) [22], and clinical predictions following from animal studies should take into account deficiencies in design and reporting.
In the case of semegacestat, it has been over 5 years since the drug was first tested in human beings, and preclinical studies have yet to be published. However, narrative reviews by Eli Lilly scientists indicate trials were launched on the basis of molecular, rather than behavioral, endpoints [23]. Although the absence of publication makes difficult any assessment of animal study quality, the use of molecular endpoints raises questions about the construct validity of clinical generalizations drawn from preclinical experiments.
Evidential Conservatism
A second concern about forecasting outcomes in translational trials relates to a tendency to base clinical inferences on a relatively narrow class of evidence: those preclinical studies that involve the particular agent. We call this “evidential conservatism.” Such evidential conservatism is reflected in various policies. For example, the American Society of Clinical Oncology states that “the decision to move an agent into phase I evaluation is based… central[ly on]… the observation of sufficient preclinical antitumor activity, such that a therapeutic effect in human cancer is anticipated” [24],[25]. International Council on Harmonization policy requires investigators to furnish ethics review committees with only a narrow type of preclinical evidence [26]. Similarly, some commentators argue that risk-benefit decisions in early phase trials should be driven by mechanistic evidence about an agent [27].
Evidential conservatism, however, fails to address the higher-order question of the reliability of forecasts made from such a narrow evidence base. This higher-order question is of special relevance for early phase research because agents that do not enjoy the support of promising preclinical results will not be plausible candidates for translation. Yet when agents are supported by equally promising preclinical results they may be differentiated by the maturity of the knowledge surrounding a nexus of variables concerning the relationship between test and target populations.
For instance, although neuroprotective stroke treatments have moved to translation on the basis of very encouraging preclinical studies, they have consistently failed randomized trials. Estimates of the risks and benefits of any particular neuroprotective compound that are based solely on preclinical evaluation of that compound will be less reliable than those that incorporate information about the relative success of neuroprotective compounds as a class. In part, this is because the success or failure of other interventions in this reference class provides evidence about the degree to which clinical development is guided by a reliable working knowledge of relevant disease processes.
Our claim that decision-makers need to use a broader base of evidence for evaluating early phase research is consistent with a recent call for incorporating whole research program outcomes into systematic reviews of particular agents [28].
Assessing Relevant Evidence
How might researchers depart from evidential conservatisim in a way that is open to scrutiny and amenable to assessment, revision, and improvement? Decision-makers who make forecasts about agent activity in early phase research must identify reference classes that are relevant to the decision at hand. Delimiting the reference class of relevant evidence poses a challenge in that interventions possess limitless characteristics. A drug might be classed within neuroprotective compounds, stroke drugs, and drugs beginning with the letter “n.” Decision-makers thus confront the timeless problem of selecting those characteristics most salient for prediction.
There are no simple formulas here. In some cases, choice of reference classes will be straightforward (e.g., a new, small molecule HMG-CoA reductase inhibitor); in other cases, consensus may be elusive. Nevertheless, we suggest that the very act of attending to reference class identity would be a departure from evidential conservatism. As a starting place, decision-makers should identify reference classes that index the maturity of knowledge regarding central causal premises embedded within a protocol. In an era in which basic science heavily informs product development, drug developers themselves often class their agents according to explicit ambitions about causal pathways. Asserting that a drug targets a particular pathophysiologic process should prompt us to look at how other drugs that target the same process performed in clinical translation. We can then base our estimates of the maturity of knowledge about these causal premises on the success or failure of past attempts at redeeming these ambitions. Decision-makers should therefore adjust their confidence in clinical generalizations on the basis of outcomes with previous interventions that addressed the same pathological processes.
Semagacestat was screened and designed to target amyloid-β production, which is believed to be a key step in dementia onset. Eight other anti-amyloid drugs have either failed randomized trials or been abandoned due to toxicity (Table 1) [29],[30]. Although a variant of this approach may eventually succeed, promising preclinical evidence supporting semagecestat should have been tempered by the accumulation of data about outcomes in the same reference class.
Practical Implications
To illustrate how our suggestions interface with ethical decision-making, consider recent proposals to reinitiate trials of fetal-derived tissues for Parkinson's disease [31]. Previous trials involved treatment-refractory patients, but investigators are now proposing trials involving patients with recent onset. The rationale is that fetal-derived tissues can only protect dopaminergic neurons to the extent that the latter remain intact. However, the risk-benefit balance is contentious, because the trial will expose patients who can manage symptoms with standard treatments to the risks of neurosurgery, immunosuppression, and cell transplantation.
According to evidential conservatism, investigators and ethics bodies should evaluate the risk-benefit balance by consulting preclinical studies and the biological rationale for patient-subject selection. One commentator notes that, on the basis of preclinical studies showing the intervention is designed to address early disease processes, performing studies in patients with advanced disease would be unethical [27]. We think this way of using evidence in ethical evaluation is misguided.
Our proposal directs decision-makers to make risk-benefit decisions in light of two additional factors. First, to what degree do the preclinical studies incorporate design elements that support reliable inferences about clinical activity? This directs stakeholders to attend to those methodological features of the preclinical studies that support credible claims of internal, construct, and external validities in preclinical studies. As these preclinical studies are presently underway, researchers have an opportunity to overcome past limitations in addressing validity threats in Parkinson's disease models [32].
Second, our proposal directs stakeholders to consider evidence that sheds light on the maturity of the knowledge relating to key causal claims presupposed by therapeutic predictions. As investigators propose to intervene in degenerative processes, a claim of therapeutic action would need to be evaluated in light of outcomes in previous Parkinson's trials involving surgically delivered neuroprotective agents and/or transplanted tissues. No such strategies have produced positive randomized trials (Table 2). Accordingly, even with carefully collected preclinical evidence, decision-makers should approach new trials with modest therapeutic expectations.
Thoughtful commentators have argued that, before initiating cell-based dopamine replacement, strategies should be “clinically competitive” with standard of care [33]. However, this may present an unworkable standard [34]. Previous unsuccessful attempts at translation betray profound uncertainty concerning risks and benefits for research volunteers. Given the preliminary nature of such interventions, the ethical justification for their administration in early phase trials should not hinge on the prospect of benefit for volunteers. It should rest instead on a compelling claim of knowledge value and on the reduction of avoidable risks. The latter entails pursuing trials in patients less likely to suffer opportunity costs from study participation, and maintaining a background of medical management that does not fall below standard of care. Rather than being told that the approach is comparable to standard of care, the consent process should emphasize that clinical benefit is unlikely.
Conclusion
Systematic study of preclinical research has centered on stroke and practices focused on internal validity. Our proposal makes clear the need to broaden the scope of this research agenda to cover a wider range of preclinical research, and to expand its focus to include issues of construct and external validity. A key component of this process will involve creating databases for aggregating translational outcomes according to relevant reference classes.
Some may worry that such an analysis might produce less optimistic predictions, and hence stymie product development. However, we do not see how medicine is advanced by forging ahead on the basis of predictions of dubious reliability. Moreover, there are many productive ways in which stakeholders may respond to less optimistic projections. For instance, review of relevant information may prompt researchers to test certain hypotheses before moving ahead with human trials. Investigators might adjust the design of translational studies to align the risk profile with ethical judgments. Or, investigators might decide that moving forward with a protocol represents the best way to advance a particular scientific initiative, but that risks can only be justified by appealing to the value of the knowledge sought, rather than the product's therapeutic activity.
Stakeholders might already adjust their predictions in light of intuitions about validity or experiences with success or failure for similar agents. If so, they do so on the basis of private beliefs, and often without the data needed to make these adjustments systematically. Our approach provides a more publicly accessible basis for making and adjudicating risk-benefit predictions. We suggest that this would better cohere with a sage prescription offered by the National Commission: “there should first be a determination of the validity of the presuppositions of the research…. The method of ascertaining risks should be explicit… It should also be determined whether an investigator's estimates of the probability of harm or benefits are reasonable, as judged by known facts or other available studies ” [3].
Zdroje
1. KimmelmanJ
2010
Gene transfer and the ethics of first-in-human research: lost in
translation.
Cambridge
Cambridge University Press
2. ExtanceA
2010
Alzheimer's failure raises questions about disease-modifying
strategies.
Nat Rev Drug Discov
9
749
751
3. The National Commission for the Protection of Human Subjects of
Biomedical and Behavioural Research
1979
The Belmont report: ethical principles and guidelines for the
protection of human subjects of research.
Bethesda
Department of Health Education and Welfare
4. World Medical Association
1964
Declaration of Helsinki.
Helsinki
18th World Medical Assembly
5. MannH
2010
ASSERT: a standard for the review and monitoring of randomized
clinical trials.
Available: http://www.assert-statement.org/. Accessed 31 January
2011
6. Department of Health and Human Services
2005
Protection of human subjects: criteria for IRB approval of
research.
Title 45 CFR 46.111(a)(1)
1
12
7. LondonAJKimmelmanJEmborgME
2010
Research ethics. Beyond access vs. protection in trials of
innovative therapies.
Science
328
829
830
8. DjulbegovicBKumarASoaresHPHozoIBeplerG
2008
Treatment success in cancer: new cancer treatment successes
identified in phase 3 randomized controlled trials conducted by the National
Cancer Institute-sponsored cooperative oncology groups, 1955 to
2006.
Arch Intern Med
168
632
642
9. KumarASoaresHWellsRClarkeMHozoI
2005
Are experimental treatments for cancer in children superior to
established treatments? Observational study of randomised controlled trials
by the Children's Oncology Group.
BMJ
331
1295
10. SoaresHPKumarADanielsSSwannSCantorA
2005
Evaluation of new treatments in radiation oncology: are they
better than standard treatments?
JAMA
293
970
978
11. GrossCPKrumholzHMVan WyeGEmanuelEJWendlerD
2006
Does random treatment assignment cause harm to research
participants?
PLoS Med
3
e188
doi:10.1371/journal.pmed.0030188
12. KolaILandisJ
2004
Can the pharmaceutical industry reduce attrition
rates?
Nat Rev Drug Discov
3
711
715
13. PangalosMNSchechterLEHurkoO
2007
Drug development for CNS disorders: strategies for balancing risk
and reducing attrition.
Nat Rev Drug Discov
6
521
532
14. Contopoulos-IoannidisDGNtzaniEIoannidisJP
2003
Translation of highly promising basic science research into
clinical applications.
Am J Med
114
477
484
15. van der WorpHBHowellsDWSenaESPorrittMJRewellS
2010
Can animal models of disease reliably inform human
studies?
PLoS Med
7
e1000245
doi:10.1371/journal.pmed.1000245
16. KilkennyCParsonsNKadyszewskiEFestingMFCuthillIC
2009
Survey of the quality of experimental design, statistical
analysis and reporting of research using animals.
PLoS ONE
4
e7824
doi:10.1371/journal.pone.0007824
17. ReynoldsJCRittenbergerJCMenegazziJJ
2007
Drug administration in animal studies of cardiac arrest does not
reflect human clinical experience.
Resuscitation
74
13
26
18. BathPMGrayLJBathAJBuchanAMiyataT
2009
Effects of NXY-059 in experimental stroke: an individual animal
meta-analysis.
Br J Pharmacol
157
1157
1171
19. PhilipMBenatarMFisherMSavitzSI
2009
Methodological quality of animal studies of neuroprotective
agents currently in phase II/III acute ischemic stroke
trials.
Stroke
40
577
581
20. SenaESvan der WorpHBBathPMHowellsDWMacleodM
2010
Publication bias in reports of animal stroke studies leads to
major overstatement of efficacy.
PLoS Biol
8
e1000344
doi:10.1371/journal.pbio.1000344
21. MacleodMRFisherMO'CollinsVESenaESDirnaglU
2009
Good laboratory practice. Preventing introduction of bias at the
bench.
Stroke
40
e50
e52
22. KilkennyCBrowneWCuthillIEmersonMAltmanD
2010
Improving bioscience research reporting: the ARRIVE guidelines
for reporting animal research.
PLoS Biol
8
e1000412
doi:10.1371/journal.pbio.1000412
23. HenleyDBMayPCDeanRASiemersER
2009
Development of semagacestat (LY450139), a functional
gamma-secretase inhibitor, for the treatment of Alzheimer's
disease.
Expert Opin Pharmacother
10
1657
1664
24. American Society of Clinical Oncology
1997
Critical role of phase I clinical trials in cancer
treatment.
J Clin Oncol
15
853
859
25. ChristianMShoemakerD
2002
The investigator's handbook: a manual for participants in
clinical trials of investigational agents sponsored by DCTD,
NCI.
Bethesda
Cancer Therapy Evaluation Program
26. International Conference on Harmonisation of Technical Requirements for
Registration of Pharmaceuticals for Human Use (ICH)
1996
ICH Harmonized Tripartite Guideline.
Guideline for Good Clinical Practice E6(R1)
27. LowensteinPR
2008
A call for physiopathological ethics.
Mol Ther
16
1771
1772
28. IoannidisJPKarassaFB
2010
The need to consider the wider agenda in systematic reviews and
meta-analyses: breadth, timing, and depth of the evidence.
BMJ
341
c4875
29. MangialascheFSolomonAWinbladBMecocciPKivipeltoM
2010
Alzheimer's disease: clinical trials and drug
development.
Lancet Neurol
9
702
716
30. CummingsJ
2010
What can be inferred from the interruption of the semagacestat
trial for treatment of Alzheimer's disease?
Biol Psychiatry
68
876
878
31. HoldenC
2009
Neuroscience. Fetal cells again?
Science
326
358
359
32. KimmelmanJLondonAJRavinaBRamsayTBernsteinM
2009
Launching invasive, first-in-human trials against
Parkinson's disease: ethical considerations.
Mov Disord
24
1893
1901
33. LindvallOKokaiaZ
2010
Stem cells in human neurodegenerative disorders—time for
clinical translation?
J Clin Invest
120
29
40
34. AndersonJAKimmelmanJ
2010
Extending clinical equipoise to phase 1 trials involving
patients: unresolved problems.
Kennedy Inst Ethics J
20
75
98
35. GilmanSKollerMBlackRSJenkinsLGriffithSG
2005
Clinical effects of Abeta immunization (AN1792) in patients with
AD in an interrupted trial.
Neurology
64
1553
1562
36. FeldmanHHDoodyRSKivipeltoMSparksDLWatersDD
2010
Randomized controlled trial of atorvastatin in mild to moderate
Alzheimer disease: LEADe.
Neurology
74
956
964
37. Elan Corporation
2010
Elan and Transition Therapeutics announce topline summary results
of Phase 2 study and plans for Phase 3 for ELND005 (Scyllo-inositol)
[press release]
38. SallowaySSperlingRGilmanSFoxNCBlennowK
2009
A phase 2 multiple ascending dose trial of bapineuzumab in mild
to moderate Alzheimer disease.
Neurology
73
2061
2070
39. WinbladBGiacobiniEFrolichLFriedhoffLTBruinsmaG
2010
Phenserine efficacy in Alzheimer's disease.
J Alzheimers Dis
22
1201
1208
40. GoldMAldertonCZvartau-HindMEggintonSSaundersAM
2010
Rosiglitazone monotherapy in mild-to-moderate alzheimer's
disease: results from a randomized, double-blind, placebo-controlled phase
III study.
Dement Geriatr Cogn Disord
30
131
146
41. GreenRCSchneiderLSAmatoDABeelenAPWilcockG
2009
Effect of tarenflurbil on cognitive decline and activities of
daily living in patients with mild Alzheimer disease: a randomized
controlled trial.
JAMA
302
2557
2564
42. Bellus Health Inc
2008
Neurochem announces results from Tramiprosate (ALZHEMED(TM))
North American Phase III clinical trial
43. MarksWJJrBartusRTSiffertJDavisCSLozanoA
2010
Gene delivery of AAV2-neurturin for Parkinson's disease: a
double-blind, randomised, controlled trial.
Lancet Neurol
9
1164
1172
44. NuttJBurchielKJComellaCLJankovicJLangAE
2003
Randomized, double-blind trial of glial cell line-derived
neurotrophic factor (GDNF) in PD.
Neurology
60
69
73
45. LangAEGillSPatelNKLozanoANuttJG
2006
Randomized controlled trial of intraputamenal glial cell
line-derived neurotrophic factor infusion in Parkinson
disease.
Ann Neurol
59
459
466
46. FreedCRGreenePEBreezeRETsaiWYDuMouchelW
2001
Transplantation of embryonic dopamine neurons for severe
Parkinson's disease.
N Engl J Med
344
710
719
47. OlanowCWGoetzCGKordowerJHStoesslAJSossiV
2003
A double-blind controlled trial of bilateral fetal nigral
transplantation in Parkinson's disease.
Ann Neurol
54
403
414
48. WattsRLFreemanTBHauserRABakayRAEElliasSA
2001
A double-blind, randomised, controlled, multicenter clinical
trial of the safety and efficacy of stereotaxic intrastriatal implantation
of fetal porcine ventral mesencephalic tissue (Neurocelli-PD) vs. imitation
surgery in patients with Parkinson's disease (PD).
Parkinsonism Relat Disord
7
Suppl
S87
49. OlanowCWSternMBSethiK
2009
The scientific and clinical basis for the treatment of Parkinson
disease.
Neurology
72
21 Suppl 4S1
S136
Štítky
Interní lékařstvíČlánek vyšel v časopise
PLOS Medicine
2011 Číslo 3
- Není statin jako statin aneb praktický přehled rozdílů jednotlivých molekul
- Moje zkušenosti s Magnosolvem podávaným pacientům jako profylaxe migrény a u pacientů s diagnostikovanou spazmofilní tetanií i při normomagnezémii - MUDr. Dana Pecharová, neurolog
- Nedostatek hořčíku se projevuje u stále více lidí
- Magnosolv a jeho využití v neurologii
- Metamizol v terapii bolesti v ambulanci praktického lékaře i pediatra
Nejčtenější v tomto čísle
- The BCG World Atlas: A Database of Global BCG Vaccination Policies and Practices
- How Can Institutional Review Boards Best Interpret Preclinical Data?
- The Challenge of Discharging Research Ethics Duties in Resource-Constrained Settings
- HIV-1 Drug Resistance Emergence among Breastfeeding Infants Born to HIV-Infected Mothers during a Single-Arm Trial of Triple-Antiretroviral Prophylaxis for Prevention of Mother-To-Child Transmission: A Secondary Analysis