Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Optimizing the study design of clinical trials to identify the efficacy of artificial intelligence tools in clinical practices–Authors’ reply
0
Zitationen
4
Autoren
2019
Jahr
Abstract
We are pleased to response to the concerns regarding the study design of using a randomized controlled trial (RCT) to identify the efficacy of CC-Cruiser in clinical practices [[1]Lin H. Li R. Liu Z. et al.Diagnostic efficacy and therapeutic decision-making capacity of an artificial intelligence platform for childhood cataracts in eye clinics: a multicentre randomized controlled trial.EClinicalMedicine. 2019; 9: 52-59Summary Full Text Full Text PDF PubMed Scopus (49) Google Scholar]. We appreciate that Qian Zhou and colleagues regarded the paper as “one of the first published RCTs comparing the diagnostic efficacy of artificial intelligence (AI) against experts”. However, we would like to make some clarifications concerning the subject of our paper: to explore the real-world performance, including the utility and acceptability, of AI diagnosis with using unfiltered clinical data in the current stage. After conducting the test of significance, we admitted that AI is inferior to doctors. Qian Zhou and colleagues had 2 main concerns: 1) a non-inferior design is more appropriate to confirm the efficacy of artificial intelligence (AI); 2) a single arm diagnostic accuracy testing trial design can avoid the trial effect in the senior consultants group. We would like to have a point-to-point response. First, we agree that a non-inferior design can effectively assess AI performance in clinical practices in some cases. Nevertheless, a design is appropriate only if it is applicable to a specific study. Our study is not a phase 3 clinical trial as Qian Zhou and colleagues thought, where a superiority, equivalence, or non-inferiority design is needed. We claimed that AI was inferior to doctors but can assist doctors as a screening tool due to the acceptable diagnosis accuracy, sensitivity, and specificity in clinical setting, instead of conducting the diagnosis. Moreover, a non-inferior design with acceptable non-inferiority margin of 5% of diagnostic accuracy is hard to set in our study due to the expected differences of at least (not at most non-inferiority designs require) 5% between two groups and lack of similar reference for the non-inferiority margin from previous studies [[2]Hou Y. Wu X.Y. Li K Issues on the selection of non-inferiority margin in clinical trials.Chin Med J (Engl). 2009; 122: 466-470PubMed Google Scholar]. Therefore, the non-inferior design is improper for our study. Second, the experts providing golden standard diagnosis with masking to the group assignments mentioned in Zhou's letter are not the testing clinicians in senior consultants group in our study. Indeed, the trial effect of testing clinicians may not be avoided in this RCT. However, the center effect (confirmed in our analysis) in this multicenter trial may also influence the accuracy results. In addition, a single arm design cannot assess the clinical differences between the diagnostic settings of medical AI and traditional ophthalmic clinics in real-world [[3]Sambucini V. Comparison of single-arm vs. randomized phase II clinical trials: a Bayesian approach.J Biopharm Stat. 2015; 25: 474-489Crossref PubMed Scopus (11) Google Scholar]. We were also interested in comparing the mean time for receiving a diagnosis and level of patient satisfaction as the important metrics of diagnostic efficacy of AI tools with using a two-arm trial. In conclusion, this diagnostic RCT is a more suitable choice for this trial and can be regarded as the final frontier to evaluate the clinical difference between the AI diagnostic procedures using CC-Cruiser and traditional eye clinics [[4]Rodger M. Ramsay T. Fergusson D Diagnostic randomized controlled trials: the final frontier.Trials. 2012; 13: 137Crossref PubMed Scopus (48) Google Scholar]. We also believe that the guidelines of appropriate study design of AI applications are needed. Ruiyang Li: Writing - review & editing. Lanqin Zhao: Writing - review & editing. Dongyuan Yun: Writing - review & editing. Haotian Lin: Writing - review & editing. The authors declare no competing financial interests. We are thankful towards the authors Miss Qian Zhou and colleagues for the comments on our study and improvement of AI applications. Optimizing the study design of clinical trials to identify the efficacy of artificial intelligence tools in clinical practicesWe read with interest the recent article by Lin et al. [1] aimed to compare the diagnostic efficacy between CC-Cruiser a developed artificial intelligence (AI) platform [2] and ophthalmologists in real-world clinical settings in a randomized controlled trial (RCT). They concluded that CC-Cruiser had the capacity to assist doctors in clinical practice even though it showed less accuracy compared to senior consultants in diagnosing and making treatment decisions. However, we believe that the study design of the research could be improved by deeply considering the properness of using RCT. Full-Text PDF Open AccessDiagnostic Efficacy and Therapeutic Decision-making Capacity of an Artificial Intelligence Platform for Childhood Cataracts in Eye Clinics: A Multicentre Randomized Controlled TrialCC-Cruiser exhibited less accurate performance comparing to senior consultants in diagnosing childhood cataracts and making treatment decisions. However, the medical service provided by CC-Cruiser was less time-consuming and achieved a high level of patient satisfaction. CC-Cruiser has the capacity to assist human doctors in clinical practice in its current state. Full-Text PDF Open Access
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.287 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.140 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.534 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.450 Zit.