Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Generative artificial intelligence as a source of breast cancer information for patients: Proceed with caution
13
Zitationen
17
Autoren
2024
Jahr
Abstract
BACKGROUND: This study evaluated the accuracy, clinical concordance, and readability of the chatbot interface generative pretrained transformer (ChatGPT) 3.5 as a source of breast cancer information for patients. METHODS: Twenty questions that patients are likely to ask ChatGPT were identified by breast cancer advocates. These were posed to ChatGPT 3.5 in July 2023 and were repeated three times. Responses were graded in two domains: accuracy (4-point Likert scale, 4 = worst) and clinical concordance (information is clinically similar to physician response; 5-point Likert scale, 5 = not similar at all). The concordance of responses with repetition was estimated using intraclass correlation coefficient (ICC) of word counts. Response readability was calculated using the Flesch Kincaid readability scale. References were requested and verified. RESULTS: The overall average accuracy was 1.88 (range 1.0-3.0; 95% confidence interval [CI], 1.42-1.94), and clinical concordance was 2.79 (range 1.0-5.0; 95% CI, 1.94-3.64). The average word count was 310 words per response (range, 146-441 words per response) with high concordance (ICC, 0.75; 95% CI, 0.59-0.91; p < .001). The average readability was poor at 37.9 (range, 18.0-60.5) with high concordance (ICC, 0.73; 95% CI, 0.57-0.90; p < .001). There was a weak correlation between ease of readability and better clinical concordance (-0.15; p = .025). Accuracy did not correlate with readability (0.05; p = .079). The average number of references was 1.97 (range, 1-4; total, 119). ChatGPT cited peer-reviewed articles only once and often referenced nonexistent websites (41%). CONCLUSIONS: Because ChatGPT 3.5 responses were incorrect 24% of the time and did not provide real references 41% of the time, patients should be cautioned about using ChatGPT for medical information.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.693 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.598 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 8.124 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.871 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.