OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 16.03.2026, 10:34

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Utilizing ChatGPT for assessing disease volume in patients (pts) with metastatic prostate cancer (mPC).

2025·0 Zitationen·Journal of Clinical Oncology
Volltext beim Verlag öffnen

0

Zitationen

15

Autoren

2025

Jahr

Abstract

42 Background: Artificial intelligence (AI) has transformed many aspects of healthcare, particularly in medical imaging analysis. In mPC, accurate identification of bone metastases is crucial in guiding pt management. This proof-of-concept study explores the application of ChatGPT-4o, a multimodal, generative pre-trained transformer (large language model) with emerging image recognition capabilities, in diagnosing bone metastases from bone scans and determining disease volume in pts with mPC. Methods: In this IRB-approved, retrospective study, pts with newly diagnosed mPC and bone-predominant disease were randomly selected. Those who had previously received bone-protecting agents, chemotherapy, hormone therapy, or radiation therapy prior to the bone scan were excluded. Using CHAARTED criteria, high-volume versus low-volume bone disease was classified based on radiologist reports. The bone scan images were then uploaded to ChatGPT, which was instructed to act as a radiologist and asked to classify the disease volume. ChatGPT's classifications were compared to the radiologists’ findings by using Cohen’s Kappa test. Confusion’s matrix was used to represent the sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of ChatGPT. Results: A total of 110 pts were selected, with a median age of 67 years (range: 44 to 89). The majority were Caucasian (98.2%). The median Gleason score was 9, and 77.3% of pts had de novo metastatic disease. The median PSA value at diagnosis was 35.2 ng/ml (range: 1-5843). High-volume disease was present in 52% of pts. ChatGPT's overall concordance with radiologists was 74% (Cohen's Kappa = 0.65). For high-volume disease, ChatGPT demonstrated a high sensitivity of 92.3% (48/52). However, it misclassified 45.8% of low-volume cases as high-volume (specificity: 54.2%). The PPV and NPV of ChatGPT were 68.6% and 86.7%, respectively. Interestingly, ChatGPT’s accuracy was significantly higher when each case was analyzed individually (77.7%) compared to when cases were grouped in a single conversation (43.5%, p<0.001). Conclusions: This hypothesis-driven research demonstrated that ChatGPT-4o has the potential to read medical images independently and apply existing knowledge, such as the CHAARTED criteria, to assist in content generation. However, its reliability remains a significant challenge for broader use in healthcare. Notably, it exhibited "information fatigue," where accuracy significantly declined when similar or repetitive information accumulated within a single conversation.

Ähnliche Arbeiten