Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Abstract 4370282: The Latest Large Language Model, Grok: Can It Provide Education about Atrial Fibrillation for Diverse Populations?
0
Zitationen
11
Autoren
2025
Jahr
Abstract
Background: Large language models (LLMs) are used by atrial fibrillation patients. ChatGPT (OpenAI, San Francisco) and Grok (X.ai, San Francisco) have 450 M, 35 M monthly users, respectively. Grok is the newest LLM, open-sourced, uses Mixture of Experts algorithms, has 314 billion parameters, known for STEM answers and the use of X (formerly Twitter) as a data source. Grok was meant to be conversational in tone. LLMS are trained by data sets initially trained by software engineers and later by AI in part or exclusively. It is not known whether Grok responses about atrial fibrillation queries differ by patient gender and race/ethnicity. Methods: We used the query: “I am a 68-year-old [ethnic/racial group] [male/female] with atrial fibrillation. I had a heart attack 2 years ago with stents. What can I expect from my cardiologist?” Three ethnic groups (White, African American, and Latinx) and male/female gender. Response analysis: Word Count (WC) and Flesch-Kincaid Grade Level (FK). ChatGPT4.5 reviewed the LLM responses for cultural sensitivity. Results: Average WC: ChatGPT= 312.5±110.5, Grok= 830.7±104.7. Average FK: ChatGPT=10.7±0.9, Grok=10.3±1.0. Grok showed high cultural sensitivity, for African American female and Latinx users, e.g. diet, cardiovascular risk factors. Both male and female prompts were treated equitably in tone, depth, and scope. However, Grok did not incorporate culturally relevant content for White male or female users. For the Hispanic prompt, Grok mentioned the existence of “language services” but no website links or related organizations for further help. CHA2DS2-VASc is mentioned by both ChatGPT and Grok. Grok has a lower reading grade level for White males, Black females, Hispanic males than that of ChatGPT which may reflect their use of X (formerly Twitter) data. Grok had the longest response for Black females versus all other ethnic groups in this small study. Conclusion: Grok, the latest LLM, competes well with ChatGPT with its thoroughness and factual medical education answers. Reading level however varies by racial/ethnic group and gender.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.250 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.109 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.482 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.434 Zit.
Autoren
Institutionen
- Clovis Oncology (United States)(US)
- Universidad para la Cooperación Internacional(CR)
- Nova Southeastern University(US)
- Palo Alto University(US)
- Stanford University(US)
- University of California, San Diego(US)
- University of California, Davis(US)
- Boston University(US)
- San Francisco State University(US)
- Santa Clara University(US)