Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Decoding ChatGPT’s ‘impact’ on the future of healthcare
11
Zitationen
2
Autoren
2023
Jahr
Abstract
Neural networks with many computational layers have been used as deep learning models to solve real-world problems.[1] Mostly composed of a network of artificial neurons, these are trained on existing datasets to learn novel patterns in applications ranging from image processing and speech recognition to human-like navigation of autonomous cars.[2] One area that has seen immense progress is natural language processing, which uses a specific class of deep learning model, known as a transformer,[3] which is scalable with respect to model size and training data. These language models have outperformed other deep learning architectures such as recurrent neural networks,[1] which were traditionally used for natural language processing tasks.[4] More recently, language models have formed the backbone of generative artificial intelligence (AI) tools, which are trained over huge, often publicly available datasets to generate human-like responses on diverse topics. Two prominent examples are ChatGPT, which generates human-like responses to text prompts, and Stable Diffusion, which generates images based on text prompts. ChatGPT is a large language model developed by OpenAI, designed to understand natural language and communicate with humans via a text-based conversational interface.[5] Derived from their GPT-3.5, it has been trained on a tremendous corpus of textual data extracted from the Internet, including books, articles, websites, and software repositories, and then refined using a technique known as reinforcement learning with human feedback, whereby human supervisors encouraged the model to understand instructions and be more conversational. Presumably because of the broad training corpus used over the billions of parameters making up the underlying GPT models, ChatGPT has demonstrated significant capabilities in a diverse range of tasks. Due to the strength of its prose, ChatGPT is capable of passing tests and examination questions from the law, medical, programming,[6–8] and creative writing and literature domains. Applications may even be built atop it, for instance using its outputs to control robotics.[9,10] These capabilities are making ChatGPT an extremely popular tool: since its release in November 2022, it is estimated that ChatGPT has already reached more than 100 million unique users,[11] making it possibly the fastest growing consumer application in the last 20 years. Such statistics cannot be ignored—the rise of generative AI such as ChatGPT has been meteoric. This naturally poses the question: What does this mean for professionals such as those who work in the healthcare sector? The answer to this question was explored by Parikh et al.[12] who surveyed 210 professionals (157, 74.8% from healthcare) with the goal of gauging their opinion on this matter by circulating a short questionnaire comprising nine multiple choice questions. They found that within this group, there was already significant awareness of ChatGPT and its potential (approx. 63% awareness) and a sizeable minority had already tried asking questions (approx. 42%) to this platform. When asked how much ChatGPT would revolutionize their fields, the majority from both groups expected the change to be less than 50%. Taking a step further and characterizing this change, medical professionals were more likely to believe that any impact of ChatGPT would be positive (approx. 52%), versus other professionals (approx. 38%). Few participants planned to make any significant changes to their career plans in 2023 based on the impact of ChatGPT like generative AI. We may start by pointing out some positives of this study. First, the study shows that there is considerable awareness regarding generative AI among healthcare professionals. Second, an early publication on this topic, in a medical journal, will help to enhance the awareness among readers of this journal, who are most likely going to be healthcare professionals. However, the major challenges in such a study lie in careful consideration of inclusion and exclusion criteria, identification of suitable multiple choice questions and rationale for their selection, and finally ethics approval. All of these were lacking in the study by Parikh et al. Hence, we are concerned regarding the validity of the findings. We will elaborate on these concerns in the following paragraphs. The first limitation of this study concerns the relatively small and self-selected sample size. Given that ChatGPT already has more than 100 million users, a group of N = 210 from “a survey link… shared among healthcare and other professionals who had previously participated in our academic activities,” could be distorting the results.[13] Additionally, as there were no well-thought-out exclusion criteria, the study questions appeared somewhat ad-hoc in nature. For example, if the study had been advertised widely, an exclusion criterion could have been a lack of familiarity with ChatGPT. Having established this point, questions 2-4 could have been eliminated in favor of more important questions, which would have been more relevant to the study design. Additionally, question 9 seems random and irrelevant, based on the conclusions drawn. How many people in the study were actually familiar with Steven Hawking’s work that this question was referring to? Instead, some questions could have been crafted about the effectiveness of ChatGPT in helping individuals in their profession. Questions relating to the awareness that ChatGPT can provide biased, inaccurate, and misleading answers could also have been included. Second, with any opinion-based study like this, it is very difficult to quantify what the impact might be, especially given the broad range of responsibilities across professionals in the medical field. How should someone quantify an impact, especially as an arbitrary percentage? For instance, let us imagine a hypothetical scenario where ChatGPT takes over the initial triage of patients based on how they describe their symptoms. Such a process would substantially change patient flow through a hospital emergency department. How would one quantify this change as a percentage? If you were a dentist, maybe the impact of this change on you is low—but if you were an emergency room trauma nurse, then perhaps the impact would be higher. Even then, what would low or high mean, when trying to score an opinion-based impact on an arbitrary above-or- below-50% scale? Is changing or augmenting the duties of your job a big impact or would that be reserved for tasks whereby careers are fundamentally changed? While there is considerable reason to remain skeptical about any such potential application of ChatGPT in the medical field, consider a different domain, where generative AI is proving to be really disruptive. Few programmers even considered that AI tools might be able to write code. Consider this widely cited survey examining opinions of AI trends from 2018.[14] It does not even mention the potential for code-writing as a skill! Yet, in 2021, just 3 years later, GitHub Copilot was released—a generative AI platform, which now writes up to 40% of code for the 1.2 million developers using it.[15,16] Is it so far-fetched to imagine that the medical field may face such a change in the future? As noted, ChatGPT has already demonstrated the ability to retain and present medical information on par with that of medical students.[17,18] Unlike programming, healthcare is a highly regulated domain due to the critical nature of medical decisions. A wrong medication, a false diagnosis, or a delayed treatment as a result of an AI-based decision could have catastrophic consequences. Hence, drugs and devices go through many years of testing and validation even before embarking on clinical trials involving human subjects. Any clinical trial, whether involving human or animal subjects, must have suitable ethics approval. Furthermore, any medical device with safety implications needs safety certification by agencies such as the Food and Drug Administration. Hence, the use of generative AI in a domain with serious safety implications needs careful consideration, including the need for suitable regulation.[18] Of course, at this time the discussion remains firmly hypothetical, given the propensity of models like ChatGPT to ‘hallucinate’,[19] putting them in charge of potentially life-or-death situations seems considerably premature. Still, as we are already seeing substantial demand for the adoption of other kinds of AI throughout the medical setting,[20–24] it seems logical to assume that ChatGPT and its peer models will also soon be in demand, and further research will no doubt explore its applications, especially when such AI can be verified for its clinical efficacy.[25] ChatGPT itself has an “opinion” on this topic. When prompted with, “Given that you are an artificial AI which knows data about the medical field, what do you think the impact of ChatGPT and other generative AI will be if applied to the medical field?” it responded that it has the potential to revolutionize the field in many ways. By leveraging vast amounts of medical and clinical data from a variety of sources, it claims it will be able to recognize patterns and make connections to help identify diseases, improve diagnosis and treatment, and develop new drugs and treatments, personalize medicine, and assist in medical education and research, concluding, “Overall, generative AI has the potential to improve healthcare outcomes, speed up medical research, and reduce healthcare costs. However, it is important to note that AI is not a substitute for human expertise and judgment. AI should be used to augment the work of healthcare professionals, not replace them.” Where do we go from here? It is quite clear that the AI “genie” is out of the “bottle.” Generative AI capable of producing vast quantities of human-like prose are proliferating, ChatGPT included. With the release of its application programming interface,[1] it is clear that companies like OpenAI are making large commercial bets on the successful adoption of these tools across industries. So, they certainly seem to believe that there will be an impact—it remains to be seen what this impact will be. The leaders of the healthcare community need to come to terms with this technology on an urgent basis, so that careful regulation may be introduced to govern how AI can be verified and used for healthcare-related purposes.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.460 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.341 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.791 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.536 Zit.