Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Risky or rigorous? Developing trustworthiness criteria for <scp>AI</scp> ‐supported qualitative data analysis

2025·1 Zitationen·Anatomical Sciences EducationOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

The artificial intelligence (AI) movement, which some link to the 4th industrial revolution,1, 2 is infiltrating and impacting many aspects of academia, including teaching and assessment,3 lab-based research, and pertinent to this special issue, qualitative research.4 There are examples of AI in anatomy education that are shaping the way we teach and learn,3 and thus there is interest in how AI could also be used to support the education research process within the fields of anatomy and health professions education. There are articles debating AI ethics in qualitative research and how to engage both commercially available and proprietary AI at different points in the research process.4-7 For example, computer scientist Luca Longo4 presents a detailed illustration of the potential roles of AI in the qualitative research process, examining how AI can be integrated across research stages, including participant recruitment, methodological design, data analysis, and dissemination. In this brave new world, one could picture AI bots running an online focus group, utilizing neural networks that dynamically adjust questions to probe deeper based on participant responses. Large language models (LLMs) could then be used to develop a rigorous codebook, classifying themes and patterns with a level of consistency that human analysts might struggle to maintain.8, 9 Simultaneously, a LLM could generate a written analysis, synthesizing and visualizing findings into a polished report.10 Rigorous qualitative data analysis is known to take considerable researcher time, so there is considerable interest in how AI can support, augment, and expedite this aspect of the research process.5, 6 But does the hype represented in the literature match the reality in practice? While AI LLM technologies like GPT-4, DeepSeek, and Gemini offer promising features to enhance the efficiency and scalability of qualitative data analysis, concerns persist regarding AI's ability to interpret data in a way that is meaningful for humans, potential biases, methodological transparency (including ethical considerations for data use and storage), and propensity for output variability and hallucinations (i.e., incorrect or misleading information presented as facts).4 This means that human expertise remains crucial in qualitative data analysis, playing an essential role in ensuring contextual accuracy, addressing ethical considerations, and providing critical interpretation of AI-generated outputs. Thus, integrating AI into qualitative research should be approached as a collaborative effort, combining the strengths of AI and human judgment. Ultimately, the goal is to achieve insightful research outputs and a quality approach to the research process—known in qualitative work as rigor or trustworthiness.11, 12 However, as humans shift their roles toward assessing AI outputs rather than generating insights themselves, emerging evidence suggests that this reliance may inadvertently diminish critical thinking in various domains and could negatively impact not only research findings but also their adoption and application into practice.13 With such risk in mind, how can researchers use AI to support qualitative data analysis in a rigorous and trustworthy manner? We draw on our experiences using AI to support qualitative data analysis (Box 1), combined with the literature on rigorous approaches to qualitative data analysis, to provide recommendations for researchers interested in AI-supported qualitative data analysis. Our research engaged AI in the form of a reflexive expressions language parser, which identifies linguistic patterns within reflective writing based on an established conceptual framework.14 This AI reflexive expressions language parser was developed and refined by one of the co-authors (AG), specifically for the purpose of identifying reflexive expressions. The parser is not a commercially available resource (see Box 1), thus reducing some of the concerns related to rigor, ethics, and logistics that occur when using for-profit developed AI in the research process.15 Further details on the development of the language parser are previously described.14 These reflexive expressions can be positional (e.g., how someone positions themselves within the narrative) or expressive (e.g., how someone describes themselves within the narrative). For instance, a diary entry that stated that the author ‘had done a lot’ would indicate a positional reflexive expression, and if the same diary described how the author felt while engaging in these prior events, the reflexive expression would be coded to an expressive theme. We also used epistemic network analysis (ENA), a technological adjunct to this qualitative data analysis, to explore the co-occurrence of reflexive expression codes across the dataset (https://app.epistemicnetwork.org/login.html). ENA was originally developed to model the patterns of associations among cognitive elements, such as knowledge, skills, and meaning, that characterize the thinking of individuals.16 More recently, ENA has been adopted to analyze wider scenarios, where the co-occurrence of codes is expected to capture more subtle insights compared with the single occurrence of codes in isolation.17 This is similar to how qualitative data analysis software can look for co-occurrence of codes (e.g., cross-tabulation or matrix analyses). ENA, however, can provide statistical analysis of this co-occurrence and develop network diagrams to represent these connections across codes, allowing the human researcher to develop a deeper understanding of the patterns of code co-occurrences. We employed ENA to analyze writing samples, specifically reflexive diary entries, which were coded by the reflexive expressions language parser, to facilitate the researchers' understanding of how students used reflexive expressions when reflecting on experiences of certainty and uncertainty in health professions education (Box 1). Using this case study, this viewpoint commentary guides readers to consider the potential roles for AI in qualitative research and methods for ensuring rigor in such work. This case study illustrates how AI is challenging us to redefine and reshape what the field considers rigor for qualitative research. Approaches to ensuring rigor vary according to worldview. While many worldviews may lend themselves to AI-integrated research and ENA, this article focuses on interpretivist worldviews and “Big Q” qualitative research.18 For the purposes of this article, we will use the term ‘researchers’ to refer to humans and will identify AI through either AI in general or specific types (e.g., LLMs). Background This case study is based on a secondary analysis of reflective diary entries completed by (n = 41) medical students at an Australian medical school undertaking clinical rotations in 2020. Participants were asked to reflect on their experiences of uncertainty and certainty at six timepoints across an academic year, and record these in audio, typed, or handwritten diary entries (n = 230). Primary analysis identified stimuli of uncertainty,19 factors influencing or moderating participants' experiences of uncertainty,20 and how they described responding to uncertainty.21 Based on these analyses, we identified an important role for critical reflection in developing learners' skills for managing uncertainty.20, 21 These studies did not provide clarity, however, on how best to support learners' critical reflection. For instance, authors GCS and MDL noted that reflections varied in depth, quality, and focus, and thus wanted to more effectively consider how to support learners engaging in high-quality critical reflection. This led to MDL and GCS reaching out to colleagues in other fields, including critical reflection (AG) and learning analytics (LZ & RM). In doing so, we ‘partnered’ with the AI language parser as well as the statistical approach of ENA to gain a deeper understanding of the patterns in this population's critical reflections. Research question What are the patterns in medical students' reflexive expressions within reflective diary entries focused on uncertainty and certainty? Worldview Interpretivism, where knowledge and knowing is socially co-constructed by those experiencing it and may include multiple perspectives. Study design The dataset comprised the uncoded diary entries (n = 230) submitted as part of the prior studies.19-21 An AI model created with machine learning techniques and specialized in detecting reflexive expressions, termed a reflexive expressions language parser, coded these reflections.14 ENA was then used to cluster coded diaries into groups according to patterns of co-occurrence of reflexive expressions. Following ENA clustering, randomly selected diaries were reviewed (n = 15/cluster) by MDL and GCS for reflexive expressions using deductive thematic analysis based on Gibson et al.,14 with cluster themes identified. AI reflexive expressions language parser Reflexive expressions analysis uses a theoretically grounded computational model that identifies reflexive n-grams (groups of words) associated with eight categories.14 Data for its development comprised 13,841 short written reflections on personal experience, written by undergraduate and postgraduate students. N-grams that were common in the British National Corpus—a repository of hundreds of writing samples22—were removed during the modeling process to ensure that remaining n-grams were particularly characteristic of reflection. The resulting reflexive expressions were topic independent, representing reflective aspects of the text rather than the content of the text. For example, in the text “I've been thinking about my recent prac at XX High School,” the model would represent “I've been thinking about my,” but not “recent prac” or “XX High School.” This feature differentiates reflexive expression analysis from other text analysis techniques, which tend to focus on the content of the text. Summary of study findings Three distinct coding patterns were identified among reflective diary clusters: “superficially-,” “partially-,” and “deeply-” reflective. Participants' clustering patterns were static throughout the study, indicating that individuals' reflective depth did not significantly change over time. Reflections focused on uncertainty had a greater frequency of expressive codes across all three clusters, suggesting that asking students to focus on uncertainty in critical reflections may support their engagement in the reflective process. While the case study (Box 1) led to some valuable insights into critical reflection (publication in process), the study also inspired us to consider some key questions about these methods and tools in the context of qualitative rigor. Typical criteria for qualitative research rigor in the interpretivist paradigm include credibility, dependability, confirmability, transferability, and reflexivity. We explore how AI may relate to qualitative research rigor by considering this case study in more detail in the following sections. Critically, all of these elements need to be considered at the study outset and throughout the research process. A summary of these considerations relevant to researchers, reviewers, and editors is provided in Table 1. Credibility considers all elements of the research process holistically and how each element contributes to plausible findings.11, 23, 24 A core concept that can help establish credibility is internal coherence, or the alignment of philosophy, methodology, and methods of research.24 Failure to consider internal coherence could result in researchers overlooking or missing details that are required to answer their research questions. The research paradigm for the larger project from which our case study is drawn (Box 1) was interpretivism, which privileges multiple perspectives and viewpoints on a topic. To explore these, we therefore utilized qualitative longitudinal methodology,25 reflective diary entries as our method, and deductive thematic analysis using reflexive expressions.18 When considering credibility in the context of AI-supported qualitative data analysis, researchers should examine how AI usage supports the internal coherence of the study. While human analysts are adept at identifying semantic content in small quantities of documents, it can be challenging to consistently identify lexical patterns over large numbers of documents. This is where a computational analysis can provide significant assistance.14 For our study, we chose AI that could identify different approaches to reflection and assist with identifying patterns within our large and complicated dataset.14 Through the lens of credibility, the nature of the dataset (e.g., large and complicated) could mean that AI would enhance credibility as, depending on its programming and training, it may be better equipped to manage this type of data than a human brain. However, reflexivity (see below) will still be a vital component of data analysis. Further approaches to establishing credibility that are used in specific research methodologies and methods are included in checklists for reporting qualitative research.26, 27 Such approaches include audit trails, member checking, and triangulation or crystallization. Audit trails involve researchers documenting the research process from inception to completion, including decisions made during data analysis.11 In the context of AI-supported qualitative data analysis, audit trails could include which parts of the analysis were researcher or AI led and, where relevant to the AI used, the prompts researchers used to elicit AI responses. Member checking involves discussing preliminary study findings with participants to explore how their perspectives align with those of the researchers.28 For studies using AI, member checking could involve the researchers sharing and discussing researcher-evaluated AI outputs with participants. Triangulation is a concept drawn from positivist paradigms where multiple ‘experiments’, or data collection methods, are used collectively to support findings.28 This concept is debated in interpretivist and Big Q qualitative research, with crystallization proposed as an alternative.28 Rather than confirming findings across different forms of data, crystallization supports using multiple data collection approaches to build a richer, more nuanced understanding of the phenomenon being studied. We considered crystallization in the larger study from which this case study was drawn, wherein participants completed semi-structured interviews that explored ideas initially expressed in diary entries.19-21 In the present case study, AI contributed to crystallization as a source of data analysis, which, in with the researcher may a level of to the by not be if AI were running as it is the that crystallization of interviews were not completed for the present study, approaches that more explore preliminary findings of AI-supported may be an approach to support the credibility of research engaging AI for qualitative data analysis. To dependability, researchers need to provide detail in the study methods to researcher to similar research in their This details about data such as how participants were the approach to data analysis, In considering for our study (Box 1), we also which details to be provided about our use of AI, such as our in using the reflexive expressions language parser, of how it was developed and and how the language parser dataset compared with our study We that the reflexive expressions language parser was to support our research to the alignment our and specific research questions and the This reflexive expressions language parser is different from commercially available (e.g., GPT-4, DeepSeek, in that the development and of the parser are described.14 The reflexive expressions language parser model was using a process of clustering, human and machine learning (see Box for more When considering in to AI usage in qualitative data analysis, researchers should also consider and detail what the AI While researchers knowledge and based on our personal including (e.g., and contextual AI's are The knowledge that AI has and can use is based on the (e.g., and and learning the AI is through its this has compared with The reflexive expressions language parser of how it was and the data it was on (see Box 1). examining what the AI and in what this the research forms an important part of reflexivity (see In our case study, the of this were when the language parser reflexive diaries that included When the researchers, reviewed these however, we noted that the were in rather than by the students An was a participant described other students were of during learning Our engagement as researchers was therefore vital to ensure that AI outputs could be refined to best answer our research In to the language parser we many proprietary AI to be AI its in a way that can what the AI is how it is doing and it will In our we to use AI for a identify linguistic patterns based on an conceptual which is in previously at the of are not to identify all the elements required for dependability, with this aspect of rigor. In to the details about the AI qualitative researchers should also detail how they with the support of is the process of examining throughout the research process, the details of which are in a reflexivity within the methods of a research all elements of it is the one that is challenging to engage with and detail for & reflexivity in their article on reflexive thematic analysis in this special issue, that is important to reflexivity when researchers can develop insights into how they the research to design, data and data analysis and The term in this context is where the researcher considers their and how this to the research methods and context being studied. For more on in quality research, et in this special This is an when with AI, as AI may the of the In considering researchers need to consider their with the researchers similar experiences to so, the researchers may be but if the researchers may in their reflexivity that they are In the case of the of AI may the research process as the is a its of contextual Failure to consider for instance, result in researchers to AI outputs as being of examining how outputs align with different including those of interpretivist or “Big Q” qualitative research of researcher (i.e., a source of potential according to positivist and research and researcher during data is thus vital to the factors that researchers' interpretation of the In other human researchers reflect on and what positivist researchers would consider their However, researchers may not be to identify the within the that AI is on and how these data analysis at transparency about Thus, for qualitative human researchers using AI-supported qualitative data analysis, critical reflexivity and and of potential influencing AI outputs are a should include an of the researchers' reflexivity and AI to the of a a summary could include a that the researchers' role in and points at which AI was integrated into this research process, as well as the researchers' Such a would support the paradigm of or rather the where and reflexivity that the of the researchers is not by the which is a human A can help the research process, which elements of (e.g., similar to the approach of an audit However, reflexivity will need to include such as and experiences noted on the of more details on reflexivity could also be included in in to a summary within the methods of a research The concept of considers how qualitative research findings may be relevant in those In qualitative research, is by considering in relevant conceptual and that to elements of the The same could be considered when researchers AI into the research process. For the case study, the AI reflexive expressions language parser on for reflection developed from an of In this case study considered in the of the we chose AI that was from data and analysis related to the of the research and a that only has to the it was on and those we of also study context and potential with other and For instance, the case study included medical participants in and their experiences of to the of uncertainty in findings may be to other health professions Thus, this case study illustrates how and conceptual can be used to enhance in both the of the AI and the of the on researchers the data and study In qualitative research in this is by researchers sharing data such as to in their outputs. In some larger parts or may be (e.g., with participant as ethical as part of online data or the context of AI engagement for data analysis, may be by providing outputs such as coded In our case study, the language parser a coded including and to the coding 1), of which could form to support While of the for-profit AI hype emerging technologies as research for their and approach in the qualitative research process, this can result in the of core of qualitative research rigor if elements of rigor are not considered in the research process. qualitative researchers are to consider how of rigor need to to the engagement and of AI may be the we need to more consider the elements of rigor in qualitative research more and the critical role that humans our in the qualitative research process. project writing writing and Data writing and writing and writing and data project writing and writing The authors the of the and the and as the of the on which many of us work and and We that technological including can negatively impact and that and the technological which are in with The authors also to participants from the study from which this case study is drawn, and those with the studies that the AI language parser this study. We also would like to for the critical of the by as part of the the of Australian is a and of the for in the of and of and at is also the for the for in research focuses on understanding how anatomy education can the clinical anatomy of the undergraduate entry medical at is a research in the of at in the of research focus is using learning analytics to facilitate teaching and learning in such as the is an information scientist the of time, and personal is a in within the of of of research and of cognitive and involves both conceptual and computational This work such as the nature of the of and in cognitive the of and the to and and cognitive reflexivity for is an of learning analytics and AI and of the for at research learning and data to enhance learning and in domains such as and education. work computational and qualitative methods to and support and learning at is a in the for in the of and of and at clinical anatomy with a focus on developing skills through research focuses on health professions learners' development of uncertainty and the of and

Autoren

Institutionen

Themen

Artificial Intelligence in Healthcare and EducationEthics and Social Impacts of AI

Volltext beim Verlag öffnen

Risky or rigorous? Developing trustworthiness criteria for <scp>AI</scp> ‐supported qualitative data analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen