Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Addressing Generative AI Detection in Doctoral Nursing Education
0
Zitationen
7
Autoren
2025
Jahr
Abstract
Integrating generative artificial intelligence (AI) tools such as ChatGPT presents new challenges to academic integrity in doctor of nursing practice (DNP) education. The American Association of Colleges of Nursing (AACN) 2021 Essentials call for developing practice-ready nurses through scholarship and competency-based education.1 Meanwhile, the emergence of tools such as ChatGPT, launched by OpenAI in 2022, introduces new challenges for nurse educators.2-4 These generative AI technologies can produce human-like written content, potentially undermining traditional methods of assessing DNP student competency through written assignments. Nurse educators must find equitable ways to evaluate student work while addressing unethical generative AI use. Problem DNP programs traditionally rely on written work to demonstrate critical thinking, scholarship, and synthesis. While AI can support student grammar and productivity, it does not substitute for the original thought, synthesis, and personal insight that doctoral-level writing demands. With generative AI becoming increasingly accessible, its impact on academic writing in nursing education cannot be ignored. DNP programs must emphasize ethical AI use and implement fair, consistent methods to evaluate suspected misuse. Various AI detection software platforms have emerged in recent years but are known to have high false positive rates.2 Over-reliance on flawed detection tools can result in unjust penalties for students and erode trust between students and faculty.2 Nurse educators are called to adapt their pedagogical practices, supporting innovation while maintaining academic integrity in the face of evolving AI technology.4 Purpose This project aimed to develop a standardized method for identifying and evaluating potential AI-generated content in DNP student writing assignments. A tool was designed to support instructor evaluations by promoting fairness and consistency. The survey aimed to identify objective indicators of potential generative AI use in student work and reduce individual instructor bias by applying the same criteria across all cases. A student interview template was developed to allow students to explain their approach, clarify misunderstandings, and present evidence of their original work. Approach This School of Nursing used current literature and ChatGPT to design a survey and a follow-up interview tool for addressing student assignments flagged as having >25% AI match using a generative AI detection software.2 Themes from the literature and insights from ChatGPT were reviewed and organized.3 The initial survey included 6 questions targeting common signs of generative AI use in writing. These include a score exceeding 25% from AI detection software, the use of generic language or a lack of original and creative ideas, an absence of domain-specific knowledge or contextual insights such as professional experience, citations that reference non-existent literature or contain formatting errors, noticeable formatting quirks or stylistic markers such as boxes around text or inconsistent fonts that suggest the use of copy-paste functions, and clear inconsistencies when compared to the student’s previous work. If the student’s assignment had at least 2 of the 6 findings present on a submitted written assignment, faculty were advised to proceed with the structured interview with the student. The interview addressed whether an approved editing service was used and facilitated various student requests, including drafts/phases of their work and copies of the articles on the reference list. Lastly, faculty were prompted to ask the student if generative AI was used for the assignment. Both tools were reviewed and approved by faculty before the project pilot. The pilot started in Fall 2024 and continued through the Spring 2025 trimester. When a primary teaching faculty member received a student assignment with >25% AI detection, they completed the survey. Then, they forwarded the assignment to up to 5 DNP faculty members so they could review it and complete the survey. Faculty raters responded with yes, no, or not applicable to each item. If 2 or more yes responses were recorded, the student was referred to the standardized interview process conducted by the primary teaching faculty. The interview responses were not included in the data collection for this project. However, faculty perceptions of the interview process were noted through informal conversations. Given that this was a pilot project, the survey and interview findings did not affect the students’ grades. All student work was evaluated by following existing grade policies for the course. Institutional Review Board was deferred as the University classified this project as a quality improvement project. Ethical standards were maintained, and a code book was created to de-identify students. Data Analysis There were 14 assignments across 3 DNP courses that were selected for inclusion. Multiple faculty raters (2-5) completed the survey for each flagged assignment. Assignments with only 1 rater were excluded from the reliability analysis. The dataset included 62 responses. Question 6 (Does the content align with the student’s prior work?) had 79% marked N/A and thus was excluded from reliability testing. The interview threshold was met consistently across raters in 9 of the 14 assignments. Variability was observed in 5 cases, underscoring the challenges of subjective interpretation and emphasizing the need for more precise and objective criteria. A statistician employed by the University supported the project’s data analysis. As part of the evaluation, 2 statistical tests were conducted to assess the reliability of the survey results. Cronbach’s alpha was calculated to determine how well the individual items on the survey measure the same underlying concept. While values above 0.70 would indicate reliability, this project yielded a value of 0.59, indicating that the survey did not demonstrate sufficient internal consistency. To determine how consistently different evaluators rated the same student work, Cohen’s kappa was used to assess inter-rater agreement. Across the assignments, Cohen’s kappa values ranged from a perfect agreement score of 1.0 to a negative value of −0.09, which reflects poor agreement. Notably, 2 assignments achieved a kappa score equal to or greater than 0.70, which is typically interpreted as substantial agreement between raters for those specific assignments. Discussion While psychometric standards were not met, the model showed potential to reduce bias and foster transparency. Combined with ethical guidance, this approach supports innovation while preserving academic integrity, offering a faculty-driven framework for navigating potential generative AI use in academic writing. The findings suggest that while the faculty-developed survey provides a structured alternative to AI detection software alone, it is not yet reliable enough to determine AI use conclusively. Survey responses were based, in part, on perception and experience, introducing variability in judgment. Nonetheless, the approach offers a promising model for standardized review that could minimize faculty bias and protect student rights. Although not formally analyzed, the interview process proved valuable in fostering student-faculty dialogue. Faculty noted increased student accountability when students were asked to explain their writing process and provide supporting materials. This structured follow-up encourages transparency, promotes academic integrity, and creates a collaborative space for learning and reflection. AI detection tools lack transparency and yield false positives. Until more accurate systems are developed, schools of nursing must craft balanced, human-centered approaches for assessing potential generative AI content. Conclusion Integrating generative AI into higher education is not optional; it is here. DNP programs must now emphasize ethical AI use and redesign curricula with various assessment measures beyond narrative written assignments that allow students to demonstrate learning clearly. The partnership between faculty and students requires the development of fair methods to assess suspected AI misuse. While this pilot faculty review tool did not meet psychometric standards for internal consistency, it represents a step forward in faculty-led innovation for evaluating generative AI use in academic writing. More data is needed to refine the AI detection survey, improve internal consistency and inter-rater agreement, and validate use in academic integrity cases. A standardized process combining detection, review, and interview in the interim promotes fairness, student engagement, and alignment with the AACN 2021 Essentials.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.