Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Evaluation of an Explainable LLM-Powered Chat-Ops for CI/CD Pipeline Diagnostics and Developer Support
0
Zitationen
2
Autoren
2026
Jahr
Abstract
Modern software delivery relies on Continuous Integration and Continuous Deployment (CI/CD) pipelines, yet diagnosing failed jobs still demands intensive manual inspection of logs, configuration files, and environment settings. This study presents an explainable LLM powered ChatOps diagnostic assistant that integrates Google Gemini with Discord and GitHub Actions to automatically detect, analyze, and explain CI/CD pipeline failures. The prototype uses structured prompt engineering to generate human readable diagnostic reports that separate Root Cause, Solution Steps, and Preventive Recommendations, while an explainability layer highlights supporting log evidence and adapts explanations to developer roles. Evaluation combines expert based assessment with semantic benchmarking using Sentence BERT across five representative CI/CD failure scenarios. Six domain experts reported high Readability (4.87) and Usefulness (4.83) on a five-point Likert scale, indicating that the explanations were clear, coherent, and actionable. SBERT based comparison with ground truth solutions yielded an average F1 score of 0.57 and cosine similarity of 0.79, demonstrating moderate semantic alignment. These triangulated findings show that the assistant can provide accurate, comprehensible guidance for CI/CD failure diagnostics, while also revealing the need for better grounding to improve precision and contextual specificity. The work contributes a validated framework for integrating explainable AI into DevOps ChatOps workflows and establishes an empirical basis for LLM driven fault diagnosis in CI/CD environments.
Ähnliche Arbeiten
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.777 Zit.
Generative Adversarial Nets
2023 · 19.896 Zit.
Visualizing and Understanding Convolutional Networks
2014 · 15.328 Zit.
"Why Should I Trust You?"
2016 · 14.594 Zit.
Generative adversarial networks
2020 · 13.208 Zit.