Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Machine Learning-Assisted Code Generation: Exploring The Role Of Large Language Models In Automating Software Development Tasks
0
Zitationen
6
Autoren
2026
Jahr
Abstract
This paper explores the capabilities of large language models (LLMs) to automate software development tasks in combination with human-in-the-loop evidence to quantitatively benchmark performance. Background: LLMs hold the promise of translating natural-language intent into code, but concerns remain over reliability, maintainability and trust in actual workflows. Procedure: We performed a mixed-methods evaluation of MBPP (Python) and a curated C++ dataset with GPT-4, Code Llama, and StarCoder, comparing Pass apartment at k accuracy, execution proficiency, and unit-goal coverage, and surveyed 21 items (n=100) on adoption, perceived productivity, belief, and experience effects. Results: GPT-4 had the highest Pass@5 and execution success both on the two datasets consistently; Code Llama was in the middle; StarCoder fell behind. Adoption is still patchy (51% had not used AI coding tools at all). The perceived gains in productivity were modest (median 3/5; mean 3.2/5), trust was concentrated at “Medium” (~38%), and those with 1-3 years of experience saw the largest benefits (mean approx. 3.58), indicating that LLMs are likely to be helpful to juniors on routine work but to seniors on complex design work. Recommendation: LLMs are optimal as a supplemental, facilitator of existing hybrid test-gated workflows; training should focus on problem development, trouble shooting, and AI critiquing. Disadvantages are the presence of only function-level benchmarks, Python/C++ scope, and self-reported survey data; future research must address industrial-scale repositories, other languages, longitudinal user studies, and iterative human-in-the-loop debugging, and improved levels of explainability and provenance controls.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.551 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.443 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.942 Zit.
BioBERT: a pre-trained biomedical language representation model for biomedical text mining
2019 · 6.792 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.781 Zit.