Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Predicting ChatGPT’s Ability to Solve Complex Programming Challenges
0
Zitationen
7
Autoren
2024
Jahr
Abstract
The recent emergence of Large Language Model (LLM)-based tools such as OpenAI’s ChatGPT and Google’s Gemini has sparked excitement across the software development industry, and offered promises to transform the software development process. Despite the enthusiasm, it remains uncertain whether these tools are already good enough at coding to replace the role of software developers. Currently, no studies have provided insights into the performance of LLMs, such as understanding which characteristics of a programming task might affect an LLM's performance, or predicting how an LLM will handle new programming challenges. In this work, we address these challenges by first creating a data collection framework to gather 3,323 programming tasks from Kattis, a widely-used programming challenge platform. We then use OpenAI's ChatGPT to solve these programming tasks. The solutions obtained from ChatGPT are submitted back to Kattis to evaluate their correctness and effectiveness. Next, we use the collected data, including both problem and solution information, to analyze the task characteristics that significantly influence ChatGPT's performance. Building on the analysis, we develop predictive models that can forecast the efficacy of ChatGPT on new programming problems. Our analysis indicates that factors such as the difficulty level of a programming challenge, or the readability complexity of a problem description can significantly affect the efficacy of ChatGPT. Finally, the experimental results show that our predictive model can correctly predict ChatGPT performance with an accuracy of up to 90% for easy problems, and up to 79% for difficult problems.
Ähnliche Arbeiten
Explainable Artificial Intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI
2019 · 8.260 Zit.
Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead
2019 · 8.116 Zit.
High-performance medicine: the convergence of human and artificial intelligence
2018 · 7.493 Zit.
Proceedings of the 19th International Joint Conference on Artificial Intelligence
2005 · 5.776 Zit.
Peeking Inside the Black-Box: A Survey on Explainable Artificial Intelligence (XAI)
2018 · 5.438 Zit.