Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Exploring ChatGPT Efficiency in Automatic Test Generation for Python: A Comparative Analysis

2025·0 ZitationenOpen Access

Volltext beim Verlag öffnen

Zitationen

Autoren

2025

Jahr

Abstract

Context: Large language models (LLMs) like ChatGPT have gained attention in automated software testing. This study evaluates ChatGPT-3.5-turbo’s ability to generate test sets for Python programs, comparing it with Pynguin and pre-existing test sets. Problem: Automated testing remains challenging for dynamically typed languages like Python, requiring adaptable tools for diverse code structures. Solution: We assessed ChatGPT-3.5-turbo’s test generation using different prompt configurations and temperature settings. Method: Using 40 Python programs, we generated Pytestcompliant tests via the OpenAI API, varying temperature settings (0.0 to 1.0). Tests were validated using Pytest, with coverage and mutation scores measured via Coverage, MutPy, and Cosmic-Ray. Pynguin-generated and pre-existing test sets served as baselines. Summary of Results: ChatGPT-3.5-turbo successfully generated valid tests for simpler programs, but averaged below 28% overall, with a low cost. Higher temperatures (0.5–1.0) improved results, but combining test cases from all temperatures introduces diversity in the LLM-generated test sets, making it possible to overcome both Pynguin and pre-existing test sets in terms of decision coverage and mutation score.

Autoren

Institutionen

Themen

Software Testing and Debugging TechniquesArtificial Intelligence in Healthcare and EducationMachine Learning and Data Classification

Volltext beim Verlag öffnen

Exploring ChatGPT Efficiency in Automatic Test Generation for Python: A Comparative Analysis

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen