OpenAlex · Aktualisierung stündlich · Letzte Aktualisierung: 14.03.2026, 14:33

Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Enriching Conversations: Empowering ChatGPT with Image Caption Generation

2024·3 Zitationen
Volltext beim Verlag öffnen

3

Zitationen

5

Autoren

2024

Jahr

Abstract

Image captioning stands as a pivotal technique for providing contextual descriptions of visual content, promising substantial enhancement in the capabilities of conversational AI systems. This work delves into the integration of image captioning methodologies into ChatGPT, aiming to fortify its capacity in understanding and responding to visual information. The study extensively explores the application of deep learning models, encompassing ResNet50, LSTM, DenseNet121, MobileNet, and MobileNetv2, in the domain of image captioning. Specifically, a comprehensive investigation is conducted into a Recurrent Neural Network employing LSTM as a decoder and a Convolutional Neural Network utilizing ResNet as an encoder. These fusion harnesses vocabulary and image features to craft precise and meaningful descriptions of visual content. Furthermore, this study pioneers an approach to identify and relate at least two salient features within any given image, forming a coherent caption that binds the relationship between these identified features. This novel capability not only refines image captioning techniques but also empowers ChatGPT to comprehend complex visual contexts within conversational settings. The outcomes of this work offer profound insights into augmenting AI capabilities, facilitating a deeper understanding and more effective interaction with visual information across various domains, thereby advancing the field of conversational AI integration with visual context.

Ähnliche Arbeiten

Autoren

Institutionen

Themen

Multimodal Machine Learning ApplicationsArtificial Intelligence in Healthcare and EducationCOVID-19 diagnosis using AI
Volltext beim Verlag öffnen