Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Um Novo Modelo Generativo para Descrições Textuais de Imagens Médicas Utilizando Transformadores Realçados com Redes Neurais Convolucionais
0
Zitationen
1
Autoren
2023
Jahr
Abstract
The automatic generation of descriptions for medical images has sparked increasing interest in the healthcare field due to its potential to assist professionals in the interpretation and analysis of clinical exams. This work explores the development and evaluation of a generalist generative model for medical image descriptions using the publicly available Radiology Objects in COntext (ROCO) dataset. The state of the art was analyzed, identifying approaches based on natural language processing techniques and machine learning. However, several gaps were identified in the literature, such as the lack of studies that explore the performance of specific models for medical description generation, the need for objective evaluation of the quality of generated descriptions, the lack of model generalization to different image modalities and medical conditions, as well as the lack of transparency and interpretability of the models. To address these issues, a methodological strategy was adopted, combining natural language processing techniques and image recognition models to extract relevant features from medical images and feed them into a generative model based on neural networks, aiming for model generalization to different image modalities and medical conditions. The model was trained using an annotated dataset, and its evaluation was performed using accuracy and Bilingual Evaluation Understudy (BLEU) metrics. The obtained results showed promising outcomes in the generation of descriptions for medical images, with an accuracy of 0.7628 and a BLEU-1 score of 0.5387. However, the quality of the generated descriptions may still be limited, exhibiting semantic errors or lacking specific relevant details. These limitations can be attributed to the availability and representativeness of the data in the ROCO dataset, as well as the techniques used for description generation. For future research, it is suggested to further explore the influence of different techniques and approaches, such as using more advanced neural network architectures, implementing interpretability of generative models and employing even larger and more diverse datasets. Additionally, conducting clinical validation is recommended to deepen the comparative analysis between generated descriptions and those provided by experts.
Ähnliche Arbeiten
MizAR 60 for Mizar 50
2023 · 74.707 Zit.
ImageNet: A large-scale hierarchical image database
2009 · 60.761 Zit.
Microsoft COCO: Common Objects in Context
2014 · 41.359 Zit.
Fully convolutional networks for semantic segmentation
2015 · 36.463 Zit.
Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization
2017 · 20.576 Zit.