Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Pathology Public Datasets for Artificial Intelligence: A Systematic Review
0
Zitationen
3
Autoren
2026
Jahr
Abstract
Abstract Histopathology plays a crucial role in the diagnosis of many diseases, especially cancers, for which the correct classification of tissue samples significantly influences treatments for patients. The growing use of artificial intelligence (AI) in digital pathology offers opportunities for improving diagnostic process speed, accuracy, and scalability. The availability of well-structured, annotated histopathology datasets is essential to this advancement. This study provides an extensive overview of publicly available datasets tailored specifically for histopathology-related AI and machine learning research. Our review yielded 151 datasets across tissue types and cancers (gastrointestinal, brain glioma, lung adenocarcinoma, and others). We also categorize the datasets in terms of the number of patients, organs, staining, magnification, scanner, size, collected method, year, and resolution. We analyze multiple key and popular datasets, including but not limited to CAMELYON, TUPAC, MIDOG, MoNuSeg, and BreakHis. We believe this review will help computational histopathology research by providing a comprehensive understanding of the available datasets, their structures, and their specific applications. Researchers can more effectively choose relevant datasets for creating AI models suited to certain tasks, such as cancer diagnosis, treatment response prediction, and tissue classification, by documenting and evaluating these resources. Within the field, standardizing images across these datasets can facilitate collaborations between the data-generating experts, pathologists, and AI model developers, as well as help in reproducibility, benchmark testing, and evaluation. Furthermore, by combining histopathological, radiological, and genomic data, for example, this evaluation will help identify gaps in the availability of current datasets. Another benefit is that it will help identify the need for additional diversified datasets that incorporate multimodal data. Closing these gaps will be essential to creating AI models that are applicable to different types of institutions and patient groups.
Ähnliche Arbeiten
A survey on deep learning in medical image analysis
2017 · 13.483 Zit.
Dermatologist-level classification of skin cancer with deep neural networks
2017 · 13.116 Zit.
A survey on Image Data Augmentation for Deep Learning
2019 · 11.718 Zit.
QuPath: Open source software for digital pathology image analysis
2017 · 8.074 Zit.
Radiomics: Images Are More than Pictures, They Are Data
2015 · 7.969 Zit.