Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.

Toward a Neurosymbolic Understanding of Hidden Neuron Activations

2026·0 Zitationen·Neurosymbolic Artificial Intelligence

Volltext beim Verlag öffnen

Zitationen

Autoren

2026

Jahr

Abstract

With the widespread adoption of deep learning techniques, the need for explainability and trustworthiness is increasingly critical, especially in safety-sensitive applications and for improved debugging, given the black-box nature of these models. The explainable AI (XAI) literature offers various helpful techniques; however, many approaches use a secondary deep learning-based model to explain the primary model’s decisions or require domain expertise to interpret the explanations. A relatively new approach involves explaining models using high-level, human-understandable concepts. While these methods have proven effective, an intriguing area of exploration lies in using a white-box technique to explain the probing model. We present a novel, model-agnostic, post hoc XAI method that provides meaningful interpretations for hidden neuron activations. Our approach leverages a Wikipedia-derived concept hierarchy, encompassing approximately 2 million classes as background knowledge, and uses deductive reasoning-based concept induction to generate explanations. Our method demonstrates competitive performance across various evaluation metrics, including statistical evaluation, concept activation analysis, and benchmarking against contemporary methods. Additionally, a specialized study with large language models (LLMs) highlights how LLMs can serve as explainers in a manner similar to our method, showing comparable performance with some trade-offs. Furthermore, we have developed a tool called ConceptLens, enabling users to test custom images and obtain explanations for model decisions. Finally, we introduce an entirely reproducible, end-to-end system that simplifies the process of replicating our system and results.

Autoren

Institutionen

Themen

Explainable Artificial Intelligence (XAI)Artificial Intelligence in Healthcare and EducationMultimodal Machine Learning Applications

Volltext beim Verlag öffnen

Toward a Neurosymbolic Understanding of Hidden Neuron Activations

Abstract

Ähnliche Arbeiten

Autoren

Institutionen

Themen