Dies ist eine Übersichtsseite mit Metadaten zu dieser wissenschaftlichen Arbeit. Der vollständige Artikel ist beim Verlag verfügbar.
Latent dirichlet allocation
26.930
Zitationen
3
Autoren
2003
Jahr
Abstract
We describe latent Dirichlet allocation (LDA), a generative probabilistic model for collections of discrete data such as text corpora. LDA is a three-level hierarchical Bayesian model, in which each item of a collection is modeled as a finite mixture over an underlying set of topics. Each topic is, in turn, modeled as an infinite mixture over an underlying set of topic probabilities. In the context of text modeling, the topic probabilities provide an explicit representation of a document. We present efficient approximate inference techniques based on variational methods and an EM algorithm for empirical Bayes parameter estimation. We report results in document modeling, text classification, and collaborative filtering, comparing to a mixture of unigrams model and the probabilistic LSI model.
Ähnliche Arbeiten
MizAR 60 for Mizar 50
2023 · 73.962 Zit.
AI-Assisted Pipeline for Dynamic Generation of Trustworthy Health Supplement Content at Scale
2018 · 45.360 Zit.
Glove: Global Vectors for Word Representation
2014 · 33.283 Zit.
2019 · 31.360 Zit.
Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation
2014 · 23.810 Zit.