Please use this identifier to cite or link to this item: http://hdl.handle.net/2080/5150
Title: Augmentation of Topic Modeling: Comparing The Traditional and Their Word Embedding Oriented Approaches
Authors: Das, Gobind Kumar
Bhattacharjee, Panthadeep
Keywords: Topic Modeling
Latent Dirichlet Allocation
Non-negative Matrix Factorization
Word2Vec embedding
Issue Date: Feb-2025
Citation: 3rd International Conference on Intelligent Systems, Advanced Computing, and Communication (ISACC), Assam University, Silchar, 27-28 February 2025
Abstract: This paper compares the efficacy of classical topic modeling approaches namely the Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) with their respective word embedding enhanced techniques. The LDA and NMF are based on word frequency and co-occurrence patterns, but ignore the semantic similarity between words. To address these limitations, the proposed methods enhance the LDA and NMF with Word2Vec embeddings by capturing the semantic relationships, and overseeing the improvement in topic coherence. This study therefore examines the comparable strengths of these approaches and shows how word embeddings can improve overall topic modeling. Our experimental results have shown that the Word2Vec enhanced LDA and NMF improve the coherence scores over their traditional counterparts.
Description: Copyright belongs to the proceeding publisher.
URI: http://hdl.handle.net/2080/5150
Appears in Collections:Conference Papers

Files in This Item:
File Description SizeFormat 
2025_ISACC_GKDas_TopicModeling.pdf756.42 kBAdobe PDFView/Open    Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.