Please use this identifier to cite or link to this item:
http://hdl.handle.net/2080/5150
Title: | Augmentation of Topic Modeling: Comparing The Traditional and Their Word Embedding Oriented Approaches |
Authors: | Das, Gobind Kumar Bhattacharjee, Panthadeep |
Keywords: | Topic Modeling Latent Dirichlet Allocation Non-negative Matrix Factorization Word2Vec embedding |
Issue Date: | Feb-2025 |
Citation: | 3rd International Conference on Intelligent Systems, Advanced Computing, and Communication (ISACC), Assam University, Silchar, 27-28 February 2025 |
Abstract: | This paper compares the efficacy of classical topic modeling approaches namely the Latent Dirichlet Allocation (LDA) and Non-Negative Matrix Factorization (NMF) with their respective word embedding enhanced techniques. The LDA and NMF are based on word frequency and co-occurrence patterns, but ignore the semantic similarity between words. To address these limitations, the proposed methods enhance the LDA and NMF with Word2Vec embeddings by capturing the semantic relationships, and overseeing the improvement in topic coherence. This study therefore examines the comparable strengths of these approaches and shows how word embeddings can improve overall topic modeling. Our experimental results have shown that the Word2Vec enhanced LDA and NMF improve the coherence scores over their traditional counterparts. |
Description: | Copyright belongs to the proceeding publisher. |
URI: | http://hdl.handle.net/2080/5150 |
Appears in Collections: | Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2025_ISACC_GKDas_TopicModeling.pdf | 756.42 kB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.