Please use this identifier to cite or link to this item:
http://hdl.handle.net/2080/4632
Title: | NLP-Driven Malware Classification: A Jaccard Similarity Approach |
Authors: | Gond, Bishwajit Prasad Shahnawaz, Md Rajneekant Mohapatra, Durga Prasad |
Keywords: | Malware Malware classifier n-grams Jaccard similarity ortable executable |
Issue Date: | Jun-2024 |
Citation: | IEEE International Conference on Information Technology, Electronics and Intelligent Communication Systems (ICITEICS) Karnataka, India. Jun 28-29, 2024 |
Abstract: | Malware classification is a critical task in cybersecu-rity, essential for identifying and mitigating threats. This paper presents an approach to malware classification using Natural Language Processing (NLP) techniques coupled with Jaccard similarity. We propose utilizing n-grams of API call sequences, comprising API names and their arguments, to represent the be- haviour of malware samples. By computing the Jaccard similarity between these n-grams, we can effectively capture the similarities and differences in malware behaviour. Our experiments reveal that different n-grams exhibit varying classification abilities, with some performing better for specific types of malware. Moreover, we observe that increasing the value of n in n-grams leads to improved evaluation metrics, indicating the effectiveness of our approach. Overall, our method offers a promising approach to malware classification, leveraging NLP and Jaccard similarity to enhance accuracy and effectiveness in identifying malware variants. |
Description: | Copyright belongs to proceeding publisher |
URI: | http://hdl.handle.net/2080/4632 |
Appears in Collections: | Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2024_IEEE_DPMohapatra_NLP.pdf | 1.06 MB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.