Please use this identifier to cite or link to this item:
http://hdl.handle.net/2080/3972
Title: | Multi-branch Multi-Scale Attention Network for Facial Expression Recognition (FER) in-the-Wild |
Authors: | Ghadai, Chakrapani Patra, Dipti |
Keywords: | Facial expression recognition CNN Muti-scale Attention receptive field kernel size |
Issue Date: | Jan-2023 |
Citation: | 4th International Conference on Advances in Distributed Computing and Machine Learning (ICADCML), NIT Rourkela, Odisha, 15-16 January 2023 |
Abstract: | The challenges in facial expression recognition (FER) is mostly caused by high intra-class variations, subtle inter-class visual changes, and smaller datasets. The intra-class and inter-class variations suffer big from the pose, illumination, or partial occlusion in the real world, which degrade the performance of FER significantly. Multi-scale and attentionbased networks are widely used to address these challenges. In most of the previous approaches, lower-level features at smaller scale progress towards higher levels to construct features at larger scales, or convolutions at different resolutions are used for multi-scale feature representations. The used methods have increased depth, but lacked width and are inadequate in representing features at granular levels to precisely capture important facial expression features. Here, we introduce a novel multibranch multi-scale attention network (MSA-Net) for FER. MSA-net is a deeper and wider network and it extracts multi-scale features at different receptive fields in a parallel network structure. Moreover, to improve the effective receptive field and extract diverse features, different kernel sizes in each parallel branch are used. Further, to focus on important regions, and make the features more discriminating multi-scale features are passed through attention networks. MSA-Net can extract sufficiently diverse attention-enhanced multi-scale features from different parallel paths, this can lessen the effect of intra-class and inter-class variations due to external factors. Further, features at different receptive fields from each parallel path are combined together to reduce the effect of pose and partial occlusion. The experimental findings reveal that the suggested method achieves competitive results on widely used in-the-wild public datasets. |
Description: | Copyright belongs to proceeding publisher |
URI: | http://hdl.handle.net/2080/3972 |
Appears in Collections: | Conference Papers |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
2023_ICADCML_Multi-branch_CGhadai.pdf | 447.98 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.