Multi-branch Multi-Scale Attention Network for Facial Expression Recognition (FER) in-the-Wild

Please use this identifier to cite or link to this item: http://hdl.handle.net/2080/3972

Title:	Multi-branch Multi-Scale Attention Network for Facial Expression Recognition (FER) in-the-Wild
Authors:	Ghadai, Chakrapani Patra, Dipti
Keywords:	Facial expression recognition CNN Muti-scale Attention receptive field kernel size
Issue Date:	Jan-2023
Citation:	4th International Conference on Advances in Distributed Computing and Machine Learning (ICADCML), NIT Rourkela, Odisha, 15-16 January 2023
Abstract:	The challenges in facial expression recognition (FER) is mostly caused by high intra-class variations, subtle inter-class visual changes, and smaller datasets. The intra-class and inter-class variations suffer big from the pose, illumination, or partial occlusion in the real world, which degrade the performance of FER significantly. Multi-scale and attentionbased networks are widely used to address these challenges. In most of the previous approaches, lower-level features at smaller scale progress towards higher levels to construct features at larger scales, or convolutions at different resolutions are used for multi-scale feature representations. The used methods have increased depth, but lacked width and are inadequate in representing features at granular levels to precisely capture important facial expression features. Here, we introduce a novel multibranch multi-scale attention network (MSA-Net) for FER. MSA-net is a deeper and wider network and it extracts multi-scale features at different receptive fields in a parallel network structure. Moreover, to improve the effective receptive field and extract diverse features, different kernel sizes in each parallel branch are used. Further, to focus on important regions, and make the features more discriminating multi-scale features are passed through attention networks. MSA-Net can extract sufficiently diverse attention-enhanced multi-scale features from different parallel paths, this can lessen the effect of intra-class and inter-class variations due to external factors. Further, features at different receptive fields from each parallel path are combined together to reduce the effect of pose and partial occlusion. The experimental findings reveal that the suggested method achieves competitive results on widely used in-the-wild public datasets.
Description:	Copyright belongs to proceeding publisher
URI:	http://hdl.handle.net/2080/3972
Appears in Collections:	Conference Papers

Files in This Item:

File	Description	Size	Format
2023_ICADCML_Multi-branch_CGhadai.pdf		447.98 kB	Adobe PDF	View/Open

Show full item record