Towards Accurate In-Bed Human Pose Estimation Using Swin Transformer

Please use this identifier to cite or link to this item: http://hdl.handle.net/2080/5378

Title:	Towards Accurate In-Bed Human Pose Estimation Using Swin Transformer
Authors:	Mondal, Shiladitya Chatterjee, Saptarshi
Keywords:	Human pose estimation Swin transformer Uni-modal LWIR
Issue Date:	Nov-2025
Citation:	4th IEEE Conference on Applied Signal Processing (ASPCON), Jadavpur University, Kolkata, 21-22 November 2025
Abstract:	In-bed human pose estimation is a very challenging task in computer vision due to factors such as low-light environments and occlusions caused by covers. Convolutional neural networks (CNNs) are commonly used in these type of vision tasks but struggle with capturing long-range dependencies. To address this, we propose a transformer-based deep learning model utilizing a pre-trained Swin Transformer combined with a pose estimation head. This design allows the model to integrate multi-scale features effectively and model spatial dependencies among joints. Simultaneously-collected multimodal lying pose (SLP) dataset is used for training and testing of our methodology. Our approach is uni-modal, relying solely on the long-wave infrared (LWIR) modality to predict 2D joint positions, without the need for additional modalities such as depth or pressure data. Experiments show that our proposed approach surpasses most prior methods in in-bed human pose estimation accuracy, highlighting its effectiveness.
Description:	Copyright belongs to the proceeding publisher.
URI:	http://hdl.handle.net/2080/5378
Appears in Collections:	Conference Papers

Files in This Item:

File	Description	Size	Format
2025_ASPCON_SMondal_Towards.pdf		863.85 kB	Adobe PDF	View/Open Request a copy