Please use this identifier to cite or link to this item: http://hdl.handle.net/2080/5378
Title: Towards Accurate In-Bed Human Pose Estimation Using Swin Transformer
Authors: Mondal, Shiladitya
Chatterjee, Saptarshi
Keywords: Human pose estimation
Swin transformer
Uni-modal
LWIR
Issue Date: Nov-2025
Citation: 4th IEEE Conference on Applied Signal Processing (ASPCON), Jadavpur University, Kolkata, 21-22 November 2025
Abstract: In-bed human pose estimation is a very challenging task in computer vision due to factors such as low-light environments and occlusions caused by covers. Convolutional neural networks (CNNs) are commonly used in these type of vision tasks but struggle with capturing long-range dependencies. To address this, we propose a transformer-based deep learning model utilizing a pre-trained Swin Transformer combined with a pose estimation head. This design allows the model to integrate multi-scale features effectively and model spatial dependencies among joints. Simultaneously-collected multimodal lying pose (SLP) dataset is used for training and testing of our methodology. Our approach is uni-modal, relying solely on the long-wave infrared (LWIR) modality to predict 2D joint positions, without the need for additional modalities such as depth or pressure data. Experiments show that our proposed approach surpasses most prior methods in in-bed human pose estimation accuracy, highlighting its effectiveness.
Description: Copyright belongs to the proceeding publisher.
URI: http://hdl.handle.net/2080/5378
Appears in Collections:Conference Papers

Files in This Item:
File Description SizeFormat 
2025_ASPCON_SMondal_Towards.pdf863.85 kBAdobe PDFView/Open    Request a copy


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.