Please use this identifier to cite or link to this item:
http://hdl.handle.net/2080/5378| Title: | Towards Accurate In-Bed Human Pose Estimation Using Swin Transformer |
| Authors: | Mondal, Shiladitya Chatterjee, Saptarshi |
| Keywords: | Human pose estimation Swin transformer Uni-modal LWIR |
| Issue Date: | Nov-2025 |
| Citation: | 4th IEEE Conference on Applied Signal Processing (ASPCON), Jadavpur University, Kolkata, 21-22 November 2025 |
| Abstract: | In-bed human pose estimation is a very challenging task in computer vision due to factors such as low-light environments and occlusions caused by covers. Convolutional neural networks (CNNs) are commonly used in these type of vision tasks but struggle with capturing long-range dependencies. To address this, we propose a transformer-based deep learning model utilizing a pre-trained Swin Transformer combined with a pose estimation head. This design allows the model to integrate multi-scale features effectively and model spatial dependencies among joints. Simultaneously-collected multimodal lying pose (SLP) dataset is used for training and testing of our methodology. Our approach is uni-modal, relying solely on the long-wave infrared (LWIR) modality to predict 2D joint positions, without the need for additional modalities such as depth or pressure data. Experiments show that our proposed approach surpasses most prior methods in in-bed human pose estimation accuracy, highlighting its effectiveness. |
| Description: | Copyright belongs to the proceeding publisher. |
| URI: | http://hdl.handle.net/2080/5378 |
| Appears in Collections: | Conference Papers |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 2025_ASPCON_SMondal_Towards.pdf | 863.85 kB | Adobe PDF | View/Open Request a copy |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
