Video Anomaly Detection Using Self-Attention-Enabled Convolutional Spatiotemporal Autoencoder

Please use this identifier to cite or link to this item: http://hdl.handle.net/2080/4104

Full metadata record

DC Field	Value	Language
dc.contributor.author	Nayak, Rashmiranjan	-
dc.contributor.author	Pati, Umesh Chandra	-
dc.contributor.author	Das, Santos Kumar	-
dc.date.accessioned	2023-11-20T05:55:54Z	-
dc.date.available	2023-11-20T05:55:54Z	-
dc.date.issued	2023-10	-
dc.identifier.citation	International Symposium on Communications and Information Technologies (ISCIT), Sydney, Australia, 16-18 October 2023	en_US
dc.identifier.uri	http://hdl.handle.net/2080/4104	-
dc.description	Copyright belongs to proceeding publisher	en_US
dc.description.abstract	The process of automatically detecting abnormal video patterns in the intelligent surveillance framework is known as video anomaly detection. However, video anomaly detection is challenging due to inherent research challenges such as equivocal nature, data imbalances, data scarcity, the complex nature of the entities involved in the anomaly, etc. Hence, a self-attention-enabled convolutional spatiotemporal autoencoder is proposed to detect video anomalies efficiently. The proposed Self-Attention-enabled Convolutional Long-Short-Term-Memory Auto-Encoder (SA-ConvLSTM2DAE)-based video anomaly detector is comprised of three sequential stages: spatial encoder to learn spatial (appearance) features of individual frames, temporal encode-decoder to learn temporal (motion) features of encoded spatial features, and spatial decoder to decode the encoded spatial features for reconstructing the individual frames. Here, the self-attention mechanism is embedded into the convolutional Long Short Term Memory block present in the temporal encoder-decoder section to generate the Spatial-Attention-enabled ConvLSTM block for learning better spatiotemporal features. An efficient threshold selection criteria based on the finding of the optimized Geometric mean value of the sensitivity and specificity from the Receiver Operating Characteristics curve is implemented. The model is trained on only the video frame sequences corresponding to the normal incidents. However, the model poorly reconstructed test frame sequences with video anomalies, as anomalous samples are never exposed during training. Hence, when the anomaly score of individual frames exceeds the selected optimum threshold level, then an anomaly is said to be detected	en_US
dc.subject	Auto-encoders	en_US
dc.subject	Convolutional LSTM	en_US
dc.subject	Convolutional spatiotemporal autoencoder	en_US
dc.subject	Self-attention	en_US
dc.subject	Video anomaly detection	en_US
dc.title	Video Anomaly Detection Using Self-Attention-Enabled Convolutional Spatiotemporal Autoencoder	en_US
dc.type	Article	en_US
Appears in Collections:	Conference Papers

Files in This Item:

File	Description	Size	Format
2023_ISCIT_RNayak_Video.pdf		973.41 kB	Adobe PDF	View/Open

Show simple item record