ﻻ يوجد ملخص باللغة العربية
Video anomaly detection is a challenging task because of diverse abnormal events. To this task, methods based on reconstruction and prediction are wildly used in recent works, which are built on the assumption that learning on normal data, anomalies cannot be reconstructed or predicated as good as normal patterns, namely the anomaly result with more errors. In this paper, we propose to discriminate anomalies from normal ones by the duality of normality-granted optical flow, which is conducive to predict normal frames but adverse to abnormal frames. The normality-granted optical flow is predicted from a single frame, to keep the motion knowledge focused on normal patterns. Meanwhile, We extend the appearance-motion correspondence scheme from frame reconstruction to prediction, which not only helps to learn the knowledge about object appearances and correlated motion, but also meets the fact that motion is the transformation between appearances. We also introduce a margin loss to enhance the learning of frame prediction. Experiments on standard benchmark datasets demonstrate the impressive performance of our approach.
Video anomaly detection is commonly used in many applications such as security surveillance and is very challenging.A majority of recent video anomaly detection approaches utilize deep reconstruction models, but their performance is often suboptimal
This survey article summarizes research trends on the topic of anomaly detection in video feeds of a single scene. We discuss the various problem formulations, publicly available datasets and evaluation criteria. We categorize and situate past resear
Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process. A side effect of data imbalance occurs when a few abnormal data face a vast number of normal data. The latest VAD
In this paper, we introduce a novel task, referred to as Weakly-Supervised Spatio-Temporal Anomaly Detection (WSSTAD) in surveillance video. Specifically, given an untrimmed video, WSSTAD aims to localize a spatio-temporal tube (i.e., a sequence of b
Appearance and motion are two key components to depict and characterize the video content. Currently, the two-stream models have achieved state-of-the-art performances on video classification. However, extracting motion information, specifically in t