ﻻ يوجد ملخص باللغة العربية
Anomaly detection at scale is an extremely challenging problem of great practicality. When data is large and high-dimensional, it can be difficult to detect which observations do not fit the expected behaviour. Recent work has coalesced on variations of (random) $k$emph{d-trees} to summarise data for anomaly detection. However, these methods rely on ad-hoc score functions that are not easy to interpret, making it difficult to asses the severity of the detected anomalies or select a reasonable threshold in the absence of labelled anomalies. To solve these issues, we contextualise these methods in a probabilistic framework which we call the Mondrian Polya{} Forest for estimating the underlying probability density function generating the data and enabling greater interpretability than prior work. In addition, we develop a memory efficient variant able to operate in the modern streaming environments. Our experiments show that these methods achieves state-of-the-art performance while providing statistically interpretable anomaly scores.
Classification algorithms have been widely adopted to detect anomalies for various systems, e.g., IoT, cloud and face recognition, under the common assumption that the data source is clean, i.e., features and labels are correctly set. However, data c
Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? Existing approaches aim to detect individually surprisin
Given a stream of entries in a multi-aspect data setting i.e., entries having multiple dimensions, how can we detect anomalous activities in an unsupervised manner? For example, in the intrusion detection setting, existing work seeks to detect anomal
Despite the superior performance in modeling complex patterns to address challenging problems, the black-box nature of Deep Learning (DL) methods impose limitations to their application in real-world critical domains. The lack of a smooth manner for
Anomaly Detection is an unsupervised learning task aimed at detecting anomalous behaviours with respect to historical data. In particular, multivariate Anomaly Detection has an important role in many applications thanks to the capability of summarizi