Sketch-Based Streaming Anomaly Detection in Dynamic Graphs

78 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Siddharth Bhatia

تاريخ النشر 2021

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Siddharth Bhatia - Mohit Wadhwa - Philip S. Yu

بنى وهياكل البيانات والخوارزميات الذكاء الاصطناعي التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Given a stream of graph edges from a dynamic graph, how can we assign anomaly scores to edges and subgraphs in an online manner, for the purpose of detecting unusual behavior, using constant time and memory? For example, in intrusion detection, existing work seeks to detect either anomalous edges or anomalous subgraphs, but not both. In this paper, we first extend the count-min sketch data structure to a higher-order sketch. This higher-order sketch has the useful property of preserving the dense subgraph structure (dense subgraphs in the input turn into dense submatrices in the data structure). We then propose four online algorithms that utilize this enhanced data structure, which (a) detect both edge and graph anomalies; (b) process each edge and graph in constant memory and constant update time per newly arriving edge, and; (c) outperform state-of-the-art baselines on four real-world datasets. Our method is the first streaming approach that incorporates dense subgraph search to detect graph anomalies in constant memory and time.

قيم البحث

329 - Yixin Liu , Shirui Pan , Yu Guang Wang 2021

Detecting anomalies for dynamic graphs has drawn increasing attention due to their wide applications in social networks, e-commerce, and cybersecurity. The recent deep learning-based approaches have shown promising results over shallow methods. Howev er, they fail to address two core challenges of anomaly detection in dynamic graphs: the lack of informative encoding for unattributed nodes and the difficulty of learning discriminate knowledge from coupled spatial-temporal dynamic graphs. To overcome these challenges, in this paper, we present a novel Transformer-based Anomaly Detection framework for DYnamic graph (TADDY). Our framework constructs a comprehensive node encoding strategy to better represent each nodes structural and temporal roles in an evolving graphs stream. Meanwhile, TADDY captures informative representation from dynamic graphs with coupled spatial-temporal patterns via a dynamic graph transformer model. The extensive experimental results demonstrate that our proposed TADDY framework outperforms the state-of-the-art methods by a large margin on four real-world datasets.

التعلم الآلي

Isconna: Streaming Anomaly Detection with Frequency and Patterns

119 - Rui Liu , Siddharth Bhatia , Bryan Hooi 2021

An edge stream is a common form of presentation of dynamic networks. It can evolve with time, with new types of nodes or edges being continuously added. Existing methods for anomaly detection rely on edge occurrence counts or compare pattern snippets found in historical records. In this work, we propose Isconna, which focuses on both the frequency and the pattern of edge records. The burst detection component targets anomalies between individual timestamps, while the pattern detection component highlights anomalies across segments of timestamps. These two components together produce three intermediate scores, which are aggregated into the final anomaly score. Isconna does not actively explore or maintain pattern snippets; it instead measures the consecutive presence and absence of edge records. Isconna is an online algorithm, it does not keep the original information of edge records; only statistical values are maintained in a few count-min sketches (CMS). Isconnas space complexity $O(rc)$ is determined by two user-specific parameters, the size of CMSs. In worst case, Isconnas time complexity can be up to $O(rc)$, but it can be amortized in practice. Experiments show that Isconna outperforms five state-of-the-art frequency- and/or pattern-based baselines on six real-world datasets with up to 20 million edge records.

التعلم الآلي الذكاء الاصطناعي

A Memory-Efficient Sketch Method for Estimating High Similarities in Streaming Sets

59 - Pinghui Wang , Yiyan Qi , Yuanming Zhang 2019

Estimating set similarity and detecting highly similar sets are fundamental problems in areas such as databases, machine learning, and information retrieval. MinHash is a well-known technique for approximating Jaccard similarity of sets and has been successfully used for many applications such as similarity search and large scale learning. Its two compress

بنى وهياكل البيانات والخوارزميات

Dynamic Graph-Based Anomaly Detection in the Electrical Grid

112 - Shimiao Li , Amritanshu Pandey , Bryan Hooi 2020

Given sensor readings over time from a power grid, how can we accurately detect when an anomaly occurs? A key part of achieving this goal is to use the network of power grid sensors to quickly detect, in real-time, when any unusual events, whether na tural faults or malicious, occur on the power grid. Existing bad-data detectors in the industry lack the sophistication to robustly detect broad types of anomalies, especially those due to emerging cyber-attacks, since they operate on a single measurement snapshot of the grid at a time. New ML methods are more widely applicable, but generally do not consider the impact of topology change on sensor measurements and thus cannot accommodate regular topology adjustments in historical data. Hence, we propose DYNWATCH, a domain knowledge based and topology-aware algorithm for anomaly detection using sensors placed on a dynamic grid. Our approach is accurate, outperforming existing approaches by 20% or more (F-measure) in experiments; and fast, running in less than 1.7ms on average per time tick per sensor on a 60K+ branch case using a laptop computer, and scaling linearly in the size of the graph.

التعلم الآلي

Multi-Perspective Anomaly Detection

467 - Peter Jakob , Manav Madan , Tobias Schmid-Schirling 2021

Anomaly detection is a critical problem in the manufacturing industry. In many applications, images of objects to be analyzed are captured from multiple perspectives which can be exploited to improve the robustness of anomaly detection. In this work, we build upon the deep support vector data description algorithm and address multi-perspective anomaly detection using three different fusion techniques, i.e., early fusion, late fusion, and late fusion with multiple decoders. We employ different augmentation techniques with a denoising process to deal with scarce one-class data, which further improves the performance (ROC AUC $= 80%$). Furthermore, we introduce the dices dataset, which consists of over 2000 grayscale images of falling dices from multiple perspectives, with 5% of the images containing rare anomalies (e.g., drill holes, sawing, or scratches). We evaluate our approach on the new dices dataset using images from two different perspectives and also benchmark on the standard MNIST dataset. Extensive experiments demonstrate that our proposed {multi-perspective} approach exceeds the state-of-the-art {single-perspective anomaly detection on both the MNIST and dices datasets}. To the best of our knowledge, this is the first work that focuses on addressing multi-perspective anomaly detection in images by jointly using different perspectives together with one single objective function for anomaly detection.

الرؤية الحاسوبية وتمييز الأنماط الذكاء الاصطناعي التعلم الآلي