ترغب بنشر مسار تعليمي؟ اضغط هنا

Anomaly Detection for Aggregated Data Using Multi-Graph Autoencoder

101   0   0.0 ( 0 )
 نشر من قبل Tomer Meirman
 تاريخ النشر 2021
  مجال البحث الهندسة المعلوماتية
والبحث باللغة English




اسأل ChatGPT حول البحث

In data systems, activities or events are continuously collected in the field to trace their proper executions. Logging, which means recording sequences of events, can be used for analyzing system failures and malfunctions, and identifying the causes and locations of such issues. In our research we focus on creating an Anomaly detection models for system logs. The task of anomaly detection is identifying unexpected events in dataset, which differ from the normal behavior. Anomaly detection models also assist in data systems analysis tasks. Modern systems may produce such a large amount of events monitoring every individual event is not feasible. In such cases, the events are often aggregated over a fixed period of time, reporting the number of times every event has occurred in that time period. This aggregation facilitates scaling, but requires a different approach for anomaly detection. In this research, we present a thorough analysis of the aggregated data and the relationships between aggregated events. Based on the initial phase of our research we present graphs representations of our aggregated dataset, which represent the different relationships between aggregated instances in the same context. Using the graph representation, we propose Multiple-graphs autoencoder MGAE, a novel convolutional graphs-autoencoder model which exploits the relationships of the aggregated instances in our unique dataset. MGAE outperforms standard graph-autoencoder models and the different experiments. With our novel MGAE we present 60% decrease in reconstruction error in comparison to standard graph autoencoder, which is expressed in reconstructing high-degree relationships.



قيم البحث

اقرأ أيضاً

350 - Tong Zhao , Bo Ni , Wenhao Yu 2020
The proliferation of web platforms has created incentives for online abuse. Many graph-based anomaly detection techniques are proposed to identify the suspicious accounts and behaviors. However, most of them detect the anomalies once the users have p erformed many such behaviors. Their performance is substantially hindered when the users observed data is limited at an early stage, which needs to be improved to minimize financial loss. In this work, we propose Eland, a novel framework that uses action sequence augmentation for early anomaly detection. Eland utilizes a sequence predictor to predict next actions of every user and exploits the mutual enhancement between action sequence augmentation and user-action graph anomaly detection. Experiments on three real-world datasets show that Eland improves the performance of a variety of graph-based anomaly detection methods. With Eland, anomaly detection performance at an earlier stage is better than non-augmented methods that need significantly more observed data by up to 15% on the Area under the ROC curve.
Dimensionality reduction is a crucial first step for many unsupervised learning tasks including anomaly detection and clustering. Autoencoder is a popular mechanism to accomplish dimensionality reduction. In order to make dimensionality reduction eff ective for high-dimensional data embedding nonlinear low-dimensional manifold, it is understood that some sort of geodesic distance metric should be used to discriminate the data samples. Inspired by the success of geodesic distance approximators such as ISOMAP, we propose to use a minimum spanning tree (MST), a graph-based algorithm, to approximate the local neighborhood structure and generate structure-preserving distances among data points. We use this MST-based distance metric to replace the Euclidean distance metric in the embedding function of autoencoders and develop a new graph regularized autoencoder, which outperforms a wide range of alternative methods over 20 benchmark anomaly detection datasets. We further incorporate the MST regularizer into two generative adversarial networks and find that using the MST regularizer improves the performance of anomaly detection substantially for both generative adversarial networks. We also test our MST regularized autoencoder on two datasets in a clustering application and witness its superior performance as well.
Unsupervised learning can leverage large-scale data sources without the need for annotations. In this context, deep learning-based auto encoders have shown great potential in detecting anomalies in medical images. However, state-of-the-art anomaly sc ores are still based on the reconstruction error, which lacks in two essential parts: it ignores the model-internal representation employed for reconstruction, and it lacks formal assertions and comparability between samples. We address these shortcomings by proposing the Context-encoding Variational Autoencoder (ceVAE) which combines reconstruction- with density-based anomaly scoring. This improves the sample- as well as pixel-wise results. In our experiments on the BraTS-2017 and ISLES-2015 segmentation benchmarks, the ceVAE achieves unsupervised ROC-AUCs of 0.95 and 0.89, respectively, thus outperforming state-of-the-art methods by a considerable margin.
Graph-based anomaly detection has been widely used for detecting malicious activities in real-world applications. Existing attempts to address this problem have thus far focused on structural feature engineering or learning in the binary classificati on regime. In this work, we propose to leverage graph contrastive coding and present the supervised GCCAD model for contrasting abnormal nodes with normal ones in terms of their distances to the global context (e.g., the average of all nodes). To handle scenarios with scarce labels, we further enable GCCAD as a self-supervised framework by designing a graph corrupting strategy for generating synthetic node labels. To achieve the contrastive objective, we design a graph neural network encoder that can infer and further remove suspicious links during message passing, as well as learn the global context of the input graph. We conduct extensive experiments on four public datasets, demonstrating that 1) GCCAD significantly and consistently outperforms various advanced baselines and 2) its self-supervised version without fine-tuning can achieve comparable performance with its fully supervised version.
Recently, the graph neural network (GNN) has shown great power in matrix completion by formulating a rating matrix as a bipartite graph and then predicting the link between the corresponding user and item nodes. The majority of GNN-based matrix compl etion methods are based on Graph Autoencoder (GAE), which considers the one-hot index as input, maps a user (or item) index to a learnable embedding, applies a GNN to learn the node-specific representations based on these learnable embeddings and finally aggregates the representations of the target users and its corresponding item nodes to predict missing links. However, without node content (i.e., side information) for training, the user (or item) specific representation can not be learned in the inductive setting, that is, a model trained on one group of users (or items) cannot adapt to new users (or items). To this end, we propose an inductive matrix completion method using GAE (IMC-GAE), which utilizes the GAE to learn both the user-specific (or item-specific) representation for personalized recommendation and local graph patterns for inductive matrix completion. Specifically, we design two informative node features and employ a layer-wise node dropout scheme in GAE to learn local graph patterns which can be generalized to unseen data. The main contribution of our paper is the capability to efficiently learn local graph patterns in GAE, with good scalability and superior expressiveness compared to previous GNN-based matrix completion methods. Furthermore, extensive experiments demonstrate that our model achieves state-of-the-art performance on several matrix completion benchmarks. Our official code is publicly available.

الأسئلة المقترحة

التعليقات
جاري جلب التعليقات جاري جلب التعليقات
سجل دخول لتتمكن من متابعة معايير البحث التي قمت باختيارها
mircosoft-partner

هل ترغب بارسال اشعارات عن اخر التحديثات في شمرا-اكاديميا