Advanced search powered by artificial intelligence

New community

Subscribe to the gold package and get unlimited access to Shamra Academy

Graph-based Incident Aggregation for Large-Scale Online Service Systems

88 0 0.0 ( 0 )

Download Cite

Added by Zhuangbin Chen

Publication date 2021

fields Informatics Engineering

and research's language is English

Authors Zhuangbin Chen - Jinyang Liu - Yuxin Su

Machine Learning Software Engineering

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

As online service systems continue to grow in terms of complexity and volume, how service incidents are managed will significantly impact company revenue and user trust. Due to the cascading effect, cloud failures often come with an overwhelming number of incidents from dependent services and devices. To pursue efficient incident management, related incidents should be quickly aggregated to narrow down the problem scope. To this end, in this paper, we propose GRLIA, an incident aggregation framework based on graph representation learning over the cascading graph of cloud failures. A representation vector is learned for each unique type of incident in an unsupervised and unified manner, which is able to simultaneously encode the topological and temporal correlations among incidents. Thus, it can be easily employed for online incident aggregation. In particular, to learn the correlations more accurately, we try to recover the complete scope of failures cascading impact by leveraging fine-grained system monitoring data, i.e., Key Performance Indicators (KPIs). The proposed framework is evaluated with real-world incident data collected from a large-scale online service system of Huawei Cloud. The experimental results demonstrate that GRLIA is effective and outperforms existing methods. Furthermore, our framework has been successfully deployed in industrial practice.

rate research

A Graph Computation based Sequential Power Flow Calculation for Large-Scale ACDC Systems

310 - Wei Feng , Jingjin Wu , Chen Yuan 2019

This paper proposes a graph computation based sequential power flow calculation method for Line Commutated Converter (LCC) based large-scale AC/DC systems to achieve a high computing performance. Based on the graph theory, the complex AC/DC system is first converted to a graph model and stored in a graph database. Then, the hybrid system is divided into several isolated areas with graph partition algorithm by decoupling AC and DC networks. Thus, the power flow analysis can be executed in parallel for each independent area with the new selected slack buses. Furthermore, for each area, the node-based parallel computing (NPC) and hierarchical parallel computing (HPC) used in graph computation are employed to speed up fast decoupled power flow (FDPF). Comprehensive case studies on the IEEE 300-bus, polished South Carolina 12,000-bus system and a China 11,119-bus system are performed to demonstrate the accuracy and efficiency of the proposed method

Distributed Parallel and Cluster Computing

Adversarial Attack on Large Scale Graph

116 - Jintang Li , Tao Xie , Liang Chen 2020

Recent studies have shown that graph neural networks (GNNs) are vulnerable against perturbations due to lack of robustness and can therefore be easily fooled. Currently, most works on attacking GNNs are mainly using gradient information to guide the attack and achieve outstanding performance. However, the high complexity of time and space makes them unmanageable for large scale graphs and becomes the major bottleneck that prevents the practical usage. We argue that the main reason is that they have to use the whole graph for attacks, resulting in the increasing time and space complexity as the data scale grows. In this work, we propose an efficient Simplified Gradient-based Attack (SGA) method to bridge this gap. SGA can cause the GNNs to misclassify specific target nodes through a multi-stage attack framework, which needs only a much smaller subgraph. In addition, we present a practical metric named Degree Assortativity Change (DAC) to measure the impacts of adversarial attacks on graph data. We evaluate our attack method on four real-world graph networks by attacking several commonly used GNNs. The experimental results demonstrate that SGA can achieve significant time and memory efficiency improvements while maintaining competitive attack performance compared to state-of-art attack techniques. Codes are available via: https://github.com/EdisonLeeeee/SGAttack.

Machine Learning Artificial Intelligence Cryptography and Security

GIST: Distributed Training for Large-Scale Graph Convolutional Networks

100 - Cameron R. Wolfe , Jingkang Yang , Arindam Chowdhury 2021

The graph convolutional network (GCN) is a go-to solution for machine learning on graphs, but its training is notoriously difficult to scale both in terms of graph size and the number of model parameters. Although some work has explored training on large-scale graphs (e.g., GraphSAGE, ClusterGCN, etc.), we pioneer efficient training of large-scale GCN models (i.e., ultra-wide, overparameterized models) with the proposal of a novel, distributed training framework. Our proposed training methodology, called GIST, disjointly partitions the parameters of a GCN model into several, smaller sub-GCNs that are trained independently and in parallel. In addition to being compatible with any GCN architecture, GIST improves model performance, scales to training on arbitrarily large graphs, significantly decreases wall-clock training time, and enables the training of markedly overparameterized GCN models. Remarkably, with GIST, we train an astonishgly-wide 32,768-dimensional GraphSAGE model, which exceeds the capacity of a single GPU by a factor of 8X, to SOTA performance on the Amazon2M dataset.

Machine Learning Artificial Intelligence Distributed Parallel and Cluster Computing

CoSimGNN: Towards Large-scale Graph Similarity Computation

66 - Haoyan Xu , Runjian Chen , Yunsheng Bai 2020

The ability to compute similarity scores between graphs based on metrics such as Graph Edit Distance (GED) is important in many real-world applications, such as 3D action recognition and biological molecular identification. Computing exact GED values is typically an NP-hard problem and traditional algorithms usually achieve an unsatisfactory trade-off between accuracy and efficiency. Recently, Graph Neural Networks (GNNs) provide a data-driven solution for this task, which is more efficient while maintaining prediction accuracy in small graph (around 10 nodes per graph) similarity computation. Existing GNN-based methods, which either respectively embed two graphs (lack of low-level cross-graph interactions) or deploy cross-graph interactions for whole graph pairs (redundant and time-consuming), are still not able to achieve competitive results when the number of nodes in graphs increases. In this paper, we focus on similarity computation for large-scale graphs and propose the embedding-coarsening-matching framework, which first embeds and coarsens large graphs to coarsened graphs with denser local topology and then deploys fine-grained interactions on the coarsened graphs for the final similarity scores.

Machine Learning Social and Information Networks Machine Learning

Training like Playing: A Reinforcement Learning And Knowledge Graph-based framework for building Automatic Consultation System in Medical Field

90 - Yining Huang , Meilian Chen , Keke Tang 2021

We introduce a framework for AI-based medical consultation system with knowledge graph embedding and reinforcement learning components and its implement. Our implement of this framework leverages knowledge organized as a graph to have diagnosis according to evidence collected from patients recurrently and dynamically. According to experiment we designed for evaluating its performance, it archives a good result. More importantly, for getting better performance, researchers can implement it on this framework based on their innovative ideas, well designed experiments and even clinical trials.

Machine Learning Software Engineering

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Graph-based Incident Aggregation for Large-Scale Online Service Systems

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions