ﻻ يوجد ملخص باللغة العربية
Network device syslogs are ubiquitous and abundant in modern data centers with most large data centers producing millions of messages per day. Yet, the operational information reflected in syslogs and their implications on diagnosis or management tasks are poorly understood. Prevalent approaches to understanding syslogs focus on simple correlation and abnormality detection and are often limited to detection providing little insight towards diagnosis and resolution. Towards improving data center operations, we propose and implement Log-Prophet, a system that applies a toolbox of statistical techniques and domain-specific models to mine detailed diagnoses. Log-Prophet infers causal relationships between syslog lines and constructs succinct but valuable problem graphs, summarizing root causes and their locality, including cascading problems. We validate Log-Prophet using problem tickets and through operator interviews. To demonstrate the strength of Log-Prophet, we perform an initial longitudinal study of a large online service providers data center. Our study demonstrates that Log-Prophet significantly reduces the number of alerts while highlighting interesting operational issues.
Modern semantic parsers suffer from two principal limitations. First, training requires expensive collection of utterance-program pairs. Second, semantic parsers fail to generalize at test time to new compositions/structures that have not been observ
Detection of malicious behavior is a fundamental problem in security. One of the major challenges in using detection systems in practice is in dealing with an overwhelming number of alerts that are triggered by normal behavior (the so-called false po
Recent results in coupled or temporal graphical models offer schemes for estimating the relationship structure between features when the data come from related (but distinct) longitudinal sources. A novel application of these ideas is for analyzing g
Today, network devices share buffer across priority queues to avoid drops during transient congestion. While cost-effective most of the time, this sharing can cause undesired interference among seemingly independent traffic. As a result, low-priority
In recent years, many techniques have been developed to improve the performance and efficiency of data center networks. While these techniques provide high accuracy, they are often designed using heuristics that leverage domain-specific properties of