Real-Time Anomaly Detection in Data Centers for Log-based Predictive Maintenance using an Evolving Fuzzy-Rule-Based Approach

73 0 0.0 ( 0 )

تحميل البحث استخدام كمرجع

نشر من قبل Daniel Leite

تاريخ النشر 2020

مجال البحث الهندسة المعلوماتية

والبحث باللغة English

تأليف Leticia Decker - Daniel Leite - Luca Giommi

الذكاء الاصطناعي قواعد البيانات التعلم الآلي

قم بزيارة صفحتنا على فيسبوك

‎Shamra Academia - شمرا أكاديميا‎

اسأل ChatGPT حول البحث

الملخص بالعربية الملخص بالإنكليزية

Detection of anomalous behaviors in data centers is crucial to predictive maintenance and data safety. With data centers, we mean any computer network that allows users to transmit and exchange data and information. In particular, we focus on the Tier-1 data center of the Italian Institute for Nuclear Physics (INFN), which supports the high-energy physics experiments at the Large Hadron Collider (LHC) in Geneva. The center provides resources and services needed for data processing, storage, analysis, and distribution. Log records in the data center is a stochastic and non-stationary phenomenon in nature. We propose a real-time approach to monitor and classify log records based on sliding time windows, and a time-varying evolving fuzzy-rule-based classification model. The most frequent log pattern according to a control chart is taken as the normal system status. We extract attributes from time windows to gradually develop and update an evolving Gaussian Fuzzy Classifier (eGFC) on the fly. The real-time anomaly monitoring system has to provide encouraging results in terms of accuracy, compactness, and real-time operation.

قيم البحث

64 - Leticia Decker , Daniel Leite , Fabio Viola 2020

Log-based predictive maintenance of computing centers is a main concern regarding the worldwide computing grid that supports the CERN (European Organization for Nuclear Research) physics experiments. A log, as event-oriented adhoc information, is qui te often given as unstructured big data. Log data processing is a time-consuming computational task. The goal is to grab essential information from a continuously changeable grid environment to construct a classification model. Evolving granular classifiers are suited to learn from time-varying log streams and, therefore, perform online classification of the severity of anomalies. We formulated a 4-class online anomaly classification problem, and employed time windows between landmarks and two granular computing methods, namely, Fuzzy-set-Based evolving Modeling (FBeM) and evolving Granular Neural Network (eGNN), to model and monitor logging activity rate. The results of classification are of utmost importance for predictive maintenance because priority can be given to specific time intervals in which the classifier indicates the existence of high or medium severity anomalies.

الحوسبة العصبية والتطورية التعلم الآلي التعلم الالي

pAElla: Edge-AI based Real-Time Malware Detection in Data Centers

102 - Antonio Libri , Andrea Bartolini , Luca Benini 2020

The increasing use of Internet-of-Things (IoT) devices for monitoring a wide spectrum of applications, along with the challenges of big data streaming support they often require for data analysis, is nowadays pushing for an increased attention to the emerging edge computing paradigm. In particular, smart approaches to manage and analyze data directly on the network edge, are more and more investigated, and Artificial Intelligence (AI) powered edge computing is envisaged to be a promising direction. In this paper, we focus on Data Centers (DCs) and Supercomputers (SCs), where a new generation of high-resolution monitoring systems is being deployed, opening new opportunities for analysis like anomaly detection and security, but introducing new challenges for handling the vast amount of data it produces. In detail, we report on a novel lightweight and scalable approach to increase the security of DCs/SCs, that involves AI-powered edge computing on high-resolution power consumption. The method -- called pAElla -- targets real-time Malware Detection (MD), it runs on an out-of-band IoT-based monitoring system for DCs/SCs, and involves Power Spectral Density of power measurements, along with AutoEncoders. Results are promising, with an F1-score close to 1, and a False Alarm and Malware Miss rate close to 0%. We compare our method with State-of-the-Art MD techniques and show that, in the context of DCs/SCs, pAElla can cover a wider range of malware, significantly outperforming SoA approaches in terms of accuracy. Moreover, we propose a methodology for online training suitable for DCs/SCs in production, and release open dataset and code.

التعلم الآلي التشفير والأمن معالجة الإشارات

Anomaly Detection in Predictive Maintenance: A New Evaluation Framework for Temporal Unsupervised Anomaly Detection Algorithms

67 - Jacinto Carrasco , Irina Markova , David Lopez 2021

The research in anomaly detection lacks a unified definition of what represents an anomalous instance. Discrepancies in the nature itself of an anomaly lead to multiple paradigms of algorithms design and experimentation. Predictive maintenance is a s pecial case, where the anomaly represents a failure that must be prevented. Related time-series research as outlier and novelty detection or time-series classification does not apply to the concept of an anomaly in this field, because they are not single points which have not been seen previously and may not be precisely annotated. Moreover, due to the lack of annotated anomalous data, many benchmarks are adapted from supervised scenarios. To address these issues, we generalise the concept of positive and negative instances to intervals to be able to evaluate unsupervised anomaly detection algorithms. We also preserve the imbalance scheme for evaluation through the proposal of the Preceding Window ROC, a generalisation for the calculation of ROC curves for time-series scenarios. We also adapt the mechanism from a established time-series anomaly detection benchmark to the proposed generalisations to reward early detection. Therefore, the proposal represents a flexible evaluation framework for the different scenarios. To show the usefulness of this definition, we include a case study of Big Data algorithms with a real-world time-series problem provided by the company ArcelorMittal, and compare the proposal with an evaluation method.

التعلم الآلي

DROW: Real-Time Deep Learning based Wheelchair Detection in 2D Range Data

89 - Lucas Beyer , Alexander Hermans , Bastian Leibe 2016

We introduce the DROW detector, a deep learning based detector for 2D range data. Laser scanners are lighting invariant, provide accurate range data, and typically cover a large field of view, making them interesting sensors for robotics applications . So far, research on detection in laser range data has been dominated by hand-crafted features and boosted classifiers, potentially losing performance due to suboptimal design choices. We propose a Convolutional Neural Network (CNN) based detector for this task. We show how to effectively apply CNNs for detection in 2D range data, and propose a depth preprocessing step and voting scheme that significantly improve CNN performance. We demonstrate our approach on wheelchairs and walkers, obtaining state of the art detection results. Apart from the training data, none of our design choices limits the detector to these two classes, though. We provide a ROS node for our detector and release our dataset containing 464k laser scans, out of which 24k were annotated.

علم الروبوتات الرؤية الحاسوبية وتمييز الأنماط التعلم الآلي

Uncertainty in Ontology Matching: A Decision Rule-Based Approach

313 - Amira Essaid , Arnaud Martin , Gregory Smits 2015

Considering the high heterogeneity of the ontologies pub-lished on the web, ontology matching is a crucial issue whose aim is to establish links between an entity of a source ontology and one or several entities from a target ontology. Perfectible si milarity measures, consid-ered as sources of information, are combined to establish these links. The theory of belief functions is a powerful mathematical tool for combining such uncertain information. In this paper, we introduce a decision pro-cess based on a distance measure to identify the best possible matching entities for a given source entity.

الذكاء الاصطناعي