Hybrid Isolation Forest - Application to Intrusion Detection

134 0 0.0 ( 0 )

Download Cite

Added by Pierre-Francois Marteau

Publication date 2017

fields Informatics Engineering

and research's language is English

Authors Pierre-Franc{c}ois Marteau - Saeid Soheily-Khah - Nicolas Bechet

Machine Learning

visit our facebook page

‎Shamra Academia - شمرا أكاديميا‎

Ask ChatGPT about the research

Abstract in Arabic Abstract in English

From the identification of a drawback in the Isolation Forest (IF) algorithm that limits its use in the scope of anomaly detection, we propose two extensions that allow to firstly overcome the previously mention limitation and secondly to provide it with some supervised learning capability. The resulting Hybrid Isolation Forest (HIF) that we propose is first evaluated on a synthetic dataset to analyze the effect of the new meta-parameters that are introduced and verify that the addressed limitation of the IF algorithm is effectively overcame. We hen compare the two algorithms on the ISCX benchmark dataset, in the context of a network intrusion detection application. Our experiments show that HIF outperforms IF, but also challenges the 1-class and 2-classes SVM baselines with computational efficiency.

rate research

Interpretable Anomaly Detection with DIFFI: Depth-based Isolation Forest Feature Importance

199 - Mattia Carletti , Matteo Terzi , Gian Antonio Susto 2020

Anomaly Detection is an unsupervised learning task aimed at detecting anomalous behaviours with respect to historical data. In particular, multivariate Anomaly Detection has an important role in many applications thanks to the capability of summarizing the status of a complex system or observed phenomenon with a single indicator (typically called `Anomaly Score) and thanks to the unsupervised nature of the task that does not require human tagging. The Isolation Forest is one of the most commonly adopted algorithms in the field of Anomaly Detection, due to its proven effectiveness and low computational complexity. A major problem affecting Isolation Forest is represented by the lack of interpretability, an effect of the inherent randomness governing the splits performed by the Isolation Trees, the building blocks of the Isolation Forest. In this paper we propose effective, yet computationally inexpensive, methods to define feature importance scores at both global and local level for the Isolation Forest. Moreover, we define a procedure to perform unsupervised feature selection for Anomaly Detection problems based on our interpretability method; such procedure also serves the purpose of tackling the challenging task of feature importance evaluation in unsupervised anomaly detection. We assess the performance on several synthetic and real-world datasets, including comparisons against state-of-the-art interpretability techniques, and make the code publicly available to enhance reproducibility and foster research in the field.

Machine Learning Human-Computer Interaction Machine Learning

Hybrid Intrusion Detection and Prediction multiAgent System HIDPAS

602 - Farah Jemili , Montaceur Zaghdoud , Mohamed Ben Ahmed 2009

This paper proposes an intrusion detection and prediction system based on uncertain and imprecise inference networks and its implementation. Giving a historic of sessions, it is about proposing a method of supervised learning doubled of a classifier permitting to extract the necessary knowledge in order to identify the presence or not of an intrusion in a session and in the positive case to recognize its type and to predict the possible intrusions that will follow it. The proposed system takes into account the uncertainty and imprecision that can affect the statistical data of the historic. The systematic utilization of an unique probability distribution to represent this type of knowledge supposes a too rich subjective information and risk to be in part arbitrary. One of the first objectives of this work was therefore to permit the consistency between the manner of which we represent information and information which we really dispose.

Cryptography and Security Artificial Intelligence Data Structures and Algorithms

Anomaly Detection Framework Using Rule Extraction for Efficient Intrusion Detection

359 - Antti Juvonen , Tuomo Sipola 2014

Huge datasets in cyber security, such as network traffic logs, can be analyzed using machine learning and data mining methods. However, the amount of collected data is increasing, which makes analysis more difficult. Many machine learning methods have not been designed for big datasets, and consequently are slow and difficult to understand. We address the issue of efficient network traffic classification by creating an intrusion detection framework that applies dimensionality reduction and conjunctive rule extraction. The system can perform unsupervised anomaly detection and use this information to create conjunctive rules that classify huge amounts of traffic in real time. We test the implemented system with the widely used KDD Cup 99 dataset and real-world network logs to confirm that the performance is satisfactory. This system is transparent and does not work like a black box, making it intuitive for domain experts, such as network administrators.

Machine Learning Cryptography and Security

Efficient GAN-based method for cyber-intrusion detection

68 - Hongyu Chen , Li Jiang 2019

Ubiquitous anomalies endanger the security of our system constantly. They may bring irreversible damages to the system and cause leakage of privacy. Thus, it is of vital importance to promptly detect these anomalies. Traditional supervised methods such as Decision Trees and Support Vector Machine (SVM) are used to classify normality and abnormality. However, in some case the abnormal status are largely rarer than normal status, which leads to decision bias of these methods. Generative adversarial network (GAN) has been proposed to handle the case. With its strong generative ability, it only needs to learn the distribution of normal status, and identify the abnormal status through the gap between it and the learned distribution. Nevertheless, existing GAN-based models are not suitable to process data with discrete values, leading to immense degradation of detection performance. To cope with the discrete features, in this paper, we propose an efficient GAN-based model with specifically-designed loss function. Experiment results show that our model outperforms state-of-the-art models on discrete dataset and remarkably reduce the overhead.

Machine Learning Cryptography and Security Machine Learning

Federated Intrusion Detection for IoT with Heterogeneous Cohort Privacy

80 - Ajesh Koyatan Chathoth University of Pittsburgh 2021

Internet of Things (IoT) devices are becoming increasingly popular and are influencing many application domains such as healthcare and transportation. These devices are used for real-world applications such as sensor monitoring, real-time control. In this work, we look at differentially private (DP) neural network (NN) based network intrusion detection systems (NIDS) to detect intrusion attacks on networks of such IoT devices. Existing NN training solutions in this domain either ignore privacy considerations or assume that the privacy requirements are homogeneous across all users. We show that the performance of existing differentially private stochastic methods degrade for clients with non-identical data distributions when clients privacy requirements are heterogeneous. We define a cohort-based $(epsilon,delta)$-DP framework that models the more practical setting of IoT device cohorts with non-identical clients and heterogeneous privacy requirements. We propose two novel continual-learning based DP training methods that are designed to improve model performance in the aforementioned setting. To the best of our knowledge, ours is the first system that employs a continual learning-based approach to handle heterogeneity in client privacy requirements. We evaluate our approach on real datasets and show that our techniques outperform the baselines. We also show that our methods are robust to hyperparameter changes. Lastly, we show that one of our proposed methods can easily adapt to post-hoc relaxations of client privacy requirements.

Machine Learning Distributed Parallel and Cluster Computing Networking and Internet Architecture

يمكنك البدء بجني المال وتحقيق ربح مادي من أبحاثك العلمية، المزيد

Hybrid Isolation Forest - Application to Intrusion Detection

Ask ChatGPT about the research

No Arabic abstract

Read More

suggested questions