No Arabic abstract
Intrusion detection is only a starting step in securing IT infrastructure. Prediction of intrusions is the next step to provide an active defense against incoming attacks. Current intrusion prediction methods focus mainly on prediction of either intrusion type or intrusion category and do not use or provide contextual information such as source and target IP address. In addition most of them are dependant on domain knowledge and specific scenario knowledge. The proposed algorithm employs a bag-of-words model together with a hidden Markov model which not depend on specific domain knowledge. Since this algorithm depends on a training process it is adaptable to different conditions. A key advantage of the proposed algorithm is the inclusion of contextual data such as source IP address, destination IP range, alert type and alert category in its prediction, which is crucial for an eventual response. Experiments conducted using a public data set generated over 2500 alert predictions and achieved accuracy of 81% and 77% for single step and five step predictions respectively for prediction of the next alert cluster. It also achieved an accuracy of prediction of 95% and 92% for single step and five step predictions respectively for prediction of the next alert category. The proposed methods achieved a prediction accuracy improvement of 5% for alert category over existing variable length Markov chain intrusion prediction methods, while providing more information for a possible defense.
Network intrusion detection sensors are usually built around low level models of network traffic. This means that their output is of a similarly low level and as a consequence, is difficult to analyze. Intrusion alert correlation is the task of automating some of this analysis by grouping related alerts together. Attack graphs provide an intuitive model for such analysis. Unfortunately alert flooding attacks can still cause a loss of service on sensors, and when performing attack graph correlation, there can be a large number of extraneous alerts included in the output graph. This obscures the fine structure of genuine attacks and makes them more difficult for human operators to discern. This paper explores modified correlation algorithms which attempt to minimize the impact of this attack.
Attack graphs (AG) are used to assess pathways availed by cyber adversaries to penetrate a network. State-of-the-art approaches for AG generation focus mostly on deriving dependencies between system vulnerabilities based on network scans and expert knowledge. In real-world operations however, it is costly and ineffective to rely on constant vulnerability scanning and expert-crafted AGs. We propose to automatically learn AGs based on actions observed through intrusion alerts, without prior expert knowledge. Specifically, we develop an unsupervised sequence learning system, SAGE, that leverages the temporal and probabilistic dependence between alerts in a suffix-based probabilistic deterministic finite automaton (S-PDFA) -- a model that accentuates infrequent severe alerts and summarizes paths leading to them. AGs are then derived from the S-PDFA. Tested with intrusion alerts collected through Collegiate Penetration Testing Competition, SAGE produces AGs that reflect the strategies used by participating teams. The resulting AGs are succinct, interpretable, and enable analysts to derive actionable insights, e.g., attackers tend to follow shorter paths after they have discovered a longer one.
This paper proposes an intrusion detection and prediction system based on uncertain and imprecise inference networks and its implementation. Giving a historic of sessions, it is about proposing a method of supervised learning doubled of a classifier permitting to extract the necessary knowledge in order to identify the presence or not of an intrusion in a session and in the positive case to recognize its type and to predict the possible intrusions that will follow it. The proposed system takes into account the uncertainty and imprecision that can affect the statistical data of the historic. The systematic utilization of an unique probability distribution to represent this type of knowledge supposes a too rich subjective information and risk to be in part arbitrary. One of the first objectives of this work was therefore to permit the consistency between the manner of which we represent information and information which we really dispose.
We use Hidden Markov Models to motivate a quantitative compositional semantics for noninterference-based security with iteration, including a refinement- or implements relation that compares two programs with respect to their information leakage; and we propose a program algebra for source-level reasoning about such programs, in particular as a means of establishing that an implementation program leaks no more than its specification program. <p>This joins two themes: we extend our earlier work, having iteration but only qualitative, by making it quantitative; and we extend our earlier quantitative work by including iteration. <p>We advocate stepwise refinement and source-level program algebra, both as conceptual reasoning tools and as targets for automated assistance. A selection of algebraic laws is given to support this view in the case of quantitative noninterference; and it is demonstrated on a simple iterated password-guessing attack.
With the growing amount of cyber threats, the need for development of high-assurance cyber systems is becoming increasingly important. The objective of this paper is to address the challenges of modeling and detecting sophisticated network attacks, such as multiple interleaved attacks. We present the interleaving concept and investigate how interleaving multiple attacks can deceive intrusion detection systems. Using one of the important statistical machine learning (ML) techniques, Hidden Markov Models (HMM), we develop two architectures that take into account the stealth nature of the interleaving attacks, and that can detect and track the progress of these attacks. These architectures deploy a database of HMM templates of known attacks and exhibit varying performance and complexity. For performance evaluation, in the presence of multiple multi-stage attack scenarios, various metrics are proposed which include (1) attack risk probability, (2) detection error rate, and (3) the number of correctly detected stages. Extensive simulation experiments are used to demonstrate the efficacy of the proposed architectures.