Machine Learning based Malicious Payload Identification in Software-Defined Networking


Abstract in English

Deep packet inspection (DPI) has been extensively investigated in software-defined networking (SDN) as complicated attacks may intractably inject malicious payloads in the packets. Existing proprietary pattern-based or port-based third-party DPI tools can suffer from limitations in efficiently processing a large volume of data traffic. In this paper, a novel OpenFlow-enabled deep packet inspection (OFDPI) approach is proposed based on the SDN paradigm to provide adaptive and efficient packet inspection. First, OFDPI prescribes an early detection at the flow-level granularity by checking the IP addresses of each new flow via OpenFlow protocols. Then, OFDPI allows for deep packet inspection at the packet-level granularity: (i) for unencrypted packets, OFDPI extracts the features of accessible payloads, including tri-gram frequency based on Term Frequency and Inverted Document Frequency (TF-IDF) and linguistic features. These features are concatenated into a sparse matrix representation and are then applied to train a binary classifier with logistic regression rather than matching with specific pattern combinations. In order to balance the detection accuracy and performance bottleneck of the SDN controller, OFDPI introduces an adaptive packet sampling window based on the linear prediction; and (ii) for encrypted packets, OFDPI extracts notable features of packets and then trains a binary classifier with a decision tree, instead of decrypting the encrypted traffic to weaken user privacy. A prototype of OFDPI is implemented on the Ryu SDN controller and the Mininet platform. The performance and the overhead of the proposed sulotion are assessed using the real-world datasets through experiments. The numerical results indicate that OFDPI can provide a significant improvement in detection accuracy with acceptable overheads.

Download