Detection and classification of malicious network streams in honeynets : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Palmerston North, New Zealand
Loading...
Date
2013
DOI
Open Access Location
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Massey University
Rights
The Author
Abstract
Variants of malware and exploits are emerging on the global canvas at an ever-increasing
rate. There is a need to automate their detection by observing their malicious footprints
over network streams. Misuse-based intrusion detection systems alone cannot cope with
the dynamic nature of the security threats faced today by organizations globally, nor
can anomaly-based systems and models that rely solely on packet header information,
without considering the payload or content.
In this thesis we approach intrusion detection as a classi cation problem and describe
a system using exemplar-based learning to correctly classify known classes of malware
and their variants, using supervised learning techniques, and detect novel or unknown
classes using unsupervised learning techniques. This is facilitated by an exemplar selection
algorithm that selects most suitable exemplars and their thresholds for any
given class and a novelty detection algorithm and classi cation algorithm that is capable
to detect, learn and classify unknown malicious streams into their respective novel
classes. The similarity between malicious network streams is determined by a proposed
technique that uses string and information-theoretic metrics to evaluate the relative
similarity or level of maliciousness between di erent categories of malicious network
streams. This is measured by quantifying sections of analogous information or entropy
between incoming network streams and reference malicious samples. Honeynets are
deployed to capture these malicious streams and create labelled datasets. Clustering
and classi cation methods are used to cluster similar groups of streams from the
datasets. This technique is then evaluated using a large dataset and the correctness
of the classi er is veri ed by using \area under the receiver operating characteristic
curves" (ROC AUC) measures across various string metric-based classi ers. Di erent
clustering algorithms are also compared and evaluated on a large dataset.
The outcomes of this research can be applied to aid existing intrusion detection systems
(IDS) to detect and classify known and unknown malicious network streams by utilizing
information-theoretic and machine learning based approaches.
Description
Keywords
Malware (Computer software), Prevention, Intrusion detection systems, Computer security