Few-shot learning for malware detection : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Auckland, New Zealand

Thumbnail Image
Open Access Location
Journal Title
Journal ISSN
Volume Title
Massey University
The Author
The amount of malware is growing as the electric equipment thrives, which is exacerbated by malware’s diversity and uniqueness. Under such circumstances, efficient detection by artificial intelligence (AI) has recently emerged and inspired researchers to pay attention to this field. At present, the designed AI model based on large-scale training can effectively detect known types of malicious attacks. However, malware differs from the natural images that other AI models access. In particular, zero-day attacks are scarce in number and frequently updated, and they generally are packaged with obfuscation techniques to avoid being detected. This thesis demonstrates some novel approaches from advanced artificial intelligence technology to overcome these challenges. Our first research investigates a few-shot learning model applied in malware detection with scarce data and utilizes the Siamese Neural Network (SNN) based on the metric space to detect malware. Our model addresses the optimization problem that tends to overfit in the few-shot training phase, in which feature embedding space is optimized with the objection function of binary cross-entropy loss to improve detection accuracy. We then explored the specificity between malware in the presence of obfuscation techniques affecting the malware signature and proposed a novel Task-Aware Meta-Learning-based Siamese Neural Network that generates task-specific weights based on the entropy value. With the weights that contribute to the different classes, this model efficiently captures the unique signatures of different malware families. Along with initial success in few-shot learning for malware detection, we take into account the characteristics of malicious signatures in entropy patterns. We first proposed a model that utilizes the entropy feature directly obtained from binary ransomware files to retain more fine-grained features associated with different ransomware signatures. Benefiting from the robust features, a pre-trained network (e.g., VGG-16) combined with SNN, boosts feature representation along the frequency of malware signature and achieved a competitive outcome compared with the traditional deep learning method applied in malware detection. Next, we propose a triage approach using a Task Memory based on the Meta-Transfer Learning framework, which quantifies the malware threat level in the few-shot learning mechanism to prioritize different classes, which can also alert some suspicious software to human decision-making methods. Finally, we propose a novel Siamese Neural Network (SNN) designed to replace the distance scores but use relation-aware embeddings which can output better similarity probabilities based on semantics across different malware samples. Along with the use of entropy images as inputs, our proposed model can obtain better structural information and subtle differences in malware signatures despite the noises introduced by different obfuscation techniques.
cyber security, machine learning, deep learning, Siamese Neural Network, malware detection