Few-shot learning for malware detection : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Auckland, New Zealand

Zhu, Jinting

Few-shot learning for malware detection : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Auckland, New Zealand

Files

ZhuPhDThesis.pdf (6.64 MB)

Date

2024-01-29

Authors

Zhu, Jinting

Publisher

Massey University

Rights

The Author

Abstract

The amount of malware is growing as the electric equipment thrives, which is exacerbated by malware’s diversity and uniqueness. Under such circumstances, efficient detection by artificial intelligence (AI) has recently emerged and inspired researchers to pay attention to this field. At present, the designed AI model based on large-scale training can effectively detect known types of malicious attacks. However, malware differs from the natural images that other AI models access. In particular, zero-day attacks are scarce in number and frequently updated, and they generally are packaged with obfuscation techniques to avoid being detected. This thesis demonstrates some novel approaches from advanced artificial intelligence technology to overcome these challenges. Our first research investigates a few-shot learning model applied in malware detection with scarce data and utilizes the Siamese Neural Network (SNN) based on the metric space to detect malware. Our model addresses the optimization problem that tends to overfit in the few-shot training phase, in which feature embedding space is optimized with the objection function of binary cross-entropy loss to improve detection accuracy. We then explored the specificity between malware in the presence of obfuscation techniques affecting the malware signature and proposed a novel Task-Aware Meta-Learning-based Siamese Neural Network that generates task-specific weights based on the entropy value. With the weights that contribute to the different classes, this model efficiently captures the unique signatures of different malware families. Along with initial success in few-shot learning for malware detection, we take into account the characteristics of malicious signatures in entropy patterns. We first proposed a model that utilizes the entropy feature directly obtained from binary ransomware files to retain more fine-grained features associated with different ransomware signatures. Benefiting from the robust features, a pre-trained network (e.g., VGG-16) combined with SNN, boosts feature representation along the frequency of malware signature and achieved a competitive outcome compared with the traditional deep learning method applied in malware detection. Next, we propose a triage approach using a Task Memory based on the Meta-Transfer Learning framework, which quantifies the malware threat level in the few-shot learning mechanism to prioritize different classes, which can also alert some suspicious software to human decision-making methods. Finally, we propose a novel Siamese Neural Network (SNN) designed to replace the distance scores but use relation-aware embeddings which can output better similarity probabilities based on semantics across different malware samples. Along with the use of entropy images as inputs, our proposed model can obtain better structural information and subtle differences in malware signatures despite the noises introduced by different obfuscation techniques.

Keywords

cyber security, machine learning, deep learning, Siamese Neural Network, malware detection

URI

https://mro.massey.ac.nz/handle/10179/69416

Collections

Theses and Dissertations

Full item page

Few-shot learning for malware detection : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Computer Science at Massey University, Auckland, New Zealand

Files

Date

DOI

Open Access Location

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Rights

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By