Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    Improving the robustness and privacy of HTTP cookie-based tracking systems within an affiliate marketing context : a thesis presented in fulfilment of the requirements for the degree of Doctor of Philosophy at Massey University, Albany, New Zealand
    (Massey University, 2021) Amarasekara, Bede Ravindra
    E-commerce activities provide a global reach for enterprises large and small. Third parties generate visitor traffic for a fee; through affiliate marketing, search engine marketing, keyword bidding and through organic search, amongst others. Therefore, improving the robustness of the underlying tracking and state management techniques is a vital requirement for the growth and stability of e-commerce. In an inherently stateless ecosystem such as the Internet, HTTP cookies have been the de-facto tracking vector for decades. In a previous study, the thesis author exposed circumstances under which cookie-based tracking system can fail, some due to technical glitches, others due to manipulations made for monetary gain by some fraudulent actors. Following a design science research paradigm, this research explores alternative tracking vectors discussed in previous research studies within a cross-domain tracking environment. It evaluates their efficacy within current context and demonstrates how to use them to improve the robustness of existing tracking techniques. Research outputs include methods, instantiations and a privacy model artefact based on information seeking behaviour of different categories of tracking software, and their resulting privacy intrusion levels. This privacy model provides clarity and is useful for practitioners and regulators to create regulatory frameworks that do not hinder technological advancement, rather they curtail privacy-intrusive tracking practices on the Internet. The method artefacts are instantiated as functional prototypes, available publicly on Internet, to demonstrate the efficacy and utility of the methods through live tests. The research contributes to the theoretical knowledge base through generalisation of empirical findings and to the industry by problem solving design artefacts.
  • Item
    Building privacy-preservation models for distributed processing platforms : a thesis submitted in partial fulfilment of the requirements for the degree of Doctor of Philosophy (Ph.D.) in Computer Science, Massey University, New Zealand
    (Massey University, 2020) Bazai, Sibghat Ullah
    The widespread proliferation of data collection has increased a serious privacy concern in recent years. Data anonymization approaches have been proposed as a privacy-preserving technique to preserve the privacy of data. However, most existing data anonymization approaches have been designed to work with a small number of datasets within a single machine environment thus often not suitable for big data. To resolve these limitations, many scalable data anonymization solutions that can work with the distributed processing platform (e.g., MapReduce and Spark) has emerged to take advantage of scalability and other supports required for big data. However, due to lack of inherent support for the algorithms involved in data anonymization techniques, these existing proposals often encounter many implementation and performance bottlenecks. In the studies presented in this thesis, we propose a set of novel data anonymization approaches that can work well in the most popular distributed processing platforms for big data such as MapReduce and Spark. Our first set of studies address the privacy concerns involved in MapReduce platform that processes sensitive data without an appropriate privacy protection which may allow adversaries to break two very important security principals such as data confidentiality and integrity. Firstly, we propose a privacy-preservation platform as an extra layer on MapReduce to provide a set of privacy services to produce different sets of privacy-preserving anonymized datasets that can be safely processed by MapReduce. Secondly, we also offer a privacy-preserving $k$-NN based classifier for MapReduce. Instead of working with plaintext, our $k$-NN classifier can work on any anonymized datasets to protect the privacy concern of input data while still providing accurate classification results. In our second set of studies, we address the concerns in Apache Spark that lack appropriate supports for many popular data anonymization techniques. We first investigate the requirement for the types of support required for many data anonymization approaches which often demand multiple read and write operations. We argue that existing approaches fail to provide supports for caching intermediate data in memory which found to contribute performance degradation. To address this problem, we propose a Resilient Distributed Dataset (RDD) based data anonymization model that avoids expensive disk I/O. We also argue that many existing methods do not provide support for iterative intensive operations which appear in many data anonymization technique such as subtree generalization. We propose a generic approach for implementing subtree-based data anonymization techniques for Spark that provide more effective support for iteration intensive operations. Extending from this, we also provide a novel hybrid approach that can more effectively apply different data anonymization techniques for multi-dimensional data. We argue that our hybrid approach offers much better control for the expensive RDD creation and the size of partitions attached for each RDD which is much better suited to reduce many overheads such as involved in re-computation, shuffle operations, message exchange, and cache management. The experimental studies confirm that our novel privacy-preserving models implemented on both MapReduce and Spark provide high performance and scalability while supporting high levels of data privacy and utility.
  • Item
    RanDeter : using novel statistical and physical controls to deter ransomware attacks : a thesis presented in partial fulfillment of the requirements for the degree of Master of Information Sciences in Software Engineering at Massey University, Auckland, New Zealand
    (Massey University, 2018) McIntosh, Timothy Raymond
    Crypto-Ransomware are a type of extortion-based malware that encrypt victims’ personal files with strong encryption algorithms and blackmail victims to pay ransom to recover their files. The recurrent episodes of high-profile ransomware attacks like WannaCry and Petya, particularly on healthcare, government agencies and big corporates, have highlighted the immediate demand for effective defense mechanisms. In this paper, RANDETER is introduced as a novel anti-crypto-ransomware solution that deters ransomware activities, using novel statistical and physical controls inspired by the police anti-terrorism practice. Police try to maintain public safety by maintaining a constant presence to patrol key public areas, identifying suspects who exhibit out-of-ordinary characteristics, and restricting access to protected areas. Ransomware are in many ways like terrorists; their attacks are unexpected, malicious and aim for the largest number of victims. It is possible to try to detect and deter crypto-ransomware by maintaining a constant surveillance on the potential victims – MBR and user files especially documents and photos. RANDETER is implemented as two compatible and complementary modules: PARTITION GUARD and FILE PATROL. PARTITION GUARD blocks modifications to the area of MBR on the booting disk. FILE PATROL checks all file activities of directories protected by RANDETER against a list of Recognized Processed with Multi-Tier Security Rules. Upon detection of violations of such rules, which may have been initiated by crypto-ransomware as judged by FILE PATROL, FILE PATROL will freeze access of the monitored directories, terminate the offending processes, and resume access of those directories. Our evaluation demonstrated that RANDETER could ensure less and often no irrecoverable file damage by current ransomware families, while imposing less disk performance overheads, compared to existing competitor anti-ransomware implementations like CRYPTOLOCK, SHIELDFS and REDEMPTION. In addition, RANDETER was shown to be resilient against masquerading attacks and ransomware polymorphism.
  • Item
    IPeMS : a Digital Rights Management framework for learning objects, 31 July 2006 : a thesis contributing to a Master of Information Science degree, Information Systems Department, Massey University, Palmerston North, New Zealand
    (Massey University, 2006) Hill, Margaret
    The Internet is long-acclaimed to provide a medium for easy sharing of ideas and collaboration, and has huge potential for academic and training organisations to share learning resources. However, there are no formal mechanisms for managing intellectual property (IP) and there remain today tensions between freedom to share and ownership of creativity. Theories around land property rights have contributed to the rights of IP as we know them today. Creating digital IP, however, is not a physical labour like toiling the land. It does not preclude the owner from retaining a copy and copying the IP does not make the IP more scarce, or competitive to possess. Management of IP rights is about finding a balance between over zealous enforcement and 'free' use of IP. Protection of IP can be achieved by law and technology, and a mechanism for managing the use of digital learning objects would require a digital rights management (DRM) framework. Architecture of XML (eXtended Markup Language) Web Services is emerging as a standardised approach to dynamic component connectivity and interoperability that relies on self-describing components and open connectivity standards and emerging standards, including IP (Internet Protocol), SOAP (Simple Object Access Protocol), WSDL (Web Services Description Language) and UDDI (Universal Description, Discovery and Integration). XML Web Services technologies have great potential as the underlying technology for the establishment of a DRM framework for learning objects (LOs) on the Internet. An initial survey, with endorsement of findings by experts in Information and Communication Technologies (ICT) in education, identifies the components of an online contract that would license an educator to use LOs. A framework is proposed and a prototype of an intellectual property electronic management system (IPeMS) is designed and developed. Web Services operations authenticate teachers and enable the teachers to search for LOs. The teachers can view permissions and constraints of use of the LOs, and can create a contract, with or without payment as the conditions dictate, that, on agreeing to, will license the teachers to use one or more learning objects. Another evaluation survey completes the research study, giving feedback about IPeMS, with respect to its application to an educational environment, to license an educator to use digital LOs.
  • Item
    Realism in synthetic data generation : a thesis presented in fulfilment of the requirements for the degree of Master of Philosophy in Science, School of Engineering and Advanced Technology, Massey University, Palmerston North, New Zealand
    (Massey University, 2017) McLachlan, Scott
    There are many situations where researchers cannot make use of real data because either the data does not exist in the required format or privacy and confidentiality concerns prevent release of the data. The work presented in this thesis has been undertaken in the context of security and privacy for the Electronic Healthcare Record (EHR). In these situations, synthetic data generation (SDG) methods are sought to create a replacement for real data. In order to be a proper replacement, that synthetic data must be realistic yet no method currently exists to develop and validate realism in a unified way. This thesis investigates the problem of characterising, achieving and validating realism in synthetic data generation. A comprehensive domain analysis provides the basis for new characterisation and classification methods for synthetic data, as well as a previously undescribed but consistently applied generic SDG approach. In order to achieve realism, an existing knowledge discovery in databases approach is extended to discover realistic elements inherent to real data. This approach is validated through a case study. The case study demonstrates the realism characterisation and validation approaches as well as establishes whether or not the synthetic data is a realistic replacement. This thesis presents the ATEN framework which incorporates three primary contributions: (1) the THOTH approach to SDG; (2) the RA approach to characterise the elements and qualities of realism for use in SDG, and finally; (3) the HORUS approach for validating realism in synthetic data. The ATEN framework presented is significant in that it allows researchers to substantiate claims of success and realism in their synthetic data generation projects. The THOTH approach is significant in providing a new structured way for engaging in SDG. The RA approach is significant in enabling a researcher to discover and specify realism characteristics that must be achieved synthetically. The HORUS approach is significant in providing a new practical and systematic validation method for substantiating and justifying claims of success and realism in SDG works. Future efforts will focus on further validation of the ATEN framework through a controlled multi-stream synthetic data generation process.