Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 10 of 15
  • Item
    Semantic integrity in data warehousing : a framework for understanding : a thesis presented in partial fulfilment of the requirements for the degree of Masters of Business Studies in Information Systems at Massey University, Palmerston North, New Zealand
    (Massey University, 2002) Sampson, Jennifer Jane
    Data modelling has gathered an increasing amount of attention by data warehouse developers as they come to realise that important implementation decisions such as data integrity, performance and meta data management, depend on the quality of the underlying data model. Not all organisations model their data but where they do, Entity-Relationship (E-R) modelling, or more correctly relational modelling, has been widely used. An alternative, dimensional modelling, has been gaining acceptance in recent years and adopted by many practitioners. Consequently, there is much debate over which form of modelling is the most appropriate and effective. However, the dimensional model is in fact based on the relational model and the two models are not so different that a debate is necessary. Perhaps, the real focus should be on how to abstract meaning out of the data model. This research explores the importance of semantic integrity during data warehouse design and its impact on the successful use of the implemented warehouse. This has been achieved through a detailed case study. Consequently, a conceptual framework for describing semantic integrity has been developed. The purpose of the framework is to provide a theoretical basis for explaining how a data model is interpreted through the meaning levels of understanding, connotation and generation, and also how a data model is created from an existing meaning structure by intention, generation and action. The result of this exploration is the recognition that the implementation of a data warehouse may not assist with providing a detailed understanding of the semantic content of a data warehouse.
  • Item
    Functional dependencies for XML : axiomatisation and normal form in the presence of frequencies and identifiers : a thesis presented in partial fulfilment of the requirements for the degree of Master of Sciences in Information Sciences at Massey University, Palmerston North, New Zealand
    (Massey University, 2005) Trinh, Diem-Thu
    XML has gained popularity as a markup language for publishing and exchanging data on the web. Nowadays, there are also ongoing interests in using XML for representing and actually storing data. In particular, much effort has been directed towards turning XML into a real data model by improving the semantics that can be expressed about XML documents. Various works have addressed how to define different classes of integrity constraints and the development of a normalisation theory for XML. One area which received little to no attention from the research community up to five years ago is the study of functional dependencies in the context of XML [37]. Since then, there has been increasingly more research investigating functional dependencies in XML. Nevertheless, a comprehensive dependency theory and normalisation theory for XML have yet to emerge. Functional dependencies are an integral part of database theory in the relational data model (RDM). In particular, functional dependencies have been vital in the investigation of how to design "good" relational database schemas which avoid or minimise problems relating to data redundancy and data inconsistency. Since the same problems can be shown to exist in poorly designed XML schemas 1 , there is a need to investigate how these problems can be eliminated in the context of XML. We believe that the study of an analogy to relational functional dependencies in the context of XML is equally significant towards designing "good" XML schemas. [FROM INTRODUCTION]
  • Item
    A database with enterprise application for mining astronomical data obtained by MOA : a thesis submitted in partial fulfilment of the requirements for the degree of the Master of Information Science in Computer Science, Massey University at Albany, Auckland, New Zealand
    (Massey University, 2007) Xu, Huawei
    The MOA (Microlensing Observations in Astrophysics) Project is one of a new generation of modern astronomy endeavours that generates huge volumes of data. These have enormous scientific data mining potential. However, it is common for astronomers to deal with millions and even billions of records. The challenge of how to manage these large data sets is an important case for researchers. A good database management system is vital for the research. With the modern observation equipments used, MOA suffers from the growing volume of the data and a database management solution is needed. This study analyzed the modern technology for database and enterprise application. After analysing the data mining requirements of MOA, a prototype data management system based on MVC pattern was developed. Furthermore, the application supports sharing MOA findings and scientific data on the Internet. It was tested on a 7GB subset of achieved MOA data set. After testing, it was found that the application could query data in an efficient time and support data mining.
  • Item
    Applying knowledge management in education : teaching database normalization : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Information Science at Massey University
    (Massey University, 2004) Zhang, Lei
    In tertiary education, Information Science has been attracting more attention in both teaching and learning. However, along the course on the database design theory, learners always find it hard to grasp the knowledge on database normalisation and hard to apply different levels of the normal forms while designing a database. This results poor database construction and difficulties in database maintenance. In regard to this teaching and learning dilemma, academic teaching staff should, on the one hand, pay more attention to organising different teaching resources on database normalisation concepts and making the best use of the existing and newly developed resources so as to make the teaching environment more adaptive and more sharable. and on the other hand, apply different teaching methods to different students according to their knowledge levels by understanding the nature of each learner's behaviour, interests and preferences concerning the existing learning resources. However, at present there is no effective Information Technology tool to use in considering the dynamic nature of knowledge discovery, creation, transfer utilisation and reuse in this area. This provides an opportunity to examine the potentiality of applying knowledge management in education with the focus on teaching database normalisation, in terms of knowledge discovering, sharing, utilisation and reuse. This thesis contains a review of knowledge management and web mining technologies in the education environment, presents a dynamic knowledge management framework for better utilising teaching resource in the area of database normalisation and diagnoses the students' learning patterns and behaviours to assist effective teaching and learning. It is argued that knowledge management-supported education can work as a value-added process which supports the different needs of teachers and learners.
  • Item
    Associative access in persistent object stores : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences in Information Systems at Massey University
    (Massey University, 2004) Nusdin, Weena
    The overall aim of the thesis is to study associative access in a Persistent Object Store (POS) providing necessary object storage and retrieval capabilities to an Object Oriented Database System (OODBS) (Delis, Kanitkar & Kollios, 1998 cited in Kirchberg & Tretiakov, 2002). Associative access in an OODBS often includes navigational access to referenced or referencing objects of the object being accessed (Kim. Kim. & Dale. 1989). The thesis reviews several existing approaches proposed to support associative and navigational access in an OODBS. It was found that the existing approaches proposed for associative access could not perform well when queries involve multiple paths or inheritance hierarchies. The thesis studies how associative access can be supported in a POS regardless of paths or inheritance hierarchies involved with a query. The thesis proposes extensions to a model of a POS such that approaches that are proposed for navigational access can be used to support associative access in the extended POS. The extensions include (1) approaches to cluster storage objects in a POS on their storage classes or values of attributes, and (2) approaches to distinguish references between storage objects in a POS based on criteria such as reference types - inheritance and association, storage classes of referenced storage objects or referencing storage objects, and reference names. The thesis implements Matrix-Index Coding (MIC) approach with the extended POS by several coding techniques. The implementation demonstrates that (1) a model of a POS extended by proposed extensions is capable of supporting associative access in an OODBS and (2) the MIC implemented with the extended POS can support a query that requires associative access in an OODBS and involves multiple paths or inheritance hierarchies. The implementation also provides proof of the concepts suggested by Kirchberg & Tretiakov (2002) that (1) the MIC can be made independent from a coding technique, and (2) data compression techniques should be considered as appropriate alternatives to implement the MIC because they could reduce the storage size required.
  • Item
    Facilitating evolution in relational database design : a procedure to evaluate and refine novice database designers' schemata : a thesis presented in partial fulfilment of the requirements for the degree of Master of Business Studies in Information Systems at Massey University
    (Massey University, 1996) Ryder, Michael Robert
    Relational database management systems (RDBMS) have become widely used by many industries in recent years. Latterly these systems have begun to expand their market by becoming readily available at minimal cost to most users of modern computing technology. The quality of applications developed from RDBMSs however is largely dependent upon the quality of the underlying schema. This research looks at the area of schema design and in particular schemata designed by people who have a minimal understanding of relational concepts. It uses a survey and case studies to help define some of the issues involved in the area. A procedure to modify existing schemata is described, and the schema from one of the case studies used to apply the schema re-design procedure to a real database design. The results are compared to the original schema as well as a schema designed using a conventional application of the NIAM analysis and design methodology. The research supports the hypothesis that database applications based on schemata designed by lay-persons are currently being used to support business data management requirements. The utility, reliability and longevity of these applications depend to some extent on the quality of the underlying schema and its ability to store the required data and maintain that data's integrity. The application of the schema re-design procedure presented in this thesis reveals refinements on the original schema and provides a method for lay-persons to evaluate and improve existing database designs. A number of issues and questions related to the focus of this research are raised and, although outside the scope of the research, are noted as suggestions for further work.
  • Item
    Developing a courseware database for the AudioGraph : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science at Massey University
    (Massey University, 2000) Pan, Jun
    The goal of this project is to investigate and prototype a database driven server for the editing and delivery of multimedia courseware. This project required the analysis, design, and construction of a client/server based, distributed educational system. The components of the project are a relational database server with a particular database schema that can be downloaded or distributed with an existing project and the AudioGraph. The AudioGraph is an application using a multi-media tool to publish university lectures, tutorials or training material on the Web. The front-end interface is a Java application that lets the lecturers or students interact with the database. This system can be used to keep track of various stages of courseware development and web publishing. The overall aim was a flexible and adaptive system with the current lecture development and environment maintained. The system may be distributed on Windows NT, Unix and Macintosh platforms and so is portable and extendible and is platform-independent. The background and technology employed in the project is introduced. Each stage of the project process is explained in terms of the development lifecycle of the system. A limitation imposed by multi-platform compatibility is discussed and the achievement is presented by screenshots. Through the report, the structure of the file, run time environment, inter-process communication, user interface, and server access are explained.
  • Item
    Distribution design in object oriented databases : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Science in Information Systems
    (Massey University, 2003) Ma, Hui
    The advanced development of object oriented database systems has attracted much research. However, very few of them contribute to the distribution design of object oriented databases. The main tasks of distribution design are fragmenting the database schema and allocating the fragments to different sites of a network. The aim of fragmentation and allocation is to improve the performance and increase the availability of a database system. Even though much research has been done on distributed databases, the research almost always refers to the relational data model (RDM). Very few efforts provide distribution design techniques for distributed object oriented databases. The aim of this work is to generalise distribution design techniques from relational databases for object oriented databases. First, the characteristics of distributed databases in general and the techniques used for fragmentation and allocation for the RDM are reviewed. Then, fragmentation operations for a rather generic object oriented data model (OODM) are developed. As with the RDM, these operations include horizontal and vertical fragmentation. A third operation named splitting is also introduced for OODM. Finally, normal predicates are introduced for OODM. A heuristic procedure for horizontal fragmenting of OODBs is also presented. The adaption of horizontal fragmentation techniques for relational databases to object oriented databases is the main result of this work.
  • Item
    Conceptual data modelling for geographical information systems : a thesis presented in partial fulfilment of the requirements for the degree of Master of Arts (Information Systems) at Massey University
    (Massey University, 1994) Bekesi, Erzsebet
    This thesis sets out to find an answer to the question: does an appropriate conceptual data model exist for the practitioners of Geographical Information Systems database design? It aims to investigate and answer the question by: • Finding a workable data model to solve a database design problem (Manawatu-Wanganui Regional Council, Palmerston North, Natural Resources Management, Groundwater Section database). • Analysing the user's data requirements and producing a feasible conceptual schema. Usage of Geographical Information Systems applications is a recognised need in a growing number of organisations in New Zealand, but many factors block the way of this relatively new technology. One of these factors is the lack of well-designed databases to support the data needs of these non-traditional applications. One school of thought adopts general data modelling techniques for every database design problem, another group of researchers suggests that specialised data models are necessary to model data in various problem domains. This thesis summarises the "specialities" pertaining to the GIS database domain. The most important are the special data needs of GIS applications and the problem of the placement of spatial data models in the traditional taxonomy of database models. It chooses the objectives of conceptual data modelling as the evaluation criteria which the selected data model must satisfy i.e. to model reality and to form the basis for database schema design. This thesis reviews a group of published papers, selected from proponents of the entity-relationship and the object-oriented data modelling paradigms and the applications of these data modelling techniques in a spatial context. It compares various extensions to the original entity relationship model, and a comparison of the main data modelling paradigms is included. Data modelling shortcomings encountered in the literature are also summarised. The literature reviewed concludes that not appreciating the conceptual data modelling objectives leads to unsatisfactory conceptual database design. The selected data model, the spatially extended entity relationship (SEER) model is described and applied to the database design problem of a local authority to produce conceptual schemas. Findings are summarised and issues for future research are identified. Conclusions reached are: further evaluative work on the applied spatially extended entity relationship (SEER) model would be useful and clear directions are essential for practitioners showing the guiding principles of conceptual data modelling in a spatial context.
  • Item
    Every picture tells a story : an investigation of data models as tools of communication : a thesis presented in partial fulfillment of the requirements for the degree of Master of Information Sciences at Massey University,
    (Massey University, 2000) Somrutai, Malaipong
    Data models are important in information system (IS) development, particularly as tools for expressing and communicating information business requirements. The ability to understand the information content of data models is a fundamental skill required by anyone involved with them. The aim of data modelling is usually the creation of a database design but without everyone having a clear understanding of what the data model 'says', the quality of the design may suffer. When we build a model, we obviously want it to be understood but that understanding is dependent on the ability of the model to communicate its meaning. The better the model is as a vehicle of communication the clearer the understanding will be. This research report explores this important aspect of data models and investigates the research that has been undertaken in this area including the NaLER (Natural Language for E-R) technique for reading data models. It describes an experiment conducted to explore this aspect. Subjects were tested on their ability to accurately and comprehensively interpret or 'read' a data model both before and after learning the NaLER technique. Measurement was done by using a questionnaire which consisted of two types of questions. The results show that when the subjects used NaLER, they improved their scores on the difficult questions but not the simple and medium questions. In addition, the results show that after learning NaLER, subjects' confidence in their ability to understand a model was increased, even though they actually scored less well overall.