Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 10 of 13
  • Item
    Context-sensitive interpretation of natural language location descriptions : a thesis submitted in partial fulfilment of the requirements for the award of Doctor of Philosophy in Information Technology at Massey University, Auckland, New Zealand
    (Massey University, 2022) Aflaki, Niloofar
    People frequently describe the locations of objects using natural language. Location descriptions may be either structured, such as 26 Victoria Street, Auckland, or unstructured. Relative location descriptions (e.g., building near Sky Tower) are a common form of unstructured location description, and use qualitative terms to describe the location of one object relative to another (e.g., near, close to, in, next to). Understanding the meaning of these terms is easy for humans, but much more difficult for machines since the terms are inherently vague and context sensitive. In this thesis, we study the semantics (or meaning) of qualitative, geospatial relation terms, specifically geospatial prepositions. Prepositions are one of the most common forms of geospatial relation term, and they are commonly used to describe the location of objects in the geographic (geospatial) environment, such as rivers, mountains, buildings, and towns. A thorough understanding of the semantics of geospatial relation terms is important because it enables more accurate automated georeferencing of text location descriptions than use of place names only. Location descriptions that use geospatial prepositions are found in social media, web sites, blogs, and academic reports, and georeferencing can allow mapping of health, disaster and biological data that is currently inaccessible to the public. Such descriptions have unstructured format, so, their analysis is not straightforward. The specific research questions that we address are: RQ1. Which geospatial prepositions (or groups of prepositions) and senses are semantically similar? RQ2. Is the role of context important in the interpretation of location descriptions? RQ3. Is the object distance associated with geospatial prepositions across a range of geospatial scenes and scales accurately predictable using machine learning methods? RQ4. Is human annotation a reliable form of annotation for the analysis of location descriptions? To address RQ1, we determine the nature and degree of similarity among geospatial prepositions by analysing data collected with a human subjects experiment, using clustering, extensional mapping and t-stochastic neighbour embedding (t-SNE) plots to form a semantic similarity matrix. In addition to calculating similarity scores among prepositions, we identify the senses of three groups of geospatial prepositions using Venn diagrams, t-sne plots and density-based clustering, and define the relationships between the senses. Furthermore, we use two text mining approaches to identify the degree of similarity among geospatial prepositions: bag of words and GloVe embeddings. By using these methods and further analysis, we identify semantically similar groups of geospatial prepositions including: 1- beside, close to, near, next to, outside and adjacent to; 2- across, over and through and 3- beyond, past, by and off. The prepositions within these groups also share senses. Through is recognised as a specialisation of both across and over. Proximity and adjacency prepositions also have similar senses that express orientation and overlapping relations. Past, off and by share a proximal sense but beyond has a different sense from these, representing on the other side. Another finding is the more frequent use of the preposition close to for pairs of linear objects than near, which is used more frequently for non-linear ones. Also, next to is used to describe proximity more than touching (in contrast to other prepositions like adjacent to). Our application of text mining to identify semantically similar prepositions confirms that a geospatial corpus (NCGL) provides a better representation of the semantics of geospatial prepositions than a general corpus. Also, we found that GloVe embeddings provide adequate semantic similarity measures for more specialised geospatial prepositions, but less so for those that have more generalised applications and multiple senses. We explore the role of context (RQ2) by studying three sites that vary in size, nature, and context in London: Trafalgar Square, Buckingham Palace, and Hyde Park. We use the Google search engine to extract location descriptions that contain these three sites with 9 different geospatial prepositions (in, on, at, next to, close to, adjacent to, near, beside, outside) and calculate their acceptance profiles (the profile of the use of a preposition at different distances from the reference object) and acceptance thresholds (maximum distance from a reference object at which a preposition can acceptably be used). We use these to compare prepositions, and to explore the influence of different contexts. Our results show that near, in and outside are used for larger distances, while beside, adjacent to and at are used for smaller distances. Also, the acceptance threshold for close to is higher than for other proximity/adjacency prepositions such as next to, adjacent to and beside. The acceptance threshold of next to is larger than adjacent to, which confirms the findings in ‎Chapter 2 which identifies next to describing a proximity rather than touching spatial relation. We also found that relatum characteristics such as image schema affect the use of prepositions such as in, on and at. We address RQ3 by developing a machine learning regression model (using the SMOReg algorithm) to predict the distance associated with use of geospatial prepositions in specific expressions. We incorporate a wide range of input variables including the similarity matrix of geospatial prepositions (RQ1); preposition senses; semantic information in the form of embeddings; characteristics of the located and reference objects in the expression including their liquidity/solidity, scale and geometry type and contextual factors such as the density of features of different types in the surrounding area. We evaluate the model on two different datasets with 25% improvement against the best baseline respectively. Finally, we consider the importance of annotation of geospatial location descriptions (RQ4). As annotated data is essential for the successful study of automated interpretation of natural language descriptions, we study the impact and accuracy of human annotation on different geospatial elements. Agreement scores show that human annotators can annotate geospatial relation terms (e.g., geospatial prepositions) with higher agreement than other geospatial elements. This thesis advances understanding of the semantics of geospatial prepositions, particularly considering their semantic similarity and the impact of context on their interpretation. We quantify the semantic similarity of a set of 24 geospatial prepositions; identify senses and the relationships among them for 13 geospatial prepositions; compare the acceptance thresholds of 9 geospatial prepositions and describe the influence of context on them; and demonstrate that richer semantic and contextual information can be incorporated in predictive models to interpret relative geospatial location descriptions more accurately.
  • Item
    Assessing the effectiveness of crowdsourced geographic information for solid waste management in Timor-Leste : a thesis presented in partial fulfilment of the requirements for the degree of Master of Information Sciences (Information Technology) at Massey University, Albany, New Zealand
    (Massey University, 2019) da Conceição Baptista, Elizabeth
    Dili, the capital city of Timor-Leste has been faced with serious solid waste problems in recent years. Responding to this issue, the government has adopted various policies including setting up solid waste collection sites in community areas and outsourcing collection to the private sector to collect waste directly from homes in several areas. Despite, these efforts, waste is still found scattered on the roads and disposed of in rivers and open lands. A proper solid waste management strategy is necessary to transform the city into a clean city. In order to develop an effective solid waste management strategy, reliable data and public participation are required. This study, therefore, investigated whether crowdsourcing, in particular, Volunteered Geographic Information (VGI) can effectively be used to collect data about solid waste disposal and collection practices in Dili and raise awareness of the impact of waste disposal practices among the public. The study result demonstrated that crowdsourcing is a viable method for collecting solid waste data. Challenges such as collecting accurate location-specific data still remain, hence, the crowdsourced dataset may not entirely substitute for the usual traditional dataset. At this stage, however, the collected data can still be utilized as a supplementary data source. In the future, by improving data collection methodologies, such as using smaller rewards or providing necessary facilities, a crowdsourcing-based data collection method could be utilized as an adequate substitute for traditional data source because of its ability to collect data in real- time with lower operational costs. This approach is feasible for a developing country such as Timor-Leste where critical area such as waste management has less priority for funding.
  • Item
    Geospatial threat measurement : an analysis of the threat the diatom Didymosphenia geminata poses to Canterbury, New Zealand : a thesis submitted in fulfilment of the requirements for the degree of Master of Philosophy in Geographic Information Systems in Massey University, Palmerston North
    (Massey University, 2009) Thornley, Norman John
    This thesis provides analysis of the threat Didymosphenia geminata poses to the Canterbury Conservancy of the Department of Conservation More specifically, it examines the relationship between Values, Risk and Hazard to measure the degree of threat posed by the diatom. This is the first time this type of Threat Analysis has been applied to such a problem in this region; and so will provide an important insight into the validity of the application of this methodology to an alien invasive threat. Moreover, it is the first time Values. Risk and Hazard have been modelled together to give an over all threat classification in this context. Risk mitigation is one of the variables that can be measured, managed and priced; factoring this into the model is also discussed. Qualitative and quantitative Values and Risk information is provided by Department of Conservation staff; some from their local knowledge and some from biodiversity datasets which have been collected over time. The Risk data is supplemented by fishing access data supplied by the two local Fish and Game Council Offices. Where available, further Values and Risk data has been gleaned from existing datasets in order to supplement the existing data. The Hazard data is taken from the work done by NIWA in 2005 and 2007; the latter being generated after field surveys were conducted on D. geminaia infected sites in the South Island.
  • Item
    The development of a Java based GIS viewing tool : a thesis presented in partial fulfilment of the requirements for the degree of Masters of Technology in Information Engineering at Massey University
    (Massey University, 1999) Maguire, James
    Geographic Information Systems (GIS) industry sources quote the ratio of power users to casual users at 1000:1, within New Zealand this figure has been found to be 30:1. The casual user is often under-supported, with slow and cumbersome viewing tools. This project implements a full data download system in Java for use with Genasys (New Zealand) GIS software. Three components were developed; a vector data handler, an image download system, and a database client. These components were integrated to form a powerful client that offered a significant performance increase over the "server based" client. The image download system outperformed the "server based" client by over 400%. The vector data handler outperformed the "server based" client by over 50%, while the database client was over 250% quicker. GIS users rated all components to be of significant benefit, offering improved performance over their current GIS viewing tools. The work completed in this thesis provides Genasys (New Zealand) a useful tool to enable powerful, fast and stable Java based GIS viewing clients. Keywords: GIS, Java, computer graphics, image pyramid.
  • Item
    A user friendly geographic information system for soil conservation planners : a thesis presented in partial fulfilment of the requirements for the degree of Master of Agricultural Science in Soil Science at Massey University
    (Massey University, 1994) Savitri, Endang
    Soil conservation is an important activity for sustainable, productive landuse. To ensure sound effective soil conservation planning, the people who are involved in this activity - the planners and the decision makers - should know (among other things) how best to use a land resource inventory database, which has been stored in a computer. Using Geographic Information Systems (GIS) to analyse such data is a technique which is being widely advocated. Unfortunately, most GIS computer programs are too difficult for the people like soil conservation planners who usually have little knowledge of computers. To help them understand GIS and then use GIS for their planning, a user friendly interface to the GIS was created. Two systems were created for the Pijiharjo sub-watershed, Indonesia; one with a popup menu, the other with a pulldown menu. Both interfaces were created using the SML (Simple Macro Language) command which is available under pc ARC/Info version 3.4D Plus. Although they looked different to the user, both used the same commands to execute the various operations. Once the initial design was completed, an evaluation was held to check whether the design was satisfactory from the user's point of view. The result of the evaluation showed that both systems were simple and easy to understand. However, there were some aspects that should be revised, such as the HELP facility. Similar databases from other areas could be analysed using these interfaces with the only requirement being a modification to the introductory remarks. Ideas for the future development of such systems are also discussed.
  • Item
    Stoat trap tunnel location : GIS predictive modelling to identify the best tunnel location : a thesis submitted in fulfillment of the requirements for the degree of Master of Philosophy in Geographic Information Systems in Massey University
    (Massey University, 2009) Day, A. Mark; Day, A. Mark
    Stoats are recognised as one of the biggest threats to New Zealand's threatened species. They are difficult to control because of their biological characteristics. Currently trapping is the most common type of control technique that has a proven success rate. Research studies have shown that some traps catch more stoats than others. However the reason for this is not well documented. The effectiveness of a trap set is difficult to determine because not all trap locations are the same and not all people have the same ability to select the best location for a trap. This study uses GIS to spatially analyse stoat capture data from a control operation on Secretary Island in conjunction with commonly available vegetation, habitat, diet and home range spatial data to see if there are consistent patterns that could be used as variables in a model that would predict the best place to locate a stoat trap tunnel. The model would then be tested against a similar dataset from Resolution Island. The Department of Conservation supplied the stoat capture data from the control operations on both islands. Standard spatial analysis techniques were used to generate surfaces that combined the capture data with the vegetation, habitat, diet and home range surfaces to produce predictive surfaces. The key finding from the research was that it is possible to produce a predictive model, although one was not created because the spatial datasets were not of a high enough resolution to provide conclusive evidence that could be confidently used as a variable in a model. The spatial analysis also indicated that stoats on both islands were caught mainly in the warmer northwestern parts of the islands although the study could not determine why there was a preference for these areas. In rugged terrain like that found on both islands the location of the track network will influence where the majority of stoats will be caught.
  • Item
    Conceptual data modelling for geographical information systems : a thesis presented in partial fulfilment of the requirements for the degree of Master of Arts (Information Systems) at Massey University
    (Massey University, 1994) Bekesi, Erzsebet
    This thesis sets out to find an answer to the question: does an appropriate conceptual data model exist for the practitioners of Geographical Information Systems database design? It aims to investigate and answer the question by: • Finding a workable data model to solve a database design problem (Manawatu-Wanganui Regional Council, Palmerston North, Natural Resources Management, Groundwater Section database). • Analysing the user's data requirements and producing a feasible conceptual schema. Usage of Geographical Information Systems applications is a recognised need in a growing number of organisations in New Zealand, but many factors block the way of this relatively new technology. One of these factors is the lack of well-designed databases to support the data needs of these non-traditional applications. One school of thought adopts general data modelling techniques for every database design problem, another group of researchers suggests that specialised data models are necessary to model data in various problem domains. This thesis summarises the "specialities" pertaining to the GIS database domain. The most important are the special data needs of GIS applications and the problem of the placement of spatial data models in the traditional taxonomy of database models. It chooses the objectives of conceptual data modelling as the evaluation criteria which the selected data model must satisfy i.e. to model reality and to form the basis for database schema design. This thesis reviews a group of published papers, selected from proponents of the entity-relationship and the object-oriented data modelling paradigms and the applications of these data modelling techniques in a spatial context. It compares various extensions to the original entity relationship model, and a comparison of the main data modelling paradigms is included. Data modelling shortcomings encountered in the literature are also summarised. The literature reviewed concludes that not appreciating the conceptual data modelling objectives leads to unsatisfactory conceptual database design. The selected data model, the spatially extended entity relationship (SEER) model is described and applied to the database design problem of a local authority to produce conceptual schemas. Findings are summarised and issues for future research are identified. Conclusions reached are: further evaluative work on the applied spatially extended entity relationship (SEER) model would be useful and clear directions are essential for practitioners showing the guiding principles of conceptual data modelling in a spatial context.
  • Item
    A development and application of GIS in Whanganui Catchment based river environment classification system : a thesis presented in partial fulfilment of the requirements for the degree of Master in Resource and Environmental Planning, Massey University
    (Massey University, 2002) Zhai, Qian
    This thesis concerns a development and implementation of Geographical Information System (GIS) for the New Zealand Whanganui catchment, based on a new methodology for river environment classification systems in New Zealand. The Ministry for the Environment (MfE) and National Institute of Water and Atmospheric Research (NIWA) are developing this system with assistance from regional councils. The river habitat classification is sometimes called river "ecotyping". It describes the process of dividing rivers into similar or different physical classes based on the habitat requirements of the plants and animals that live there (Murray McLea, 1999). This project focuses on generating a Digital Terrain Model (DTM) for the Whanganui river catchment to determine Whanganui catchment boundaries and a series of hydrology parameters such as catchment patterns and channel slopes, etc. It comprises layers of elevation, rainfall, geology, land-cover and additional ecotyping related attributes for classification of each arc of the Whanganui River. There are five sections in this thesis. The first section introduces the basic concept of hydrology in environmental and ecological aspects. It reviews the hydrology model with GIS and DTM. It also briefly describes the river environment classification system – ecotyping methodology. Finally, it describes the aims and achievements of this project. The second section focuses on the ARC/INFO software environment, using different ways to generate the DTMs and present criteria that will be used to test and analyse the accuracy of DTMs. Also the Whanganui catchment and catchment boundaries will be determined. The third section focuses on the river analysis. The main target is to test whether the 1: 50000 topographic data can be used to determine the channel slope and channel sinuosity for river sections other than reaches (Snelder et al. 1999). The fourth section describes the method of using ecotyping parameters and classification rules to classify each arc of the river into a database. These rules are introduced in the article "Further development and application of a GIS based river environment classification system" (Snelder et al. 1999). The last section as a conclusion of the thesis will summary the achievements, the methodology of the processing and the results of the application of this research.
  • Item
    The use of a geographic information system to investigate soil slip distribution and the land use capability classification in the East Coast region, New Zealand : a thesis presented in partial fulfilment of the requirements for the degree of Master of Applied Science in Soil Science at Massey University
    (Massey University, 1995) Hendriksen, Sheryl Denise
    The land of the North Island East Coast region has such a severe erosion problem that in some places the current land use cannot be sustained. The expansion of exotic forestry in the region will provide protection for the land, regional growth and development, and employment, but it also brings competition for good land. The New Zealand Resource Management Act, 1991, aims to promote sustainable use of our resources and requires regulatory authorities to monitor the state of their natural resources and to follow the principles set in the RMA when developing land use policies. Remotely sensed data provides a timely and accurate assessment of surface features. Aerial photography provides a better delineation of soil slip erosion than satellite imagery. Geographic Information Systems facilitate the storage and display of resource information. Through manipulation of GIS data layers, relationships between the distribution of soil slip erosion following Cyclone Bola, 1988, and other physical factors are investigated. The density of soil slip increases with increasing slope angle to a maximum on slopes of 30°. The amount of soil slip depends on the underlying rock type with jointed mudstone having the highest density. Most soil slip erosion occurs on NE, N, NW, and E facing slopes, but the reason for this cannot be attributed to either slope angle or rock type. The Land Use Capability classification is currently used by land use managers and planners to describe the land in terms of its limitation to productive uses. The detail of information in the New Zealand Land Resource Inventory LUC classification can be improved by incorporating more detailed slope angle and slope aspect information derived from digital contour data.
  • Item
    The use of a geographic information system (GIS) for farm soil conservation planning : a thesis presented in partial fulfilment of the requirements for the degree of Master of Agricultural Science in Soil Science at Massey University
    (Massey University, 1993) Priyono, Cyprianus Nugroho Sulistyo
    The use of a Geographic Information Systems (PC ARC/INFO) for farm soil conservation planning was demonstrated in several neighbouring properties in the Apiti district, Manawatu. The area (775 ha) was mainly steep and strongly rolling hill country where the dominant land use was pastoral grazing by sheep and cattle. The main objective of this study was to utilize the GIS at each step of the farm soil conservation planning process. The planning process began with a land resources inventory (LRI) where information on basic physical resources relevant to land management and soil conservation was collected and stored in a database before further processing. Factors collected in the LRI included primary factors (soil type, soil depth, slope, rock type and elevation) and secondary factors (existing erosion, land use, fence lines and ownership, and drainage condition). A digital elevation model (DEM) was developed to display landforms. Field observations were also used and local farmers were given the opportunity to become involved in the planning process. The next step involved delineating areas of similar land use capability and potential land use. The areas were also assessed in terms of potential erosion and conservation needs. These operations were undertaken by combining the LRI factors in various ways. Results of these assessments were matched to define land units which have similar physical characteristics. Recommendations for management practices were then made by considering combinations of the factors. The plan was displayed as maps showing the management options available for farmers. Both map overlay procedures and database analyses were carried out at each step of the planning process. As the map overlay is a unique operation in the GIS, it was used to combine necessary factors from the LRI based on a set of criteria. Database analyses were then carried out using macro commands which were developed according to the criteria. The ability of the GIS for database analyses distinguishes the GIS from other systems whose primary objective is map production. The use of database analyses in this study was a particular example for making recommendations in soil conservation planning. However, the techniques are applicable to many different conditions and different purposes. The maps presented in this study are examples of how it is possible to show the results of analyses. Advantages and constraints of such procedures at each step of the planning process were discussed.