Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 9 of 9
  • Item
    Contributions to food safety acceptance sampling plans : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics, Massey University, School of Mathematical and Computer Sciences
    (Massey University, 2023) Thevaraja, Mayooran
    An appropriate sampling inspection method is an essential tool for risk assessment in the food industry. A representative sampling approach will be helpful to reduce risk while minimising the sampling costs. Consequently, food manufacturers are employing efficient sampling approaches to assure food safety. In the food safety field, microbiological or other contamination often spreads unevenly across the production. Many factors are involved in the microbial risk assessment, such as (1) the amount of sample used for inspection, (2) what sampling methods were applied, (3) laboratory testing procedures, (4) physical sampling of materials from lots/batches of products and (5) the mixing of initially collected samples. This study focusses on improved sampling inspection approaches to reduce microbiological risk in food products. Part of this research also included developing open-source R packages to generate graphical displays for probabilistic risk assessment for practitioners. A single “wrapper” package is also provided to install all the newly developed packages in a single step.
  • Item
    Some diagnostic techniques for small area estimation : with applications to poverty mapping : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Palmerston North, New Zealand
    (Massey University, 2019) Livingston, Alison
    Small area estimation (SAE) techniques borrow strength via auxiliary variables to provide reliable estimates at finer geographical levels. An important application is poverty mapping, whereby aid organisations distribute millions of dollars every year based on small area estimates of poverty measures. Therefore diagnostics become an important tool to ensure estimates are reliable and funding is distributed to the most impoverished communities. Small area models can be large and complex, however even the most complex models can be of little use if they do not have predictive power at the small area level. This motivated a variable importance measure for SAE that considers each auxiliary variable’s ability to explain the variation in the dependent variable, as well as its ability to distinguish between the relative levels in the small areas. A core question addressed is how candidate survey-based models might be simplified without losing accuracy or introducing bias in the small area estimates. When a small area estimate appears to be biased or unusual, it is important to investigate and if necessary remedy the situation. A diagnostic is proposed that quantifies the relative effect of each variable, allowing identification of any variables within an area that have a larger than expected influence on the small area estimate for that area. This highlights possible errors which need to be checked and if necessary corrected. Additionally in SAE, it is essential that the estimates are at an acceptable level of precision in order to be useful. A measure is proposed that takes the ratio of the variability in the small areas to the uncertainty of the small area estimates. This measure is then used to assist in determining the minimum level of precision needed in order to maintain meaningful estimates. The diagnostics developed cover a wide range of small area estimation methods, consisting of those based on survey data only and those which combine survey and census data. By way of illustration, the proposed methods are applied to SAE for poverty measures in Cambodia and Nepal.
  • Item
    Study of managing and developing procedures to calculate the retail packaging waste in New Zealand : a thesis presented in partial fulfilment for the degree of Master of Technology in Packaging Technology at Massey University, Palmerston North, New Zealand
    (Massey University, 2003) Sheridan, Carl
    The purpose of completing this research is to develop a system that will estimate the total volume of retail packaging in our waste streams. The research also outlines the procedures that are required in order to complete the project in future years. The objectives of the project were to complete a thorough investigation into the following fields of literature; Packaging Fundamentals, Sampling Statistics, Regression Analysis and project management to assist in developing a sound methodology to collect data. Woolworths, Pak N Save, The Warehouse and Liqour King stores were used, as it was believed that these stores would most fairly represent the retail market. The Warehouse in general merchandise, the 2 grocery stores as the food market and the bottle store as one of the foremost users of packaging innovation in the retail market. The largest step of this project was collecting the data. Unfortunately not each piece of retail packaging can be assessed so this project was implemented to rationalise this sample to an accurate and reliable representation of the retail market. Then using a simple net gross weight methodology assess a proportion of the packages to develop conversion equations. Once the conversion equations were developed then these could be applied to all the packages. Secondary Packaging was also assessed but using a slightly different process that investigated the recycling of cardboard from all the stores. This information can then be extrapolated up to represent the complete and entire retail market. This was done on a market share basis. For example Woolworths represent 30% of New Zealand's grocery market. Therefore the weight and volume of packaging from that store could be extrapolated up to represent the complete grocery market. From this, a procedures manual and a future plan was outlined in order to maintain up to date information.
  • Item
    The performance of techniques for estimating the number of eligible signatories to a large petition on the basis of a sample of signatures : a thesis presented in partial fulfilment of the requirements for the degree of Master of Science in Statistics at Massey University, Palmerston North, New Zealand
    (Massey University, 2002) Hedderley, Duncan
    The New Zealand Citizens' Initiated Referenda Act, 1993, states that if a petition signed by at least 10 percent of eligible electors is presented to the House of Representatives, then parliament is required to hold an indicative referendum on the petition. Normal practice at present is to check a sample of the signatures and from that estimate the number of eligible electors who have signed a petition, making allowance for signatories who are not eligible and multiple signatures from eligible electors. We review a number of techniques used for similar problems such as estimating the size of a population through capture-recapture studies, or estimating the number of duplicate entries in a mailing list. One suitable estimator was developed by Goodman (1949). A number of variants on it are reported by Smith-Cayama & Thomas (1999). An estimator proposed by Esty (1985) was found to give unreasonable estimates, and so a modification was developed. In order to test the performance of the modified estimator, simulations, drawing repeated samples from artificial petitions with known distributions of multiple signatures, were performed. The simulation results allowed us to investigate bias in the estimators and the accuracy of the variance estimates proposed by Hass & Stokes (1998). The effect of sampling fraction on bias, variability and estimated variance of the estimators was also investigated. The simulation program was modified to include ineligible signatures. Results of these simulations showed that estimating the number of ineligible signatures added to the variability of the overall estimate of number of eligible signatories. Although Smith-Cayama & Thomas (1999) mention that the estimated number of multiple eligible signatures and the estimated number of ineligible signatures are correlated, the simulations suggest the correlation is small and makes little difference to the final estimate of variability.
  • Item
    The effect of clustering on the precision of estimation : a thesis presented in partial fulfilment of the requirements for the degree of Master of Business Studies in Marketing at Massey University
    (Massey University, 1997) Guan, Zhengping
    The effect of clustering interval on design effect may be important in selection of alternative sampling designs by evaluating the cost-efficiency in the context of face-to-face interview surveys. There has been little work in investigating this effect in New Zealand. This study attempts to investigate this effect by using data from a two-stage sampling face-to-face interview survey. Seventeen stimulated samples are generated. A simple method, design effect =msb /ms, is developed to estimate design effects for 81 variables for both the simulated samples and the original sample. These estimated design effects are used to investigate the effect of clustering interval. This study also investigates the effect of cluster size. The results indicate that clustering interval has little influence on design effect but cluster size substantial influence. The evaluation of the cost-efficiency in alternative clustering intervals is discussed. As an improvement in the efficiency of a sample design by an increase in clustering interval can not be justified by the increase in cost, it seems that the sample design with the smallest clustering interval is the best. An alternative method design effect ≈ mr2 is also discussed and tested in estimating design effects. The result indicates that the applicability of design effect ≈ mr2 is the same as that of design effect = msb /ms.
  • Item
    Acceptance sampling for food quality assurance : this dissertation is submitted for the degree of Doctor of Philosophy in Statistics, Institute of Fundamental Sciences, Massey University
    (Massey University, 2017) Santos-Fernández, Edgar
    Acceptance sampling plays a crucial role in food quality assurance. However, safety inspection represents a substantial economic burden due to the testing costs and the number of quality characteristics involved. This thesis presents six pieces of work on the design of attribute and variables sampling inspection plans for food safety and quality. Several sampling plans are introduced with the aims of providing a better protection for the consumers and reducing the sample sizes. The effect of factors such as the spatial distribution of microorganisms and the analytical unit amount is discussed. The quality in accepted batches has also been studied, which is relevant for assessing the impact of the product in the public health system. Optimum design of sampling plans for bulk materials is considered and different scenarios in terms of mixing efficiency are evaluated. Single and two-stage sampling plans based on compressed limits are introduced. Other issues such as the effect of imperfect testing and the robustness of the plan have been also discussed. The use of the techniques is illustrated with practical examples. We considered numerous probability models for fitting aerobic plate counts and presence-absence data from milk powder samples. The suggested techniques have been found to provide a substantial sampling economy, reducing the sample size by a factor between 20 and 80% (when compared to plans recommended by the international Commission on Microbiological Specification for Food (ICMSF) and the CODEX Alimentarius). Free software and apps have been published, allowing practitioners to design more stringent sampling plans. Keywords: Bulk material, Composite samples, Compressed limit, Consumer Protection, Double sampling plan, Food safety, Measurement errors, Microbiological testing, Sampling inspection plan.
  • Item
    Analysis of complex surveys : a thesis presented in partial fulfillment of the requirements for the degree of Masterate in Science in Statistics at Massey University
    (Massey University, 1997) Young, Jane
    Complex surveys are surveys which involve a survey design other than simple random sampling. In practice sample surveys require a complex design due to many factors such as cost, time and the nature of the population. Standard statistical methods such as linear regression, contingency tables and multivariate analyses are based on data which are independently and identically distributed (IID). That is, the data is assumed to have been selected by a simple random sampling design. The assumptions underlying standard statistical methods are generally not met when the data is from a complex design. A measure of the efficiency of a design was found by the ratio of the variance of the actual design over the variance of a simple random sample (of the same sample size). This is known as the design effect (deff). There are two forms of design effects; one proposed by Kish (1965) and another termed the misspecification effect (meff) by Skinner et al. (1989). Throughout the thesis, the design effect referred to is Skinner et al. (1989)'s misspecification effect. Cluster sampling generally yields a deff greater than one and stratified samples yields a deff less than one. Some researchers have adopted a model based approach for parameter estimation rather than the traditional design based approach. The model based approach is one which each possible respondent has a distribution of possible values, often leading to the equivalent of an infinite background population, called the superpopulation. Both approaches are discussed throughout the thesis. Most of the standard computing packages available have been developed for simple random sample data. Specialized packages are needed to analyse complex survey data correctly. PC CARP and SUDAAN are two such packages. Three examples of statistical analyses on complex sample surveys were explored using the specialized statistical packages. The output from these packages were compared to a standard statistical package, The SAS System. It was found that although SAS produced the correct estimates, the standard errors were much smaller than those from SUDAAN. This led, in regression for example, to a much higher number of variables appearing to be significant when they were not. The examples illustrated the consequences of using a standard statistical package on complex data. Statisticians have long argued the need for appropriate statistics for complex surveys.
  • Item
    Open population mark-recapture models including ancillary sightings : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University
    (Massey University, 1995) Barker, Richard J
    A model is proposed for a mark-recapture experiment with ancillary observations obtained from marked animals any time between capture periods and throughout the geographic range of the animals. The model allows three types of emigration from the site where recaptures are made: (1) random emigration, where the probability an animal is at risk of capture at i does not depend on whether it was at risk of capture at i - 1, (2) permanent emigration where animals can leave the area where they are at risk of capture but not return, and (3) Markov emigration, where the probability an animal is at risk of capture at i depends on whether it was at risk of capture at i - 1. Under random emigration the likelihood function can be factored into a set of conditionally independent binomial terms used to estimate the parameters and a set of conditionally independent multihypergeometric terms that do not involve the parameters. Closed-form maximum likelihood estimators are derived under random emigration for models with age-dependence and a temporary marking effect. Contingency table based goodness-of-fit tests are derived from the multihypergeometric terms in the likelihood function. Contingency table tests of the age-dependence and temporary marking effect models are also derived. Explicit estimators do not appear to exist for permanent or Markov emigration. It is shown that the estimator suggested by Jolly (Biometrika 52:239, 1965), and as a consequence the estimator suggested by Buckland (Biometrics 36:419-435, 1980), is only valid if there is no emigration from the study area or if emigration is random. The estimator suggested by Mardekian and McDonald (Journal of Wildlife Management 45:484-488, 1981) for joint analysis of recapture and tag-recovery data is also only valid under no emigration or random emigration. By making appropriate constraints on parameters the models reduce to previously published models including the Jolly-Seber model (with age-dependence and a temporary marking effect), tag-resight models, tag-recovery models, and joint live-recapture/ tag-recovery models. Thus, the model provides a common framework for most widely-used mark-recapture models and allows simultaneous analysis of data obtained in several ways. Advantages of the new models include improved precision of parameter estimates, and the ability to distinguish between different type of emigration. FORTRAN programmes are developed for fitting the models to data with an application to a data set for brown trout (Salmo trutta) tagged in spawning tributaries of Lake Brunner, Westland between 1987 and 1991.
  • Item
    Small area estimation via generalized linear models : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Statistics at Massey University, Palmerston North, New Zealand
    (Massey University, 2003) Noble, Alasdair D. L.
    Survey information is commonly collected to yield estimates of quantities for large geographic areas, for example, complete countries. However the estimates of those quantities at much smaller geographic areas are often of interest and the sample sizes in these areas are generally too small to give useful results. Small area estimation is used to make inference about those small areas with greater precision than the direct estimates, either by exploiting similarities between different small areas or by accessing additional information often from administrative records. The majority of the traditional small area estimation methods are examples of a simple linear model Marker (1999) and this work begins by extending the model to a generalized linear model (GLM) Nelder and Wedderburn (1972) and then including structure preserving estimation (SPREE) in the classification. This had not been done previously. SPREE had previously been fitted using the iterative proportional fitting algorithm Deming and Stephan (1940) which could be described as a "black box" approach. By expressing SPREE in terms of a GLM an alternative algorithm for fitting the method is developed which elucidates the underlying concepts. This new approach allows the method to be extended from the contingency table with categorical variables which the IPF could fit, to continuous variables and random effects models. An example including a continuous variable is given. SPREE is a method which uses auxiliary information as well as survey data. In the past assumptions about appropriate auxiliary information have been made with little theoretical support. The new approach allows these assumptions to be considered and they are found to be wanting in some cases. An example based on a national survey in New Zealand for unemployment statistics, is used extensively throughout the thesis. These data have characteristics that make analysis in the Bayesian paradigm appropriate. This paradigm has been applied and a conditional autoregressive error structure is considered. Finally relative risk models are considered. It is shown that these could have been fitted using the IPF algorithm but the new approach allows combinations of other modeling techniques which are not available using IPF.