Massey Documents by Type

Permanent URI for this communityhttps://mro.massey.ac.nz/handle/10179/294

Browse

Search Results

Now showing 1 - 5 of 5
  • Item
    Manipulating the alpha level cannot cure significance testing – comments on "Redefine statistical significance"
    (PeerJ Preprints, 2017-11-14) Trafimov D; Amrhein V; Areshenkoff CN; Barrera - Causil C; Beh EJ; Bilgiç Y; Bono R; Bradley MT; Briggs WM; Cepeda - Freyre HA; Chaigneau SE; Ciocca DR; Correa JC; Cousineau D; de Boer MR; Dhar SS; Dolgov I; Gómez - Benito J; Grendar M; Grice J; Guerrero - Gimenez ME; Gutiérrez A; Huedo - Medina TB; Jaffe K; Janyan A; Karimnezhad A; Korner - Nievergelt F; Kosugi K; Lachmair M; Ledesma R; Limongi R; Liuzza MT; Lombardo R; Marks M; Meinlschmidt G; Nalborczyk L; Nguyen HT; Ospina R; Perezgonzalez JD; Pfister R; Rahona JJ; Rodríguez - Medina DA; Romão X; Ruiz - Fernández S; Suarez I; Tegethoff M; Tejo M; van de Schoot R; Vankov I; Velasco - Forero S; Wang T; Yamada Y; Zoppino FCM; Marmolejo - Ramos F
    We argue that depending on p-values to reject null hypotheses, including a recent call for changing the canonical alpha level for statistical significance from .05 to .005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable criterion levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and determining sample sizes much more directly than significance testing does; but none of the statistical tools should replace significance testing as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, or implications for applications. To boil all this down to a binary decision based on a p-value threshold of .05, .01, .005, or anything else, is not acceptable.
  • Item
    Fisher, Neyman-Pearson or NHST? A tutorial for teaching data testing.
    (FRONTIERS RESEARCH FOUNDATION, 2015) Perezgonzalez JD
    Despite frequent calls for the overhaul of null hypothesis significance testing (NHST), this controversial procedure remains ubiquitous in behavioral, social and biomedical teaching and research. Little change seems possible once the procedure becomes well ingrained in the minds and current practice of researchers; thus, the optimal opportunity for such change is at the time the procedure is taught, be this at undergraduate or at postgraduate levels. This paper presents a tutorial for the teaching of data testing procedures, often referred to as hypothesis testing theories. The first procedure introduced is Fisher's approach to data testing-tests of significance; the second is Neyman-Pearson's approach-tests of acceptance; the final procedure is the incongruent combination of the previous two theories into the current approach-NSHT. For those researchers sticking with the latter, two compromise solutions on how to improve NHST conclude the tutorial.
  • Item
    Open letter to The Independent - Pilots 'very likely' to misjudge flying conditions due to irrational decisions, revisited
    (Figshare, 22/12/2016) Perezgonzalez JD
    Staufenberg’s news article (2016) comments on research reported by Walmsley and Gilbey (2016). An interview with the corresponding author also yielded extra information, especially the verbalization that practically all pilots fell prey to cognitive biases and the hint that pilots were making irrational decisions.In reality, Walmsley and Gilbey’s own results do not support much of the conclusions posed. I have further expanded on information which is specific to Staufenberg’s news article, especially information about minima meteorological conditions for visual flight rules (VFR) flying in the UK, as well as a breakdown of the percentage of pilots in Walmsley and Gilbey’s study which contradicts the information provided.
  • Item
    Confidence intervals and tests are two sides of the same research question.
    (FRONTIERS RESEARCH FOUNDATION, 2015) Perezgonzalez JD
  • Item
    Sorry to say, but pilots’ decisions were not irrational
    (British Psychological Society Digest Research, 16/12/2016) Perezgonzalez JD
    Fradera’s Digest (2016) makes for interesting reading both for aviators and cognitive psychologists alike. Fradera reports on a research article by Walmsley and Gilbey (2016) and the Digest seems pretty accurate to the contents commented upon (in a way, thus, whatever praises or criticisms are raised apply equally to the latter article). The Digest is interesting because what it says is quite relevant in principle but rather misleading in practice. That is, the actual results reported by Walmsley and Gilbey, do not seem to support the portrayal of pilots as biased and irrational, a portrayal which originates in the interpretation of those results based on a flawed statistical technique—null hypothesis significance testing, or NHST. In a nutshell, Fradera opted to summarize the interpretation of (some) outputs made by Walmsley and Gilbey instead of re-interpreting those outputs anew within the context of the methodology and the results described in the original article, as I shall argue.