TY  - JOUR
A1  - Trafimow, David
A1  - Amrhein, Valentin
A1  - Areshenkoff, Corson N.
A1  - Barrera-Causil, Carlos J.
A1  - Beh, Eric J.
A1  - Bilgiç, Yusuf K.
A1  - Bono, Roser
A1  - Bradley, Michael T.
A1  - Briggs, William M.
A1  - Cepeda-Freyre, Héctor A.
A1  - Chaigneau, Sergio E.
A1  - Ciocca, Daniel R.
A1  - Correa, Juan C.
A1  - Cousineau, Denis
A1  - de Boer, Michiel R.
A1  - Dhar, Subhra S.
A1  - Dolgov, Igor
A1  - Gómez-Benito, Juana
A1  - Grendar, Marian
A1  - Grice, James W.
A1  - Guerrero-Gimenez, Martin E.
A1  - Gutiérrez, Andrés
A1  - Huedo-Medina, Tania B.
A1  - Jaffe, Klaus
A1  - Janyan, Armina
A1  - Karimnezhad, Ali
A1  - Korner-Nievergelt, Fränzi
A1  - Kosugi, Koji
A1  - Lachmair, Martin
A1  - Ledesma, Rubén D.
A1  - Limongi, Roberto
A1  - Liuzza, Marco T.
A1  - Lombardo, Rosaria
A1  - Marks, Michael J.
A1  - Meinlschmidt, Gunther
A1  - Nalborczyk, Ladislas
A1  - Nguyen, Hung T.
A1  - Ospina, Raydonal
A1  - Perezgonzalez, Jose D.
A1  - Pfister, Roland
A1  - Rahona, Juan J.
A1  - Rodríguez-Medina, David A.
A1  - Romão, Xavier
A1  - Ruiz-Fernández, Susana
A1  - Suarez, Isabel
A1  - Tegethoff, Marion
A1  - Tejo, Mauricio
A1  - van de Schoot, Rens
A1  - Vankov, Ivan I.
A1  - Velasco-Forero, Santiago
A1  - Wang, Tonghui
A1  - Yamada, Yuki
A1  - Zoppino, Felipe C. M.
A1  - Marmolejo-Ramos, Fernando
T1  - Manipulating the Alpha Level Cannot Cure Significance Testing
JF  - Frontiers in Psychology
N2  - We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable.
KW  - statistical significance
KW  - null hypothesis testing
KW  - p-value
KW  - significance testing
KW  - decision making
Y1  - 2018
U6  - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-189973
SN  - 1664-1078
VL  - 9
IS  - 699
ER  -