TY - JOUR A1 - Trafimow, David A1 - Amrhein, Valentin A1 - Areshenkoff, Corson N. A1 - Barrera-Causil, Carlos J. A1 - Beh, Eric J. A1 - Bilgiç, Yusuf K. A1 - Bono, Roser A1 - Bradley, Michael T. A1 - Briggs, William M. A1 - Cepeda-Freyre, Héctor A. A1 - Chaigneau, Sergio E. A1 - Ciocca, Daniel R. A1 - Correa, Juan C. A1 - Cousineau, Denis A1 - de Boer, Michiel R. A1 - Dhar, Subhra S. A1 - Dolgov, Igor A1 - Gómez-Benito, Juana A1 - Grendar, Marian A1 - Grice, James W. A1 - Guerrero-Gimenez, Martin E. A1 - Gutiérrez, Andrés A1 - Huedo-Medina, Tania B. A1 - Jaffe, Klaus A1 - Janyan, Armina A1 - Karimnezhad, Ali A1 - Korner-Nievergelt, Fränzi A1 - Kosugi, Koji A1 - Lachmair, Martin A1 - Ledesma, Rubén D. A1 - Limongi, Roberto A1 - Liuzza, Marco T. A1 - Lombardo, Rosaria A1 - Marks, Michael J. A1 - Meinlschmidt, Gunther A1 - Nalborczyk, Ladislas A1 - Nguyen, Hung T. A1 - Ospina, Raydonal A1 - Perezgonzalez, Jose D. A1 - Pfister, Roland A1 - Rahona, Juan J. A1 - Rodríguez-Medina, David A. A1 - Romão, Xavier A1 - Ruiz-Fernández, Susana A1 - Suarez, Isabel A1 - Tegethoff, Marion A1 - Tejo, Mauricio A1 - van de Schoot, Rens A1 - Vankov, Ivan I. A1 - Velasco-Forero, Santiago A1 - Wang, Tonghui A1 - Yamada, Yuki A1 - Zoppino, Felipe C. M. A1 - Marmolejo-Ramos, Fernando T1 - Manipulating the Alpha Level Cannot Cure Significance Testing JF - Frontiers in Psychology N2 - We argue that making accept/reject decisions on scientific hypotheses, including a recent call for changing the canonical alpha level from p = 0.05 to p = 0.005, is deleterious for the finding of new discoveries and the progress of science. Given that blanket and variable alpha levels both are problematic, it is sensible to dispense with significance testing altogether. There are alternatives that address study design and sample size much more directly than significance testing does; but none of the statistical tools should be taken as the new magic method giving clear-cut mechanical answers. Inference should not be based on single studies at all, but on cumulative evidence from multiple independent studies. When evaluating the strength of the evidence, we should consider, for example, auxiliary assumptions, the strength of the experimental design, and implications for applications. To boil all this down to a binary decision based on a p-value threshold of 0.05, 0.01, 0.005, or anything else, is not acceptable. KW - statistical significance KW - null hypothesis testing KW - p-value KW - significance testing KW - decision making Y1 - 2018 U6 - http://nbn-resolving.de/urn/resolver.pl?urn:nbn:de:bvb:20-opus-189973 SN - 1664-1078 VL - 9 IS - 699 ER -