August 24, 2016

Conducting a self-audit on publications→

August 24, 2016/ Jonathan

Great example.

Like some of my more courageous colleagues, I conducted an audit of all my published individual studies since that started in January of 2009.

August 22, 2016

The principle of assumed error→

August 22, 2016/ Jonathan

Nice post from Russ Poldrack.

The principle is that whenever one finds something using a computational analysis that fits with one’s predictions or seems like a “cool” finding, they should assume that it’s due to an error in the code rather than reflecting reality. Having made this assumption, one should then do everything they can to find out what kind of error could have resulted in the effect. This is really no different from the strategy that experimental scientists use (in theory), in which upon finding an effect they test every conceivable confound in order to rule them out as a cause of the effect. However, I find that this kind of thinking is much less common in computational analyses. Instead, when something “works” (i.e. gives us an answer we like) we run with it, whereas when the code doesn’t give us a good answer then we dig around for different ways to do the analysis that give a more satisfying answer. Because we will be more likely to accept errors that fit our hypotheses than those that do not due to confirmation bias, this procedure is guaranteed to increase the overall error rate of our research. If this sounds a lot like p-hacking, that’s because it is

May 24, 2016

Statistically Controlling for Confounding Constructs Is Harder than You Think→

May 24, 2016/ Jonathan

Social scientists often seek to demonstrate that a construct has incremental validity over and above other related constructs. However, these claims are typically supported by measurement-level models that fail to consider the effects of measurement (un)reliability. We use intuitive examples, Monte Carlo simulations, and a novel analytical framework to demonstrate that common strategies for establishing incremental construct validity using multiple regression analysis exhibit extremely high Type I error rates under parameter regimes common in many psychological domains. Counterintuitively, we find that error rates are highest—in some cases approaching 100%—when sample sizes are large and reliability is moderate. Our findings suggest that a potentially large proportion of incremental validity claims made in the literature are spurious. We present a web application (http://jakewestfall.org/ivy/) that readers can use to explore the statistical properties of these and other incremental validity arguments. We conclude by reviewing SEM-based statistical approaches that appropriately control the Type I error rate when attempting to establish incremental validity.

April 26, 2016

The ASA's statement on p-values: context, process, and purpose→

April 26, 2016/ Jonathan

Good summary and interesting background on the ASA's statement on p-values.

Let’s be clear. Nothing in the ASA statement is new. Statisticians and others have been sounding the alarm about these matters for decades, to little avail. We hoped that a statement from the world’s largest professional association of statisticians would open a fresh discussion and draw renewed and vigorous attention to changing the practice of science with regards to the use of statistical inference.

January 26, 2016

Dorothy Bishop on "ghost variables" and why researchers need to understand poker→

January 26, 2016/ Jonathan

Good reminder (and a good analogy) from Dorothy Bishop on reporting all variables we test:

Quite simply p-values are only interpretable if you have the full context: if you pull out the 'significant' variables and pretend you did not test the others, you will be fooling yourself - and other people - by mistaking chance fluctuations for genuine effects.

January 26, 2016

Guess the correlation→

January 26, 2016/ Jonathan

Fun and informative stats game.

January 02, 2016

AsPredicted.org: Pre-registration made easy→

January 02, 2016/ Jonathan

Nice, short introduction to benefits of preregistration (with a nod to the authors' website, , which seems great).

December 07, 2015

Great paper looking at the use of parametric vs. permutation tests in group-level fMRI analysis→

December 07, 2015/ Jonathan

The most widely used task fMRI analyses use parametric methods that depend on a variety of assumptions. While individual aspects of these fMRI models have been evaluated, they have not been evaluated in a comprehensive manner with empirical data. In this work, a total of 2 million random task fMRI group analyses have been performed using resting state fMRI data, to compute empirical familywise error rates for the software packages SPM, FSL and AFNI, as well as a standard non-parametric permutation method. While there is some variation, for a nominal familywise error rate of 5% the parametric statistical methods are shown to be conservative for voxel-wise inference and invalid for cluster-wise inference; in particular, cluster size inference with a cluster defining threshold of p = 0.01 generates familywise error rates up to 60%. We conduct a number of follow up analyses and investigations that suggest the cause of the invalid cluster inferences is spatial auto correlation functions that do not follow the assumed Gaussian shape. By comparison, the non-parametric permutation test, which is based on a small number of assumptions, is found to produce valid results for voxel as well as cluster wise inference. Using real task data, we compare the results between one parametric method and the permutation test, and find stark differences in the conclusions drawn between the two using cluster inference. These findings speak to the need of validating the statistical methods being used in the neuroimaging field.

November 05, 2015

What's the probability that a significant p-vaue indicates a true effect?→

November 05, 2015/ Jonathan

Great post and interactive demonstrations from Felix Schönbrodt.

October 10, 2015

The Fallacy of Placing Confidence in Confidence Intervals→

October 10, 2015/ Jonathan

Seems like an important paper, and a very cool website.

In this paper, we argue that advocacy of CIs is based on a folk understanding rather than a principled understanding of CI theory. We outline three fallacies underlying the folk theory of CIs, and place these in the philosophical and historical context of CI theory proper. Through an accessible example adapted from the statistical literature, we show how CI theory differs from the folk theory of CIs. Finally, we show the fallacies of confidence in the context of a CI advocated and commonly used for ANOVA and regression analysis, and discuss the implications of the mismatch between CI theory and the folk theory of CIs.

August 31, 2015

The Bayesian Reproducibility Project→

August 31, 2015/ Jonathan

Alexander Etz on why we need a better metric for "success" in reproducibility.

Based on these two metrics, the headlines are accurate: Over half of the replications “failed”. But these two reproducibility metrics are either invalid (comparing significance levels across experiments) or very vague (confidence interval agreement). They also only offer binary answers: A replication either “succeeds” or “fails”, and this binary thinking leads to absurd conclusions in some cases like those mentioned above. Is replicability really so black and white? I will explain below how I think we should measure replicability in a Bayesian way, with a continuous measure that can find reasonable answers with replication effects near zero with wide CIs, effects near the original with tight CIs, effects near zero with tight CIs, replication effects that go in the opposite direction, and anything in between.

August 27, 2015

Interactive visualization of correlations is great (try dragging data around)→

August 27, 2015/ Jonathan

Very nice.

May 18, 2015

Daniel Lakens: The perfect t-test→

May 18, 2015/ Jonathan

Great idea from Daniel Lakens—an R script that helps you properly compare two groups.

The goal of this script is to examine whether more researcher-centered statistical tools (i.e., a one-click analysis script that checks normality assumptions, calculates effect sizes and their confidence intervals, creates good figures, calculates Bayesian and robust statistics, and writes the results section) increases the use of novel statistical procedures. Download the script here: https://github.com/Lakens/Perfect-t-test.

April 29, 2015

Reminder: There's more to improving science than p values→

April 29, 2015/ Jonathan

Good reminder that there is a lot more to improving the quality of science than p values.

P values are an easy target: being widely used, they are widely abused. But, in practice, deregulating statistical significance opens the door to even more ways to game statistics — intentionally or unintentionally — to get a result. Replacing P values with Bayes factors or another statistic is ultimately about choosing a different trade-off of true positives and false positives. Arguing about the P value is like focusing on a single misspelling, rather than on the faulty logic of a sentence.

April 23, 2015

Beyond bar and line graphs: Time for a new data presentation paradigm→

April 23, 2015/ Jonathan

Different data can lead to the same summary statistics - hence, authors should try to show as much data as possible (e.g., scatter plots rather than bar graphs of means). Good advice.

March 12, 2015

John Ioannidis on scientific accuracy→

March 12, 2015/ Jonathan

Interesting interview from Vox with the author of "Why most published research findings are false" (and many other articles), including personal tidbits:

He even has a mythical origin story. He was raised in Greece, the home of Pythagoras and Euclid, by physician-researchers who instilled in him a love of mathematics. By seven, he quantified his affection for family members with a "love numbers" system. ("My mother was getting 1,024.42," he said. "My grandmother, 173.73.")

and thoughts on how to improve science:

Recently there’s increasing emphasis on trying to have post-publication review. Once a paper is published, you can comment on it, raise questions or concerns. But most of these efforts don’t have an incentive structure in place that would help them take off. There’s also no incentive for scientists or other stakeholders to make a very thorough and critical review of a study, to try to reproduce it, or to probe systematically and spend real effort on re-analysis. We need to find ways people would be rewarded for this type of reproducibility or bias checks.

January 19, 2015

Machine Learning: Exceeding Chance Level By Chance→

January 19, 2015/ Jonathan

Possible warnings for machine learning studies? "Chance" classification depends on the number of samples - if samples are not infinite, it may be higher than expected.

UPDATE: The general consensus seems to be that permutation testing, as commonly done, is valid and avoids the potential problems laid out by the authors.

Jonathan Peelle

Jonathan Peelle

Linked

Jonathan Peelle

Conducting a self-audit on publications→

The principle of assumed error→

Statistically Controlling for Confounding Constructs Is Harder than You Think→

The ASA's statement on p-values: context, process, and purpose→

Dorothy Bishop on "ghost variables" and why researchers need to understand poker→

Guess the correlation→

AsPredicted.org: Pre-registration made easy→

Great paper looking at the use of parametric vs. permutation tests in group-level fMRI analysis→

What's the probability that a significant p-vaue indicates a true effect?→

The Fallacy of Placing Confidence in Confidence Intervals→

The Bayesian Reproducibility Project→

Interactive visualization of correlations is great (try dragging data around)→

Daniel Lakens: The perfect t-test→

Reminder: There's more to improving science than p values→

Beyond bar and line graphs: Time for a new data presentation paradigm→

John Ioannidis on scientific accuracy→

Machine Learning: Exceeding Chance Level By Chance→

Jonathan Peelle

Categories

Tags