Understanding Family-Wise Error Rate

Timothy Gardner

In a related blog post, we discussed False Discovery Rate (FDR) calculations for dealing with multiple testing error. In my view, FDR is generally more aligned with what you want as an experimenter. But you could take another, generally more stringent, approach to reduce your false findings in multiple testing situations.

The alternate approach is to control the Family-Wise Error Rate (FWER). It’s a scary sounding term, but don’t be deterred. It’s a simple concept. It is the probability that one or more of your “family” of multiple tests is false. This is what we calculated in our related blog post when we explained that there is a 91% chance that one or more of your findings over a year are false if you perform one experimental test each week for 48 weeks.

So what P-value acceptance threshold should we use if we want our FWER to be less than 5% over a year? A typical FWER approach used in the scientific literature is a Bonferroni correction (one of many FWER methods). Bonferroni is super simple—just divide your original acceptance threshold (P≤0.05) by the number of tests you are analyzing. You then accept only results below that new threshold. For example, we would set a threshold of P<0.05/48 = 0.0010417 to ensure that we have less than 5% chance of accepting a false positive over the course of one year. In other words, this correction methods says you must increase your confidence in each week’s experiment to 99.9%.

Now let’s verify if this works. We’ll use the new Bonferroni-corrected P-value threshold to calculate the probability of one or more false findings each year just like we did in the related blog post: 100% - (99.89583%)^48 = 4.88%. Cool! It works!!!

Er, wait. Maybe it’s not so cool. How many times will your experiment produce a result that has such a huge effect that you’re 99.9% confident it’s right? Not very often. If you made this Bonferroni adjustment, you’d almost certainly miss every potential true discovery lying there among your data. In other words, you would have an unacceptably high false negative rate—and you would throw out the baby with the bathwater.

So what can you do? Most of the time, you'd rather use False Discovery Rate calculations instead of FWER. They give you a better estimate of where to draw the line between baby and bathwater. Alternatively, you should use a more sophisticated FWER method than Bonferroni, like the Tukey procedure or Dunnett’s correction. As usual, Wikipedia provides an excellent starting point for learning about these and other methods.[1]

Interested in more insights from Riffyn? Follow us on LinkedIn to join the conversation.


[1] Family-Wise Error Rate (2017 Oct) Retrieved from: http://en.wikipedia.org/wiki/Family-wise_error_rate