Evaluating Continuous Data with Parametric and Nonparametric Tests
Continuous data consists of measurements recorded on a scale, such as white blood cell count, blood pressure, or temperature. There are two types of statistical tests that are appropriate for continuous data -- parametric tests and nonparametric tests.
Parametric tests are suitable for normally distributed data.
Nonparametric tests are suitable for any continuous data, based on ranks of the data values. Because of this, nonparametric tests are independent of the scale and the distribution of the data.
Choosing Between Parametric and Nonparametric Tests
Deciding whether to use a parametric or nonparametric test depends on the normality of the data that you are working with.
Therefore, the first step in making this decision is to check normality.
One option is to perform a simple check based on a histogram.
A histogram is simply a frequency plot of the values being witnessed in a dataset. For example, if researchers were interested in temperature, they could examine a histogram that displays the frequencies of each temperature occurring in their sample data.
If your histogram is roughly symmetrical, it is safe to assume that the data is relatively normally distributed, and a parametric test will be appropriate.
If the histogram is not symmetrical, then a nonparametric test will be more appropriate.
It should be noted that checking normality of data produced by smaller samples can be difficult.
Sometimes with a small sample, the data displayed in a histogram will be obviously asymmetrical, but there are certainly occasions in which it is impossible to tell.
This is because with a small sample, the histogram may not be smooth even if the data are normal. There might not be any significant evidence of symmetry or asymmetry, which can make it difficult to determine whether the data are normal or not.
However, one way to get around this obstacle is to leverage instances in which the same measurements have been measured from a previous, larger sample in an earlier study.
If data is found to be normal in previous studies using larger samples, it’s safe to assume your data will be normal as well.
If Your Data is Not Normal, There Are Steps You Can Take
If your data is not normal, there are a few steps you can take prior to performing a nonparametric test.
If your data has a generally skewed distribution, you could consider a transformation of the data.
When data is significantly skewed in one direction or the other, sometimes there are patterns that can be observed. By observing these patterns, you can then reframe your histogram so that the patterns are accounted for, and the histogram displays more normality.
If non-normality is due to outliers, it’s best to consider whether or not their deletion could be justified.
If the outliers are significantly distant from the mean, it’s best to further investigate them to see if they are a result of an error in your data collection, or if they even hold much context to your study. If they don’t hold context, it’s fine to remove them from your dataset so that your histogram displays more normality.
Another option here is to simply perform your analysis without considering the outliers, and then perform the analysis again while considering the outliers.
If the normality of your data is clearly in doubt, parametric tests will lead to seriously confusing data insights. Instead, when the normality of your data is in doubt, it’s time to resort to nonparametric tests.
If Your Data is Normal, Use a Parametric Test
The following statistical analyses can be applied to data that is assumed to have a normal distribution:
- Regression Analysis
- Correlation Coefficient
If Your Data Is Not Normal, Use a Nonparametric Test
The following statistical analyses can be applied to data that does not have a normal distribution:
- 1-sample sign test
- 1-sample Wilcoxon Signed Rank test
- Friedman test
- Kruskal-Wallis test
- Mann-Whitney test
- Mood’s Median test
- Spearman Rank Correlation
Key Takeaways to Remember About Parametric and Nonparametric Tests
The decision of whether to use a parametric or nonparametric test often depends on whether the mean or median more accurately represents the center of your data set’s distribution.
If the mean more accurately represents the center of the distribution of your data, and your sample size is large enough, use a parametric test.
If the median more accurately represents the center of the distribution of your data, use a nonparametric test even if you have a large sample size.
Lastly, if you are forced to use a small sample size, you might also be forced to use a nonparametric test. You’ll need to consider going out and collecting further data if you are set on using parametric tests!