Non-parametric methods are employed with populations that measure a graded sequence (for example, film reviews getting one to four stars). The use of non-parametric strategies might be needed whenever data has no obvious statistical meaning. Since non-parametric approaches generate a lesser number of assumptions, their applicability is significantly broader compared to the matching parametric methods. In particular, Non-parametric tests are applied in situations where less is known about the experiment in question. Specifically, non-parametric methods were developed to be used in cases when the researcher knows little about the parameters of the variable of interest in the population. In more technical terms, non-parametric methods do not rely on the estimation of parameters (such as the mean or the standard deviation) describing the distribution of the variable of interest in the population. This paper analyzes different sets of data using non-parametric methods.
Part A. Questions about non-parametric procedures
1. What are the most common reasons you would select a non-parametric test over the parametric alternative?
Nonparametric, or distribution free tests are so-called because the assumptions underlying their use are fewer and weaker than those associated with parametric tests. To put it another way, nonparametric tests require few if any assumptions about the shapes of the underlying population distributions. For this reason, they are often used in place of parametric tests if/when the researcher feels that the assumptions of the parametric test have been too grossly violated. Another reason is that the outcome is a rank or a score and the population is clearly not Gaussian. Sometimes, some values are 'off the scale', that is, too high or too low to measure. Even if the population is Gaussian, it is impossible to analyze such data with a parametric test since the researcher does not know all of the values. Also, when the data are measurements, and the researcher is sure that the population is not distributed in a Gaussian manner.
2. Discuss the issue of statistical power in non-parametric tests (as compared to their parametric counterparts). Which type tends to be more powerful? Why?
Statistical power of non-parametric tests are lower than that of their parametric counterpart. The nonparametric tests lack statistical power with small samples. When you use a nonparametric test with small samples and with data from a Gaussian population, the P values tend to be too high. In general, the most important component affecting statistical power is sample size in the sense that the most frequently asked question in practice is how many observations need to be collected. In fact, there is a little room to change a test size (significance level) since conventional .05 or .01 levels are widely used. It is difficult to control effect sizes in many cases. It is costly and time-consuming to get more observations, of course. However, if too many observations are used (or if a test is too powerful with a large sample size), even a trivial effect will be mistakenly detected as a significant one. Thus, virtually anything can be proved regardless of actual effects. By contrast, if too few observations are used, a hypothesis test will be weak and less convincing. Accordingly, there may be little chance to detect a meaningful effect even when it exists there. Statistical power analysis often answers these questions. What is the statistical power of a test, given N, effect size, and test size? At least how many observations are needed for a test, given effect size, test size, and statistical power?
2. What non-parametric test to use?
Non-parametric tests do not assume an underlying Normal (bell-shaped) distribution. Therefore, there are two general situations when non-parametric tests are used:
a. Data is nominal or ordinal (variance cannot be calculated).
b. The data does not satisfy other assumptions underlying parametric tests.
The choice of a non-parametric test to use for the researcher can be easier to determine and perform by using the following table as a guide:
Part B. SPSS Activity
In this part the researcher will perform the non-parametric version of the tests used in previous activities. In each case. the assumption is to opt to use the non-parametric equivalent rather than the parametric test.
1. Activity 5a: Sign test and Wilcoxon's matched pairs test.
The first section gives the descriptive statistics for the dependent variable for each level of the independent variable. In this example, there were 40 people (N) in each condition. The pre test students gave a mean liking rating of 40.15 with a standard deviation of 8.304 (although this number may not be meaningful in this example as standard deviation is not a valid statistic for an ordinally scaled variable.) The post test students gave a mean liking rating of 43.35 with a standard deviation of 9.598
The second section of the output shows the ranks for the Wilcoxon test. It gives the number of observations (N), 9, in which the post-test students did better than their matched counterpart (The Negative Ranks row). It also gives the number of observations, 28, in which the post-test students did better than their matched counterparts (the Positive Ranks row.) Finally, it gives the number of observations, 3, in which the post-test students got the same score as their matched counterparts in the pre-test (the Ties row.).
The third section of the output gives the values of the Wilcoxon test. The p value associated with the Wilcoxon test is given at the intersection of the row labeled Asymp. Sig. (2-tailed) (asymptotic significance, 2-tailed) and the column labeled with the difference of the variables that correspond to the means in the hypothesis (e.g. Liking Rating for post-test - Liking Rating for pre-test. In this example, the p value for the Wilcoxon test is .001.
This section of the output is similar to the ranks section. It is produced for the sign test, while the ranks section is produced for the Wilcoxon test. It gives the number of observations (N), 9, in which the post-testdid better than their matched counterpart (the Negative Differences row). It also gives the number of observations, 28, in which the post-test students di than their matched counterparts (the Positive Differences row). Finally, it gives the number of observations, 3, in which the post-test students got the same scores their matched counterparts in the pre-test (the Ties row.)
The final section of the output gives the values of the Sign test. The p value associated with the sign test is given at the intersection of the row labeled Exact Sig. (2-tailed) and the column labeled with the difference of the variables that correspond to the means in the hypothesis (e.g. Liking Rating for post-test - Liking Rating for pres-test.) In this example, the p value for the sign test is .003.
2. Activity 5b: non-parametric version of the independent t-test using Mann-Whitney test.
The first section gives the descriptive statistics for the dependent variable and (less usefully) for the independent variable. In this example, there were 80 people (N) who took a pre-test and a post-test. They have a mean score of 41.75 with a standard deviation of 9.062 (although this number may not be meaningful in this example, as standard deviation is not a valid statistic for an ordinally scaled variable.)
The second section of the output shows the number (N) of people in each condition (40). (40 pre-test and 40 post-test) and the mean rank and sum of ranks for each group (useful if the resercher were calculating the U statistic by hand.)
The final section of the output gives the values of the Mann-Whitney U test (and several other tests as well.) The observed Mann-Whitney U value is given at the intersection of the row labeled Mann-Whitney U and the column labeled with the dependent variable (score received in test) In this example, the Mann-Whitney U value is 629.0. There are two p values given, one on the row labeled Asymp. Sig (2-Tailed) and the other on the row labeled Exact Sig. [2*(1- tailed Sig.)]. Typically, the researcher will use the exact significance, although if the sample size is large, the asymptotic significance value can be used to gain a little statistical power.
3. Activity 5c: non-parametric version of the single factor ANOVA.
This is a very useful table as it can be used to present descriptive statistics in the researchers results section for each of the time points or conditions (depending on yourthe study design) for the dependent variable.
The Friedman Test compares the mean ranks between the related groups and indicates how the groups differed and it is included for this reason. However, the researcher is not very likely to actually report these values in the results section but most likely will report the median value for each related group.
The above table provides the test statistic (?2) value (Chi-square), degrees of freedom (df) and the significance level (Asymp. Sig.), which is all the researcher needs to report the result of the Friedman Test. There is an overall statistically significant difference between the mean ranks of the related groups. It is important to note that the Friedman Test is an omnibus test like its parametric alternative. It tells the researcher whether there are overall differences but does not pinpoint which groups in particular differ from each other. To do this the researcher needs to run post-hoc tests.
The above table provides the test statistic (?2) value (Chi-square), degrees of freedom (df) and the significance level (Asymp. Sig.), which is all we need to report the result of the Friedman Test. The researcher can see that there is an overall statistically significant difference between the mean ranks of the related groups. It is important to note that the Friedman Test is an omnibus test like its parametric alternative - that is, it tells the researcher whether there are overall differences but does not pinpoint which groups in particular differ from each other. To do this post-hoc tests are necessary.
The researcher can report the Friedman Test result as follows: There was a statistically significant difference in perceived systolic blood pressure depending on where the test was taken, ?2(2) = 60, P = 0.000. The median values could also be included for each of the related groups. However, at this stage, the researcher only knows that there are differences somewhere between the related groups but you do not know exactly where those differences lie.
4. Activity 6: non-parametric version of the factorial ANOVA
(Note) According to IBM Support Department at https://www-304.ibm.com/support/docview.wss?uid=swg21487416 there are no options for nonparametric factorial ANOVA models in SPSS. Perhaps a solution would be to separate the groups and perform One-way ANOVA (independent) by using Kruskal-Wallis analysis of ranks and the Median test on each group.
Part C. Contingency Tables
Sometimes a researcher is only interested in the following: Whether two variables are dependent on one another, (e.g. are death and smoking dependent variables; are SAT scores and high school grades independent variables?). To test this type of claim the researcher uses a contingency table, with the null hypothesis being that the variables are independent. In a contingency table the rows are one variable the columns another. In contingency table analysis (also called two-way ANOVA) the researcher determines how closely the amount in each cell coincides with the expected value of each cell if the two variables were independent.
The following contingency table lists the response to a bill pertaining to gun control.
Cell 1 indicates that 10 people in the Northeast were in favor of the bill.
In the previous contingency table, 40 out of 160 (1/4) of those surveyed were from the Northeast. If the two variables were independent, the researcher would expect 1/2 of that amount (20) to be in favor of the amendment since there were only two choices. The researcher would be checking to see if the observed value of 10 was significantly different from the expected value of 20.
To determine how close the expected values are to the actual values, the test statistic chi-square is determined. Small values of chi-square support the claim of independence between the two variables. That is, chi-square will be small when observed and expected frequencies are close. Large values of chi-square would cause the null hypothesis to be rejected and reflect significant differences between observed and expected frequencies.
The Case Processing Summary represents what percentage was involved in the analysis, as there is missing data. In this survey, only 65% of the data was accounted for or 927 participants out of the 1419. The missing cases (34.7%) did not answer the questions or participated and were not included in the analysis.
The crosstabulation table shows the breakdown in cells of the count of each respondent's highest degree and life is exiting or dull. The researcher can determine which group has the most exiting life (High School), the routine life (High School) and the dullest life (High School). High school students seem to have more exiting, routine and dull lives. Therefore, there seems to be an association between level of education and life being exiting or dull. The researcher can also see the values per education level of each group as well. The total column helps the researcher determine the number of cases per category as well i.e. The largest tested group were high school students, N= 483, and the lesser group were Junior college students N=59. This would affect the null hypothesis; consequently, the Chi-Square Tests are necessary to determine the validity of the hull hypothesis.
The Person Chi-Square has a value of 39.428. The Sig. being less than .05 indicates that there is a statistical significance between level of education and life existence. It is statistically improbable that the difference the researcher sees has occurred by chance. This tests a null hypothesis stating that the frequency distribution of certain events observed in a sample is consistent with a particular theoretical distribution. The events considered are mutually exclusive and have total probability 1.Therefore, in a larger population the results would be the same as in the sample population. The researcher cannot reject the null that education and perception of life are independent.