Chi Square And Nonparametric Techniques Accounting Essay

Published:

In most of the cases, the assumptions required for parametric techniques are not fulfilled. This opens door for the usage of non parametric techniques. Though, non parametric tests are less influential then parametric tests, but shows better results in many cases. Particularly Chi square test shows a good usage in routine research.

SPSS provides a number of non parametric techniques, as shown in the following figure.

Though, assumption testing for non parametric techniques is much critical, some generic assumptions apply to them also.

There should be random sampling.

Similar shape and variability across distributions should be there

Independence - for between subjects designs, independence (i.e., subjects appear in only one group and groups are not related in any way) must be ensured.

Chi Square

Chi square is used for finding significant relations. It is used to determine if categorical data shows dependency or the two classifications are independent. This test can also be used to make comparisons between theoretical populations and actual data when categories are used. There are two popular types of Chi square tests.

Lady using a tablet
Lady using a tablet

Professional

Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

Chi square test for goodness of fit - analysis of single categorical variable. The Chi square is used to find the bias of respondents regarding various related factors

Chi square test for independence or relatedness - analysis of relationship between two categorical variables.

Note: We can also test homogeneity or the significance of population variance through Chi Square test.

The Chi square test is a non-parametric test which assumes that the data analyzed:

Consist of nominal or ordinal category variables (i.e. each case can only be in one category or another).

Consist of entire populations or be randomly sampled from the population.

No data point should be zero.

80% of the expected frequencies should be 5 or more. However, the observed frequencies can be any value, including zero.

Chi square result analysis: If significance value is less than 0.05, reject null hypothesis at 95% level of confidence.

Note: In case, you want to test for 99% level of confidence, as in most cases of medicine or emergency or sensitive products, if significance value is less than 0.01, reject null hypothesis.

Working Example 1: Chi Square test for Goodness of fit (Based on individual scores)

Dharam Gupta wants to know whether internet has influence on cost /price comparison of products. He also wants to know the influence of internet in case of online ordering. The level of significance will be 5%. He categorizes the responses based on 5 point scale of Never, Occasionally, Considerably, Almost Always and Always. He carries out Chi Square test for results.

Null Hypothesis 1 : There is not much influence of internet in case of cost / price comparison.

Alternate Hypothesis 1 : There is much influence of internet in case of cost / price comparison.

Null Hypothesis 2 : There is not much influence of internet in case of online ordering. Alternate Hypothesis 2 : There is much influence of internet in case of online ordering.

The variable view is shown in the figure below for practical use.

Make the data file with two variables as shown in the figure above.

Categorize in the value column as following:

Influence of Internet in Cost / Price Comparison - 1(Never), 2(Occasionally), 3(Considerably), 4(Almost always), 5(Always)

Influence of Internet in Online Ordering - 1(Never), 2(Occasionally), 3(Considerably), 4(Almost always), 5(Always)

Enter the data of 29 respondents in data view, as shown in the following figure. The following figure will now outline the responses of 29 people for two variables (cost/ price comparison and online ordering).

Note: Only 29 respondents are taken into consideration for analysis, in order to have data view available for practical application for the user.

Select Analyze menuNonparametric TestsChi-Square….

Chi-Square dialogue box will be opened as shown in the figure below.

Select the variable online ordering in the left box and click right arrow button to transfer the variable to Test Variable List. Similarly, transfer other variable, i.e., cost/price comparison as shown in the figure below.

Click Options push button to open Options sub dialogue box. Click Descriptive check box and click continue. The previous Chi Square Dialogue box will re appear. Click OK to see the output.

The Output

Lady using a tablet
Lady using a tablet

Comprehensive

Writing Services

Lady Using Tablet

Plagiarism-free
Always on Time

Marked to Standard

Order Now

NPAR TESTS

/CHISQUARE=CostPrice OnlineOrdr

/EXPECTED=EQUAL

/STATISTICS DESCRIPTIVES

/MISSING ANALYSIS.

Descriptive Statistics

N

Mean

Std. Deviation

Minimum

Maximum

Cost / price comparison

29

3.62

.775

1

5

Online Ordering

29

3.97

.499

2

5

The figure above shows the descriptive statistics of the data. It shows the minimum and maximum values along with total responses, mean and standard deviation.

The figure below shows the observed, expected and residual values of Cost/price comparison variable.

Cost / price comparison

Observed N

Expected N

Residual

Never

1

5.8

-4.8

Occasionally

1

5.8

-4.8

Considerably

7

5.8

1.2

Almost Always

19

5.8

13.2

Always

1

5.8

-4.8

Total

29

The figure below shows the observed, expected and residual values of Online Ordering variable.

Online Ordering

Observed N

Expected N

Residual

Occasionally

1

7.3

-6.3

Considerably

1

7.3

-6.3

Almost Always

25

7.3

17.8

Always

2

7.3

-5.3

Total

29

Test Statistics

Cost / price comparison

Online Ordering

Chi-Square

42.207a

58.034b

df

4

3

Asymp. Sig.

.000

.000

a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 5.8.

b. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 7.3.

The table above shows that Chi square value of 42.207 (df=4, N=29), p<0.05 is significant at 4 degree of freedom, showing that there is significant difference in expected and observed frequencies. As such we reject Null hypothesis 1and accept alternate hypothesis 1, that is, there is much influence of internet in case of cost/price comparison.

Similarly, we can conclude from the table above shows that Chi square value of 58.034 (df=3, N=29), p<0.05 is significant at 3 degree of freedom, showing that there is significant difference in expected and observed frequencies. As such we reject Null hypothesis 2and accept alternate hypothesis 2, that is, there is much influence of internet in case of online ordering.

Working Example 2: Chi Square test for Goodness of fit (Based on weigh cases)

Dharam Gupta again wants to know whether internet has influence on cost /price comparison of products. The level of significance will be 5%. He categorizes the responses based on 5 point scale of Never, Occasionally, Considerably, Almost Always and Always. He carries out Chi Square test for results.

Null Hypothesis 3 : There is not much influence of internet in case of cost / price comparison.

Alternate Hypothesis 3 : There is much influence of internet in case of cost / price comparison.

Make a data file as shown in above and lower figure. Enter the frequencies of internet influence as shown below.

Internet Influence

Frequency

Never(1)

1

Occasionally(2)

1

Considerably(3)

7

Almost Always(4)

19

Always(5)

1

After entering the values in the data file as shown in the figure above. Select Data menuWeigh cases.

The dialogue box of weigh cases will be opened as shown in the figure below.

Click on Weigh cases by radio button. Select the variable you require and click right arrow button to shift it in the frequency variable box. In this case Cost/ price comparison and press OK to close the dialogue box.

The message Weight On should appear on the status bar at the bottom right of the window, as shown in the figure below.

Select the Analyze menu Nonparametric testsChi-Square, as shown in the figure below.

Chi-Square Test dialogue box will be opened as shown in the figure below.

Select the variable you require. In this case, Internet Influence and click the right arrow button to shift the variable in test variable list. You may also choose Options button and select the various options available. Click OK to close the dialogue box and see the results in the output viewer.

The Output :

WEIGHT BY CostPrice.

NPAR TESTS

/CHISQUARE=InternetInfluence

/EXPECTED=EQUAL

/MISSING ANALYSIS.

The figure below shows the observed, expected and residual values of Internet Influence variable.

Internet Influence

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

Observed N

Expected N

Residual

1.00

1

5.8

-4.8

2.00

1

5.8

-4.8

3.00

7

5.8

1.2

4.00

19

5.8

13.2

5.00

1

5.8

-4.8

Total

29

Test Statistics

Internet Influence

Chi-Square

42.207a

df

4

Asymp. Sig.

.000

a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 5.8.

You can see that, results in this exercise also show a similar result as shown by previous exercise for cost/ price comparison. The table above shows that Chi square value of 42.207 (df=4, N=29), p<0.05 is significant at 4 degree of freedom, showing that there is significant difference in expected and observed frequencies. As such we reject Null hypothesis 3and accept alternate hypothesis 3, that is, there is much influence of internet in case of cost/price comparison.

In this case, the expected frequencies represent 1/5th split across the categories, i.e., each having 5.9 expected frequencies. In many cases, expected frequencies are predefined and not evenly balanced across categories. Suppose, the expected frequency for each category is 4, 4, 6, 10, 5. SPSS allows you to allot the predefined expected frequencies. The same example will be used for predefined expected frequencies.

Open the Chi-Square dialogue box in the same data file by clicking Analyze menuNonparametric testsChi-Square.

Select the variable and send it to test variable list. In this case, Internet Influence.

In the Expected values box, click Values radio button. Enter 4 in the Values box and click Add. Similarly add 4,6,10,5 one by one as shown in the figure below. Now press OK to see the output.

The Output :

WEIGHT BY CostPrice.

NPAR TESTS

/CHISQUARE=InternetInfluence

/EXPECTED= 4 4 6 10 5

/MISSING ANALYSIS.

The figure below shows the observed, expected and residual values of Internet Influence variable.

Internet Influence

Observed N

Expected N

Residual

1.00

1

4.0

-3.0

2.00

1

4.0

-3.0

3.00

7

6.0

1.0

4.00

19

10.0

9.0

5.00

1

5.0

-4.0

Total

29

Test Statistics

Internet Influence

Chi-Square

15.967a

df

4

Asymp. Sig.

.003

a. 2 cells (40.0%) have expected frequencies less than 5. The minimum expected cell frequency is 4.0.

The table above shows that Chi square value of 15.967 (df=4, N=29), p<0.05 is significant at 4 degree of freedom, showing that there is significant difference in expected and observed frequencies. As such we reject Null hypothesis 3 and accept alternate hypothesis 3, that is, there is much influence of internet in case of cost/price comparison.

Working Example 3: Chi Square test for independence or relatedness (Based on weigh cases)

Ram Gupta is a media researcher. He wants to know whether the movie preference was dependent on location of the respondent. The responses indicate, 75 respondents each have seen movies 3Idots and Dabangg. His responses indicate 71 respondents from Delhi and 79 respondents from Mumbai.

Make a data file as shown in figure below.

Enter the values in values column as shown below.

Movie - 1(3 Idots), 2(Dabangg)

Place - 1(Delhi), 2(Mumbai)

Enter the frequencies and other variables in data view as shown in the table below.

Movie

Place

Frequency

1

1

42

1

2

33

2

1

29

2

2

46

After entering the values in the data file as shown in the figure above. Select Data menuWeigh cases…. The dialogue box of weigh cases will be opened.

Click on Weigh cases by radio button. Select the variable you require. In this case Frequency and click the right arrow button. The Frequency variable will come in Frequency Variable box. Click OK to close the dialogue box.

Now, Select the Analyze menu Descriptive StatisticsCrosstabs, as shown in the figure below. The Crosstabs dialogue box will appear on the screen.

Select a row variable, in this case (3 Idots / Dabangg) and a column variable, in this case (Delhi / Mumbai) as shown in the figure below.

Click Statistics… button. A Statistics sub dialogue box will be opened, as shown below. Click on Chi-square check box. Now click continue. Previous dialogue box will appear again.

Note: For additional results, if you want to measure strength of association between two variables through cross tabulation, you may click on contingency coefficient, phi and cramer's V, Lambda and uncertainty coefficient in nominal head.

The contingency coefficient lies between 0 and 1. It can be used when rows and columns are equal through cross tabulation. Please also note that it cannot attain maximum value of 1. In 2x2 matrix, the maximum value can be .707 and in 4x4 matrix the maximum value can be .87. Phi correlation coefficient is used mainly for 2x2 matrix, so that we may easily interpret the result. Cramer's V is a variation of Phi Correlation coefficient. It can have maximum value of 1 and is not restricted to 2x2 matrix. Lambda Asymmetric coeffient measures the error reduction in predicting the value(category) of one variable, if we know the category of other variable. For example, if Lambda is 0.25(for column variable, given the row variable), the reduction in error in predicting the column variable value is 0.25 or 25 percent. The vice versa can also be computed. Further, Lambda symmetric can be computed as a weighted average of the above two Lambda Asymmetric values for column and row variables.

Generally, one or two of these above tests( in the note section) are sufficient to find out the association between the row and column variable in the cross tabulation. If the value is close to 0, the association is weak and if the value is close to 1, the association is strong.

Click on Cells. Select, observed and expected check boxes in counts box and row, column, total check boxes in percentages. Click on continue and then click OK to see the output viewer.

The Output :

CROSSTABS

/TABLES=Movie BY Place

/FORMAT=AVALUE TABLES

/STATISTICS=CHISQ

/CELLS=COUNT EXPECTED ROW COLUMN TOTAL

/COUNT ROUND CELL.

Case Processing Summary

Cases

Valid

Missing

Total

N

Percent

N

Percent

N

Percent

3 Idots / Dabangg * Delhi / Mumbai

150

100.0%

0

.0%

150

100.0%

3 Idots / Dabangg * Delhi / Mumbai Crosstabulation

Delhi / Mumbai

Total

Delhi

Mumbai

3 Idots / Dabangg

3 Idots

Count

42

33

75

Expected Count

35.5

39.5

75.0

% within 3 Idots / Dabangg

56.0%

44.0%

100.0%

% within Delhi / Mumbai

59.2%

41.8%

50.0%

% of Total

28.0%

22.0%

50.0%

Dabangg

Count

29

46

75

Expected Count

35.5

39.5

75.0

% within 3 Idots / Dabangg

38.7%

61.3%

100.0%

% within Delhi / Mumbai

40.8%

58.2%

50.0%

% of Total

19.3%

30.7%

50.0%

Total

Count

71

79

150

Expected Count

71.0

79.0

150.0

% within 3 Idots / Dabangg

47.3%

52.7%

100.0%

% within Delhi / Mumbai

100.0%

100.0%

100.0%

% of Total

47.3%

52.7%

100.0%

Chi-Square Tests

Value

df

Asymp. Sig. (2-sided)

Exact Sig. (2-sided)

Exact Sig. (1-sided)

Pearson Chi-Square

4.520a

1

.034

Continuity Correctionb

3.851

1

.050

Likelihood Ratio

4.543

1

.033

Fisher's Exact Test

.049

.025

Linear-by-Linear Association

4.489

1

.034

N of Valid Cases

150

a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 35.50.

b. Computed only for a 2x2 table

In order to interpret the output results, Pearson Chi-square is seen in the above output table. The Pearson Chi-Square value of 4.520 (df=1, N=150), p<0.05 is significant at 1 degree of freedom, showing that there is significant difference in location and movie relatedness. Moreover, minimum expected cell frequency is 35.50, which is greater than 5. As such, one of the assumptions of the chi square has not been violated. Based on above output statistics, we can conclude that, 3 idots was much popular in Delhi and Dabangg was much preferential in Mumbai.

Spearman's rank-order correlation

When, parametric bivariate correlation (Pearson's r) cannot be performed. We can use Spearman's rank order correlation. Also known as Spearman's rho ( r ).

Working Example :

Pushpa Gupta wants to see the relationship between monthly household income and retail purchase by 20 respondents through spearman's rank order correlation. The data violates the assumptions of a Pearson's r, so we may perform spearman's rank order correlation.

Make a data file as shown in the figure below.

Enter the data as shown in the figure below.

Click Analyze menuDescriptive StatisticsCrosstabs. The Crosstabs dialogue box will open.

In the Crosstabs dialogue box, enter the row and column. In this case, household and retail purchase, respectively.

Click on Statistics button to open Statistics sub dialogue box. Select the Correlations check box. Click on continue and then OK to see the output viewer.

The Output:

CROSSTABS

/TABLES=HIncome BY Purchase

/FORMAT=AVALUE TABLES

/STATISTICS=CORR

/CELLS=COUNT

/COUNT ROUND CELL.

Case Processing Summary

Cases

Valid

Missing

Total

N

Percent

N

Percent

N

Percent

Monthly Household Income (Rs. Lac) * Retail Purchase (Rs.thousand)

20

100.0%

0

.0%

20

100.0%

Monthly Household Income (Rs. Lac) * Retail Purchase (Rs.thousand) Crosstabulation

Count

Retail Purchase (Rs.thousand)

Total

2.00

3.00

6.00

8.00

10.00

20.00

25.00

30.00

35.00

60.00

100.00

150.00

Monthly Household Income (Rs. Lac)

.10

1

0

0

0

0

0

0

0

0

0

0

0

1

.15

1

0

0

0

0

0

0

0

0

0

0

0

1

.20

0

1

0

0

0

0

0

0

0

0

0

0

1

.30

0

0

1

0

0

0

0

0

0

0

0

0

1

.40

0

0

0

1

0

0

0

0

0

0

0

0

1

.50

0

0

0

0

1

0

0

0

0

0

0

0

1

.80

0

0

0

0

1

0

0

0

0

0

0

0

1

1.00

0

0

0

0

2

0

0

0

0

0

0

0

2

3.00

0

0

0

0

0

0

0

1

1

0

0

0

2

5.00

0

0

0

0

0

0

1

1

0

0

0

0

2

6.00

0

0

0

0

0

0

0

1

0

0

0

0

1

7.00

0

0

0

0

0

1

0

0

0

0

1

0

2

8.00

0

0

0

0

0

0

0

0

0

1

1

0

2

10.00

0

0

0

0

0

0

0

0

0

0

1

1

2

Total

2

1

1

1

4

1

1

3

1

1

3

1

20

Symmetric Measures

Value

Asymp. Std. Errora

Approx. Tb

Approx. Sig.

Interval by Interval

Pearson's R

.864

.052

7.273

.000c

Ordinal by Ordinal

Spearman Correlation

.942

.044

11.902

.000c

N of Valid Cases

20

a. Not assuming the null hypothesis.

b. Using the asymptotic standard error assuming the null hypothesis.

c. Based on normal approximation.

The output shows that Spearman's rank order correlation is significant, r (20) = 0.942, p<0.5. As such, we can say that, higher monthly household income leads to more retail purchases.

You can find out Spearman's rho in one more way, by clicking through Analyze menuCorrelateBivate… , as shown in below figure.

In the Bivariate Correlations dialogue box, enter the variables. In this case, monthly income and retail purchase. Click on spearman check box and click OK to see the output viewer.

The Output:

NONPAR CORR

/VARIABLES=HIncome Purchase

/PRINT=SPEARMAN TWOTAIL NOSIG

/MISSING=PAIRWISE.

Correlations

Monthly Household Income (Rs. Lac)

Retail Purchase (Rs.thousand)

Spearman's rho

Monthly Household Income (Rs. Lac)

Correlation Coefficient

1.000

.942**

Sig. (2-tailed)

.

.000

N

20

20

Retail Purchase (Rs.thousand)

Correlation Coefficient

.942**

1.000

Sig. (2-tailed)

.000

.

N

20

20

**. Correlation is significant at the 0.01 level (2-tailed).

Here also, the output shows that Spearman's rank order correlation is significant, r (20) = 0.942, p<0.5. As such, we can say that, higher monthly household income leads to more retail purchases.

Mann-Whitney U Test (Wilcoxon rank sum W test)

The Mann-Whitney U Test is used to check the hypothesis that two independent samples come from populations having same distribution. This non parametric test is equivalent to parametric t-test of independent groups.

Working Example

The sales of two retail stores of Delhi(Store 1) and Mumbai (Store 2) are compared by Utkarsh. The sales are in Rs. Lacs. There are 20 responses, 10 from each store. The data violates the assumptions of an independent groups t-test. As such Mann-Whitney U test is performed.

Make a data file as shown in the figure below.

Enter the data as shown in the figure below.

Click Analyze menuNonparametric Tests 2 Independent Samples…. The 2 Independent Samples dialogue box will be opened.

Select dependent variable (i.e. Sale) in Test Variable List and RetailStore in Grouping Variable list. Select Mann-Whitney U check box in Test Type box.

Click Define Groups button. A sub dialogue box will open. Enter the values as shown below.

The Grouping Variable updates from RetailStore (? ?) to RetailStore(1 2), as shown in the figure below. Click OK to see the output.

The Output

NPAR TESTS

/M-W= Sale BY RetailStore(1 2)

/MISSING ANALYSIS.

Ranks

Retail Store

N

Mean Rank

Sum of Ranks

Sale (In Rs. Lacs)

Delhi Store

10

11.80

118.00

Mumbai Store

10

9.20

92.00

Total

20

Test Statisticsb

Sale (In Rs. Lacs)

Mann-Whitney U

37.000

Wilcoxon W

92.000

Z

-.987

Asymp. Sig. (2-tailed)

.324

Exact Sig. [2*(1-tailed Sig.)]

.353a

a. Not corrected for ties.

b. Grouping Variable: Retail Store

In the interpretation for this test, you have to consider Z-score and two tailed P-value, which have been corrected for ties. The output reveals that after correction for ties and Z-score conversion, the result was not significant, z=-.987, p>.05 and no significant differences exist in the sales of two retail stores.

Wilcoxon signed-rank test

The Wilcoxon signed-rank test ( sometimes called Wilcoxon t-test), is used when repeated measures are used in case of paired t-tests. When same participant perform under each level of independent variable, we use Wilcoxon signed-rank test.

Working Example:

A Showroom manager compares the laptop sales for the year in two parts. He wants to compare the sales of first half and second half of the year. He recorded the sales from 20 showrooms and saved their sales in Rs. Lacs. The data violates the assumptions of a paired t-test, so Wilcoxon signed-rank test is used.

Make the data file as shown in figure below.

Enter the data in the file as shown in figure below.

Select Analyze menu Nonparametric Tests2 Related Samples…. The Two Related Samples Tests dialogue box will be opened.

Select Laptop Sale (Jan-June) in variable 1 and Laptop Sale (July-Dec) in variable 2 on the left side box as shown in the figure below. Also notice that Wilcoxon check box should be selected.

Click Options to open Options sub dialogue. Click Descriptive check box and click Continue. You will see the previous dialogue box. Click Ok to see the output viewer.

The Output

NPAR TESTS

/WILCOXON=Sale1 WITH Sale2 (PAIRED)

/STATISTICS DESCRIPTIVES

/MISSING ANALYSIS.

Descriptive Statistics

N

Mean

Std. Deviation

Minimum

Maximum

Laptop Sale (Jan-June) Rs.Lacs

20

45.1000

23.15145

10.00

100.00

Laptop Sale (July-Dec) Rs. Lacs

20

61.3000

29.05005

20.00

120.00

Ranks

N

Mean Rank

Sum of Ranks

Laptop Sale (July-Dec) Rs. Lacs - Laptop Sale (Jan-June) Rs.Lacs

Negative Ranks

5a

10.10

50.50

Positive Ranks

15b

10.63

159.50

Ties

0c

Total

20

a. Laptop Sale (July-Dec) Rs. Lacs < Laptop Sale (Jan-June) Rs.Lacs

b. Laptop Sale (July-Dec) Rs. Lacs > Laptop Sale (Jan-June) Rs.Lacs

c. Laptop Sale (July-Dec) Rs. Lacs = Laptop Sale (Jan-June) Rs.Lacs

Test Statisticsb

Laptop Sale (July-Dec) Rs. Lacs - Laptop Sale (Jan-June) Rs.Lacs

Z

-2.036a

Asymp. Sig. (2-tailed)

.042

a. Based on negative ranks.

b. Wilcoxon Signed Ranks Test

The output shows that, there is a significant difference in showroom sales for the first and second half of the year, z=-2.036, p<.05. The result also shows that sales is higher in second half of the year.

Kruskal-Wallis test

The Kruskal-Wallis test allows possible differences between two or more groups to be examined and is equivalent to one-way ANOVA between groups.

Working Example :

The SPSS Vice President-Sales has developed three new training programs for increasing sales effectiveness. There are 60 new hires in the organization. He has randomly chosen 20 people for each training program. After one year, he decides to see the results. Due to violations of One-way Anova assumptions, Kruskal-Wallis test is performed.

Input Data :

Program Sale

1 10.0

1 12.0

1 30.0

1 50.0

1 20.0

1 15.0

1 19.0

1 25.0

1 60.0

1 10.0

1 2.0

1 25.0

1 60.0

1 40.0

1 50.0

1 20.0

1 30.0

1 40.0

1 50.0

1 60.0

2 9.0

2 90.0

2 80.0

2 110.0

2 150.0

2 120.0

2 111.0

2 50.0

2 20.0

2 30.0

2 90.0

2 70.0

2 50.0

2 90.0

2 250.0

2 300.0

2 200.0

2 150.0

2 165.0

2 158.0

3 10.0

3 20.0

3 30.0

3 10.0

3 20.0

3 30.0

3 60.0

3 50.0

3 40.0

3 90.0

3 80.0

3 70.0

3 88.0

3 57.0

3 67.0

3 400.0

3 90.0

3 70.0

3 44.0

3 62.0

Click Analyze menu Nonparametric TestsK Independent Samples…. This will open Tests for Several Independent Samples dialogue box.

Enter dependent variable, in this example, Sales (Rs. Crores) in Test Variable List. Enter independent variable, in this example, Sales Training Program in Grouping Variable box, as shown in below figure. See that, Krukal-Wallis H check box should be selected in Test Type box.

Click on Define Range. This will open a sub dialogue box. Enter the values as shown below. In our example, we have 3 training programs starting from 1. So, we select minimum as 1 and maximum as 3.

Click Continue. You will come back to previous dialogue box. Here you may choose Options button, in case you would like to add any results like descriptives, quartiles, etc. Click OK to see the output viewer.

The Output:

NPAR TESTS

/K-W=Sales BY Program(1 3)

/MISSING ANALYSIS.

Ranks

Sales Training Program

N

Mean Rank

Sales (Rs.Crores)

1

20

18.10

2

20

42.93

3

20

30.48

Total

60

Test Statisticsa,b

Sales (Rs.Crores)

Chi-Square

20.277

df

2

Asymp. Sig.

.000

a. Kruskal Wallis Test

b. Grouping Variable: Sales Training Program

The Chi-Square value X2(df=2, N=60) = 20.277, p<0.05 indicates that sales differ significantly across three training programs. The second training program delivers maximum good results, followed by third training program. First training program delivers least favourable results.

Friedman test

The Friedman test is equivalent to repeated measures or within subjects ANOVA. This test is used to compare two or more related samples.

Working Example:

The consumption times in minutes (for 50 beverages bottles per individual) of 20 individuals were measured under normal weather conditions, in cold weather and in hot weather.

Make variables in new data file as shown in figure below.

Enter data in the data file as shown in figure below.

Select AnalyzeNonparametric TestsK Related Samples….This will open Tests for Several Related Samples dialogue box as shown in the figure ahead.

Enter all the three variables in Test Variables list. You may click Statistics button for extra analysis. See that Friedman check box is selected in the Test Type box. Click OK to see the output viewer.

The Output:

NPAR TESTS

/FRIEDMAN=Normal Hot Cold

/MISSING LISTWISE.

Ranks

Mean Rank

Normal Weather

2.18

Hot Weather

1.28

Cold Weather

2.55

Test Statisticsa

N

20

Chi-Square

17.844

df

2

Asymp. Sig.

.000

a. Friedman Test

The Chi square, X2(df=2, N=20) =17.844, p<.05 shows that significant differences exist in consumption times across all the three weather conditions. The results also show that in cold weather, consumption of beverage slowed and in hot weather beverages were consumed minimum time as compared to other weather conditions.

SPSS Procedure for Nonparametric Techniques

After the input data has been typed according to the variables desired according to the problem, proceed according to following steps in respective cases.

Chi Square test for Goodness of fit (Based on individual scores)

Select Analyze menuNonparametric TestsChi-Square….

Chi-Square dialogue box will be opened.

Select the variable in the left box and click right arrow button to transfer the variable to Test Variable List. Similarly, transfer other variable, if more then one variable are required to be analyzed.

Click Options push button to open Options sub dialogue box. Click Descriptive check box and click continue. The previous Chi Square Dialogue box will re appear. Click OK to see the output.

Chi Square test for Goodness of fit (Based on weigh cases)

Select Data menuWeigh cases. The dialogue box of weigh cases will be opened.

Click on Weigh cases by radio button. Select the variable you require and click right arrow button to shift it in the frequency variable box and press OK to close the dialogue box. The message Weight On should appear on the status bar at the bottom right of the window.

Select the Analyze menu Nonparametric testsChi-Square. Chi-Square Test dialogue box will be opened.

Select the variable you require and click the right arrow button to shift the variable in test variable list. You may also choose Options button and select the various options available. Click OK to close the dialogue box and see the results in the output viewer.

In many cases, expected frequencies are predefined and not evenly balanced across categories. Suppose, the expected frequency for each category is 4, 4, 6, 10, 5. SPSS allows you to allot the predefined expected frequencies. The same example will be used for predefined expected frequencies.

Open the Chi-Square dialogue box in the same data file by clicking Analyze menuNonparametric testsChi-Square.

Select the variable and send it to test variable list.

In the Expected values box, click Values radio button. Enter 4 in the Values box and click Add. Similarly add 4,6,10,5 one by one. Now press OK to see the output.

Chi Square test for independence or relatedness (Based on weigh cases)

Select Data menuWeigh cases…. The dialogue box of weigh cases will be opened.

Click on Weigh cases by radio button. Select the variable you require and click the right arrow button. The variable will come in Frequency Variable box. Click OK to close the dialogue box.

Now, Select the Analyze menu Descriptive StatisticsCrosstabs. The Crosstabs dialogue box will appear on the screen.

Select a row variable and a column variable.

Click Statistics… button. A Statistics sub dialogue box will be opened, as shown below. Click on Chi-square check box. Now click continue. Previous dialogue box will appear again.

Click on Cells. Select, observed and expected check boxes in counts box and row, column, total check boxes in percentages. Click on continue and then click OK to see the output viewer.

Spearman's rank-order correlation

Click Analyze menuDescriptive StatisticsCrosstabs. The Crosstabs dialogue box will open.

In the Crosstabs dialogue box, enter the row and column.

Click on Statistics button to open Statistics sub dialogue box. Select the Correlations check box. Click on continue and then OK to see the output viewer.

You can find out Spearman's rho in one more way,

Clicking through Analyze menuCorrelateBivate….

In the Bivariate Correlations dialogue box, enter the variables. In this case, monthly income and retail purchase. Click on spearman check box and click OK to see the output viewer.

Mann-Whitney U Test (Wilcoxon rank sum W test)

Click Analyze menuNonparametric Tests 2 Independent Samples…. The 2 Independent Samples dialogue box will be opened.

Select dependent variable in Test Variable List and other variable in Grouping Variable list. Select Mann-Whitney U check box in Test Type box.

Click Define Groups button. A sub dialogue box will open. Enter the values accordingly. Click OK to see the output.

Wilcoxon signed-rank test

Select Analyze menu Nonparametric Tests2 Related Samples…. The Two Related Samples Tests dialogue box will be opened.

Select variable 1 and variable 2 on the left side box. Also notice that Wilcoxon check box should be selected.

Click Options to open Options sub dialogue. Click Descriptive check box and click Continue. You will see the previous dialogue box. Click Ok to see the output viewer.

Kruskal-Wallis test

Click Analyze menu Nonparametric TestsK Independent Samples…. This will open Tests for Several Independent Samples dialogue box.

Enter dependent variable in Test Variable List. Enter independent variable in Grouping Variable box. See that, Krukal-Wallis H check box should be selected in Test Type box.

Click on Define Range. This will open a sub dialogue box. Enter the values accordingly.

Click Continue. You will come back to previous dialogue box. Here you may choose Options button, in case you would like to add any results like descriptives, quartiles, etc. Click OK to see the output viewer.

Friedman test

Select AnalyzeNonparametric TestsK Related Samples….This will open Tests for Several Related Samples dialogue box.

Enter all the three variables in Test Variables list. You may click Statistics button for extra analysis. See that Friedman check box is selected in the Test Type box. Click OK to see the output viewer.

Problems

1. Berger Paints owner is concerned about the brand of berger. He wants to know the uneven distribution of brand in the four metro cities viz. New Delhi, Kolkata, Mumbai, Chennai. A random sampling of 120 retail consumers in each region was done as per following table.

New Delhi

Kolkata

Mumbai

Chennai

Total

Berger Paint

75

70

50

55

250

Other Paint

45

50

70

65

230

Total

120

120

120

120

480

Develop a table of observed and expected frequencies for this problem.

State Null and alternate hypothesis.

Calculate Chi square value and provide solution at 5 percent level of significance.

Neha Gupta wants to see that whether males drink more beer or the females take more beer in Delhi.

Female

Male

Total

Drink Beer

399

1563

1962

Don't Drink Beer

401

937

1338

Total

800

2500

3300

Is there any significant difference between females and males in drinking beer.

State Null and alternate hypothesis.

Calculate Chi square value and provide solution at 5 percent level of significance.

Madhu Gupta wants to know whether the serial preference was dependent on location of the respondent. The responses indicate, 75 respondents each have seen serial Crorepati and Big Boss. His responses indicate 66 respondents from Delhi and 84 respondents from Mumbai. The frequency table is shown below.

Serial

Place

Frequency

Crorepati

Delhi

40

Crorepati

Mumbai

35

Big Boss

Delhi

26

Big Boss

Mumbai

49

Conduct a Chi Square test for independence or relatedness (Based on weigh cases).