International Use Of Standardized Tests Economics Essay


This paper examines six reports of a comprehensive international standardized test that comprises math and science for grades 12, 8 and 4 called the Third International Mathematics and Science Study (TIMSS). I took an in-depth look at the goals, context, methods and results of the tests for each participating country, and concluded that standardized tests can be useful, but only for specific purposes and not for blanket application.

I went on to ask the question: why did students in each country score the way they did. I asked whether the methodology used accounted for the results and if not, what is responsible for the disparity in scores. Since the scores could help shed some light on the education system of other countries, I asked why some countries score so high, and others scored so low. Since the key issues in education include students' socioeconomic status and school funding, I attempted to answer some of the questions relating to performance by comparing these common issues across participating countries.

Lady using a tablet
Lady using a tablet


Essay Writers

Lady Using Tablet

Get your grade
or your money back

using our Essay Writing Service!

Essay Writing Service

What are the effects of child poverty, income inequality and public spending on academic achievements? I examine the correlation between the overall level of inequality as represented by the Gini coefficient in each of the 21 countries under study and the relative poverty rates of children in those countries. In addition, I study the impact of primary and secondary education spending as a percentage of the Gross National Product.

Analyzing data from the Third International Mathematics and Science Study (TIMSS), World Bank's World Development Indicators (WDI), United Nations Development Program's Human Development Index (HDI) and the Luxembourg Income Study (LIS), I found that the higher the level of income inequality in a country, the lower that country's math and science scores on the average. Also, the lower the level of income inequality in a country, the higher that country's math and science scores on the average. Though this analysis focuses more on the twelfth grade math and science, a preliminary analysis of eighth grade data confirms the same correlation. There is also significant correlation between eighth grade scores and twelfth grade scores within countries. Not surprisingly, relative child poverty rates are found to be strongly correlated to income inequality rates. Finally, no significant correlation is found between the TIMSS scores and public spending on education.

TIMSS Achievement Scores

The TIMSS study is comprised of five components: curriculum analyses, questionnaires, case studies, achievement tests and video study. This paper will focus only on the achievement tests. The TIMSS study represents the most extensive and highly detailed comparison of student performance ever endeavored. For the twelfth grade, the objective of the TIMSS assessment was to measure the combined outcome of primary and secondary schooling in each of the participating countries. In other words, if the tenth grade represents the end of secondary school, then tenth grade students were included. Likewise, if the end of secondary school is represented by fourteenth grade, then the students qualify to participate. Consequently, the sampling criteria left room for considerable variations in the population of students taking the "final year" test. Nevertheless, TIMSS sample designs are nationally representative of the target age group. All the countries except Latvia, Lithuania, Italy and Israel achieved 100% coverage of national population, (TIMSS, User Guide).

The TIMSS study group was faced with the problematic selection of students at all levels. The starting age for elementary school differs across countries and the graduation age is even more varied. Students were grouped into 3 Population classes with Population 1 representing those halfway through their primary education and Population 2 representing those halfway through their secondary education while Population 3 represents those completing their secondary education. That there is variation in the age of those in Population 3 cohort should not be surprising. In any event, most of the students in Population 3 were found to be in grade 12, (TIMSS, Report). Depending on the country, this might be viewed as an advantage or a disadvantage in as much as age or grade levels are considered determinant factors in math and science knowledge acquisition. As indicated in the TIMSS report, the age factor could perhaps explain why the US scored relatively lower than other countries, (U.S Dept. of Education). This factor alone is not enough to explain the disparity in scores. For instance, some countries presented younger student than the United States on the average (Australia 17.7, Czech Republic 17.8, New Zealand 17.6, Russia 16.9 and Hungary 17.6). Those students were also in grades lower than 12th (with the exception of Australia), but on the average scored higher than the United States.

Lady using a tablet
Lady using a tablet


Writing Services

Lady Using Tablet

Always on Time

Marked to Standard

Order Now

Critics have pointed out the incomparability of test scores drawing on a whole range of issues such as school curricula and participation rates. As much as these might appear to have a bearing on the comparability of the TIMSS test scores across countries, a closer look prompts one to ask the question: what is the standard body of knowledge expected of students at the end of secondary education? TIMSS conducted a 'test-curriculum matching analysis', (TIMSS, User Guide) to ensure that test materials have relevance to students' real life situations. It has been pointed out however, that American students who had only studied pre-calculus were tested inappropriately on their knowledge of calculus. This aspect certainly represents a weakness in TIMSS management by the US TIMSS team and it additionally questions the extent to which the team allowed technical input from other experts in the field. However, this is hardly a reason to question the validity of the test itself. The experts in charge of TIMSS say that American students ought to have mastered a level of knowledge in calculus, which should have enabled them to answer questions on the test correctly. Critics say that students could not have been expected to respond correctly given the fact that they were only at the pre-calculus level.

One could argue that the US compensated for this drawback by presenting only 14% of those in the age cohort to participate in the advanced tests, while countries such as Slovenia (75%), France (20%), Denmark (21%), Australia (16%) and Canada (16%) presented higher percentages of their students and yet managed to score higher than the United States. There are indeed brow-raisers and questionable inconsistencies; a vivid example is the case of Cyprus which ranked second to the last in both the general math and science tests, but ranked sixth in advanced math and ranked eighth in physics. This issue is mainly problematic in the advanced math and science tests. Nevertheless, this and other issues have generated heated discussions among scholars and will continue to generate even more for some time to come, (Berliner & Biddle, 1995; Stedman, 1996).

What about the general knowledge portion? These scores appear to be quite consistent across the board with a few exceptions. Almost all the participating countries are guilty of not meeting TIMSS participation and exclusion rates criteria. Yet the TIMSS remains the only source of data that covers a wide range of participants and an extensive collection of information about primary and secondary schools in the world. Lessons learned from previous studies have helped TIMSS improve sampling considerably, (TIMSS report, p.23).

Definitions and Data Source

Poverty and Income Inequality

Poverty is difficult to define because it has slightly different meanings in different places. Generally, poverty can be defined as a state of deprivation, and insufficient supply of the basic means of survival such as food, shelter, health and education. There is poverty in every country but it is most prevalent in developing countries. In developed counties, however, there are segments of society that fall within the range of poverty. This is so even when the society in general has a high income per capita. The poor everywhere must struggle in their state of deprivation to succeed in academic life. When there are extreme disparities in income within a society, the feeling of poverty is generated in relation to the amount of wealth that is perceived to be available. This form of poverty can also be measured in terms of income inequality. This disparity, as shown by the high income inequality rates, has been on the rise, (Daly & Duncan, 1998).

The Gini Coefficient

To compare the distributive quality of income in education, the Gini coefficient is more useful than the income per capita or the percentage of public spending on education. For example, per capita income does not reveal the pattern of inequality, neither does public spending on education tell us how much goes to the various segments of the society. However, the measure of income inequality provides a more accurate understanding of the share of national income controlled by the different segments of the population. The three child poverty variables used in this study represent the relative poverty rates at 20, 40 and 60 percents. The rate at 20% is shown as Ch_povr1, at 40% it is Ch_povr2 and at 60% it is shown as Ch_povr3. This data was taken from the Luxembourg Income Study. They calculated child poverty rates from households' income by controlling for households with children up to age 18.

Data Source

Lady using a tablet
Lady using a tablet

This Essay is

a Student's Work

Lady Using Tablet

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Examples of our work

I examine five independent variables and two dependent variables. The income inequality variables as measured by the Gini index, is the overall income inequality within each of the participating countries studied except Cyprus and Iceland whose figures were not available. The income inequality data were taken from the World Bank (World Development Indicators, 2000), which is the most extensive and complete data bank of income inequality available. The science and mathematics scores were taken from the National Center for Educational Statistics (NCES). I obtained relative child poverty rates from the Luxembourg Income Study and the percentage of public spending on primary and secondary education were extracted from the Human Development Index, (UNDP). Since the TIMSS dates back to 1998, I have opted to use available data around the same period for comparability.

To analyze the data, descriptive statistics was generated to provide baseline information on each variable (mean, standard deviation, variance, minimum, maximum and the number of cases) as shown in Table 1. Case summaries are shown in Table 2, while Table 3 was generated through bivariate correlations procedure to calculate the Pearson's correlation coefficient and the level of significance in order to measure how the variables relate to one another.

Analyses and Findings

Income Inequality

Pearson's correlation coefficient (Table 1) for income inequality and TIMSS math test score shows a negative relationship -.655 at a significance level of .002 (p<. 01) where math test score decreases with higher income inequality (or lower income equality). The relationship between income inequality and science is also negative -.631 at a significance level of .004 (p<. 05) where science test score decreases with increased inequality of income (or with decreased equality of income). Income inequality has very little relationship with government spending on primary and secondary education. Child poverty rates are significantly and positively correlated to income inequality at .787, .792 and .742 with significance levels of .000, .000 and .001.

Math and Science Scores

Math and science are shown in Table 3 to be positively and significantly correlated. In both the math and science tests a low score in one will on the average produce a low score in the other. On the same token and on average, countries scoring high in math will score high in science and vice versa. This is important because it reinforces to some extent the notion that the TIMSS test is a good measure of student performance. The consistency in scores from one subject to the other helps one to understand that on the average students who score high in math will most likely score high in science.

As discussed above, countries that have the most unequal distribution of income, that is South Africa and the United States, scored very low on both math and science tests. A preliminary look at the eighth grade scores shows significant negative correlation (Table 5b). This data will not be analyzed or discussed in this paper. Finally, Table 3 indicates that math and science are not as strongly related to child poverty as they are to income inequality and the relationship varies according to the percentile of children in poverty. Figures show that math is significantly correlated to child poverty rates at the 20 and 40 percents. Below is a presentation of the statistical findings

** Correlation is significant at the 0.01 level (2-tailed).

* Correlation is significant at the 0.05 level (2-tailed).


Up till now, there is an enormous amount of research effort targeted to discovering the key elements responsible for student achievement. While the review of TIMSS has generally focused on critiquing the methodology and validity of the tests, others have focused on the quality of education in participating countries, attributing rakings based on scores. In evaluating the TIMSS report, considerable caution must be exercised in drawing conclusions on issues that might not be comparable across countries or across variables. The vast differences between cultures and national educational values are enough to illustrate that what works for Japan is not necessarily what is needed in France.

Operating from the assumption that every society is mirrored by its education system, it seems to make sense that a national indicator of student performance that is representative of the total population would capture both high and low academic achievers within a given country. Though an aggregate national score of achievement does not necessarily measure the quality of the education system (Rotberg, 1995), but rather it measures the make-up of high and low scoring students within that system. Because there are many factors affecting student performance, it seems quite imprudent to attribute all the credits or all the blames to schools. Research has however shown that poor black students' performance is lower than the performance of middle-class white students (Coleman, 1966). Not only are there achievement differences across socioeconomic levels, there are also unequal achievement across racial lines. One would think however that increased spending should help to attenuate the differences, yet some researchers show that school spending has very little correlation to increased student achievement (Hanushek, 1994; Monk, Nakib, Odden & Picus, 1995).

When we examine the main factors that affect the achievement of students, we find that these factors are at best only a part of a much bigger picture. The students' out-of-school environment mostly determines that picture (Traub, 2000). One of the most important findings so far is that socioeconomic condition has a definite impact on school performance. A measure of student performance at the macro level using the TIMSS math and science scores has shown correlation between income inequality and countries' aggregate scores.

This study is by no means exhaustive nor does it pretend to have examined every possible angle. It is nevertheless an initial effort to explain the common denominator between countries and their respective test scores. There is no doubt that extensive research remains to be conducted. Below is a description of how data was collected for the study.