Racial Bias in IQ Testing

The Science and Politics of IQ: The Race Concept


The ideology of race has existed for millennia, and indeed depictions of race relations in ancient Egyptian paintings stretch as far back as 1350 BC (1). The empires of the ancient world – Egyptian, Greek, Roman, and later the Muslim empire – included peoples with vastly different skin colours, hair textures, and facial features. However, history shows that Africans in Europe were assimilated into those societies wherever they were found without significant social differences being linked to their physical variations (2). Moving into the sixteenth century, and lasting until the 1700s, race was a folk idea, or a general categorizing term viewed as interchangeable with terms such as type, kind, sort, or breed (2).

Nearing the end of this time period, Europeans began encountering non-European civilizations with increasing frequency, and Enlightenment scientists and philosophers began ascribing a biological meaning to race (1). Notably, Carl Linnaeus, a Swedish physician, botanist, and zoologist well-known for establishing the taxonomic bases used to classify and name organisms, put forth that species varied by culture and place (1). The term began being applied to plants, animals, and humans as a taxonomic subclassification within a species, and as a result, race became understood as a biological, or natural, categorization system of Homo sapiens (1). During the Revolutionary era, as European colonization of the West expanded and slavery became commonplace, the concept of race was used to legitimize and prescribe the exploitation, domination, and violence against non-White peoples (1). After all, this was a time in which the dominant political philosophy was that of equality, civil rights, democracy, and justice for all human beings, so the only way leaders of American colonies could justify slavery was by demoting Blacks and other racialized groups to nonhuman status (2).

Scientific investigation into racial differences began during the second half of the 18th century, and the 19th century saw the initiation of attempts to quantify race differences through the measurement of heads and other anatomical structures (2). By the end of the 1800s, greater attention was being paid to the size and contents of the braincase, and scientists thus arrived at the final criterion by which they believed race differences could be measured: the function of the brain (2). And so, in the early 20th century, intelligence tests became the dominant focus of scientists interested in documenting the differences between races, especially between Blacks and Whites (2).

In 1905, French psychologist Alfred Binet published the first intelligence test with his colleague Theodore Simon, known as the Binet test (3). 1916 saw Lewis Terman publish his revised and Westernized Stanford-Binet test, with the intention of “discover[ing] enormously significant racial differences in general intelligence” (4). These tests were used to diagnose “defective genes”, prescribe mass sterilizations of “feebleminded individuals”, and further racial inequalities in resource access (5-8). This use of science to legitimize the abuse of IQ testing demonstrates Richard Lewontin’s idea thatscience is “a supremely social institution, reflecting the dominant values and view of society” (9).

The consensus among most of today’s academics in fields such as evolutionary biology and anthropology is that the concept of race suffers three critical flaws: race differences are not genetically discrete, they are not reliably measured, and they lack scientific meaning (2). Given that our species is at most a few hundred thousand years old, there has not been enough evolutionary time to allow for distinct species with different genetic capacities to evolve (2, 10, 11). In other words, Homo sapiens has no extant subspecies (2, 10, 11). Despite the current understanding that race is a malleable social construct largely informed by the social power hierarchies of a given time period, research investigating racial differences still continues (1, 12, 13). More than that, IQ testing continues to play a major allocative role in structuring social institutions and reflects its racially-motivated origin (7, 14, 15).

There is little debate that IQ testing reflects racial differences (16-18), however this paper seeks to use the wealth of research conducted over the past century to investigate the cause of these differences. This work then proposes next steps that educators, educational psychologists, and policymakers can consider and implement in order to make more meaningful strides towards closing the racial IQ gap.

Past Research

In 1914, two years before North American IQ tests were standardized by Terman, an American scientist named Henry Goddard investigated the heritability of “feeblemindedness” (6). He hypothesized this was a simple recessive Mendelian trait, however his theory was largely based on genetic models riddled with definitional issues such as: highly diverse categorizations, pedigrees whose creation relied on hearsay and subjective judgments, and a selective analysis and pre-screening of data (5). These are critical methodological limitations, but they enabled Goddard to find that feebleminded-individuals disproportionately belonged to minority groups (19). Such was the political and ideological climate into which IQ testing was being embroiled.

In 1936, Jenkins asked if Black individuals with a significant degree of European heritage were more likely to have high IQ scores (16). He reasoned that, when looking at a plot of IQ scores, the extreme high-end tail of the distribution – the cluster of points representing high scores – should be especially telling: if IQ is largely genetically-based, and if White individuals do in fact have higher IQs than Black individuals, then people at the high-end tail should have substantial European ancestry (16). Working with Black Chicago schoolchildren, Jenkins identified 63 children with IQs of 125 or higher, and 28 with IQs of 140 and above (16). He assessed their degree of European ancestry using self-reports about parents and grandparents, and categorized children as being all Black, mostly Black, exactly half Black and half White, and mostly White (16). Contrary to his hypothesis, Jenkins found that all of the identified children were actually slightly less likely to have substantial European ancestry than was estimated to be characteristic of the United States Black population at the time (16). He ultimately concluded that the results were consistent with a model of zero genetic contribution to the Black-White IQ gap (16).

But findings such as Jenkins’ were outliers at that time. For instance, Audrey Shuey published a seminal literature review, The Testing of Negro Intelligence, in 1958, that summarized the findings of 288 publications on race and IQ that had been conducted in the previous forty years (20). Although her work found that, for example, IQ scores and skin colour were barely correlated (with a correlation coefficient of only 0.1), Shuey insisted that skin colour was simply a weak proxy for genomic ancestry, and that, had these studies used more rigorous measures, results would “inevitably point to the presence of native differences” between races (20).

Despite Shuey’s assertions, her work was taken with a grain of salt, and as time has worn on, academics have voiced serious concerns about the statistical and methodological inconsistencies in Shuey’s analysis of data (11, 21, 22). To begin, Brown and Loehlin point out that Shuey had considered all the included studies as being equally valid (21, 22). But, doing so neglects the methodological differences that exist between these investigations, and the use of varied, non-genomic proxies for intelligence (skin colour, nose width, hair type, and blood type, to name a few), presents a serious limitation to the studies’ internal validities (21, 22). Loehlin extends his criticism by arguing that the sample sizes in some of these publications were too small to make any statistically significant interpretations or draw accurate conclusions (22). In 2016, in an attempt to clarify the genetic basis for racial differences in IQ, Kirkegaard published a paper examining the use of genomic ancestry as a predictor of cognitive ability and socioeconomic performance (12). His results led him to conclude that there was no genetic basis for race differences in IQ, directly contradicting Shuey’s claim (12).

As can be seen by Brown and Loehlin’s criticisms of Shuey’s publication, the 1970’s brought with it more research arguing for an environmental origin of the Black-White IQ gap (21, 22). In 1974, Willerman, Naylor, and Myrianthopoulos looked at children born to either a Black or White mother (23). The idea was that if the Black-White IQ gap is largely hereditary, then children with one Black and one White parent should have, on average, the same IQ, no matter which parent is Black (23). If, however, mothers were particularly important to the intellectual socialization of their children, and the socialization practices of White mothers were more favourable than those of Black mothers, then children of White mothers and Black fathers should have higher IQs than children of Black mothers and White fathers (23). It emerged that children of White mothers and Black fathers had IQs 9 points higher than children with Black mothers and White fathers, suggesting that most of the Black-White IQ gap due to environmental factors (23).

Scarr and Weinberg performed a study in 1976, in which they investigated the IQ test performance of Black and interracial children adopted by White families in Minnesota (17). Over 100 families participated, with a total of 321 children four years old or older (17). Of those children, approximately half were the biological White children of adoptive parents, while the other half were the Black and interracial adopted children (17). Intellectual, personality, and attitudinal tests were administered to both the parents and the children, extensive interviews were conducted with the parents, and ratings of home environment were made (17). Based on their analyses, the authors saw that socially-classified Black children’s IQ scores varied by at least one standard deviation, or fifteen IQ points, in White environments compared with Black environments (17). They further found that interracial children scored an estimated 12 points higher than those with two Black children, but it is important to note that they attributed that difference to confounding variables such as large differences in maternal education and pre-placement history (17). Based on these results, Scarr and Weinberg ultimately concluded that “if all [B]lack children had environments such as those provided by the adoptive families in this study … their IQ scores would be 10-20 points higher than their scores are under current rearing conditions” (17).

Contemporary Research

Although the majority of literature investigating racial differences in IQ was published in the latter half of the twentieth century, there are also several important contemporary works that merit discussion. In 2003, Philippe Rushton published Brain Size, IQ, and Racial-Group Differences: Evidence from Musculoskeletal Traits (13). It is important to note that from 2002 until the time of death, Rushton was the head of the Pioneer Fund, an American non-profit research foundation widely criticized for holding ties to eugenics and being rooted in racist ideology (24-27). There seem to be a number of methodological inconsistencies in the manner Rushton interpreted his results, and the conclusion that follows is not unlike those drawn in his previous works (28-33). Here, Rushton suggests that evolution exerted different selection pressures on geographically isolated populations, which led to a wide array of racial differences (13). Among those differences was brain size, and it is the racial variations therein that have led to racial differences in IQ (13). However, closer scrutiny of Rushton’s work reveals that the evidence he offers to support the claims that (a) brain size is positively correlated with intelligence and (b) brain size differs between races, is woefully inadequate.

Rushton relied on two resources to substantiate his first claim that brain size is positively correlated with IQ (13). The first was a paper by Van Valen[AC1] , which gives the experimental correlation of brain size and IQ as being between 0.08-0.22, and provides an estimated maximum correlation of 0.3 (34). But, rather than use Van Valen’s experimental results, Rushton opted to use the estimated value of 0.3 because it was higher (35). This decision calls into question Rushton’s scientific integrity, especially since he does not provide any explanation or reasoning to support his choice. The second paper Rushton uses quite heavily is Brain Size and Intelligence in Man by Passingham (13, 36). In this publication, Passingham establishes an extremely weak correlation in males between brain size and occupation, which he uses as a proxy for IQ (36). But, the impression that Rushton conveys is that this data strongly supports his own claim that brain size and intelligence are correlated (35).

In defending his second assertion, that there are racial differences in brain size, Rushton uses the works of five sets of authors, however only one of those provides primary data (13, 37). The remaining four sources were secondary review articles interpreting the correlations between brain size and IQ originally presented by others (38-41). All of these publications are highly suspect, as they lack both crucial test controls as well as detailed information about data collection methods (35). Coon’s 1982 publication measures the external skull dimensions rather than skull capacities (39). Molnar’s 1983 publication does not explicitly mention the source of data and lacks controls for age, sex, and body height (38). Tobias’s 1970 publication also lacks controls for sex and body size, age at death, nutritional status in early life, and lapse of time after death [Office2] (41, 42). As the total race difference in skull capacity is less than 4%, a failure to control for any one of these variables presents enough of a confound to more than explain this difference (35, 41). Rushton’s reliance on Gould’s 1978 paper is also troubling, given that it appears that Rushton manipulated data by organizing racial cranial capacities in a manner consistent with his theory. A similar issue occurs in Rushton’s presentation of Ho et al.’s data, in which he averages brain weights from various studies, leading to significant statistical problems. Taking into consideration the superficial analyses and deviations from appropriate scientific methodology that seem to be characteristic of Rushton’s work, there is firm ground to be dubious of his claims.

A critical approach to Rushton’s research is especially important because, in the same year, Turkheimer and his colleagues investigated the relationship between SES and IQ heritability in young children, interpreting SES as a measure of the quality of the environment in which children were born and raised (18). They studied 319 pairs of twins, both mono- and dizygotic, and racial split between White and Black was approximately half and half (18). Using models allowing for components attributable to the additive effects of genotype, shared environment and non-shared environment to interact with SES measured as a continuous variable, the authors concluded that in impoverished families, 60% of the variance in IQ is accounted for by the shared environment, while the contribution of genes is nearly nonexistent (18). Meanwhile, in affluent families, the result was almost exactly the reverse: a large proportion of IQ variance resulted from genes and nearly none was due to environment (18). This is important, as at the time, the majority of individuals living at or below the poverty line were racialized rather than White (43).

Even with research such as Turkheimer’s, an argument still common at the time was that IQ was genetically based because the Black-White IQ gap had remained constant over the course of the past thirty years (44). This constancy had prevailed even in the face of positive environmental changes for Black individuals, for instance gains in occupational status and school funding (44). However, this argument was being asserted despite the lack of empirical data to measure trends in Black IQ, and Dickens and Flynn sought to remedy this (44). Obtaining test results of Black individuals for four different IQ tests spanning from 1972 to 2002, and standardizing White IQ at 100, Dickens and Flynn found that Black IQ had increased by an average of 0.188 points per year (44). In other words, the Black-White IQ gap had decreased from as high as 18 points in 1980, to 12 points in 2002 (44). Because the Black-White IQ gap was not constant, Dickens and Flynn concluded that environment had to play some role in the difference between races (44).

Dickens and Flynn were met with a great deal of criticism, especially from Rushton and Jensen (2006), who wrote a paper titled, The Totality of Available Evidence Shows the Race IQ Gaps Still Remains (29). Rushton and Jensen explained that they examined ten categories of technical research, and ultimately concluded that the mean Black-White IQ difference in the United States is roughly 80% heritable (29). Two things are important to note: first, Dickens and Flynn did not argue that the race IQ gap no longer exists, simply that it is smaller than it has been in the past. Second, and more important, Rushton and Jensen (2006) provide no real evidence in support of their assertion of the heritability of IQ.

The State of IQ Testing Today

In his influential 1974 novel, The Science and Politics of IQ, Leo Kamin writes that “the politics of intelligence testing and the science of intelligence testing are inseparable” (45). From a public policy perspective, IQ testing remains an issue – despite the breadth of research surrounding it – because these tests play a major allocative role in education opportunities (46). Thus, the use of science to validate the methodology of IQ testing has far-reaching consequences for the organization and hierarchy of social institutions (15, 46, 47). Given the complex and tangled nature of IQ and politics, the politicization of IQ should be used to arrive at the most equitable outcomes, involving the democratization of the allocation of “human resources” to various professions within society, and a meritocratic system for assessing professional suitability (48). However, racial gaps in IQ testing today suggest that IQ tests have largely failed as a means for achieving a more democratic society through the unbiased search for ability.

Today, the IQ test gap in standardized testing is reflected in score differences in which failures (as defined by standardized tests) are disproportionately concentrated in minority, low SES communities (46).This racial gap disenfranchises minority groups by presenting access barriers to education and workplace opportunities (18, 46, 47). Although 79% of educators, educational policymakers and specialists, and developmental psychologists believe that IQ tests pose at least some racial bias and 84% believe that they reflect some SES bias, the overwhelming majority still think that these tests should be used for students of all races (48). Researchers have therefore concluded that the current utility of IQ tests outweighs the potential negative impacts that they have on minority groups (48). This reasoning seems to support the continued racial disparity in professional and workplace opportunities on a first-pass basis, but makes sense if utility is identified on a relative rather than absolute basis (48). That is, researchers argue that racial problems introduced by IQ testing would be amplified in the absence of standardized testing, rather than attenuated (48). Furthermore, researchers extend these judgments to the validity of a variety of standardized admissions tests, including the SAT, ACT, GRE, MCAT, LSAT, and GMAT (48).

Although the problems with IQ testing manifest as many biases when considered on a local level, a more global perspective reveals that high-stakes, standardized tests exacerbate racial inequalities (7). Research suggests that the aggregation of low standardized test scores in Black communities distorts students’ locus of control and attributions of failure, such that students are more likely to internalize their inability to achieve desired test scores are being their fault, while at the same time considering their test results to be outside of their control (7, 49). This combination of an external locus of control and internal attributions leads to situation aptly summarized in the metaphor of a leaky pipeline, which reflects that Black students are more likely to drop out from educational curriculums more frequently than their White counterparts, and that this situation is exacerbated by low standardized test scoring (49).

Moving Forward

Although a wide range of measures have been proposed to improve the racial disparity in standardized IQ tests, researchers agree that much work needs to be done before equality in education is reflected in the way we evaluate students (7, 15, 46-48). This paper suggests four strategies that have the potential to alleviate this racial disparity in IQ testing.

The first is the acceptance of the scientific consensus on the genetic basis of racial differences in IQ, and that there needs to be a stronger push to address the environmental barriers faced by students of minority races struggling for equitable education opportunities. On a state and federal level, the political discussion must policies motivated by the idea that there is no use in increasing resource access and educational opportunities in minority groups, as purported by Jensen in his famous 1969 publication How Much Can We Boost IQ and Scholastic Achievement? (50). Instead, elected government officials need to maintain and expand the funding of social welfare programs designed to address these issues. It is reasonable to suggest that a concerted effort to alleviate racial disparities in environmental factors would serve as fundamental basis for leveling out the academic playing field.

At an institutional level, graduate schools need to modify their admissions requirements to reflect the racial disparity in IQ testing that extends to minority groups. By diversifying the admissions requirements of professional schools, admissions councils have the chance to relax the disproportionate pressures exerted on minority groups through standardized testing, and ensure that students of lower SES still have the chance to participate in the graduate selection process (14, 51, 52). Given that Black students are more likely to incur the negative impacts of test anxiety on performance during standardized tests, educational policymakers need to determine more appropriate ways of overcome this disproportionate effect size should they seek to alleviate minority testing biases.

Finally, a growing body of literature suggests that teachers in low SES neighborhoods tend to divert a greater proportion of classroom time and resources away from academic goals and towards increasing student self-esteem (14, 53). Aligning teachers’ response and students’ academic needs in low SES communities provides a promising avenue through which to improve student performance.

Although there were – and continue to be – racial differences in IQ (16-18), and the vast majority of research looking into these differences was conducted in the 1900s (a time period known for blatant and rampant racism), it is evident that even then, scientists were more-or-less consistently finding that genetics were of smaller consequence to IQ than environment (16, 17). And yet, certain researchers, like Shuey, Rushton, and Jensen, were (and are) adamant that DNA is to blame (20, 28-33). This seems to indicate that these authors approached their research with an inherent bias, and their conclusions are less a reflection of the validity of their results, and more about their own ideologies and political agendas.

As we move forward, it is important to keep in mind that science is often a reflection of the political climate of the time. Especially when considering race and IQ, consider an author’s motivations and personal opinions. Think critically about their results. Take their claims with a grain (or several) of salt.

