Example Statistics Essay
Human Variability Social Science Datasets
Human variability is an important component of social science datasets. How do social researchers account for this variability when drawing conclusions from data? Describe two situations in which the basis for these conclusions is undermined.
The information contained in social science datasets aims to provide an accurate description of the social world (Byrne, 1998, p. 126). However individuals
within any society are inherently highly variable due to human nature.
The idea of humanness is something which Haslam et al. (2009, p. 55) suggest is ascribed to members of our species in a manner which is largely taken for granted. The notion of what defines humanness or human nature is somewhat tricky to define (Schein, 2010, p. 143). One way of simplifying human variability is to understand it in terms of its genetic basis. The human genome is constructed of more than 35 000 genes; with the exception of monozygotic twins, these are unlikely to be identical for any two individuals on the planet (Naylor & Chen, 2010, p. 275). However variability is likely to be further exacerbated by the influence of social factors, as human nature is formed as a combination of these biological and social influences (Kundu, 2009, p. 16).
The concept of human nature and its underlying composition therefore opens up the notion of human variability. A general definition of variability would be “striking deviations from…patterns” where patterns are observed within a given population (Gould, 2004, p. 2). However, it would be argued that this very general definition of variability could be insufficient to describe human variability, as the concept of human nature already discussed would indicate that we are all different to one another, making it difficult to ascribe any general patterns (Gould, 2004, p. 5; Cohen, 2007, p. 71). Even within the cultural model of human nature there is wide variability between individuals (Schein, 2010, p. 143). Instead, human variability would therefore be taken to constitute not only differences from the norm, but differences simply from one another.
It is impossible for any research study to adequately sample all of this variability (OECD, 2000, p. 177), although in order to draw meaningful conclusions from studies in social science it is important that human variability is something which is considered by researchers. This essay discusses the approaches taken by social science researchers to account for this variability, then presents two situations in which the basis for these conclusions may be undermined.
Accounting for Variability
It has been suggested that many of the models which social sciences rely on to explain human social phenomena may present a somewhat limited representation due to their not taking adequate account of human variability (Tanner, 2008, p. 2). However, some have gone so far as to argue that the presence of human variability means that there can be no general principles regarding human life and social interactions (Gould, 2004, p. 5; Cohen, 2007, p. 71). This is based on the observation that the human psyche only becomes organized as a result of external influences. This therefore implies that the human being is plastically variable, making it impossible to apply firm general laws to it (Cohen, 2007, p. 71). It has also been supported by the work of the German statistician Wilhelm Lexis, who found that dispersions of human behaviour from statistical models were much greater than predicted by chance (Gould, 2004, p. 5). Despite the possible validity of this argument, it is still desirable to attempt to construct general rules for the purpose of understanding the impact of different actions on the population, for example health care interventions and social policy.
In attempting to construct such models through analysis of social datasets, human variability may present an issue to social science researchers on two accounts. The first is that within any study population there is likely to be a significant level of variability between members of that population at any one point in time. Secondly, since the variability is on-going and continuous, it means that the conclusions drawn from any piece of research may be inextricably linked to the specific circumstances which existed at the point in time at which the research was conducted (Tanner, 2008, p. 2).
One example of the first of these issues is given by Tanner (2008, p. 132), who discusses the reaction of individuals to religious gatherings. The complexity of human nature means that different values and motivations most likely result in every individual experiencing religious practice in a slightly different way. While this in itself often forms the focus of qualitative research, it may be extremely difficult to take account of this variability in quantitative studies. Another example is discussed by Byrne (1998, p. 126), who suggests that quantitative research conducted within one school is unlikely to be directly applicable within another. They argue that the data collected is unlikely to be transferrable as it is likely to be significantly influenced by the school’s social dynamics, which are unlikely to be identical in any other school. A third useful example is in the study of the placebo effect in medicine, where there has been shown to be marked variation among individuals, making it difficult to derive any general trend (Lyby et al., 2011, p. 2405).
From such studies, it is however possible to see how social science researchers attempt to account for variability when deriving conclusions from the datasets.
One approach to this is to limit the scope of the conclusions which are drawn from a study to a specific subsection of the population, as discussed by Byrne et al. (1998, p. 126). This may involve performing primary research within the specific population for which it is to be applied. For example if there is a need to have evidence on which to base school policy, social science researchers may choose to conduct research specifically within that school. Here, however, there is still likely to be variation within the sample, in spite of shared social characteristics, as this would not account for other factors of human nature, such as genetic or personality differences, both of which may have a significant impact on behaviour and academic performance at school (Furnham et al., 2009, p. 769). Therefore the conclusions could be inappropriately applied to those not fitting the original norm.
A similar situation may also arise due to another approach which is to remove outliers from the data, which are those which vary markedly from the mean (Motulsky & Christopoulos, 2004, p. 23). Although this would appear to limit the usefulness of any study in the larger social science context, it may be necessary to be able to derive any meaningful predictive trends from the quantitative data. Many of the statistical testing methods which are used to analyse social science datasets are disrupted by high levels of variance. For example ANOVA attempts to explain variance in respect to one variable within the population according to the presence or absence of other factors. Yet if these other factors are too variable in nature then the results are likely to be disrupted. Additionally, the variable of interest itself must not vary markedly from the normal distribution, or this too will lead to inaccurate conclusions being drawn from the analysis (Richards, 2009, p. 14). This issue may be overcome in many instances by ensuring that a larger sample size is used, so that there is more probability of outliers lying at the extremes of this normal distribution (Gorard, 2003, p. 62).
An alternative approach is to try and instead actively limit the variation in the data analysed, as this may produce results which are more readily generalizable to the study population (Gorard, 2003, p. 61). One way to achieve this is to attempt to strip individuals within any dataset down to shared basic characteristics and then select participants which match the desired set of characteristics. This involves reducing aspects of culture down to common elements (Shore, 2012, p. 148). For example, in the study by Lyby et al. (2011, p. 2405) participants were selected on the basis of certain shared aspects of their medical history and care, so that the conclusions drawn would be targeted towards this specific subset of the population, and would not be applicable to those varying from these characteristics. This approach may be useful when considering individualized medical interventions, but may be less useful when considering wider social initiatives, where the population will invariably stray from tightly shared characteristics.
Neighbourhoods and Communities
One area in which conclusions drawn from social datasets may be undermined is that of studies involving the study of neighbourhoods and communities for the purposes of policy formation. An example of this is the policy a decade ago regarding the introduction of street wardens to the UK. There have been some studies conducted in individual neighbourhoods in which street wardens were introduced, from which the data showed subsequent reductions in crime and other negative outcomes. The conclusions from these studies attempted to account for variability by suggesting that the results be applied on a limited level (University of Leeds, 2005, p. 5; Sin, 2008, p. 389). However, when these conclusions were applied on a wider level, there were very varied results seen across different neighbourhoods (Sin, 2008, p. 389).
This could be due to similar issues as those discussed by Byrne (1998, p. 126) in the context of schools. It is likely that the characteristics of communities present in different areas of the UK are likely to be highly varied due to individual, local social and environmental factors. Therefore these communities are likely to present very different reactions to one social intervention. Here it would be suggested that the usual approaches taken to account for variation in dataset analysis may not be adequate. For example it would be very difficult to take large sample sizes if considering each community as an individual unit. It would also be difficult to remove outliers, as it would be expected that the different communities would vary so markedly that it would be difficult to establish a norm. Even if this were the case, this would then severely limit the applicability of the analysis when considering national policy. One of the best solutions would be instead to consider collection of data from each individual area and tailor policy on a local level according to these findings.
A second example of an area in which social dataset conclusions may be undermined by human variation is in adult alcohol consumption patterns and its effects (Gould, 2004, p. 7). The nature of alcohol consumption is something which is now understood to be determined by a whole multitude of factors, including genetic factors, environmental factors, social circumstances and personality characteristics (Dick et al., 2011, p. 2512; Kendler et al., 2011, p. 1507). This in itself clearly opens this behaviour up to wide variation between individuals due to the complex interaction of these factors, a consequence of human variability as discussed in the introduction. However, this also means that the precise nature of alcohol consumption may be something which is open to change over time. Many of the shared aspects of humanity within any social group could be considered to be evolutionary in nature (Shore, 2012, p. 149). For example environmental and social variables change; however this evolutionary concept would imply that it is also possible that humans change in their very nature over time too (Gould, 2004, p. 7).
This may be further explained taking the example of a research study by Hingson et al. (2009, p. 783) which examined the influence of age of drinking onset on physical injuries, motor vehicle crashes and physical fights after drinking. The study concluded that drinking at an early age was associated with greater odds of each of these consequences and that delaying the age of drinking onset should be promoted to reduce the risk to the individual. Yet human variability could undermine these conclusions in a number of ways.
For example there is evidence that rates of violence have increased in general over time (Eisner, 2008, online). Although this could in itself be attributed to increasing trends in alcohol consumption, this may be unlikely based on data which indicates a decline in drinking patterns in the US (Kerr et al., 2009, p. 27). However, there have been a number of research studies which have indicated that personality traits possibly linked to violent behaviour have changed over the past few decades (Twenge et al., 2008, p. 875). This therefore indicates that risk of violence is something which is linked to variation in human nature over time and could change in the near future, which was not accounted for when discussing the future implications of this research. Therefore it is entirely possible that the suggested interventions could have little impact on risk of violence.
The very nature of humanity means that individuals vary markedly from each other, due to the influence of different genetic, psychological, social and environmental factors. The overwhelming potential for different combinations of these factors means that within any social setting it is likely that significant variability would exist, even though some common factors may be shared by members of that population. Although this is part of the main source of interest in the study of sociology, it may present a challenge when conducting analysis on quantitative social data sets. This variability may be accounted for when drawing conclusions by ensuring that analysis reduces the variability in the data set or by limiting the application of the conclusions outside of the sample from which it was drawn.
Byrne, D.S. (1998) Complexity Theory and the Social Sciences: An Introduction. London: Routledge, p. 126.
Cohen, E. (2007) The Mind Possessed: The Cognition of Spirit Possession in an Afro-Brazilian Religious Tradition. Oxford: Oxford University Press, p. 71.
Dick, D.M., Meyers, J.L., Rose, R.J., Kaprio, J. & Kendler, K.S. (2011) Measures of current alcohol consumption and problems: Two independent twin studies suggest a complex genetic architecture. Alcoholism: Clinical and Experimental Research, 35(12), 2152-2161.
Eisner, M. (2008) Modernity strikes back? A historical perspective on the latest increase in interpersonal violence (1960-1990). International Journal of Conflict and Violence, 2(2). Available [online] from: http://www.ijcv.org/index.php/ijcv/article/viewArticle/41 [Accessed 28/11/2011].
Furnham, A., Monsen, J. & Ahmetoglu, G. (2009) Typical intellectual engagement, Big Five personality traits, approaches to learning and cognitive ability predictors of academic performance. British Journal of Educational Psychology, 79(4), 769-782.
Gorard, S. (2003) Quantitative Methods in Social Science. London: Continuum, pp. 61-62.
Gould, R. (2004) Variability: One statistician’s view. Available [online] from: http://escholarship.org/uc/item/5013f27n;jsessionid=94421E4A96F2309060D1074A216A2591#page-1 [Accessed 25/11/2011].
Haslam, N., Loughnan, S., Kashima, Y. & Bain, P. (2009) Attributing and denying humanness to others. European Review of Social Psychology, 19(1), 55-85.
Hingson, R.W., Edwards, E.M., Heeren, T. & Rosenbloom, D. (2009) Age of drinking onset and injuries, motor vehicle crashes, and physical fights after drinking and when not drinking. Alcoholism: Clinical and Experimental Research, 33(5), 783-790.
Kendler, K.S., Gardner, C. & Dick, D.M. (2011) Predicting alcohol consumption in adolescence from alcohol-specific and general externalizing genetic risk factors, key environmental exposures and their interaction. Psychological Medicine, 41, 1507-1516.
Kerr, W.C., Greenfield, T.K., Bond, J., Ye, Y. & Rehm, J. (2009) Age-period-cohort modeling of alcohol volume and heavy drinking days in the US National Alcohol Surveys: Divergence in younger and older adult trends. Addiction, 104(1), 27-37.
Kundu, A. (2009) Social Sciences: Methodology and Perspectives. New Delhi: Dorling Kindersley, p. 16.
Lyby, P.S., Aslaksen, P.M. & Flaten, M.A. (2011) Variability in placebo analgesia and the role of fear of pain – an ERP study. Pain, 152(10), 2405-2412.
Motulsky, H. & Christopoulos, A. (2004) Fitting Models to Biological Data Using Linear and Nonlinear Regression. Oxford: Oxford University Press, p. 23.
Naylor, S. & Chen, J.Y. (2010) Unraveling human complexity and disease with systems biology and personalized medicine. Personalized Medicine, 7(3), 275-289.
OECD (2000) Social Sciences for a Digital World. Paris: OECD, p. 177.
Richards, G. (2009) Psychology: The Key Concepts. Milton Park: Routledge, p. 14.
Schein, E.H. (2010) Organizational Culture and Leadership. San Francisco, CA: Jossey-Bass, p. 143.
Shore, B. (2012) Unconsilience: Rethinking the two-cultures conundrum in anthropology. In E. Slingerland & M. Collard (Ed.) Creating Consilience: Integrating the Sciences and the Humanities. Oxford: Oxford University Press, pp. 140-160.
Sin, C.H. (2008) The introduction of Street Wardens as a social policy intervention in Britain targeting the regeneration of local communities: Theory and practice. Journal of Urban Regeneration and Renewal, 1(4), 389-400.
Tanner, R.E.S. (2008) Contemporary Social Science Research. New Delhi: Concept Publishing Company, pp. 2, 132.
Twenge, J.M., Konrath, S., Foster, J.D., Campbell, W.K. & Bushman, B.J. (2008) Egos inflating over time: A cross-temporal meta-analysis of the Narcissistic Personality Inventory. Journal of Personality, 76(4), 875-902.
University of Leeds (2005) Criminal Justice Review. Available [online] from: http://www.leeds.ac.uk/law/ccjs/an_reps/17rep.pdf#page=39 [Accessed 28/11/2011].