Behaviourism 1: Skinner's 'reinforcement' and 'conditioning' theories lecture
This chapter introduces the broad theory of Behaviourism, explaining where it comes from and what it means in terms of ideas and methods. It then describes Skinner's ideas and explains key terms such as 'reinforcement' and 'conditioning' in this context. The chapter then discusses how this theory applies to educational research and what this means for educational practice. Examples are provided to illustrate how the theory works out in various educational contexts. The strengths and limitations of this theory are explored. There are prompts for reflection which focus on key points and help you to relate this material to your own knowledge and experience.
Learning objectives for this chapter
By the end of this chapter, you should be able to:
- understand and explain clearly what Behaviourism means
- understand and explain clearly what Skinner means by concepts such as 'reinforcement' and 'conditioning'
- explain how Skinner's work on behaviourism is applied to education
- critically evaluate and discuss the strengths and limitations of this theory
- link this theory to educational practice
What is Behaviourism?
Behaviourism is a branch of Psychology which began in the late nineteenth century with the work of Russian psychologist Ivan Pavlov (1849-1936), and was further developed in the United States by Edward L. Thorndike (1874-1949), John B. Watson (1878-1958) and B. F. Skinner (1904-1990). One of its most famous principles is the idea that all behaviour, in both animals and humans, can be traced back to a neurological association between stimulus and response.
Pavlov was at first interested in physiological processes such as digestive organs. He conducted experiments with dogs, and noticed by chance that when technicians entered the room, the dogs started salivating. He deduced that this was an involuntary, automatic reflex, which occurred even before any sight or smell of food was evident. The dogs had learned to associate the entry of the technicians with the imminent arrival of food. This led to further research into the relationship between a stimulus and the behavioural responses to that stimulus. It was noted that dogs could be trained to salivate at the sound of a ringing bell, if food was consistently brought to them when a bell was rung. On the basis of this empirical work, Pavlov developed the theory of conditioned reflexes, sometimes known as classical conditioning, which explored how previous conditions influence behaviour in the present. Further experiments with animals were carried out, in which different stimuli were offered, or withdrawn. By manipulating these variables, he demonstrated the process known as a conditioned reflex. Animals were conditioned through repeated actions to respond in a certain way.
This idea has been highly influential across many fields, since it offers a way of training animals and humans to respond in a predictable way by creating associations between a stimulus and a response, even when there has previously been no direct link between the stimulus and the response. In schools, for example, a bell is rung to signal the end of one period and the start of another. Teachers often observe children reaching for their coats and bags as soon as the first tone is heard. Cattle or pigs in a field will gather at the sound of the feed bucket being shaken by the farmer, so long as they have had experiences in the past that have indicated a link between this sound and the arrival of food. Babies learn to open their mouths when they see a spoon coming towards them, and many sensory experiences have the power to change our behaviour every day, simply because we have been exposed to them in the past, and this earlier experience conditions how we will respond to a similar stimulus now and in the future. This principle is very obvious when there is a direct link between a primary stimulus, such as food, and an organism's behaviour.
Edward L. Thorndike developed a behaviourist theory of learning that he called 'connectionism', based on the idea that "the most fundamental type of learning involves the forming of associations (connections) between sensory experiences (perceptions of stimuli or events) and neural impulses (responses) that manifest themselves behaviorally" (Schunk, 2012, p. 31). This theory assumes that animals can learn by themselves, without human intervention, through trial and error. The animal will carry out a series of random responses which are bound to result in some successful and some unsuccessful outcomes but gradually the unsuccessful outcomes are rejected, and the successful outcomes are retained. The connections are made through frequent repetition, until finally the desired goal is achieved. Cats, birds, primates and many other animals can be observed trying out various swipes, pecks and jumps in the hope of attaining some immediate goal such as opening a gate, or accessing some food item that is stuck or out of reach.
This theory can be used to explain the type of learning that can be seen in the case of a small child learning to catch a ball. At first the movement of the hands is clumsy, and the child does not respond quickly or skilfully enough to place her hands at the right spot to catch the ball. With practice, however, hand-eye co-ordination improves, and the child's response becomes faster and more accurate. Skilled players of ball sports can catch a ball without even thinking about it. When they see someone throwing a ball, they just intuitively reach for the moving object and catch it. There is little or no cognitive awareness involved, but visual and mechanical processes work together in harmony. It is the active engagement of the whole body in the task of catching the ball that helps the child to learn. Mistakes in technique are eliminated one by one, until the child becomes competent in the task of catching the ball.
John Watson was an American scientist who applied these Behaviourist ideas to humans. He developed a method of rigorous observation and focused more on what a person does, than what he or she is thinking at the time. From pure observation, he saw evidence of stimulus/response behaviour in humans. This basic relationship between stimuli and behaviour lies at the heart of the theory of Behaviourism.
The power of early experiences should not be underestimated, since it can be difficult to erase the associations that children make when very young. These ideas are important far beyond child development studies. Adults also struggle with these unwanted responses: "the notion that learned associations can be dormant but reappear when a trigger is reintroduced has implications for those trying to overcome phobias, drug, alcohol and gambling addictions, or learning healthy eating habits" (Gray and MacBlain, 2015, p. 31). Many behaviourists conceive of memory as "neurological connections between behaviours and external stimuli" (Schunk, 2012, p.23) and believe that the firing of these connections creates habitual responses over time. It is argued that because these responses are wired into the body through reflexes, and not just the result of cognitive activity, they can be particularly difficult to erase.
Think about pets, or domesticated animals and how they are trained to respond to stimuli so that they can live comfortably and safely alongside human beings. Observe the interaction between a dog and its owner.
What steps do people take to housetrain a dog, or teach it to obey simple commands? (If you have no experience of training a dog or similar animal, talk to someone who does have this experience, or watch some online videos on how to train animals).
List the stimuli that the animal responds to. Observe how the animal behaves. Is the animal's behaviour instinctive and automatic, or is it learned? How can you tell?
Skinner's concept of 'reinforcement'
Skinner noted that the relationship between stimulus and response in humans was more complex than a straightforward reaction, as in the case of the automatic responses that Pavlov had observed in dogs. Skinner noticed that in his experiments with humans, if a behaviour is rewarded in some way, then it is likely to be repeated, while if the behaviour is punished in some way, it is less likely to be repeated. Humans, and some animals, can learn that there are consequences of their behaviour, and then modify that behaviour to avoid negative consequences and gain positive rewards.
Skinner conducted some famous experiments with rats and pigeons which were in a cage with a lever that automatically delivered food when pressed. The rats quickly learned to press the lever in order to obtain the food. This link is called 'reinforcement'. In fact, Skinner theorised that there were three types of responses from the environment that can influence behaviour: neutral operants that neither encourage nor discourage a behaviour to be repeated, reinforcers which increase the probability that a behaviour will be repeated, and punishers, which are responses from the environment that decrease the probability of a behaviour being repeated.
Skinner distinguished between two different kinds of reinforcers: positive reinforcers and negative reinforcers. A positive reinforcer is something (like a treat, or a word of praise for example) that rewards the subject for demonstrating the desired behaviour, while a negative reinforcer is something unpleasant that is removed when the subject demonstrates the desired behaviour. An example of a negative reinforcer is a rule that says workers who arrive late in the morning must go and speak to their supervisor. A worker who is habitually late in the morning may change his behaviour, and therefore remove the necessity to have an awkward discussion with his supervisor. Having this onerous obligation removed from him helps to motivate him to get up earlier and arrive on time to work. The referral to the supervisor rule is a negative reinforcer. Remember, positive and negative reinforcers always increase the likelihood that a particular response will continue to be made in the future. They have the same ultimate effect, even though they use different means to achieve that effect.
Whitebread (2012) cites the example of gambling on a fruit machine to illustrate the principle of reinforcement. People gamble on such machines in the hope of obtaining a reward. The goal of the gambler is to gain back more money than the amount that she has paid in to the game and the enjoyment comes from that primary reward of more money, but also from the anticipation of winning that reward.
The chance of obtaining the reward is unpredictable, so that the gambler must continually play the machine in the hope of winning, even though most of the time she is likely to lose. When people win, they experience many positive feelings such as euphoria, a sense of achievement and a feeling of wellbeing that comes from having more money. When people are losing, they experience negative feelings of failure, but they patiently keep playing in the hope of having a positive, winning experience again.
In Skinner's terms, the reward system that comes with any gambling game or fruit machine could be called a 'variable schedule of reinforcement' and Whitebread (p. 114) observes that this kind of unpredictable reinforcement "was a far more powerful motivator to maintain a learnt behaviour than a fixed or regular schedule". If the fruit machine always paid out according to a regular schedule, or the poker player knew in advance exactly how much he would win or lose at any table, there would be no fun in playing. It is the hope of a large reward, and the challenge of beating the odds to make a big win, rather than a break even, small loss, or larger loss that is so motivating for the gambler. The experience of winning is a stronger motivator to continue than the experience of losing.
From all of his observations of behaviour in response to different kinds of reinforcers, Skinner concluded that rewards are a much more effective way of guiding behaviour than punishment. They also lead to a more pleasant social atmosphere in which everyone can express themselves. If this atmosphere of anticipating rewards is repeated, it will then, in itself, become a further positive reinforcement on those who are present. Punishment might successfully stop the worst excesses of bad behaviour, but it is not a very effective way of fostering good behaviour.
One complicating factor in this theory is the observation that responses to stimuli are not always consistent, and you cannot always tell what an individual will regard as desirable. Offering a tennis ball as a reward might work very well with your pet labrador but it is not likely to impress your teenage son. Skinner (1965, pp. 72-73) notes that: "the only way to tell whether or not a given event is reinforcing to a given organism under given conditions is to make a direct test." In other words, the effectiveness of any reward system is highly context-dependent, and it is not possible to make a generic prediction about what will act as a reinforcer for particular behaviours. Furthermore, a reward that motivates an individual on one occasion may not work so well on another occasion.
Skinner's theory, then, demonstrates that rewards are a fundamental tool for anyone who wants to modify behaviour. There are individual differences between people, however, and many variables in the environment and in human experiences, which mean that the type of reward has to be chosen to suit the context and the individual(s) involved in each case. The schedule of reinforcement is also important, and it seems that variability is more effective than a steady and predictable system of rewards.
Skinner's concept of 'operant conditioning'
The technical name that Skinner gave to the learned stimulus/response process is 'operant conditioning', though it is sometimes also known as 'instrumental conditioning'. These terms are used to distinguish learned behaviours from the involuntary behaviours of classical conditioning. In his later work, Skinner theorised that by varying the kinds of stimulus that a person is exposed to in a carefully planned way, it is possible to modify that person's behaviour. It is important to note that many aspects of the environment can be used to achieve this purpose, including financial reward systems for completing tasks well and fines for failure to complete tasks, verbal praise, approval and status for certain behaviours that are considered desirable by the group or organisation, and criticism or disapproval for other behaviours that are not considered acceptable. In a scientific experiment, it is quite easy to reduce the number of variables in the environment and observe any changes that follow on from reinforcing or punishing the subject's actions. It is more difficult to do this in natural human society, because there are so many aspects of the environment that can act as reinforcement or punishment.
Human beings are not robots that are programmed to behave in the same way all the time, but they are very complex and they may be subjected to conflicting stimuli, as well as a range of past experiences and present emotions which can influence behaviour in many different ways.
Think about the role of a parent in terms of Skinner's theory of operant conditioning.
How do parents try to condition the behaviour of their children?
Do you think it is possible to condition children only using positive reinforcement? Or do you think punishment is necessary? Can you think of any examples in your own life where you have been subject to operant conditioning?
Look up the term 'operant conditioning' in the library, or on the internet, and write down a definition of this term and some examples of its use in a) prisons and b) psychiatric hospitals or any other medical/clinical context.
How does Skinner's work apply to Education?
Skinner's work has been very influential in education, and many of his concepts have been translated into teaching and learning strategies that are based on the core ideas of reward and punishment. Schunk (2012, p. 25) notes that "behavioral theories seem best suited to explain simpler forms of learning that involve associations, such as multiplication facts, foreign language word meanings, and state capital cities". Teachers use techniques such as flash cards, for example, to create associations between images and words. Or teachers might use games and songs to practise letters and sounds. The idea is that learners will automatically make these associations using neurological pathways, rather than by having to use cognitive processes such as comparing things, or making deductions. Mental arithmetic is a slow and difficult cognitive task for young children. Frequent repetition, and many associations with sounds and music can help children to respond instantly and intuitively to simple arithmetical tasks.
Mark is a six-year-old boy who does not like clowns. If he sees anyone in a clown costume, or wearing a mask that reminds him of a clown's face, he immediately starts to cry and runs away. He even reacts this way to images of clowns in books, or on television, including cartoon clowns. Mark's teacher asked his mother if there had been anything in Mark's past experience that might have contributed to this behaviour. Mark's mother explained that he had been taken to a pantomime at the age of four, in which a man dressed as a clown had frightened him, and ever since then he could not bear the sight of clowns.
This is an example of classical conditioning. A certain stimulus causes a reaction of fear, and this explains Mark's behaviour. The teacher responded to this situation by encouraging the whole class to wear masks for Halloween, and Mark chose a Spiderman costume, while his friends wore a very wide variety of different costumes, including alien, witch, skeleton and dragon masks as well as superhero costumes. This encouraged Mark to understand the role of masks in imaginative and creative play, and by wearing his own mask, and observing his classmates enjoying the experience of wearing their masks, he gradually learned to experience masks and costumes in a pleasant way, and so he eventually overcame his fear of clowns and masks.
Feedback is the most common reinforcer that is used in most classrooms and it is used to encourage some behaviours and discourage others. Teachers reward learners in many ways when they behave in a way that is appropriate and helpful, while also enforcing a small but clearly defined set of classroom rules which will lead to negative consequences if they are broken. Most behaviour management systems involve some degree of behaviourist thinking, as indeed the very term "behaviour management" suggests.
See if you can locate (or at least remember in some detail) a school behaviour management policy. Can you identify any behaviourist elements in this policy? Are there any positive and/or negative reinforcers? Is there a regular or variable rewards system? Does punishment play any part in it? How well does this policy integrate with the rest of the school curriculum?
From your own experience, do you think such policies are effective? Why, or why not? Using Skinner's Behaviourist theory, see if you can improve the policy you are thinking about, or perhaps you have ideas on how to design a more effective policy using Behaviourist principles?
Another of Skinner's ideas that has proved very popular in education is that of the 'token economy'. It is impractical to keep up a constant stream of primary reinforcers such as food or money to motivate learners in an educational context. Skinner suggested that it is possible to use tokens instead, such as for example stickers, stars, ticks, grades, or team points. When these specific symbolic rewards are linked with attention, smiles, and praise from the teacher they act as generalised reinforcers. They can also be linked to tangible rewards such as treats or privileges, and learners can be encouraged to collect the tokens and perhaps exchange them later for these real rewards. These real rewards must be something that the learners find desirable, and there must be a balance between having to work hard to achieve them, and making them achievable. If learners do not value the reward, or if they think it is beyond their reach, the reward will not act as a reinforcer and may even encourage resentment and reduced commitment from the learner.
Sam is a thirteen-year old boy who has just started his second year at high school. Sam is articulate, confident and has many friends, but teachers are worried about his poor concentration and low academic achievement in core subjects. He is often disruptive in class, and does not always follow the teachers' instructions. This is distracting for other learners and not helping Sam to make progress in his schoolwork.
In the staffroom one day, several teachers are discussing Sam's behaviour. The Maths teacher says that he has to resort to sarcasm to counter Sam's frequent interruptions, and says that Sam is quickly made to be quiet if the teacher asks him to come to the front and take over the lesson, or if the teacher asks him a question that he knows Sam cannot answer. Sam's results in Maths are very weak. Another teacher says that she immediately tells him to stop talking and keeps a very close eye on Sam throughout the whole lesson. She is worried, however, that this is taking up too much of her time, and thinks it is unfair that the other children have to wait while she "sorts him out". In the Drama class, there is a competition with an end of term prize which involves a trip to London and although Sam is good at this subject, he seems not to be interested in taking part in the competition, or winning the prize. He plays the fool and makes people laugh, but he does not apply this skill to his Drama lessons. The teacher is frustrated with him and says, the more she praises him and encourages him, the more disruptive he becomes. Finally, the English teachers says that she simply ignores him when he tries to distract other learners, but looks at him and nods when he makes helpful comments or joins in with the rest of the class. She says praising him or telling him off are not very effective, because he likes being the 'bad boy'. He does respond to her silent approval, however, and so she says that this is the way to reinforce his compliance and attention.
This case illustrates the different ways in which the teachers are interpreting and applying Behaviourist principles. Individuals respond differently to the same reward or punishment, and so teachers must have a range of strategies that they can use in different situations, and with different learners.
What are the strengths and limitations of this theory?
The main strength of this theory is its contribution to our understanding of the way all animals, including humans, have both automatic and learned responses to the environment. The observations carried out by Skinner helped psychologists to develop programmes and treatments for all kinds of behaviour modification. Behaviourism in the sense it was understood by Skinner has also been extended into many different areas of education, and it has made him one of the most famous psychologists of all, on a par with Freud in terms of the influence he has had on other theorists. Another strength of this theory is the useful range of experimental methods that it has brought into use, including not just animal tests, but also exercises and tests designed to explore the responses of learners to different kind of teaching-related stimuli.
Although the strengths mentioned above can be useful in education as strategies to encourage and motivate learners, behaviourism has some serious limitations as a theory of learning. According to Whitebread (2012, p. 115) "the fundamental problem with the behaviourist approach was that it characterised learning as an essentially passive process, consisting of forming simple associations between events, and being dependent upon external rewards or reinforcements". In other words, behaviourism may be an adequate model for describing the way rats and pigeons behave, but it cannot account for the much greater range of creative, playful and sometimes obstructive behaviours that primates and humans exhibit, when exposed to stimuli that are intended to provoke certain behaviours. Every adult who deals with young children knows that their response to the same stimulus can vary enormously from one occasion to the next. Sometimes, it is possible to work out why a child will follow instructions to complete a writing task on one day, but refuse to do so the next. She may be anticipating a school trip to the park on the first day, and eager to please her teacher so that the trip can begin sooner, for example, and she may be feeling tired on the next day. Perhaps she feels this task is too boring, or she wants to continue chatting with her neighbour, rather than focusing on her writing. Sometimes, however, there is no obvious reason why a learner behaves in a particular way. Any number of factors can influence how she feels, or what she is thinking, and these factors influence how she behaves. This means that Skinner's method of close observation can only ever be a partial explanation of human behaviour. There are other dimensions, factors and processes, including especially emotional and cognitive processes, that this theory cannot explain.
Another weakness of behaviourism in education lies in the way it emphasises the accomplishment of tasks in an automatic way, without any requirement for deeper understanding. This can be problematic in subjects like Mathematics when learners move on from elementary levels and start to tackle more advanced levels. Simple concepts often build into more and more complex ideas. Learners who can correctly carry out addition or division, for example but who do not really understand what this means, can quickly become lost when these activities become part of more complex calculations.
Based on your observation of children and young people, how convincing do you find Skinner's version of Behaviourism, with its emphasis on observed behaviour in response to stimuli from the environment?
What are the similarities and differences between this theory, and Constructivism?
Can you think of any moral or ethical issues around the use of operant conditioning in an educational context?
How can this theory be linked to practice?
Behaviourist theory can be applied to many areas of teaching practice. Most teachers find that using patterns, drills and repetition can help to establish a regular routine and consolidate basic skills. Even adults can benefit from regular question and answer routines with instant feedback, or from quizzes and tests which check whether some learning content has been mastered. Some learners are motivated by good marks in such exercises, and competition between learners can also encourage hard work and good behaviour. Other learners are fearful of tests, however, and instant feedback can be negative for them, because they have had past experiences of failure.
Educators must use behaviourist principles with care because as we have seen above, the same stimulus can have very different effects on different learners. It has been noted, for example, that "those who benefit most from approaches based on behaviourist notions are those who are less well motivated, have high anxiety or a history of failure … Bright children can find programmed instruction or simplistic drill and practice situations unsatisfying and even boring" (Pritchard, 2014, p. 12).
The two most obvious ways in which behaviourism is linked to practice in schools today are in the field of behaviour management, which was mentioned above, and in computer and web-assisted learning programmes (Dede, 2008). The ability to programme all kinds of feedback and rewards into a digital learning tool can make it very useful as an aid to modifying behaviour.
A class of ten-year-old children is learning about cycling and road safety. They have been reading the highway code, and learning what the different road signs mean, and how this code applies to cyclists. The teacher is satisfied that the learners have a good general awareness of the rules, but she knows that they do not always follow these rules when they are cycling, especially at weekends and on holidays, when they may venture beyond the daily route that they take to school, and when there is no adult supervision. She knows that peer pressure and inattention while cycling with friends can be a distraction, and the children may not always apply the knowledge that they have.
The class are now using some software which tests their cycling skills. When an error is made, the user is ejected from the game, but when the user cycles safely through the hazards on the screen, they receive virtual badges and trophies. After using the software, the learners write up what they have learned, and how they will modify their cycling in the future to maximise safety for themselves and others.
Observe a teaching context (either from memory or preferably by observing a classroom in real life or online).
Can you see any learner behaviours that are not conducive to effective learning?
Can you see any teacher interventions that are aimed at modifying these unhelpful learner behaviours? How would you classify these interventions using Skinner's theory?
Is there anything in the environment that you could alter to help modify such unhelpful learner behaviours?
Think about the way you give feedback to learners. Do you remember to offer praise and encouragement? Do you know what kind of rewards are effective or not very effective for your learners? What do you think about obvious punishments such as time out or sending a learner to see the head teacher?
This chapter has shown that Behaviourism was notable for its insistence on using scientific methods of observation, and it dominated educational theory in the first half of the twentieth century. Skinner is one of the most famous psychologists of the twentieth century, and he developed behaviourism as a theory for modifying human behaviour. This theory stresses the role of the environment, and the associations that are made between the individual and events in that environment. It is useful for teaching classroom routines and basic material, but it has many limitations, since it does not foster deep understanding, and reinforcers do not always have the same effects. Behaviourist approaches have experienced something of a revival in recent years due to the popularity of computer-based learning programmes which use various kinds of feedback and rewards.
Having finished reading this chapter, and thinking about the issues raised in the examples and reflection sections, you should
- understand what 'behaviourism' means in the context of educational research
- understand and be able to use the technical terminology associated with this theory, such as conditioning and the different kinds of reinforcement, and use these terms to explain how behaviourism applies to education.
- relate this theory to educational practice at different levels from infancy to adulthood.
Now you should complete the 'hands-on scenario' at the end of this chapter. Use what you have learned in this chapter to complete the short task described there.
Dede, C. (2008) Theoretical perspectives influencing the use of information technology in teaching and learning. In J. Voogt and G. Knezek, (Eds.), International Handbook of Information Technology in Primary and Secondary Education: Part One. Enschede: Springer, pp. 43-62.
Gray, C. and MacBlain, S. (2015) Learning Theories in Childhood. Second edition. London: Sage.
Pritchard, A. (2014) Ways of Learning: Learning Theories and Learning Styles in the Classroom. Third Edition. Abingdon: Routledge.
Schunk, D. H. (2012) Learning Theories: An Educational Perspective. Sixth edition. Boston. MA: Pearson.
Skinner, B. F. (1965)  Science and Human Behavior. New York: The Free Press. Originally published by Macmillan.
Whitebread, D. (2012) Developmental Psychology and Early Childhood Education. London: Sage.
Cite This Module
To export a reference to this article please select a referencing style below: