The goal of this paper is to review the principal approaches to Music Education with a focus on Artificial Intelligence (AI). Music is a domain which requires creativity, problem-seeking and problem-solving respectively, from both learner and teacher, therefore is a challenging domain in Artificial Intelligence. It is argued that remedial intelligent tutoring-systems are inadequate for teaching a subject that requires open-ended thinking. Traditional classroom methods are sometimes favoured because tutors can focus on individual differences and enhance creativity and motivation.

However, it can also be argued that AI is a mechanism which enables those without traditional musical skills to ‘create’ music. Almost the only goal that applies to music composition in general is ‘compose something interesting’ (Levitt, 1985). This paper will review different approaches to AI in Music Education. Approaches considered will be: Intelligent Tutoring Systems in Music; AI based Music Tools; highly interactive interfaces that employ AI theories.

1. Introduction

This paper will review some of the approaches to using Artificial Intelligence in Music Education. This particular field is of high interdisciplinary and involves contributions from the fields of education, music, artificial intelligence (AI), the psychology of music, cognitive psychology, human computer interaction, philosophy, computer science and many others. AI in education itself is a very broad field, dating from around 1970 (Carbonell, 1970) and has its own theories, methodologies and technologies. For brevity, we will abbreviate Artificial Intelligence in Education to AI-ED, following a standard convention.


The scope of AI in Education (AI-ED) is not decisive, so it will be useful to consider some definitions. A common definition is: any application of AI techniques or methodologies to educational systems. Other definitions which focus more narrowly are, for example: any computer-based learning system which has some degree of autonomous decision-making with respect to some aspect of its interaction with its users (Holland, 1995). This definition suggests the requirement that AI techniques reason with the user at the point of interaction.

This might be in relation to best teaching approach, the subject being taught or any misconceptions or gaps in the student’s knowledge. However, AI-ED in a wider context is sometimes defined as: ‘the use AI methodologies and AI ways of thinking applied to discovering insights and methods for use in education, whether AI programs are involved at the point of delivery or not’ (Naughton, 1986). In practice, these contrasting approaches form a continuum.

Music: An open-ended domain

A useful distinction in AI-ED is between formalised domains and the more open-ended domains (‘domain’ means subject area to be taught). In relation to domains such as mathematics and Newtonian dynamics there are clear targets, correct answers and a reasonable clear and concise structure to follow for success. Whereas in open-ended domains such as music composition, there are in general, no clear goals, no set criteria to follow and no correct answers.

The focus is based upon, as mentioned earlier, ‘Compose something interesting’ (Levitt, 1985). Rittel and Webber (1984) describe this particular problem in domains as ‘wicked problems’. In such domains there cannot be a definitive formulation for the problem or the answer. Wicked domains such as music composition require learners to not just solve problems but also seek problems (Cook, 1994). The term problem seeking is used in a number of disciplines such as animal behaviour (Menzel, 1991). Cook (1994) imported the term into AI in Education in particular reference to the sense of philosopher Lipman (1991). In this sense Cook (1994) refers to the term ‘problem seeking’ as follows:

  • Problems are treated as ill-defined and open-ended
  • There is a continual intertwining of problem specification and solution
  • Criteria for completion is very limited
  • Context greatly affects the interpretation of the problem
  • Problems are always open re-interpretation and re-conceptualisation

In relation to expressive performing arts and music composition there is no goal or problem to be solved. The learner must find or create goals and problems which then may need to be revised, modified and rejected where best suited to his/her taste.

2. Computer-Aided Instruction

It is worth considering briefly the music education programs that negligibly use AI as a background to AI approaches in education. Historically, computers used in music, and most other subjects, were associated with the theory of learning behaviourism. These particular systems (branching teaching programs) stepped through the following algorithm (O’Shea and Holland, 1983),

  • Present a ‘frame’ to the student i.e.
  • Present the student with pre-stored material (textual or audio visual)
  • Solicit a response from the student
  • Compare the response with pre-stored alternative responses
  • Give any pre-stored comment associated with the response
  • Look up the next frame to present on the basis of the response

An example of this kind of system was the GUIDO ear-training system (Hofstetter, 1981). Branching teaching programs tend to respond to the user in a manner that has more or less been explicitly pre-planned by the author. Therefore, this tends to limit the approach to a simple treatment.

  • Multimedia and Hypermedia

Multimedia and hypermedia has had a great impact on music education and transformed music education software programs, giving a different emphasis from the earlier behaviourist programs. Recent educational music programs such as Seventh Heaven, Ear Trainer, Interval and Listen aim to provide practice in recognising or reproducing intervals, chords or melodies. MacGAMUT is a classroom simulation program that dictates exercises and provides a detailed marking scheme.

Other programs such as MiBAC Music Lessons, Perceive and Practica Musica offer a comprehensive ear training program including scales, durations, modes and tuning. See Yavlow (1982) for information on the aforementioned programs. Since the domain is relatively clear-cut and non-problematic, ear training and music theory are popular methods in non-AI music education programs. There are many useful musical computer tools applicable to education such as music editors, sequencers, computer-aided composition tools, multimedia reference tools on CD-ROM Masterworks and much more.

3. Intelligent Tutoring Systems for Music: A ‘Classical’ Approach

The history of AI in education can be divided into two periods, the ‘classical’ period (1970 – 1987) and the ‘modern’ period (1987 to present day). In the classical period, the three component ‘traditional’ model of an Intelligent Tutoring System (ITS) was the most common and influential idea. This model was sometimes extended to a four component model. After 1987, ideas had shifted to finding alternative ways around the traditional model. However, this was limited due to research available at those times, and the traditional model remains influential and is still used to the present day.

Each of the three components of the traditional model can be considered a separate ‘expert’ system’. The traditional ITS model (Sleeman and Brow, 1982) consists of three AI components, each an expert in its own area. The first component, the domain model, is an expert in the subject being taught. So in the case of a vocal tutor, the domain expert itself would be able to perform vocal tasks. This requirement is essential if the system is to be able to answer unforeseen questions in relation to the task in hand.

The second component is the student model. Its purpose is to build a model of the student’s knowledge, capabilities and attitudes. This will allow the system to vary its approach in accordance to the individual student. In essence, the student model can be viewed as a checklist of skills. This is sometimes modelled as an overlay i.e. a tick list of the elements held in the domain. Sophisticated models may view it as a deliberately distorted element or a faulty ‘expert’ system. These errors are intended to mirror a student’s misconceptions.

A fair diagnosis of a student’s knowledge, skills, capabilities and beliefs is often a hard problem in AI. One partial way around the diagnosis problem would be to ask the student about their capabilities, beliefs, previous experience and so on. A more stringent approach is to set the student tasks specifically designed to analyse their skills. The results can then be used to construct the student model.

The third component of the traditional ITS model is the teaching model. Typically, this may consist of teaching strategies such as Socratic tutoring, coaching and teaching by analogy (Elsom-Cook, 1990), to simply allowing the student to explore available materials unhindered, with or without the guidance of a human teacher. The fourth component is an interactive user interface for the tasks mentioned, if it is used. Note that not all Intelligent Tutoring Systems consist of all three components. It is common to have a central focus on one maybe two components, and omit, or greatly simplify the others.

In particular, most ITS’s in music focus on the expert or student model. Irrespective of the emphasis, ITS models require an explicit, formalisable knowledge of the task. However, many skills in music correspond to wicked problems and are resistant to explicit formalisation. This narrows the number of areas ITS models can be applied to in music education. An example area is Harmonisation. It is one of the few musical topics for which relatively detailed, rules of thumb can be found in a textbook. But even here, the traditional ITS model may not be effective.

There are two systems from the classical ITS period, which are good examples of the potential and limitations of the ITS approach in music, Vivace and Macvoice.

3.1 Vivace: An expert system

Vivace is a four-part chorale writing system, created by Thomas (1985). Vivace is not an ITS model in itself, yet has formed the basis of one. It takes an eighteenth century chorale melody and writes a bass line and two inner voices that fit the melody. It uses text from books, abstracted from the practice of past composers, to employ rules and guidelines for harmonisation. These rules can be categorized into four types: firm requirements, preferences, firm prohibitions, less firm prohibitions.

There are three specific problems which can be identified for any human or machine when trying to harmonise on the basis of the rules. The first problem is indeed common in beginners’ classes, to satisfy all the formal rules and produce a composition which is correct but aesthetically unsatisfactory. The second problem is that most of the guidelines are prohibitions rather than positive suggestions. Milton Babbit observes that ‘the rules…are not intended to tell you what to do, but what not to do’ (Pierce, 1983).

In other words, if we view harmonisation as a typical AI ‘generate and test’ problem, the rules constitute weak help in the testing phase, but little help in well focused generation. The third problem is that it is quite impossible to satisfy all of the preferences at any one given time. Some preference rules may have to be broken. A clear order of importance of preference rules is not assigned by traditional descriptions in fact, it is not at all clear that any fixed order would make sense.

However, it is possible to write a rule-based system that implements text book rules. In principle, a traditional ITS system can use these rules to criticise student’s work and serve as a model of the expertise they are supposed to acquire. In relation to the limits aforementioned, how useful or effective would such a tutor be? Thomas used the tutor to illuminate the limitations of the theory. By using Vivace, Thomas was able to establish that text book rules are an inadequate characterisation when performing such a task at expert level.

Thomas discovered using only conventional rules about range and movement the tenors voice would most certainly move to the top of its range and stay there. Thomas suggested that there must be a set of missing rules and metra-rules to fill theses gaps. He used a Vivace experimental tool to establish this gap. In each experiment Thomas had to use his intuition to decide upon whether the results were musically viable or not. Thomas discovered that many of the traditional rules were overstated or needed redefining. He also unveiled new guideline and was able to understand the task at a more strategic level. With the assistance f her human pupils, Thomas formulated a number of heuristics for ‘what to do’ rather than ‘not what to do’.

Experiments with Vivace enabled Thomas to realise the need to make human pupils aware of high level phase structure prior to detailed chord writing. As a result of her experiments, Thomas was able to use her new knowledge about the task, as a result of ‘teaching’ her expert system, and write a new teaching text book based on her findings. Part of this knowledge was used in a simple commercial ITS, which criticises student’s voice-leading (MacVoice).

3.2 MacVoice

MacVoice criticises voice-leading aspects of four part harmonisation. It is a Macintosh program based on the expert system Vivace. The MacVoice also includes a music editor as part of its interface. MacVoice makes it possible to input any note, any chord at a time or a voice at a time, or notes in any disconnected fashion. As soon as a note is placed on the stave, it will display its guess as to the function of the corresponding chord in the form of an annotated Roman numerical.

Three are two important limitations of this system as follows: firstly, all chords must form Homophonic blocks (all notes must be of the same duration); and secondly, the piece must be in a single key. There is one other menu function, called ‘voice-leading’.This particular function inspects the harmonisation in line with a set of base rules for voice-leading, indicating any errors. MacVoice is quite flexible to use.

MacVoice has been used practically at Carnegie Mellon University. MacVoice does not give positive strategic advice. It only points out errors. It does not address the efficiency or any other benefits of the chord sequences involved. Further research on this topic may include a visual display of what the voice-leading constraints are, or the possible preferred outcomes.

3.3 Lasso

Lasso was formalised by Lux (1725). It is an intelligent tutoring system designed for the 16th century counterpart and is limited to two voices. Newcomb’s approach focuses on intending to provide simple and consistent guidelines to help students know what is required to pass exams. The process of codification of the necessary knowledge goes beyond that of text book rules and guidance. Like Thomas, Newcomb was aware of this, however, approached it using a probabilistic manner, analysing scores to find out such facts as ‘the allowable ratio of skip to non-skip melodic intervals’ and ‘how many eighth note passages can be expected to be found in a piece of a given length’ (Newcomb, 1985).

Also, the knowledge used for criticising students work is being coded as branch procedural code. There are also unvarying canned error messages, help messages and congratulatory messages. This will assist students, offering some form of motivation. Lasso is a very impressive system. It has a quality musical editor, tackles complex musical paradigm and has been used in real teaching contexts. However, there are some intrinsic problems. The rules are at a very low level, and there are a high number of them. There is a system rule which prevents over one hundred comments being made about any one given attempt to complete an exercise. For example, typical remarks made by Lasso include;

“A melodic interval of a third is followed by stepwise motion in the same direction.”

“Accented quarter passing note? The dissonant quarter note is not preceded by a descending step.” (Newcomb, 1985).

The quantity of relevant text required to put in help context of myriad low-level criticisms could easily overwhelm students. Students complained that it was so difficult to meet Lasso’s demands that they were forced to revise the same task repeatedly. A solution to this problem would be to incorporate general principles to govern the low-level rules. Using such codified principles will reduce the number of comments required to relevant text and generalise observations.

3.4 Concluding remarks on Intelligent Tutoring Systems: A ‘Classical Approach’

The traditional Intelligent Tutoring System approach assumes an objectivist approach to knowledge. Such systems depend on the assumption there is a well-defined body of knowledge to be taught and can be put into precise concepts and relationships. This works with four-part harmonisation and 16th century counterparts. However, in a more open-ended context, an objectivist approach can be very limited. In domains which are artificially limited, teaching of rules drawn from practical experience tends not be a very good approach.

Using verbal definitions to teach a musical concept is limited and does not compare to the knowledge required to identify the true meaning of these definitions to be an experienced musician. It is all very well to define a chord, a dominant eighth in terms of its interval pattern and provide general rules but to an experienced musician the ‘meaning’ of a chord or a dominant eight is much more depending on the context. Being able to intelligently manipulate structures is far more important than to just being able to understand and obey a set of rules, which an experienced musician will be capable of doing so. Rather than just a set of explanations, a student needs a structured set of experiences making them more aware of musical structures, being able to manipulate them intelligently and most importantly, more capable of formulating sensible musical goals to pursue.

4. Open-ended Microworlds: The Logo Philosophy

A contrasted idea from the classical approach of AI in education, which is just as influential as the notion of an ITS is the Logo approach (Papert, 1980). The Logo philosophy has particular attractions to open-ended domains such as music. It focuses its approach on the idea of an educational microworld. An educational microworld is an open-ended environment for learning. Therefore, there are no specific built-in lessons. The Logo approach in associated microworlds does not need to involve much, or indeed any AI at point of delivery.

However, their designs tend to be strongly influenced by AI methodologies and tools. A simple version of AI programming language is used to build microworlds. Students are encouraged to write or modify programs as a means of exploring the domain. Logo doubles as the name of programming language based on Lisp, used for just this purpose. There are three distinct elements in the Logo approach: Logo (and similar languages) as a programming tool; Logo as a vehicle for expressing various AI theories for educational purposes; and Logo as an educational philosophy.

Firstly, we will briefly explore Logo as an educational philosophy. In its early work, Logo was mainly used for mathematics learning, poetry and music. One of the versions encouraged children to produce new melodies by rearranging and modifying melodic phrases. The learning philosophy was aimed to enable children to have a better understanding of the concept by making them envision or pre-hear a result. Thus, enabling them to work out how to achieve it, and realise the reason behind obtaining an unexpected result. This learning philosophy was derived from a number of sources, including the psychologist Piaget’s notions of how children construct their own knowledge through play.

The Logo approach in relation to microworlds can be somewhat complex. Students are sometimes provided with a simplified version of an AI model in some problem domains. For example, in the case of music composition, fragments of illustrative material can be generated using generative grammars as models of particular composition techniques. The supplied programs can be used by students to explore, criticise, and refine their own (or someone else’s) model of process.

Notice that none of the three components in the ITS model are required in the Logo approach. In practice, students need some form of guidance from teachers in order to make use of their full potential using Logo systems. If there is no guidance from a teacher the students risks only learning a technique without appreciating the wider possibilities and understanding the true meaning of being an experienced musician. The educational philosophy associated with Logo has been applied to a number of systems in music at different levels and in different ways, as mentioned below.

4.1 Music Logo System: Bamberger’s System

Jeanne Bamberger’s Music Logo System (1986, 1991) can be used to work with sound cards or synthesisers. It uses programming elements called functions to structure and control musical sounds. Music Logo’s central data structure is a list of integers representing sequences of durations and pitches, which can be stored separately. These can be manipulated separately before being played by a synthesiser. So for example, to play A above middle C for 30 beats, then middle C for 20 beats, then G for 20 beats , the following expression might be used.

Play [a c g] [30 20 20]

Programming constructs such as repeat can easily be understood by beginners to do musical work. Using arithmetic and list manipulation functions, note and duration list can be manipulated separately. Features such as recursion and random number generators can be used to build complex musical structures. Common musical operations are provided (list manipulation functions).

For example, one function takes a duration a pitch list and generates a number of repetitions of the phrase shifted at each repetition by a constant pitch increment, creating a simple sequence (in a musical sense of the term). Bamberger’s Music Logo System also provides other musical functions, such as retrograde (reverses a pitchlist), invert (processes a pitch list to the complimentary values within an octave), and fill (makes a list of all intermediate pitches between two specified pitches).

To try and guess a musical outcome, manipulate lists and procedures or conversely iteratively manipulating lists of representations to try to reproduce something previously imagined, Bamberger suggests many simple exercises. These techniques, in many ways, are a reflection of educational techniques suggested by Laurillard (1993) for general use in higher education. There are two particular classes of phenomena suggested by Bamberger, which emphasises the importance of ‘shock’ and learning experiences.

Firstly, perceptions of phrase boundaries occur in melodic and rhythmic fragments dependent upon small manipulations of the duration list. Secondly, there is an unpredictable difference between degree of change in the data structure and the degree of the perceived change produced. In priniciple, the Logo system allows students to focus on manipulating any kind of musical structuring technique. However, in practice the focus tends to be on simple, small scale structures such as motives, and their transformation.

4.2 A series of microworlds: Loco

Peter Desain and Henkjan Honing developed a series of microworlds and tools applying the Logo philosophy. The first series was the LOCO (Desain and Honing, 1986, 1992). The second was POCO (Honing, 1990), followed by Expresso (Honing, 1992) and LOCO-Sonnet (Deasin and Honing, 1996). All of these microworlds carefully reflect the thought behind AI methodologies and how they can be applied to music education.

LOCO is similar to Bamberg’s Logo, in the sense it also focuses on music composition. The central component is a set of tools for representing sequences of musical events, which can be interfaced with any output device or instrument. It is also flexible enough to take input from practically any composition system.

Microworlds provided each offer tools for useful style-independent composition techniques, particularly stochastic processes and context free music grammars. Two musical objects provided essentially are just ‘rests’ and ‘notes.’ LOCO’s time structuring mechanism is simple and elegant. There two relations, Parallel and Sequential – used to combine arbitrary musical objects. Sequential is a function which causes musical objects in an argument list to be played one after another, whereas, Parallel is a function that causes arguments to be played simultaneously.

It is quite simple to nest a parallel structure within a sequential structure, and vice versa. Sequential and Parallel objects are treated as data which can be computed and manipulated before they are played. The result- arbitrary time structuring can be applied with much flexibility. As mentioned earlier, LOCO provides a base for composing using stochastic processes and free grammar context. Various effects can be produced, depending on how variables are defined, including;

  • A random choice among its possible values
  • A choice weighted by a probability distribution
  • A random choice in which previous values cannot recur until all other values have been chosen
  • Selection of a value in a fixed circular order

The above are easily put together using composition (in a mathematical sense) of functions. For example, the value of an increment could be specified as a stochastic variable. This can produce a variable that performs a Brownian random walk. Brownian variables can be used, for example, as arguments in commands to instruments within a time-structured framework. These techniques can be used to construct concise, easy to read programs for transition nets and other stochastic processes. Using general programming language in each case, the operation of a program can be modified. See Ames (1989) for more information in the compositional uses of Markov chains.

The primary design goals of LOCO include ease of use by non-programmers to experts. A more recent version of LOCO, LOCO-Sonnet mirrors LOCO but also includes a graphical front end. Sonnet is a domain independent data flow language originally designed for adding sound to user interfaces drawn from Jameson’s (1992) Sonnet. It is designed for use by both novices and experts alike. LOCO has been used in workshops for novices and professionals and even has courseware available.

4.3 Concluding comments on the Logo approach

The Logo approach is known to be associated with constructivism. Constructivism, in the aspect of knowledge and learning, suggests that even in the cases where ‘objectively true knowledge, exists simply presenting it to a student limits the effects of their learning. It based on the assumption that learning arises from learners being interactive with the world, which will force them to construct their own knowledge.

The result of this ‘knowledge’ will vary between individuals creating unique ideas and outcomes. This fits in very well with open-ended domains such as music where the basis of knowledge is learning how to create your ‘own’ masterpiece.

Unlike classical Intelligent Tutoring Systems, Logo requires intensive support from a human teacher. This can be viewed as both weakness and strength of the program. Intelligent Tutoring Systems and the Logo approach were both influential ideas of AI in education in the early years. As both strengths and limitations were noted over the years, combining characteristics of the two became a prime focus of research which led to Interactive Learning Environments (ILE). We will talk about this after a brief discussion on AI-based tools.

5. Applications in Education: focus on AI-based tools

There are a number of application tools employing AI but its purpose is not primarily educational. However, it is useful to consider some of these systems as they nevertheless have clear educational applications. There are quite a few programming languages based on AI languages such as LISP and CLOS that have a relatively similar technical aspect to that of the Music Logo systems described earlier. However, the philosophy of use may be quite different. The commercial system Symbolic Composer (for Macintosh and Atari) is one example of this difference.

It has a vast library of functions, including neural nets facilities, used for processing, generating and transforming musical data and processes, commonly built on Lisp. The system is primarily aimed at composers and researchers. Another culture which offers an educational paradigm with many links to AI culture is the Smalltalk culture. An example of such a system is Pachet’s (1994) MusES environment, implemented in Smalltalk 80. It is aimed at experimenting with knowledge representation techniques in tonal music.

MuSES includes systems for harmonisation, analysis and improvisation. Finally, an example of a commercial program is Band in a Box (Binary Designs, 1996). It takes a chord sequence as input and at output can play an accompaniment based on the chord in a wide variety of styles. At one moment in time this would have required AI techniques but in today’s era it is a conventional method.

6. Supporting learning with Computational Models of Creativity

6.1 A cognitive support framework: constraint-based model of creativity

“I noticed that the [drawing] teacher didn’t tell people much….Instead, he tried to inspire us to experiment with new approaches. I thought of how we teach physics: we have so many techniques-so many mathematical methods – that we never stop telling the students how to do things. On the other hand, the drawing teacher is afraid to teach you anything.

If your lines are very heavy, the teacher can’t say “your lines are too heavy” because some artist has figured out a way of making great pictures using heavy lines. The teacher doesn’t want to push you in some particular direction. So the drawing teacher has this problem of communicating how to draw by osmosis and not by instruction, while the physics teacher has the problem of always teaching techniques, rather than spirit of how to go about solving physical problem”

Feynman (1986)

“John and I….were quite happy to nick things off people, because…you start off with the nicked piece and it gets into a the song…and when you’ve put it all together…of course it does make something original”

Paul McCartney quoted in (Moore, 1992)

There are limitations present in both traditional AI approaches in education mentioned earlier (ITS and Logo). ITS’s don not work very well in problem-seeking domains and Logo type approaches require support from a human teacher in order to be effective. One way of investigating these problems has been addressed by MC (Holland, 1989, 1991; Holland and Elsom-Cook, 1990). ‘MC’ is an acronym for both ‘Meta Constraints’ and ‘Master of Ceremonies’, which is a general framework for interactive learning environments in open-ended domains. We will focus on the domain model rather than the teaching model.

The current version is designed at teaching ab initio students to compose tonal chord sequences, with particular reference to popular music and jazz harmony. It uses a cognitive theory of Harmony (Balzano, 1980). Two elements from Johnson-Laird’s definition of creativity are: 1) the assumption that creative tasks cannot proceed from nothing: that some initial building blocks are required; 2) the assumption that a hall-mark of a creative task is that there is no precise goal, but only some pre-existing constraints or criteria that must be met (Johnson-Laird, 1988a).

From this, the act of creation can be characterised in terms of satisfaction of constraints and iterative posing by the artist. New constraints may be added by the artist at the weaker criteria. Results are tested at each iteration against acceptance criteria. It may be acceptable to sacrifice a pre-existing constraint or criteria on order to allow constraints made by the new artist. This is explained clearly by Sloboda (1985):

“..we will find composers breaking…rules [specifying the permissible compositional options] from time to time when they consider some other organisational principal to tale precedence.”

(Sloboda, 1985)

6.1.1 The type of constraints in music

There are three types of constraints in music: 1) fostering perceptual and cognitive conditions for effective communications; 2) cultural consensus; 3) introduced from scratch by the artist. The first kind of constraint appears in research used in western tonal music, such as Balzano (1980 and Minsky (1981). The focus is on harmony and amongst other things voice-leading. This research asserts how the features of music are important in the role of fostering perceptual and cognitive conditions for an effective communication structure.

Cultural consensus is the second constraint. It is important to realise the listeners’ familiarity with the materials used from previous compositions. When listeners hear a new piece of music, cognitive theories of listening posit that the music must be chunked in various ways to cope with memory and processing limitations (Sloboda, 1985). The kind of chunking that can be done depends on the listeners’ familiarity with the material. Levitt explains the connection between stylistic constraints and constraints introduced by a composer:

“Effective communication requires musicians to repeat structures frequently within a piece and collectively over many pieces. Usually we view ‘musical style’ and ‘theme and variation’ as utterly different. Computationally and socially they are similar things with different time spans; style tries to exploit long term ‘cultural memory’, while theme and variation exploits (sic) recent events. In either case, the considerate composer uses an idea of what is already in the audiences head to make the piece understandable.”

(Levitt, 1981)

6.1.2 A computation model of creativity

Using the information mentioned earlier, we can assume that creative activities can be modelled using the following process:

Given a set of building blocks,

Choose a goal.

Select constraints.

Iterate the following process:

  • apply the constraints to generate a result,
  • test the result against acceptance criteria,
  • adjust constraints until acceptance criteria are sufficiently closely met.

6.1.3 MC framework and its components

The MC model provides a set of interacting components. These components act as a cognitive support tool for creative processes. According to the domain of composing tonal chord sequences, the components are:

  • A constraint-based planner (PLANC)
  • Constraint based representations of basic musical materials for use as raw materials by the planner
  • A family of harmonic plans that can be used with planner to generate prototypes of harmony sequences. Each plan has a number of variables which are musical material, strategies and techniques. The plan is dependent on which variable is chosen, which in turn will produce a different chord sequence or often many different sequences, each with a family resemblance.
  • An extensible body of existing pieces. Each one is linked to one or more plans which generates the chord sequences, and to relevant styles.
  • A highly interactive direct manipulation based on a cognitive theory of harmony. This allows users to manipulate the different elements of harmony used in the pieces (intervals, voicing, chord sequences etc) which can also be sued by beginners (Holland, 1989, 1994).

The interactive microworld associated with MC allows beginners to manipulate and become familiar with harmonic forms, derived from Balzano’s theory (1980) of tonal harmony. Harmonic plans such as those noted by Moore (1992) and Pratt (1984) correspond slightly with the Harmonic plan. PLANC uses a harmony notation which is functional based, however there are no standard rules of classical functional harmony.

The ‘return home’ plan is a typical harmonic plan. This involves a realising a tonic dependent on whether its context is tonal or modal; moving to another root; and then moving back at home) in a direction that depends on the choice of the mode. Stating home chord of a chord sequence may not always be necessary at the beginning, depending on what other music material available in communicating its presence.

One simple plan, not involving any modulation is the ‘return home’ plan. The following are examples of return home chord sequences:

“Abracadabra” (Steve Miller) (Minor) I IV V I (restricted alphabet of roots in force

“The Lady is a Tramp” (Rogers and Hart) (Major I IV VII III VI II V I (in scaletone sevenths)

“Easy Lover” (Phil Collins) (Aeolian) VI VII I (in scaletone triads)

“Isn’t she lovely” (Stevie Wonder) (Major) VI II V I (in scaletone ninths)

(See paragraph above for details on chord notations)

The characterisation of chord sequences represent only a few viewpoints offered by the system. Some of the sequences can be characterised differently when used with other plans. ‘Interesting’ chord sequences are often those which can be characterised by a number of plans at one time. Each viewpoint of a song gives a different tree of ‘nearest neighbours’ in the body, emphasising different features.

This way of illustrating a number of viewpoints coupled with the ability to modify and generate new pieces is an important source of power in the system. Plans are shown in a similar context to that of Levitt (1985). A satisfactory way of solving constraint problems using logic programming languages is explained by Van Hentenryck and Dincbas (1987). They suggest that “Given a particular CSP, it is sufficient to associate a logic program with each kind of constraints (sic) and to provide a generator of values for the variables.”

PLANC provides default values of variables of any plan for such generators. Beginners are then free to experiment with the elements of a plan. This allows exploration of possible effects regardless whether the student has extensive knowledge of the elements or not. Interaction between constraints and the default value generators is such that the defaults selected are very widely varied between specifications. The student is permitted to work bottom up or top down by the planner.

This means that when generating sequences, it is possible to specify low level matters and leave high level choices until later, and vice versa. This reflects the varied ways in which composers seem to like to work (Sloboda 1985). MC provides a suitable infrastructure for a wide range of teaching strategies appropriate to problem seeking domains (Holland 1989). However, an understanding of how MC components interact requires focus on one of the simplest, open-ended, user-directed learning strategies that MC supports.

6.1.4 Analysis by recomposition

One of way of using the system is to use an interesting piece and allow the student to ‘recompose’ it in a variety of ways. This is very beneficial for beginners. Each piece is linked to one or more harmonic plans and a variety of specification for each plan, which generates the piece. An analysis on each piece is provided from these plans. By selecting a viewpoint and making a limited number of changes to the specifications the user can recompose and re-generating the piece.

The new result may correspond with an existing piece in the body which the system can check and make remarks on or to a new piece. Changing the levels of strategies used via re-composition provides a step to exploring the nearby neighbours of the plan trees. The piece may be seen to have different nearby neighbours depending on the viewpoint chosen. The object of the exercise is not to harvest the new pieces generated or to learn the harmonic plans.

More importantly, a musically rich context where the interaction of musical materials and existing pieces can be explored provided by the interplay of elements which include: the multiple viewpoints, the corpus, the styles, and the new pieces generated. The plan trees can be directed to gain a better understanding of songs, viewpoints or musical materials depending on the students preferred focus.

The links between each of these components allow composition and analysis of existing pieces to be entwined flexibly. One of the key principles behind MC frameworks is the meaning of musical materials is not focused upon in a particular way, but from a variety of ways depending on which way they are used in different existing pieces, as seen from different viewpoints. Possibly, one of the most fundamental uses of musical analysis is comparing similarities and differences between real pieces.

The three layers of knowledge in the planner (net of constraints, the generators, and procedural code fragments) interact and produce musically intelligent behaviour. Knowledge is recorded independently of any particular use. This allows repeated use in different contexts of musical knowledge. Implementing the musical plans in this way how little musical knowledge is actually required for surprising competence in the domain. PLANC shows how the three layers of knowledge are adequate to yield competent, yet flexible musical behaviour. MC framework is generally designed to allow novices to begin exploring interesting, motivating tasks as soon as possible.

6.1.5 The limitations of current MC implementation

Several plans, some styles, numerous pieces, the planner and a microworld for harmony have been used. The student model (a semantic net to allow construction of explanations and teaching modules) has not yet been implemented. The MC is not a practical system. It is a prototype used to demonstrate all of the key principles involved. However, the framework is compatible with standard teaching and modelling techniques developed by Baker (1994) and Cook (1994) mentioned later in this paper.

The current version of MC framework focuses on harmony, with some attention to details of metre and bass lines. However, the design of the program is quite broad and domain neutral, and can be applied to other areas of music. Levitt (1985) suggests the basis of extending the work into melody. Watson (1990) and Kane (1991) have carried out preliminary investigations into rhythm based work. The following section is based upon work with a similar framework but focuses on melody.

6.2 A constraint based learning tool for exploring melody

A constraint-based learning tool MOTIVE (Smith and Holland, 1994) explores melody in a similar context as MC frameworks, with a focus on Narmour’s (1989) cognitive theory of melody. The goal of MOTIVE is to allow ab initio beginners to explore the composition of melody. The work achieves potentially very general applicability to melody, irrespective to genre, by virtue of being based on the most fundamental psychologically grounded of theory of melody currently available (Narmour, 1989).

Narmor’s theory has very little competition as a theory of melody based upon psychological terms, however it does have its problems and limitations (Cumming, 1992). It can predict how a listener will break a melody into groups of contiguous notes and which ones will perceived as more important than others by using simple extensions to low level processing theory for melodic notes. This gives rise to hierarchical parse trees which recursively reduce the melody to simpler versions, roughly similar to Lerdahl and Jackendoff (1983) TSR trees. In order to make use of Narmour’s theory computationally, the model was refined and implemented by Smith’s work.

This enabled Smith to test Narmour’s analyses for consistency against the computational version. The test results showed the theory to be internally incoherent, with some gaps but no fatal internal flaws. This computational model then became the main component of MOTIVE. It uses a constraint based planner and is able to navigate tree plans to re-plan or recompose melodies.

Thus as well as other teaching strategies, the ‘analysis by recomposition’ strategy introduced in the MC system can be applied. The status of Narmour’s theory is unclear and whether Narmour’s theory of melody can be applied in practical terms for supporting beginners has yet to be decided. Irrespective of the result, it is likely that Smith’s work will be a useful computational tool to help explore and refine Narmour’s theory.

7. Interactive interfaces based on AI theory

One interactive manipulation tool (figure 4) for learning harmony is Harmony Space (Holland 1989, 1992, 1994; Holland and Elsom-Cook, 1990). Its design employs Longuet-Higgins (1962) and Steedman’s (1972) artificial intelligence theory. It also employs Balzano’s (1980) cognitive theory. The version of the theory used is dependent upon which tool is used, and vice versa.

Longuet-Higgins (1962) theory shows how harmonic phenomena can be used very concisely by re-formulating the tonal pitch system and harmonic relationships based upon a three dimensional co-ordinate system. Balzano’s competing theory has a different viewpoint, but is still related to a three dimensional co-ordinate system for pitch however with different characterisations of pitch (group theory as opposed to frequency ratios).

In Harmony Space, to produce interactive interfaces (notes, chords etc can be manipulated) direct manipulation theory (Hutchins, Hollans, and Norman, 1986) is applied to these cognitive models. However, the tool has been used by Howard (1994) for teaching harmonisation for bach chorales and analysis of Mozart pieces.

Harmony space is not just a single tool its interface focuses on a number of variants such as harmonisation and composition, microtonal pitch systems, harmonic analysis and interfacing with the MC cognitive framework. Simple versions have been proven to be effective for practical one to one teaching, and teaching in smaller groups (Whitelock, Holland and Howard, 1994, 1995), however students require assistance from a tutor or courseware in order to use the program effectively.

Harmony Space is a relatively suitable program for beginners and trained musicians. It allows the user to explore different harmonies without having any apparent experience. It is firmly based on AI representations and theories, but a great deal of knowledge is not required at the point of delivery. The system is coded using object-orientated programming techniques. However, its methodologies and are strongly AI based.

8. Methods of teaching: Negotiation, Dialogue and Reflection

It is wrong to assume that a music tutoring system knows better than a student. In such an open-ended domain creativity is the key to the process of learning. Baker’s system (1990) views things slightly differently to any previous systems mentioned to the problem of teaching a subject where neither teacher nor student knowledge is more complete than the other. We will explore this with a focus to expressive performance of tonal music.

Using a computer to assist in teaching musical expression allows student who cannot yet play an instrument to explore and create contrasting expressive performances. Sundberg et al. (1989) looked at the problem of trying to determine ‘musically appropriate’ performances from a score. His theory specifies simple actions such as a pause between two notes, or an ascending run, focusing mainly on the surface of musical features. Baker suggests the way a performer perceives the grouping structure of a piece has in influence on their performance.

Expert listeners may not agree with the structure used however, they do agree whether it is plausible or not. AI theories of group phrasing are imperfect. Thus AI systems sometimes produce groupings which a few expert listeners would find plausible.

This can be a problem for AI in education systems designed to help students in expressive performance. Baker devised a mechanism and representations so the system and student can interact and negotiate with each other. These can then be applied to teaching methods. The system and the tutor may negotiate learning styles, which strategies to use, what to do next etc. This system is said to support ‘learning by dialogue.’ Baker’s theory is primarily focused on teaching by negotiation rather than a practical system to use in a classroom.

8.1 Music composition: Supporting reflection

A similar outlook to Baker’s theory is Cook’s work (Cook, 1994; Cook and Morgan, 1995). It focuses on learning through dialogue. Cook’s work focuses on finding a framework which describes the internal dialogues for learner and teacher. The main purpose is to find a base for describing learning and teaching processes in music composition. There are two applications for this which include Interactive Learning Environment and a method for analysing protocols to understand teacher-student interaction.

Cook’s system COLERIDGE (Composition Learning Environment For Reflection about Intentions and Dialogue Goals in Education) mainly focuses on a higher level creative activities. His system is designed to gain skills that support reflection and problem-seeking. John Dewey, 1916) states that reflection is the ‘intentional endeavour to discover specific connections between something which we do and the consequences which result.’ Students are encouraged not to just seek out problems but to also find solutions and improve their abilities to find a solution.

Reflection and problem-seeking are highly relevant in terms of music education. For example, a good composer will not only reflect and question her composition but also try to re-think and re-design her composition in order to make it her own, instead of just merely applying compositional methods and techniques. From an AI in education point of view reflection and problem-seeking is a very interesting approach to an open-ended domain such as music.

6.1 Summary and Conclusions

In open-ended areas such as music there are no rules to adhere to - creativity is the key. However, there is a vast difference between beginners and experts. There are a variety of ways to apply AI in music education such as ear training, music composition, music theory and listening. This paper has mainly focused on composition and some attention performance and listening. As I am a musician myself and enjoy the ‘art’ of composition I have a particular interest in this field.

The effectiveness of the Logo approach has been questioned. Critics state that it requires high skilled and experienced individuals to make it work and it is unclear where to assign credit to – the system or the individual. There are many opportunities to extend the work of the Logo approach into other areas of music. Intelligent Tutoring Systems are best suited to areas which require hard rules and have set goals, and defining systematic errors.

Harmony Space is a human-interaction system which focuses on AI theories in music and its methodologies. This system is quite powerful. Harmony Space is based on AI theories of a domain in music (harmony) and uses them instead of a human computer interface. Direct Manipulation techniques are then applied which modifies the domain when and where necessary. Balzano’s (1980) theory is also quite an interesting viewpoint. It would be interesting to explore whether a similar interface could be used in other domains, not necessarily related to music.

The MC framework can be applied to any open-ended problem seeking domain. In terms of harmony it is quite a powerful tool, for a number of reasons which include: employing the representations for harmony used in Harmony Space; and generalising over modal and tonal properties. MOTIVE demonstrates that these views can also be applied to other areas of music.

Negotiation is an important part of AI in music education. There are various applications to deal with the limitations of AI theories in music and exploring human-machine co-operation. Considering the amount of knowledge in this particular domain which is available to systems, and the direct manipulation and visualisation techniques available, mean the extent to which it can be applied practically is unclear or very limited, in regards to music education.

Reflection is a very useful tool in terms of problem-seeking. It allows the user to re-compose a piece using their ability to solve a problem. However, there is still much work needed before it can be applied practically to music education.

10. References

Baker, M. (1989). An artificial intelligence approach to musical grouping analysis. Contemporary Music Review, 2, 43-68.

Baker, M. (1990). Arguing with the Tutor. In M. Elsom-Cook (Eds.), Guided Discovery Tutoring London: Paul Chapman Publishing.

Baker, M. (1994). A Model for Negotiation in Learning Teaching Dialogues. Journal of Artificial Intelligence in Education, 5(2).

Balzano (1980). The Group-theoretic Description of 12-fold and Microtonal Pitch Systems. Computer Music Journal, 4(4).

Bamberger, J. (1986) Music Logo. Cambridge, Mass. Terrapin Inc.

Carbonell, J. R. (1970). AI in CAI: an artificial intelligence approach to computer-aided instruction. IEEE Transactions on Man-Machine Systems(MMS), 11(4), 190-202.

Cook, J. (1994). Agent Reflection in an Intelligent Learning Environment Architecture for Composition. In M. Smith, A. Smaill, & G. A. Wiggins (Eds.), Music Education: An Artificial Approach London: Springer Verlag.

Cook, J. and Morgan, N. (1995). COLERIDGE: Composition Learning Environment for Reflection about Intentions and Dialogue Goals in Education. In A. Smaill (Ed.), International Congress in Music and Artificial Intelligence, University of Edinburgh, Department of Music.

Cork, C. (1988). Harmony by LEGO bricks.

Cumming, N. (1992). Eugene Narmour’s Theory of Melody: Music Analysis, 11(203), 354-374.

Desain, P. & Honing, H. (1992). Music, Mind and Machine. Studies in Computer Music, Music Cognition and Artificial Intelligence. Amsterdam: Thesis Publishers.

Desain, P., & Honing, H, (1996). LOCO-SONNET: a graphical dataflow language for algorithmic composition.

Elsom-Cook, M. (Ed.). (1990). Guided Discovery Tutoring. London: Paul Chapman Publishing.

Fenton, A. (1989). The Design of an Intelligent Tutoring System for Music. Musicus, 1(2), 125-143.

Holland, S. (1994). Learning about harmony with Harmony Space: an overview. In M. Smith, Smaill. A, & Wiggins. G. (Eds.), Music Education: an artificial intelligence approach: Springer Verlag.

Hutchins, E. L., Hollans, J. D., & Norman, D. A. (1986). Direct Manipulation Interfaces. In. D. A. Norman & S. Draper (Eds.), User Centred System Design: New Perspectives on Human Computer Interaction, Hillsdale, NJ.: Erlbaum.

Johnson-Laird, P. N. (1988a). The Computer and the Mind: London: Fontana.

Laurillard, D. (1993). Rethinking university teaching: a framework for the effective use of educational technology. London: Routledge.

Lerdahl, F., & Jackendoff, R. (1983) A generative Theory of Tonal Music. London: MIT Press.

Menzel, E. W. (1991). Chimpanzees (Pan troglodytes): Problem seeking versus bird-in-hand, least effort strategy. Primates, 32, 497-508.

Moore, A. (1992). Patterns of harmony. Popular Music, 11(1), 73-106).

Morgan, N,. & Tolonen, P. (1995). Symbolic Composer Professional – a software application for Apple Macintosh computers.

Newcomb, S. R. (1985). LASSO: An intelligent Computer Based Tutorial in Sixteenth Century Counterpoint. Computer Music Journal, 9(4).

O’Shea, T,. Self, J. (1983). Learning and Teaching with Computers. London:Prentice- Hall.

Pachet, F. (1994). The MusES System: an environment for experimenting with knowledge representation techniques in tonal harmony. In first Brazilian Symposium on Computer Music, SBC&M, (pp. 195-201). Caxambu, Minus Gerais, Brazil.

Sleeman, D,. & Brown, J. S. (Ed.). (1982). Intelligent Tutoring Systems. London: Academic Press.

Sloboda, J. A. (1985). The Musical Mind: The Cognitive Psychology of Music. Oxford. Clarendon Press.

Sleeman, D., & Brown, J. S. (Ed.). (1982). Intelligent Tutoring Systems. London: Academic Press.

Todd, N. (1989). A computational Model of Rubato. In E. Clarke & S. Emmerson (Eds.) Music, Mind and Structure

Wenger, E. (1987). Artificial Intelligence and Tutoring Systems. Los Altos, California. Morgan Kaufmann Publishers Inc.

Yavelow, C. (1992). Macworld Music and Sound Bible. San Mateo, California. IDG Books.