Cultural Approaches To Music Cultural Studies Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

For the past several decades, cognitive researchers have looked at music as a biological phenomenon (Cross and Morley 2010: 61). On the other hand, most ethnomusicologists, having rejected the evolutionary approach espoused by scholars in comparative musicology, have considered music primarily as a cultural phenomenon. [1] Biomusicologists Brown, Merker, and Wallin [2] charge that this position, which ignores questions about the origins of music, ultimately leaves the calling of ethnomusicologists nullified. [3] Should ethnomusicologists reconsider biological questions about music in the face of the burgeoning research in biomusicology? [4] If so, how can scholars navigate the apparent impasse between biological and cultural approaches? This paper focuses on three areas of potential congruence between biomusicology and ethnomusicology: the relevance of musilanguage, the role of neophilia or aesthetic play, and the significance of the debate about whether music is an adaptation or a technological spin-off of another cognitive function. By demonstrating areas of mutual interest between biomusicology and ethnomusicology, it is hoped that there will be new opportunities to reconcile the two seemingly disparate approaches to music.

Musilanguage: Implications for Ethnomusicology

One of the most provocative ideas in the neuroscientific study of music is the theory concerning a protolanguage that synergistically embodied both linguistic and musical antecedents in the early hominid line. According to Brown (2000: 271-300) and Mithen (2006: 172-76, 191-204), this protolanguage was a precursor for both music and language-'a communication system that had the characteristics that are now shared by music and language, but that split into two systems at some date in our evolutionary history' (Mithen 2006: 26).

As one of the first scientists to consider a prototypical 'musilanguage', Brown explains that the musilanguage model uses the common features of both music and language as its starting point, thereby avoiding 'the endless semantic qualifications as to what constitutes an ancestral musical property versus . . . an ancestral linguistic property . . . The model forgoes this by saying that the common features of these two systems are neither musical nor linguistic but musilinguistic, and that these properties evolved first' (2000: 277). By contrast, Brown argues that the distinctive features of music and language 'occurred evolutionarily later. They are specializations that evolved out of a common precursor and are like the various digits that develop out of a common limb bud during ontogeny of the hand' (ibid.).

Linguist Alison Wray also proposes the idea of a protolanguage as a precursor to modern human language (2002: 2008), and Steven Mithen expands further on Brown's and Wray's insights in his book The Singing Neanderthals (2006). Mithen claims,

'From my reading of their work, neither Brown nor Wray has fully appreciated the true significance of their separate insights. It is my task in this book to reveal that significance-not only to explain the origin of music and language, but also to provide a more accurate picture of the life and thought of our human ancestors' (2006: 26). Mithen feels that neither 'musilanguage' nor 'proto-language' appropriately describes this precursor, however, and prefers to use the acronym 'Hmmmmm,' which stands for a 'holistic, manipulative, multi-modal, musical and mimetic communication system,' strongly emphasizing bodily gesture (2006: 172). Mithen continues, 'the type of communication system I am suggesting for Early Humans is one that still lacks words and grammar, and which continues to follow Alison Wray's arguments for the nature of proto-language' (2006: 172). However, Mithen argues that the protolanguage was 'a more elaborate communication system than [Wray] proposed-one that use[d] not only gesture but also dance, and one with extensive use of rhythm, melody, timbre and pitch, especially for the expression of emotional states' (ibid.).

While Mithen's book-length discussion of 'Hmmmmm' in the lives of early hominids persuasively makes the case for this communication system as a precursor for both language and music, Brown's chapter in The Origins of Music addresses some specific points about the precursor that are particularly helpful for ethnomusicologists who study orally performed traditions. Brown contends that music and language 'differ more in emphasis than in kind, and are better represented as fitting along a spectrum instead of occupying two discrete, but partly overlapping universes' (2000: 279). If this theory were true, it would explain the ubiquity of human oral communication that features both language and music-descendants of the musilanguage that diverged but later recombined as vocal performance in human communication. Mithen further explains the ultimate power of the rebinding of music and language as follows:

[T]he two products, music and language, are only being recombined after a period of independent evolution into their fully evolved forms. Consequently, song [and I would add other forms of vocal performance] benefits from a superior means of information transmission, compositional language in the form of lyrics, than ever existed in [the precursor], combined with a degree of emotional expression, from music, that cannot be found in compositional language alone (2006: 273).

Thus, the recombination of music and language into human vocal performance represents a far more expressive communicative form than its precursor, but still reflects the interrelatedness between the musical and linguistic characteristics of its evolutionary antecedent.

A New Perspective for Orality Research: A Two-Dimensional Model

Neuroscientific research also reinforces the notion that modern language and music are cognitively inter-related, thereby indirectly supporting the proto-language model as an evolutionary prototype for the inextricable relationship between music and language in modern human brains. As Falk explains, 'Results from brain imaging studies . . . [imply] that music and language are part of one large, vastly complicated, distributed neurological system for processing sound in the largest-brained primate' (2000: 212).

What are the ramifications of this latest research on music and language for ethnomusicologists involved in orality research? One of the major benefits is recognizing the parity of language and music in oral performance. With the exception of a few landmark studies on the musical dimensions of orality (Austerlitz 2005, Erdeley 1995, Nettl 1998, Reichl 2000, Tokumaru and Yamaguti 1986, Treitler 2003) , most orality research has heretofore favoured the linguistic over the musical dimension in understanding this cognitive phenomenon (Lawson 2010: 429). However, by recognizing the complex interdependency of language and music in current evolutionary musicological and neuroscientific research, we now find substantial justification for changing the research paradigm to include a balance between the linguistic and the musical in orality studies.

Although the recent emphasis on the synergistic relationship between music and language might initially appear to complicate orality research, I submit that understanding the role of musicality in oral communication will ultimately expose an inherent weakness of the oral-literate paradigm that has continually obscured our understanding of orality. Instead of merely considering a uni-dimensional phenomenon reflecting oral versus literate elements, the oral-literate spectrum (graphed in Figure 1 as the y axis) can be expanded to include a dissecting 'musilinguistic' (linguistic-musical) spectrum (the x axis) on a Cartesian coordinate system, creating a two-dimensional model that allows for graphing all the different possibilities for music and language in all of its oral and literate forms into four quadrants: I, the musical-oral; II, the linguistic-oral; III, the linguistic-literate; and IV, the musical-literate.


Figure 1: Two-Dimensional Model for Orality Research with Approximate Classifications for Eight Genres

Most significantly, this proposed model has the advantage of illustrating the literate dimensions of musicality-an area that has heretofore been neglected in the research (Lawson 2010: 436-442). While Ong has articulated the concept of 'technologizing the word' through writing (2006: 7), one must also recognize the technologizing of musical sound through notation. Scholars like McLuhan have intimated that there are significant 'complications of operations' in moving from an oral-aural to a visual-literate perspective in language (2002: 93), but scholars have only barely begun to assess the impact of using visual notation in musical performance (Falk 2000: 202), let alone in performance traditions that feature both language and music (Lawson 2012).

For example, instrumental genres that employ exclusively auditory techniques (as opposed to musical notation) in learning and performing, such as traditional Hindustani classical instrumental performance, would be graphed in the musical-oral quadrant, as indicated in Figure 1. [5] Instrumental genres that rely primarily on auditory communication, but use some kind of bare-bones written prescriptive notation, such as jazz, would also be graphed in the musical-oral quadrant, but more towards the literate end of the model, as shown. Western European and North American Art Music (hereafter WENAM), which relies heavily on written notation, would be graphed at the far lower-right side of the musical-literate quadrant, and literature would be graphed at the far lower-left side of the linguistic-literate quadrant. Finally, genres like straight storytelling (no singing) that do not rely on a script would be graphed in the far upper-left side of the linguistic-oral quadrant; and vocal genres that are created musically while using a text as a kind of prescriptive notation, such as the banqiangti forms that will be discussed shortly, would be graphed towards the middle of the linguistic-musical axis, as indicated. With the potential help of brain imaging techniques, future researchers may be able to see the influence of visual orthographies on musical as well as linguistic performance. This could lead to a more sophisticated system in which actual values might be ascertained for different degrees of musicality, language, orality, and literacy, rather than merely the approximations shown in Figure 1.

With a new understanding of the musical and linguistic bases for oral performance, researchers are in a position to address the cognitive activity of orality in a more comprehensive way, including discussing the degrees of musicality and language as well as the degrees of literacy and orality. What has heretofore been considered oral performance may now be re-examined as a product that reflects stubborn traces of the ancient musilinguistic phenomenon-a modern rebinding of music and language that seems to persist despite the divergence of language and music in modern humans. In other words, we appear to be hardwired to express ourselves musilinguistically.

Implications of Ethnomusicological Research for Musilanguage

While there is no way to prove if the complex stages of the musilanguage model did indeed evolve as Brown explains it (2000: 278-91), one can look at musical and linguistic evidence of modern and historically documented performance traditions and surmise if the model seems convincing. [6] The example I use to ascertain the credibility of the musilanguage theory comes from my own research on language-music relationships in northern Chinese narrative forms known as shuochang, [7] which literally means 'speaking singing'. Although shuochang is an especially interesting example of a descendant of musilanguage and all the many subtle cultural messages that are conveyed beyond semanticity, Chinese narrative traditions are certainly not the only modern offspring of the musilanguage precursor. As Nettl suggests, all societies apparently have some kind of 'sound communication that they distinguish from ordinary speech . . . [which] could be a kind of baseline for music' (2000: 466). However, it also appears that there is little consensus from culture to culture as to how speech differs from music and all the other intermediary forms. Since there is no agreement on a definition of music among ethnomusicologists, [8] finding a place to begin a conversation between biomusicology and ethnomusicology is daunting. Indeed, the failure to propose a definition of music that is acceptable among scientific and humanistic disciplines reflects the depth of the impasse between biological and cultural approaches to music described by Cross (2010: 62). However, if one considers that music and language may be best represented as a single parameter on a musilinguistic spectrum, and that each culture may exhibit a variety of musilinguistic forms on that spectrum, then the graph proposed in Figure 1 might be used as a foundation for understanding music and language, providing a model that suggests further research into musical and linguistic cognition, and that can potentially be aided by the ongoing technological developments of brain imagery.

In some significant ways, shuochang features characteristics of what Brown claims to be the ancient prototype of modern music and language. Shuochang represents the northern variety of a Chinese narrative tradition that has over 150 variants, each differing according to geographical (and therefore linguistic/musical) area. Two particularly important counterparts to shuochang from other parts of China are pingtan, from the Shanghai-Suzhou area, and nanyin, from Guangzhou, but there are scores of other varieties throughout the country (Zhang 1983: 11-13). In each Chinese narrative genre, the emphasis is on the interplay between the local dialect and indigenous popular melodies-an aesthetic manipulation of music and language that entertains through the telling of stories and through appreciating the sonic and structural beauty of the musical rendition, since most of the genres are sung. Brown's suggestion that musilanguage would be best understood on a music-language spectrum (2000: 278) applies equally well to shuochang.

Brown's discussion of the musilanguage model becomes even more interesting when he describes the first of three phases of musilanguage that involved lexical tone or the use of pitch to convey semantic meaning (2000: 279). While it is difficult to prove Brown's suggested sequence-from intonation to combinatoriality to expressive phrasing-in the development of his musilanguage model, his comments about the importance of lexical tone as the first step in the development of musilanguage is significant with regard to Chinese narrative traditions. All Chinese narrative forms are based on tonal languages, and the forms include the largest number and greatest diversity of any narrative traditions in the world today, implying that China is an ancestral place for musilanguage. Brown suggests that 'language evolved as a tonal system from its inception, and that the evolutionary emergence of nontonal languages (intonation languages) occurred due to loss of lexical tone. In other words, this hypothesis states that tonality is the ancestral state of language' (2000: 284-85). Furthermore, Brown stresses that 'the notion of lexical tone implies that pitch can and does play an essential role in language, not just as a prosodic or paralinguistic device, but as a semantic device' (2000: 281). Moreover, 'the notion of lexical tone, with its underlying level tones and semantically meaningful pitch movements, satisfies the criterion for being a joint feature of language and music' (2000: 284-85).

Significantly, shuochang is based on a tonal language and indeed features both the semanticity of language and the melodic play of music. It is no surprise that Chinese-as both an ancient and a living language-has preserved this primordial feature of lexical tone by conjointly featuring both language and music in performance. As such, shuochang lends credence to Brown's theory by demonstrating the tenacity of lexical tone and its musical play in the language-music relationships of a modern genre.

However, shuochang challenges Brown's idea that 'concepts such as musical language . . . and speech melody are never taken beyond the domain of metaphor into the domain of mechanism' (2000: 272). Shuochang clearly demonstrates how speaking and singing operate within the domain of mechanism in a living tradition that reveals inextricable, synergistic relationships between music and language. For example, many of the northern shuochang genres are divided into two major categories: those that use melody to generate a text and those that use a poetic text to generate the melodic rendering of a story. The first category is qupaiti, which refers to the way a pre-existing tune dictates lyrics, and the second category is banqiangti, a system in which singers employ adaptable musical formulas to a newly written text (Lawson 2011: 62-76, 82-95). The permeability of language and music in the minds of performers and aficionados is evident in frequently heard descriptive phrases like 'singing while speaking' (changzhe shuo) and 'speaking while singing' (shuozhe chang). Chinese audiences delight in the ways in which language manipulates music and music modifies language in the telling of a story (ibid.: 7-14).

Given the fact that the Mandarin Chinese dialect used in shuochang is a tonal language, my primary research question entering the field was how much musical play was allowed in rendering any given story-if, indeed, the communication of stories was the pre-eminent feature of these genres. The answer to that question is that every genre offers a unique example of the interplay between language and music, each genre differing primarily according to which of the two text-setting processes was employed. Each of these two processes (banqiangti and qupaiti) represents the complement to the other, and yet, each process preserves the dynamic relationship of both the shuo (or spoken element) and the chang (or musical element). Audiences find each process fascinating in the ways speaking and singing are negotiated differently within the linguistic and musical context of each piece. Plotting these genres in Figure 1, the qupaiti genres are graphed within the musical-oral quadrant. The banqiangti genres, while still musically oriented, are graphed on the x-axis (to the left and below the qupaiti genres) because of the greater importance of lyrics and textual prescriptions. Other genres that are more akin to spoken language would be plotted even further towards the linguistic end of the continuum (Lawson 2011: 51-6).

In discussing the nature of the two ends of his musical-linguistic gamut for the musilanguage model, Brown explains that the music pole represents 'music's vehicle mode of action, in which language's referentiality and music's sound emotion function come together in a complex union of reenactment rituals, musical symbolism, musical narration, acoustic depiction, and the like' (2000: 278-79). By contrast, Brown explains that the language pole represents features like heightened speech, Sprechstimme, rapping and other uses of melody and rhythm to covey linguistic meaning. He argues that the purpose of his model is 'to describe a system containing both rudimentary referential and sound emotion properties such that it might be a reasonable precursor for the evolution of both music and language, and such that divergence from this precursor stage can be seen as an intensification of emphasis rather than the creation of new worlds' (ibid.). While shuochang genres can be easily plotted on such a gamut (as can scores of genres not mentioned in this paper), I beg to differ with Brown concerning the two ends of the continuum presented in the musilanguage model. Although I agree that the language end certainly favours the idea of semanticity, the evidence I have from studying the more musically-oriented genres demonstrates the significance of creative play with musical form.

Neophilia: The Need for Aesthetic Novelty

The aesthetic manipulation of form is one of the most important features of shuochang. Whittall explains that definitions of form have commonly 'given priority to the need for form to be unified and integrated, with contrasts and diversities subordinate rather than predominant. Moreover, form has generally been theorized as implying not simply organization, but organicism-with frequent recourse to biological or botanical analogies' (2002: 92). Musical analyses of shuochang performances demonstrate an amazing level of formal complexity in the ways music and language display mutual interdependence. In these genres, the musical 'contrasts and diversities' are indeed subordinate to the needs of the words being communicated, but there is also an inherent organicism in the way musical form adapts to any given narrative situation.

For example, in one genre, the complexity involved in using certain melodies in standard and abbreviated forms demonstrates a mathematically precise sophistication in setting texts of different lengths (Lawson 2011: 64-7). And while most consultants are not consciously aware of the level of complexity in the setting of texts, aficionados clearly know the difference between a skilled artist and an amateur performer (Lawson 2011: 10-11). If anything, the expression of emotion is experienced as much in the appreciation of the semanticity of beautifully written lyrics as in the musical rendering of a piece; indeed, emotionally commanding performances occur in genres on both ends of the spectrum. Hence, I would argue that the two poles of the shuochang gamut represent semanticity and aesthetic play rather than semanticity and emotion, since powerful emotions can be experienced on all parts of the spectrum.

The play of musical form is powerful precisely because it is the opposing complement to semanticity on the shuochang spectrum. As Cross and Morley explain, 'Music embodies and exploits an essential ambiguity, and in this respect, language and music may be at complementary poles of a communicative continuum' (2010: 69). The delight of musical play lies in its inherent ambiguity-an ambiguity that allows melody and rhythm to play with the semanticity of lexical tone within the bounds of comprehension. Any musical manipulation that distorts lexical tone to the point of miscomprehension has gone too far; however, musical play of lexical tone within the limits of comprehension represents the highest form of artistry. As a group of 'musilinguistic' genres, shuochang embodies both the 'semantically decomposable propositions' of language (Cross and Morley 2010: 69) and the essential ambiguity of music within one multi-modal experience. Indeed, the play between language's semanticity and music's 'floating intentionality'-Cross and Morley's description of the way music can gather meaning from contexts as it simultaneously contributes meaning to those contexts- is a source of pleasure for shuochang aficionados.

Imberty also mentions the importance of aesthetic play implied by musical performance when he claims that 'music is not communication but a representation of our ability to communicate; it is a stylized game for our opening to the world, it is communication without an object to communicate' (2000: 461). In other words, musical play is a kind of jouissance in witnessing and perceiving the aesthetic manipulation of the form and style of a performance; when the aesthetic play particularly complements the communication of semantic meaning, the jouissance is further enhanced.

The following brief examples demonstrate the two processes for creating pleasure through aesthetic manipulation of music and language in shuochang: The first example features the beauty of an elegantly written text, aesthetically rendered by a musical style that only mildly challenges semanticity; the second features the overpowering beauty of a melody, which clearly dominates and sometimes even obscures the text and its message. The pleasure experienced in performance is simultaneously appreciating the melody that emerges from and plays with the lexical tonality of the language and the semanticity of the language that inspires the musical setting of the text.

Singing While Speaking

As mentioned, banqiangti is the system for setting texts in which musical considerations are subsidiary to the text, and one of the best examples of this type of genre is Beijing Drumsong. The text is written by an author specializing in the genre, who begins writing couplets of poetry organized loosely into stanzas; musical formulae are then used to set each line according to the tonal and rhythmic demands of the text and the aesthetic preferences of the singer (Lawson 2011: 82-93). The singer, known as the second author, musically renders every textual line, carefully preserving the essential pitch structure, characteristic melodic movements, and cadential patterns of the system. As a result, no two pieces composed according to the same banqiangti will sound alike because different texts demand individual settings. In other words, although each text will inspire a unique musical rendering by the singer, implicit in this type of poetry and embedded in the poetic texts are musical and rhythmic formulae which are rendered by a knowledgeable singer who has spent years singing the genre and knows how to employ the formulae (Lawson 2011: 82-83).

For example, in Figure 2 each word in the piece At Break of Day demonstrates the direct correlation between word tone and the melodic direction according to the four tone marks of the Mandarin dialect over each syllable (ibid.: 131-32). This piece consists of nine couplets, where the first half of the couplet is designated as a T (top) line and the second half of the couplet is designated as a B (bottom) line (Stevens 1975: 112). In this example, the melodic rendering of the text in T lines 3 through 9 is clearly subservient to the meaning; in other words, the melodic formula (pingqiang) used to set these lines follows the lexical tone carefully, leading to high linguistic comprehension. [9] Note the basic similarity in the melodic arch of all these lines, where the differences in melody reflect the lexical tone of each syllable, and the simple 4/4 rhythmic structure characteristic of manban, which is the most common rhythmic formula used to set most Beijing Drumsong lyrics. Rhythmic differences from line to line enhance the meanings of the text and add variety to the performance.


Figure 2: T-3 through T-9 in At Break of Day

Although the T lines demonstrate a fairly simple and syllabic rendering of lexical tone, the B lines of the couplets are treated differently, as shown in Figure 3. Once the lexical information is conveyed melodically in the first part of the B lines, the rest of the B lines exhibit a more ornate, melismatic rendering of the text. At the ends of these lines-after most of the essential narrative information has been expressed-musical considerations take over, and melodic ornamentation dominates. Since the important lexical information has already been expressed, the melodic excursions at the end serve as an aesthetic complement to the more syllabic rendering of the first part of the phrase. This is in keeping with a phrase frequently heard among consultants: 'First communicate the words, then sing the melody' (Xian nianzi hou chang qiang).


Figure 3: B-6 and B-7 in At Break of Day

Significantly, the lyrics, which are considered semi-literary, are the most important feature of this genre, and the lexical tone of the lyrics carefully dictates the movement of the melody. After the basic semantic information has been conveyed, however, aesthetic demands predominate. Audiences love this genre because the beautifully written texts and delicate melodic rendering represent the elegance of Chinese culture (Lawson 2011: 93-95).

Speaking While Singing

One of the other popular genres of shuochang is called Tianjin Popular Tunes. As the center for the narrative arts in northern China, Tianjin traditionally has had one of the largest professional troupes of narrative artists in the country and currently houses the Academy for the Northern Chinese Narrative Arts, a regional school that teaches all the narrative forms performed throughout northern China (Lawson 2011: 4, 25, 28). While most of the genres performed in Tianjin are shared with and performed regularly in other areas in northern China, Tianjin Popular Tunes is indigenous to the city and reflects local Tianjin pride.

Tianjin Popular Tunes is also more musically (or chang) oriented than Beijing Drumsong. Using the previously described qupaiti format for setting texts, the tunes employed in text setting are not only native to and highly popular in Tianjin, but are also extremely ornamented, allowing for virtuosic singing on the part of the performer. Audiences love these melodies and cheer loudly when a local performer assumes the stage and begins to sing one of these beloved tunes-even if the lyrics are not always clearly understood (Lawson 2011: 70).

Figure 4 illustrates a few measures of two pieces sung to the same qupai. The lyrics on the upper line come from a piece entitled Dropping the Watermelon, and the lyrics on the bottom line come from Autumn Scenery; all the syllables indicated in parentheses are non-lexical syllables sung to fill up the extra musical 'space' of the qupai, and are as numerous as the textual syllables ( a phenomenon that will be discussed shortly). Comparing the upper and lower lines, one can clearly see how the word tone of different textual syllables is rendered by using grace notes or by making slight variations in the melody to accommodate the direction of word tone. For example, the first syllable of Watermelon, is third-tone jie (low rising), while the first syllable of Autumn is high, level first-tone tiar. Similar discrepancies between different word tones for different syllables of the two pieces can be seen clearly throughout this example.


Figure 4: Comparison of Two Texts Sung to the Same Tune

The frequent use of non-lexical syllables sung to identical segments of melody in different pieces but written according to the same qupaiti raises an interesting issue. In her recent work Formulaic Language: Pushing the Boundaries, Wray explains that formulaic language is language that 'operates beyond its normal scope . . . where language users . . . favor previously assembled output over something more spontaneous . . . [representing] the boundaries . . . of language behavior, of communication potential, and of linguistic theory' (2008: 5). The non-lexical syllables in Figure 4 are examples of formulaic language, representing a boundary between language and music that might well be plotted towards the musical end of the musical-linguistic axis in Figure 1. In the case of this genre, the non-lexical syllables are sung only to preserve the underlying melody, and they are linguistically less important than the textual syllables. Indeed, the number of non-lexical syllables can even distort the semantic intelligibility of the lyrics (Lawson 2011: 70).

Chinese consultants unanimously proclaim that this genre is musically 'weighty' (yinyuexing hen qiang), and that the lyrics are generally considered subservient to the popular tunes to which the lyrics are sung (Lawson 2011: 69-3). Interestingly, in evaluating Wray's research, Mithen posits that 'the formulaic aspects of language suggest a greater similarity with music than might initially be apparent' (2006: 19), and, by extension, I would argue that the presence of non-lexical syllables-an example of formulaic language-in musically recurring sections of qupai lends musical stability to the piece. Textual syllables demand slight melodic alterations to accommodate lexical tone, but formulaic, non-lexical syllables allow for more musical similarity from piece to piece written in the same qupai, thereby temporarily suspending semantic intelligibility and creating more musical interest for the listener. Certainly in Chinese, where lexical tone is essential for semantic communication, the absence of lexical tone denotes the dominance of musical expression that is not otherwise possible in vocal music dictated primarily by the lexical tone of the lyrics.

The kind of musical expression associated with sections of non-lexical syllables is best described by Herbert: music can offer 'an alternative mental "space" where the interaction between perceiver and stimulus does not have to constitute an effortful decoding of informationally precise meaning' (2011: 197). Within this non-lexical space, the singer displays her most ornamented and musically elaborate singing-a clear opportunity for creative play. This creative play also occurs after singing each of the lexical syllables carefully according to lexical tone, since the singer relishes the opportunity to ornament the melody once the basic lexical information is communicated. The ornamented melody is called the 'wer' or 'flavor' of the genre, and is, as consultants explained, the focal point of the performance (Lawson 2011: 63-64).

By contrast, the emphasis on musical dominance in this genre is balanced by the presence of a narrative section that often appears between the first and second couplets of the piece (Lawson 2011: 75). Significantly, this section is rendered in a semi-spoken delivery style in local Tianjin dialect, as opposed to the standard, Beijing Mandarin dialect used for the rest of the lyrics in the piece. To complement the musically-oriented sections, the performer also communicates in a less musically ornate style in the local dialect to ensure complete comprehension of the lyrics. Hence, in both the musically-oriented as well as the linguistically-oriented portions of the piece, expressions of Tianjin pride are ultimately the most important meta-messages communicated in performance: local Tianjin melodies prevail over Mandarin lyrics for most of the piece, and Tianjin dialect predominates in the melodically simpler narrative section (Lawson 2011: 69-76).

Thus, the Chinese narrative arts steadfastly maintain the lexical tonality that Brown feels is the ancestral state of language. Additionally, shuochang, whose very name reflects the musilanguage of our human predecessors, rebinds the two products of the precursor in a way that reemphasizes the musilinguistic significance of the original prototype, but creates a semantically and musically more powerful form than was ever possible in musilanguage. The interest in shuochang lies simultaneously in both the semantic beauty of the lyrics and the playful melodic manipulation of the lexical tone, creating a masterpiece of musical and linguistic charm.

Biomusicological insights on neophilia

The notion of aesthetic play has also been a topic of interest in biomusicology, partly due to the ramifications of the medical and psychological research since the late 1960s on mother-infant relationships (Malloch and Trevarthen 2010: 1). It has been demonstrated that the universal interactions between human mothers and infants not only include the 'lilting, simplified utterances' of mothers to their infants, but facial and bodily movements as well (Dissanayake 2010: 22). This 'communicative musicality'-a term coined by Malloch and Trevarthen (2010)-involves 'melodic vocal contours, rhythmic and regularized vocalizations and body movements, and expressive dynamic contrasts and variations in space . . . and time . . . , with behavioral "rests" or silences between bouts' (Dissanayake 2010: 22). )

Furthermore, Dissanayake posits that music was indispensable to humans not because of male competition or adult courtship-the more common considerations in proving adaptive behavior-but because of the 'affiliative interactions between mothers and infants' (389).

The trend toward increasingly helpless infants surely created intense selective pressure or proximate physiological and cognitive mechanisms to ensure longer and better maternal care. I suggest that the solution to this problem was accomplished by coevolution in infants and mothers of rhythmic, temporally patterned, jointly maintained communicative interactions that produced and sustained positive affect-psychobiological brain states of interest and joy-by displaying and imitating emotions and motivations of affiliation, and thereby sharing, communicating, and reinforcing them' (2000: 389-90).

Additionally, research overwhelmingly indicates that these musical and dance-like movements of the mother are in direct response to the positive responses of the baby in 'eliciting and preferring precisely these kinds of signals' (Dissanayake 2010: 22). As infants encourage mothers to interact with them in a way that simplifies, repeats, exaggerates, and elaborates common communicative signals, maternal behaviour is reinforced, and this reinforced behaviour 'would have been adaptive for both maternal reproductive success and infant survival' (ibid.: 23).

Cross and Morley continue the argument by explaining that a salient feature of the mother-infant bond, due to increasing hominid altriciality, was the vocal play associated with juveniles (2010: 73). Citing research on vocal play among pygmy marmosets (Elowsen et al. 1998), Cross and Morley compare the kind of babbling among juvenile pygmy marmosets with their caregivers to the kinds of vocalizations between mothers and infants in early hominids, explaining that 'an association between vocal play and a positive caregiving response privilege[d] the social function of these types of play' (2010: 74). Furthermore, Cross and Morley argue that given its survival value, group behaviours like babbling between juveniles and caregivers are likely to have some adaptive value: 'Music can be interpreted as one of those mechanisms, emerging under the selection pressures of the progressive extension and stage-differentiation of the juvenile period in the later hominid lineage' (74). Thus, the potential to rehearse and refine social interactions through these vocalizations was 'built on subsequently to become part of music and language in the fully symbolic culture that emerged in modern humans' (76). Hence, these 'rehearsals' of early proto-musical and proto-linguistic vocalizations were essential to the survival of both mother and infant, and, according to Cross and Morley, may well have been the precursor of human music and language.

In addition to the notion that play was essential to the risk-free development of survival skills, including language and music, Dissanayake adds another dimension to the argument by explaining how play also satisfies the apparent deep-rooted need for novelty: 'Both human and primate young show "neophilia," or attraction to the novel. Even from birth, the human infant begins actively to seek sensory and cognitive stimulation. In humans and in some animals, mild fear and frustration may even be welcomed or sought . . .'(1988: 125). She continues by suggesting that humans regularly seek ways to enliven an ordinary routine, often endowing it with spiritual values. The resulting creativity is 'one consequence of the desire for adventure, even though the corresponding desire for familiarity and predictability is equally strong. The oscillation between these two poles of being and becoming may in itself be creative . . .' (ibid.). Zoomusicological insights on play

The search for novelty and creativity is also a topic of interest among zoomusicologists, who study 'the aesthetic use of sound communication among animals' (Martinelli 2009: 2). Mâche explains that, '[l]ike man, the animal occasionally plays with sounds (1992: 158-9). One of the most interesting animal studies regarding neophilia involves male humpback whales, whose creativity in singing has been the subject of research collected over a period of thirty-two years. In studying this vast amount of data, Katharine Payne asserts that male humpback whales in a given population sing basically the same song, which differs from songs of other populations of male humpback whales (2000: 135). Significantly, however, the song of each population 'evolves continuously, progressively, and so rapidly that nonreversing changes can be measured from month to month in a singing season. Such changes, which affect the songs at all levels, seem to arise through improvisation and imitation rather than through accident or as conveyors of information'(2000: 135). She further explains how rhyme-like structures (repetitive structures at the ends of phrases) and thematic material perhaps serve as 'a mnemonic device in the context of a rapidly changing oral culture,' speculating that 'sexual selection is the driving evolutionary force behind song changing' (2000: 135).

Moreover, Payne suggests that the creative processes she studied in listening to this vast amount of material are due to improvisatory skills. 'Like improvisation in human music, changes seem to be generated by an internal process, and as in music, the imitation that then occurs reveals listening and learning. Song changing in whales seems to be a clear example of cultural evolution in a nonhuman animal'(2000: 142). She proposes that the kinds of changes observed in whale song may be similar to a concept in human psychology known as optimal mismatch, which balances conformity with originality in the process of change in the songs of these 'composers' (2000: 142).

The irrepressible need to create in whale and bird song is, according to Martinelli, an example of what he feels is deeply adaptive and dramatically evident. Emphasizing the need for neophilia, he explains that creative display in song is inherently different from ordinary communication, which 'tends to economy of expression' (2009: 195). Martinelli argues that,

[a]n animal utters as many signs as are the things she/he wants to communicate. Not a single sign in excess. It is thus very intriguing that an aesthetic form of communication is, in human and nonhuman animals, so rich and 'wasteful.' All of a sudden, the demand for economic signs disappears, and we see the message flourishing, becoming redundant, creative, and playful. All the ars retorica that animals spare during ordinary communication explodes in the aesthetic communication, and displays a potential that says a lot about the how and why of music (195).

If, as Mâche also suggests, there are also avian behaviours that appear to represent an intrinsic pleasure in singing-behaviours that overshoot the mark for the primary biological purpose of the song (2000: 478)- then creative play in song may be one of the elements that links the human and animal worlds. As Bickerton says, 'It is . . . possible that music may turn out to contain elements specific to our species mingled with other elements that may be much more widely shared' (2000: 156). Discussing the significance of elaborate avian vocal displays, Mâche also hints at the idea of jouissance for birds and, by implication, for humans.

The luxurious display of some of the best singers suggests that they go far beyond the signals that would be necessary for keeping a territory or mating. Could we interpret birdsong, and consequently music, as a case of hypertelia? [This] implies that the whole elaboration of a culture, meaning a collective structure of symbolic imagination, might stem from this lavishness of nature exceeding its limited basic purposes. Diversity in song may at first have allowed an individual to prevail over a competitor, before gradually overshooting the mark. In that case, the excess would have turned not into a disadvantage but an unexpected pleasure (2000: 478).

And, as Martinelli argues, the pleasure associated with neophilia is indeed a biological advantage and, therefore, a survival mechanism for all species (2009: 191).

Adaptation versus Technological Spin-off

In addition to the direct value of theories about musilanguage and neophilia for ethnomusicologists and scholars of oral traditions, other biomusicological issues are also potentially interesting to music scholars, notably the debate about whether music is an adaptation or a technology. Miller asks, 'Is human music a legitimate, complex, biological adaptation? If it is not, it might be explicable as a side effect of other evolutionary or cultural processes. But if it is, the rules change: complex adaptation can evolve only through natural selection or sexual selection . . . That's it. There are no other options' (2000: 334).

Is it possible that music could be both a technology and an adaptation, depending upon the kind of music under discussion? As Herbert suggests, '[D]ifficulties in making the case for the adaptive value of music [arise when] . . . music is regarded as an exceptional trait rather than a general capacity' (2011: 171). The model proposed earlier in this paper allows one to see that music and language constructs in the far upper portions of quadrants I and II are non-technologized sounds characteristic of societies where music is considered a general capacity and might be considered adaptive, whereas societies that have technologized music and language through notation and writing (see towards the bottom of quadrants III and IV) might consider music and literature as technologies created by exceptional members of society. Since the scientific burden is to prove that music is an adaptation rather than a technology (Patel 2008: 400), I will proceed first by looking at the research that seems to point to the idea of music as an adaptation, using ethnomusicological and other musical sources to illustrate my points.

Based on the musilanguage model, Brown argues that human music evolved because 'groups of musical hominids outsurvived groups of nonmusical hominids due to a host of factors related to group-level cooperation and coordination' (2000: 297). Moreover, he argues that the ability to create pitches and perform rhythmically allowed for group coordination and promoted 'interpersonal entrainment, cooperative movement, and teamwork' (2000: 297).

Dissanayake also believes that music is adaptive, but, as mentioned earlier in the paper, uses a different argument than Brown (2000: 389-90), suggesting that the origin of music is the early hominid bonding between mother and infant that has been so persuasively demonstrated through modern research in mother-infant studies (Dissanayake 2000, 2010). Nonetheless, Dissanayake's theory about mother-infant interactions as the origin of music also supports Mithen's views about infant directed speech (IDS) and, ultimately, Brown's theory of musilanguage. Mithen explains that 'the usual melodic and rhythmic features of spoken language-prosody-are highly exaggerated so that our utterances adopt an explicitly musical character . . . [Hence] on the basis of child development, it appears that the neural networks for language are built upon or replicate those for music' (2006: 69). In other words, Mithen is suggesting a biologically-driven role for music that actually precedes and contributes to the development of language, which is generally believed to be an adaptation.

Significantly, Cross and Morley (2010), Brown (2000), Dissanayake (2000, 2010), and Mithen (2006) all argue for a highly significant evolutionary role for music. Herbert summarizes the reasons to support the adaptive function of music and the arts as follows: '[T]heir universality, long historical lineage, the way they articulate important aspects of life . . . , the large amount of time and resources devoted to them . . . , the appearance of proto-artistic behaviours in infants, and the fact that, like other adaptive behaviours, they afford pleasure' (2011: 168).

Music as a Technology

By contrast to the argument that music is adaptive, Patel argues that, based on current evidence, music does not seem to be a biological adaptation (2008: 400). Steven Pinker made the following evaluation of music:

As far as biological cause and effect are concerned, music is useless. It shows no signs of design for attaining a goal such as long life, grandchildren, or accurate perception and prediction of the world. Compared with language, vision, social reasoning, and physical know-how, music could vanish from our species and the rest of our lifestyle and would be virtually unchanged. Music appears to be pure pleasure technology, a cocktail of recreational drugs that we ingest through the ear to stimulate a mass of pleasure circuits at once (2009: 528).

However, Patel emphatically states that music is not simply a 'frill, a hedonic diversion that tickles our sense and that could easily be dispensed with' (2008: 400). Rather, he insists that the choice is not between adaptation and superfluous indulgence, but between adaptation and technology.

Homo sapiens is unique among all living organisms in terms of its ability to invent things that transforms its own existence. Written language is a good example: This technology makes it possible to share complex thoughts across space and time and to accumulate knowledge in a way that transcends the limits of any single human mind . . . I believe music can sensibly be thought of in this framework . . . as something that we invented that transforms human life. Just as with other transformative technologies, once invented and experienced, it becomes virtually impossible to give it up (2000: 400-401).

If written language is a good example of a transformative technology (assuming that spoken language is a biological adaptation), could notated music be a comparable technology? The distinction between notated versus oral/aural musics might well provide some of the initial clues to unlock our understanding about the possible adaptive role of music. As Nettl explains, 'Identification of universals depends on definitions of music, of musical units analogous to culture units, and on an interculturally valid concept of music, all problematic issues' (2000: 463). The failure to recognize the technologizing of music has problematized biomusicological research, hindering the discussion of music as an adaptation or technology.

I submit that one of the primary difficulties in defining music is rooted in the fact that the most researched music in academia is WENAM, which happens to be unique among world music cultures because it has been more technologized through notation than any other musical tradition. Without explaining exactly why WENAM is different, Molino argues that WENAM should not be considered in research about the evolutionary origins of music. He explains that '[o]ur conception of music, based on the production, perception, and theory of "great" European classical music, distances ourselves irremediably from the anthropological foundations of human music in general' (2000: 170). But how can we simply excise WENAM from a discussion of the biological roots of music? Surely a theoretical model for human music must explain WENAM.

The answer to this dilemma may be in recognizing that WENAM is an extreme example of a musical-literate tradition from a musically stratified society-a clear case of music as a technology. WENAM has over a thousand years of notational history, representing a notationally technologized sound that does for music what literature has done for language. While the vast majority of languages are oral and only a certain percentage have ever been written (Ong 2006: 7), the situation with music is even more dramatic. There are even fewer musics that are notated and therefore perceived and taught differently; the vast majority of musics are primarily oral/aural. And what is the effect of notation on the cognitive processes used in learning and transmitting notated music?

Historical musicologists who study the medieval period are in the best position to address the earliest documented connections between oral performance and the use of notation. Treitler 2003, Boynton 2003, Levy 1998, Jeffery 1992, and other medievalists have written extensively on the changes from oral/aural to written traditions. Busse Berger's book, Medieval Music and the Art of Memory (2005) explains this fundamental shift in musical performance due to the emergence of visual notation. She writes, 'Rhythmic notation led to a new way of composition. It led to what Jack Goody would call "visual perception of musical phenomena" . . . just as writing led to word games and crossword puzzles, notation led to notational games' (2005: 250). The transition from oral/aural performance to performance that utilizes notated scores began a process whereby music writing became increasingly complex and, therefore, the learning and transmitting of music was fundamentally changed. While we cannot scan the brains of monks who learned to sing with notation and compare them to those who sang without notation, one can assume that the auditory processes involved in learning and transmitting music were affected when the technology of notation was introduced.

Mithen furthers this argument by paraphrasing Blacking's (1974) statement that technological development also leads to musical complexity and then exclusion. 'When the technical level of what is defined as musicality is raised, some people will become branded as unmusical, and the very nature of music will become defined to serve the needs of an emergent musical elite . . . who have mistakenly come to believe that this type of musical activity is music, rather than just one type of musical activity' (2006: 270-271). Thus, the musical activity of WENAM is an example of a transformative technology that represents a musical elite. Lehmann, Sloboda, and Woody (2007) raise an interesting point that reinforces the differences between what I would call a 'musically-stratified' society, characterized by musical complexity and exclusion, and a 'musically egalitarian' (my term) society in which musical activity is shared by all members.

Anthropologists and ethnologists have provided rich descriptions of musical behaviors that differ from our Western experiences. Compare, for example, the average level of performance of a South African and a German adult. The African adult might know many songs (some even polyphonic), can perform rather complicated rhythms vocally and in dance, and has no qualms about participating in a public musical event . . . In contrast, the average German adult will have learned to play simple tunes on the recorder in primary school, will know very few songs, will likely hand clap on beats 1 and 3 regardless of the music, and will be scared to death to perform in public (2007: 17).

What are the characteristics of musics in a musically-egalitarian society? And, in contrast, what is the significance of the process of learning and performing notated music among musical elites in a musically-stratified society? How is notated music different from music produced with no visual notation? Or music with only a bare-bones visual notation? Or music with a text with musical information embedded in the poetic forms? By carefully distinguishing musical and linguistic forms representing all parts of the model, scholars can begin to differentiate which forms of music and language are being considered and why and how they are different from one another.

The cognitive basis for musical forms that appear to be adaptive, compared to those that appear to be technologies, is explained by Molino as follows:

On the one hand, there are mental modules that underlie the capacities that appeared during human evolution as adaptations to the environment. On the other hand, we can turn to cultural transmission, which, although dependent on the biological evolution of mental modules produced through environmental adaptation, deals with objects that are susceptible to directed evolution (2000: 167-8).

Most of the examples from musically-stratified cultures that use notation would qualify as cultural products susceptible to directed evolution and independent of the rules that would qualify them as an adaptation (Miller 2000: 334). Examples of musical forms from musically-egalitarian societies (and particularly forms that do not use notation) would qualify as examples reflecting the mental modules that underlie the apparently adaptive capacities which appeared during human evolution.


Contemporary ethnomusicologists could rightly be charged with ignoring the evolutionary roots of human music. Instead of continuing in the traditions of comparative musicology, which was fraught with many racialist ideas concerning the superiority of European classical music vis-à-vis non-European 'primitive' musics (a bias that continues today), modern ethnomusicologists have chosen to create a rich culturally-based database of research. However, I believe that biomusicological research provides a number of intriguing theories that ethnomusicologists might consider.

First, biomusicological insights regarding the relationships between language and music are revolutionary for ethnomusicologists interested in oral traditions. The implied evolutionary role for music and its inextricability with language encourage a new paradigm for orality research, which has heretofore minimised the role of music in the ancient and ubiquitous form of oral/aural communication. Additionally, neuroscientific research that stresses the cognitive inter-relationship of language and music further supports the expansion of the oral-literate paradigm.

Second, biomusicological and zoomusicological research on creative play among animals lends credence to the importance of creative play in human music. If, indeed, it can be demonstrated that creative play is not only adaptive for animals but humans as well (Mâche 1992: 123-4), then neophilia-the irrepressible desire for novelty and creativity-may be necessary for human survival. As Martinelli reminds us,

[h]aving fun and pleasure during a given activity and practicing an aesthetic inclination of some kind are biological advantages . . . It looks like music, or in general an aesthetic attitude towards sounds, is a nearly endless source of welfare both for human and nonhuman animals' (2009: 191).

While biomusicological theories clearly offer both challenges and insights for culturally-based musical research, biomusicological theories could also be both strengthened and challenged by looking at examples of language-music constructs in living, human cultures and in music history. The rich database provided by ethnomusicological research has yet to be fully explored by scholars in biomusicology, and, as a corollary to this point, the development of any theory that purports to explain music as a human phenomenon is obscured and delayed by ignoring WENAM and the historical and notational records of music. By understanding how and why WENAM and other notationally-based musical systems are different from primarily aurally-based musical traditions, scholars will be in a better position to participate in the debate between adaptation and technology.

Hence, biomusicologists can benefit from considering the vast ethnographic and historical records on musical data to test, refine, and expand upon their theories about the origins of music. Music scholars can also benefit from the insights into musilanguage, neophilia, and other theories that can validate and enhance existing research in music-language issues, and also offer new directions for music scholars. Most importantly, if music is found to be an adaptation, the notion that music is 'a cocktail of recreational drugs that we ingest through the ear to stimulate a mass of pleasure circuits at once' (Pinker 2009: 528), would be, as Mithen claims, 'spectacularly wrong' (Mannes 2009). Recognizing a deeply-rooted, cognitive, and biological basis for music would mean that society generally, and musical and academic institutions specifically, would have to alter the current view of music as an intriguing but ultimately non-essential area of human culture-a change that would have far-reaching implications for performers, composers, scholars, and listeners.

Finally, the creative behaviour of whales and birds points to a communicative form that may well be linked to human musical creativity. While I am suggesting a need to distinguish between the musics that appear to be biologically-adaptive and those that are technologically- and culturally-directed, I believe that the underlying importance of neophilia is evident in all forms of music-oral/aural, notated, and every variant in between-and the neophilic instinct may be one of the most important connections between the human and animal worlds. It is the jouissance of creative imagination that links us to the hypertelia of bird song and the artistry of whale song, suggesting that musical joy may be necessary for both human and animal well-being.