Corpus Driven Language Learning English Language Essay

On 25 June 2009, American singer Michael Jackson died after suffering from cardiac arrest. Michael Jackson, referred as the King of Pop, is recognized as the most successful entertainer of all time. His contribution to music, dance and fashion, along with a much-publicized personal life, made him a global figure in popular culture for over four decades. Despite his contribution to pop culture, his personal life, including his changing appearance, personal relationships and behavior, have generated controversy. He was accused of child sexual abuse in 1993 and 2005. The sudden death of Jackson in 2009 triggered a global outpouring of grief around the world.

The study aims to find out the public attitudes towards the death of Michael Jackson. I am going to analyze the corpus by using Sinclair's (2004) model of five categories of co-selection of a lexical item to find out if the public comments on the death of Michael Jackson are positive or not. The study follows the corpus-driven approach.

Sinclair's (2004) five categories of co-selection is applied to analyze the extended meanings of the lexical item. The research question of this study is what are the extended meanings (i.e. lexical, grammatical, semantic and functional) associated with the most frequent lexical word in the corpus under the study? "The principal unit of meaning is called the lexical item."(Sinclair, 2005). This suggests that words establish a meaningful relation with other words around them. Whenever a word is used in repeated communications and in variable contexts, its meaning is likely to ramify and deviate. Thus, a word acquires new meaning in a new context.

The corpus

The corpus is a self-compiled one. The data in the corpus is obtained from Wisenews based on the newspaper articles about the death of Michael Jackson. The key words entered in Wisenews are "death", "Michael" and "Jackson". 57 newspaper articles dated 26 June 2009, the day after the death of Michael Jackson, are collected to compose the corpus. The corpus contains 38562 words.

Analysis of lexical item of co-selection

A frequency wordlist is generated to find out the most frequent lexical word. The most frequent lexical word is Jackson, with 570 instances (1.48%). It ranked tenth in the wordlist. 83 concordance lines are randomly selected in order to describe the lexical item in the corpus. Most of the uses of Jackson refer to the singer, Michael Jackson (74 instances, 89.2%) while some other (9 instances, 10.8%) are described as Jackson 5, the former band of Michael Jackson.

To identify the most frequent lexical word associated with the corpus, WordSmith 5.0 was used to generate a word frequency list. Table 1 lists the ten most frequently occurring words, which are the likely source of lexical cohesion both within and across the texts in the corpus.

Table 1. The top ten most frequent occurring words in the corpus

Rank Order












































After indentifying the most frequent lexical word in the corpus through generating the word frequency list, the corpus is further examined through reading concordance lines of the most frequent lexical words to describe the lexical items in the corpus. Sinclair's five categories of co-selection is the basis for the description of the lexical item. The study begins by examining the concordance lines containing the most frequent lexical word, Jackson. The value of reading concordance lines is that "the origin of meaning is in the text, the selection and co-selection of words."

From Table 1, Jackson is the most frequent lexical word in the word frequency list. The word ranks tenth in the list with 570 instances of Jackson in the corpus. The proper noun Jackson is identified as the most frequent. Since all of the news articles aim to report the death of Michael Jackson right after the death has occurred, the dominance of Jackson is not surprising. Figure 1 shows sample concordance lines for Jackson.

Figure 1. Sample concordance lines for Jackson

Most of the uses (74 out of 83 instances, 89.2%) of Jackson refer to the surname of the American singer, Michael Jackson. The study would focus on this usage to analyze the lexical item to describe the five categories of co-selection. The other instances (9 out of 83 instances, 10.8%) of Jackson are writing of Jackson 5, a band of Michael Jackson in his early days. This other usage is not analyzed in the study.

Collocational pattern

Collocation pattern can be observed through the construct of the most frequent lexical word "Jackson" plus the collocates. The basis for the observation is the frequency of co-occurrence. The frequency of co-occurrence means that the number of recurrence must be more than two to form a pattern. A word forms co-occurrence in a relatively fixed linear ordering position. Studying the R1 and L1 position of "Jackson", it is observed that there is no strong collocational pattern with the word. The words in the vicinity of "Jackson" are "Michael" (172 times out of 570 times, 30.2%), "was" (66 times, 11.6%), "his (14 times, 2.5%), "died" (10 times, 1.8%), "dead (5 times, 0.9%).

Colligational pattern

Colligational pattern refers to the co-occurrence of grammatical choices. When there are no regularities found in lexical choice, the observation goes to higher abstraction, word class. It studies the interrelationship between meaning and grammatical choice. It is observed that there is no strong colligational pattern with the word "Jackson". At R1 position, it often occurs after a verb "to be", such as "was" (66 times, 11.6%) and "is" (10 times, 1.8%). It also occurs with a verb. For times, "had" (14 times, 2.5%), "died" (10 times, 1.8%) and "has" (7 times, 1.2%). At L1 position, the word Jackson often occurs with a verb. These verbs are "said" (14 times, 2.5%), "meet" (4 times, 0.7%), "remember" (4 times, 0.7%), "see" (4 times, 0.7%) and "like" (3 times, 0.5%).

Semantic preferences

Semantic preferences refer to the regularities of semantic meaning of a lexical item. When no obvious regularities are found in terms of the colligational patterns, observation goes to higher abstraction, which resulted in the regularity in meaning. The basis for semantic preferences is the frequency of meaning recurrence. It studies the interrelationship between meanings.

"Jackson" is often associated with phrases such as "a phenomenal talent", "a great entertainer", "one of the greatest singer", "musical genius", "most popular pop personage of his era", " a perfectionist" and ""reclusive superstar", "totally professional" and "magical". This shows that "Jackson" in the corpus has the semantic preference of musical icon.

"Jackson" is also associated with phrases such as "the best concert", "groundbreaking performance", "musical and pop cultural accomplishment", "cut through musical racial lines", "fused classic R&B to pop", "massive success", "made meaningful art", "inspiration to rock, pop, R&B singers, dancers", "brought dance into everybody's home" and ""too important to the history of pop music". This shows that "Jackson" in the corpus also has the semantic preference of memories and accomplishments.

Semantic prosody

Semantic prosody refers to the emotional attitudinal cue of a discourse. It is not explicitly stated, but implied in the discourse. The scope of semantic prosody is the whole discourse rather than some individual words. It is the highest abstraction as it means the interrelationship between meaning and intention. The initial choice of semantic prosody is the functional choice which links meaning to purpose. All subsequent choices within the lexical item relate back to the prosody (Sinclair, p.34).

Examples of words surrounding "Jackson" in the concordance lines are "should not be dead", "can't be dead", "meant a lot to many", "so wrong that he's gone", "never think", "still remember", "honor", "recalled", "shed tears", "pay tribute", "passed so suddenly", "unbelievable", "dead at this young of an age", "a surreal moment", "a very sad day". The effect of such words is to underscore that the death of Michael Jackson is shocking to the people. The recall of accomplishments of Michael Jackson made them grievous. Thus, shocking and grievous is the chosen semantic prosody associated with "Jackson".

Semantic preference and prosody of "Jackson" is that the death of the musical icon is shocking to the people. Michael Jackson's death is a grievous loss to many people when they recalled the memories and accomplishments of Michael Jackson.

The core

Since "Jackson" has no strong collocates, "Jackson" remains as the core of the lexical word.


This study is a corpus-driven study which aims to describe the most frequent lexical word in the corpus, by applying Sinclair's (2004) five categories of co-selection. This study has shown that in the corpus, the lexical item become cumulatively associated with a shocking view of the death of a musical icon, Michael Jackson. Others become associated with a grievous view when recalling the memories and accomplishments of Michael Jackson.