Literature Review Of The Cloze Test

Info: 5460 words (22 pages) Example Literature Review
Published: 6th Dec 2019

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Four types of cloze tests used in the present study were investigated by comparing the original C-Test with the New C-Test (the NC-Test), and the original Modified C-Test (the MC-Test) with the New Modified C-Test (the NMC-Test) in measuring the English-language proficiency of EFL university students. The test-taking strategies obtained from group interviews were also examined so as to see how these test-takers at different language proficiency levels responded to each type of cloze test. In this chapter, further details of theoretical issues and related research articles are reviewed to give a clear understanding of the present study.

2.1 Background of the Cloze Test

The cloze test, initiated by Taylor (1953, cited in Anderson, 1976), is a kind of integrative test in which the entire word is rationally or randomly deleted. The word “cloze” was derived from “closure” in Gestalt psychology, indicating that humans are able to fill in what is missing by using their prior knowledge or their experience (Heaton, 1975; Oller, 1979; Sawatdirakpong, 1980). The students’ language proficiency can then be determined by measuring how accurately the students can complete the deleted part of its original passage (Hughes, 2003; Spolsky, 1973). For language teachers, the cloze tests have been widely used in order to determine reading ability and English-language proficiency, since the cloze test is more economical, easier to score, and less time- consuming (McNamara, 2000; Oller, 1979).

In the construction of cloze tests, language teachers should select a suitable passage which responds to the target students’ language abilities and the goals of the language tests. Oller (1979) clearly emphasizes that the most appropriate cloze passage should depend on the students’ ability level in a class if the purpose of the language teachers is to measure English-language proficiency. The following are guidelines on how to choose appropriate texts for constructing cloze tests: (1) the selected passage should not contain any bias, such as religion or politics; (2) the selected passage should not contain any specialized terms or content; (3) the difficulty level of the selected passage should be suitable for the target students; (4) the length of the selected passage should be sufficient for the number of items; and (5) the selected text should be complete in itself (Oller, 1979; Raatz & Klein-Braley, 2003).

For text difficulty level, many researchers and practitioners in language testing (Crawley & Mountain, 1995; Leu & Kinzer, 1995; Vacca & Vacca, 2003) recommend that a checklist or a readability formula should be employed to estimate whether the selected passage is suitable for the student’s reading level. Fry’s readability graph was proposed by Fry (1977) to estimate the readability of a reading text by using the average word length and sentence length of three sample 100-word passages from the selected passages (see Appendix A). Fry’s readability graph has been widely used due to the fact that it provides a wide range of reading grade levels from the first grade to the seventeenth grade levels (Anderson, 1983; Duffelmeyer, 1983; Fry, 1977, 1989, 1990, 2002; Leu & Kinzer, 1995; Saetung, 1984; Standal & Betza, 1990). Therefore, Fry’s readability graph enables the language teacher to adjust and develop texts before constructing and administering tests for the target students.

Regarding the deletion procedure, there are two deletion methods used in the cloze test: the systematic deletion (the fixed-ratio deletion) and the unsystematic deletion (the rational deletion) (Bachman, 1985; Chapelle & Abraham, 1990; Cohen, 1980; Klein-Braley, 1997; Oller, 1979). The former method refers to every nth word deletion which is suitable for assessing general language abilities because “all classes and types of words have an equal chance of being deleted” (Steinman, 2002, p. 293). The latter method is specific word deletion which is appropriate for a particular purpose, such as testing prepositions. Example 6 is presented in order to provide a clear understanding of both deletion procedures. In addition, Bachman (1985) found that both deletion techniques were equally reliable, although systematic deletion was more difficult than the unsystematic deletion. Nonetheless, different deletion rates affect the validity and the measurement of the cloze test (Alderson, 1979, 1980, 1983, 2000; Jafapur, 1995). For example, changing the rate of deletion in the cloze test makes it measure different language abilities (Jafapur, 1995; Weir, 1990).

Example 6: Unsystematic deletion and systematic deletion

UNSYSTEMATIC DELETION SYSTEMATIC DELETION

Fill in each blank with a/an/the or no article. Fill in the missing word.

People today are quite astonished by _____ rapid People today are quite astonished by the rapid

improvements in medicine. Doctors are improvements in medicine. Doctors _____

becoming more specialized, and new drugs are becoming more specialized, and _____ drugs are

appearing on _____ market daily. At _____ same appearing on the ______ daily. At the same time,

time, _____ people are dismayed by _____ _____ are dismayed by the inaccessibility _____

inaccessibility of doctors when they are needed. doctors when they are needed.

(Adapted from Cohen, 1994, p. 234) (Cohen, 1994, p. 234)

Many factors including the deletion procedures, the text length, the number of the items, and the goal of the language test have influence on the forms of the cloze tests. Oller (1979) suggests that the cloze test should generally provide 50 deleted items with a minimum length of 250 words for the passage. However, language teachers are sometimes confused about what the difference between a gap-filling test and a cloze test are. The difference between these two tests is focused on the same point, e.g. the word deleted in each sentence. In a gap-filling test, the deleted item is provided within one sentence, whereas the deleted word in the cloze test is given in a paragraph or a passage (Bailey, 1998). In addition, the gap-filling test is suitable for assessing specific language ability, such as grammar or vocabulary, while the cloze test can measure language proficiency of the target students (Alderson, 2000). The following are six forms of cloze tests.

The fixed-ratio cloze (the random cloze): Every nth word is deleted to be suitable for assessing overall language abilities (Alderson, 2000; Bachman, 1985; Oller, 1979; Steinman, 2002). The following is an example of a sixth-word deletion cloze test.

Example 7: Fixed-ratio cloze test

FIXED-RATIO CLOZE

Fill in the missing words.

People today are quite astonished by the rapid improvements in medicine. Doctors 1)_____ becoming more specialized, and 2)_____ drugs are appearing on the 3)______ daily. At the same time, 4)_____ are dismayed by the inaccessibility 5)_____ doctors when they are needed. 6)_____ doctors’ fees are constantly on 7) _____ rise, the quality of medical 8)_____ has reached an abysmal low.

(Adapted from Cohen, 1994, p. 234)

The rational cloze: Only specific words are deleted to be appropriate for a particular purpose, such as testing grammar, reading comprehension, and vocabulary (Bachman, 1985). As can be seen in Example 8, a rational cloze test where functional words are deleted to assess grammar is presented.

Example 8: Rational cloze test

RAITONAL CLOZE

Fill in the missing words.

Typically, when trying to test overall understanding 1) ______ the text, a tester will delete those words 2) ______ seem to carry the main ideas, or 3) ______ cohesive devices that make connections 4) ______ texts, including anaphoric references, connectors, and so 5) ______. However, the tester then needs 6) ______ check, having deleted key words, that they 7) ______ indeed restorable form the remaining context.

(Extracted from Alderson, 2000, p. 210)

The conversational cloze: Some words or sentences are deleted to determine the communicative language skills of native- and the non-native-test-takers (Hughes, 2003). The student is required to fill in what is missing in the blanks, as shown in Example 9.

Example 9: Conversational cloze

CONVERSATIONAL CLOZE

David: Hello, Mike. How are you?

Mike: Not too bad, David, and you?

David: O. K. You know, (1) ______________ been trying to work out (2) ______________

to go on holiday this year. (3) ______________ a real problem. I really can’t decide where to go. Any ideas?

Mike: Well, I suppose you could try the South (4) ______________ France.

David: No, I don’t really think so. I don’t know why, exactly. Maybe it’s (5) ______________

bit expensive down there.

(D. Brown, 1983, p. 159)

The multiple-choice cloze: Every nth word or specific words are deleted and choices of approximately two to five words are also provided for each deleted part. So the multiple-choice cloze (see Example 10), provide more choices, is easier than traditional cloze tests (Chapelle & Abraham, 1990), although its construction seems to be complicated (Hinofotis & Snow, 1980). However, the multiple-choice cloze test can be utilized for testing both specific skills and language proficiency.

Example 10: Multiple-choice cloze

(Extracted from Kaczmarek, 1980, p.152)

MULTIPLE-CHOICE CLOZE

A farmer’s daughter had been out to milk the cows and was returning home, carrying her pail of milk on her

head. As she walked along, she 1) ___________ (A) started thinking:

(B) had to

(C) prepared

(D) began to be

The matching cloze: Each deleted word, with or without additional distractors, is usually provided in alphabetical order and put in a column on the right of the cloze passage. This form of language test, featuring ease of construction and scoring, is suitable for measuring specific knowledge of English language such as vocabulary, grammar, and reading comprehension for native and non-native elementary students (Baldauf & Propst, 1979). The students are required to match the correct word provided in the right column with the numbered blanks, as shown in Example 11.

Example 11: Matching cloze

MATCHING CLOZE

Ken and Tom like dogs. Items

(1) ______ like big brown dogs a. and

(2) ______ little white dogs. b. dog

Tom (3) ______ a brown dog. c. has

He likes (4) ______ with his dog. d. playing

His (5) ______ is running. e. they

It’s going to school with him. f. Tom

(adapted from Baldauf & Propst, 1979, p. 323)

The cloze elide: Irrelevant words are added to the original text, and the students’ task is to find these additional words and delete them (Alderson, 2000; Steinman, 2002). However, the cloze elide test (see Example 12) is very difficult to construct and is suitable for assessing reading speed (Alderson, 2000).

Example 12: Cloze elide

CLOZE ELIDE

Tests are actually a very difficult to construct in this way. One has to be sure over that the inserted words do not belong with: that it is not possible to interpret great the text (albeit in some of different way) with the added words. If so, candidates will not be therefore able to identify the insertions.

(Alderson, 2000, p. 226)

The techniques of scoring cloze tests can be divided into two types: exact word scoring and acceptable word scoring. “The exact word method counts only the word deleted from the original passage as correct, whereas the acceptable word method usually counts any contextually acceptable answer as correct” (Brown, 1980, p. 311). Many researchers and practitioners in language testing (Brown, 1980; Lange & Clausing, 1981; Oller, 1972, 1979; Weir, 1990) also support using exact word scoring in the cloze test because it is quick and easy to use; nonetheless, this scoring method seems to be too strict for the test-takers for whom English is not the first language (Oller, 1979). While others (Abraham & Chapelle, 1992; Hinofotis, 1980; Lange & Clausing, 1981; Oller, 1979) state that acceptable word scoring is suitable for measuring the English language proficiency of EFL students, although this scoring technique is more expensive and time-consuming. Therefore, language teachers should take the testing situation and the test purpose into consideration in order to choose the most appropriate scoring method for the language test (Bachman & Palmer, 1996; Brown, 1980, 1996; Hughes, 2003).

Many previous studies of the cloze test indicate that the cloze test is an effective instrument which has great reliability and validity in measuring general language proficiency (Aitken, 1977; Anderson, 1976; J. D. Brown, 1983; Chavanachat, 1986; Fotos, 1991; Jonz, 1987, 1990; McKenna & Layton, 1990; Oller & Conrad, 1971; Oller, 1979; Stubbs & Tucker, 1974; Weir, 1990) and a specific knowledge of target language such as grammar and vocabulary (Alderson, 1979, 1980; Cohen, 1980; Mullen, 1979; Oller & Conrad, 1971; Oller & Inal, 1971). While other studies claim that the cloze test does not measure language ability beyond the sentence level (Alderson, 1979, 1980, 1983; Bachman, 1982; J. D. Brown, 1983; Markham, 1985, 1987; Shanahan et al., 1982). Klein-Braley (1997) also points out that the construction of the cloze test requires a long passage. So these problems have led to the development of a new form of the cloze test, which is called the C-Test.

2.2 The C-Test vs. the NC-Test

The C-Test, invented by Raatz and Klein-Braley (1981), is a test in which the second half of every second word is deleted and the student’s task is to restore the deleted parts, as shown in Example 13. The original C-Test was constructed as a way of testing English language proficiency besides using cloze tests. However, in the C-Test, the second half of each word must be deleted if the deleted word contains an even number of letters, such as “m a n y” (4 letters). For a word with an odd number of letters, its larger part must be deleted, such as e x c e l l e n t (9 letters). Many research findings also show that the C-Test is more effective and more reliable than the traditional cloze tests in assessing the students’ language proficiency (Babaii & Ansary, 2001; Cohen, Segal, & Weiss, 1984; Connelly, 1997; Dörnyei & Katona, 1992, 1993; Klein-Braley, 1985, 1997) is easy to construct and to score (Babaii & Ansary, 2001; Connelly, 1997; Dörnyei & Katona, 1992, 1993; Weir, 1990).

Example 13: The original C-Test

THE C-TEST

Many foreigners find that Thailand is a very pleasant place to have a holiday. They disc _ _ _ _ that th _ _ _ are ma _ _ interesting thi _ _ _ to d _ and t _ see. Th _ _ say th _ _ the bea _ _ _ _ are cl _ _ _ and t _ _ scenery i _ beautiful. Ma _ _ say th _ _ the hot _ _ _ are exce _ _ _ _ _ and n _ _ too expe _ _ _ _ _. They exper _ _ _ _ _ with diff _ _ _ _ _ kinds o _ Thai fo _ _ and fi _ _ that i _ tastes deli _ _ _ _ _.

(Boonsathorn, 1990, p. 48)

Some other studies report problems in using the C-Test in measuring the proficiency in the target language. For example, the C-Test does not assess language abilities beyond the sentence level (Cohen, Segal, & Weiss, 1984; Sigott & Köberl, 1993), and seems to measure the intelligence quotient (IQ) or spelling ability rather than general language skills (Jafapur, 1995). Some C-Test items, especially the functional words, were reported to have low discrimination power (Cleary, 1988; Jafapur, 1999; Wolter, 2002) and lack of validity (Bradshaw, 1990; Grotjahn, 1986; Jafapur, 1995). Dörnyei and Katona (1992) add that the C-Test is too difficult for EFL secondary students at the secondary level. Consequently, Klein-Braley and Raatz (1984) proposed the following criteria to make the C-Test more reliable and valid in measuring the target language proficiency: (1) the C-Test should contain at least 100 items; (2) the deletion rate and the starting points should be fixed; (3) only exact-word scoring should be employed; (4) the C-Test should contain various passages; (5) native speakers should get a perfect score on the C-Test; and (6) the words affected by the deletion should be a representative sample of the test (p.136).

The previous research on the C-Test focuses on its validation and its measurement of the target language, such as English. In the original study of the C-Test, Raatz & Klein-Braley (1981) examined the use of English and German C-Tests to find out whether the C-Test could be an alternative in assessing the target language. The subjects of the study were divided into two groups. The first group was composed of English native speakers, English-native-speaking schoolchildren, and non-native speakers of English. These students were requested to take the English C-Test. The second group taking the German C-Test consisted of German-native speakers, non-native speakers of German, and German-native-speaking schoolchildren at the third grade level. The results showed that the C-Test had great reliability and validity in assessing the target language of the non-native and the native test-takers.

Dörnyei and Katona (1992) also show that the C-Test is highly reliable and effective in assessing the English language proficiency of Hungarian EFL learners. Their investigation was conducted in order to validate this type of language test for the EFL students. The subjects of the study were 102 Hungarian university students and 53 Hungarian secondary students. These students were then requested to take the C-Test. The results of this study show that the C-Test is suitable to measure language proficiency of non-native students for whom English is not the first language, although this C-Test was reported to be too difficult for Hungarian students at the secondary level.

Connelly (1997) supported using the C-Test to measure the general language proficiency of high-level students studying English as a foreign language. His study examined the English C-Test with non-native postgraduate students studying at the Asian Institute of Technology (AIT) in Bangkok, Thailand. The C-Test with 100-deleted items was administered to EFL postgraduate students from six different countries: Thailand, Vietnam, Taiwan, Indonesia, Japan, and Cambodia. The results of this investigation indicate that the C-Test is highly reliable and has concurrent validity in assessing the language proficiency of English within EFL contexts. However, the C-Test seems to be less effective for EFL students in the lower levels (Cleary, 1988; Connelly, 1997; Dörnyei & Katona, 1992).

This has led to the development of the original C-Test, to make it more suitable for measuring the English language proficiency of non-native students. Thongsa-nga (1998) proposed the New C-Test (the NC-Test) by deleting the second half or the second part of every third word in order to provide more clues for Thai students at the upper secondary level. Thongsa-nga’s investigation examined the effect of different starting points in the NC-Tests and students’ attitudes towards the measurement using these language tests. In this study, the three forms of the NC-Test, with third, fourth, and fifth starting points were administered to 97 Mathayom Suksa six students at Srakaew School. These participants were also requested to answer a research questionnaire about what skills the NC-Test measured–vocabulary, grammar, reading comprehension, or English language proficiency. Her findings reveal that the NC-Test with the third starting point is the most reliable form for measuring the English language proficiency of Thai Mathayomsuksa six students; nonetheless, the majority of these students reported that the NC-Test seemed to measure vocabulary and reading skills. Thangsa-nga (1998) also adds that the different starting points had an influence on the discrimination power of these three forms of the NC-Tests. Example 14 shows a comparison of C-Test and NC-Test deletion.

Therefore, the present study continues investigating the original C-Test and the NC-Test in order to determine these two language tests are suitable for assessing the general language proficiency of the first-year undergraduate science students studying English as a foreign language.

Example 14: A comparison of word deletion between the original C-Test and the

NC-Test

THE C-TEST THE NC-TEST

Many foreigners find that Thailand is a very There is a dark shadow over schools and colleges

pleasant place to have a holiday. They where students are now facing the enormous

disc _ _ _ _ that th _ _ _ are ma _ _ problem of drugs. There seems t _ be an

interesting thi _ _ _ to d _ and t _ see. incr _ _ _ _ in the u _ _ of alcohol, tob _ _ _ _

and other dr _ _ _ by students.

(Boonsathorn, 1990, p. 48) (Thongsa-nga, 1998, p. 43)

2.3 The MC-Test vs. the NMC-Test

The Modified C-Test (the MC-Test), also known as the X-Test, was initiated by Boonsathorn (1987). The original MC-Test is a test in which the first half of every second word is deleted and the students are requested to fill in all the deleted parts, as can be seen in Example 15. In the MC-Test, if the deleted word contains an even number of letters, the first half of this word must be deleted, such as i n c ome (6 letters). For a word with an odd number of letters, its larger part must be removed, such as o b v i ous (7 letters). In addition, some research findings report that the MC-Test had high reliability and validity in measuring grammatical competence (Prapphal, 1996) and the language proficiency of English for non-native-speaking test-takers (Boonsathorn, 1987; Wonghiransombat, 1998). Nonetheless, Sigott and Köberl (1993) point out that the MC-Test does not measure language abilities beyond the sentence level and seems to be too difficult for EFL test-takers.

Example 15: The original MC-Test

THE MC-TEST

Many foreigners find that Thailand is a very pleasant place to have a holiday. They _ _ _ _ over that

_ _ _ re are _ _ ny interesting _ _ _ ngs to _ o and _ o see. _ _ ey say _ _ at the _ _ _ _ hes are _ _ _ an and _ _ e scenery _ s beautiful. _ _ ny say _ _ at the _ _ _ els are _ _ _ _ _ lent and _ _ t too _ _ _ _ _ sive. They _ _ _ _ _ iment with _ _ _ _ _ rent kinds _ f Thai _ _ od and _ _ nd that _ t tastes _ _ _ _ _ ious. They _ _ e delighted _ _ th Thai _ _ _ ic and _ _ _ _ _ nated by _ _ ai dancing. Visitors from all countries often say that Thai people are warm and friendly.

(Boonsathorn, 1990, p. 49)

In the original investigation of the MC-Test, Boonsathorn (1987) compared the MC-Test with the C-Test in measuring language proficiency in English, and examined the reading strategies of ESL students. The subjects of the study included 389 native-speaking-English high school (L1) students and 104 ESL adult learners (L2) in Alberta, Canada. The two types of language tests were administered to both groups. His study showed that the MC-Test and the C-Test were highly reliable for an English test of native and non-native learners even though the MC-Test was reported to be more difficult and had better discrimination than the C-Test. For reading strategies, only 28 ESL adult students were interviewed to report what reading strategies they used while taking these tests. Boonsathorn (1987) added that the ESL learners taking the MC-Test required more strategies than those taking the C-Test.

Other previous studies also support the use of the MC-Test in assessing English language proficiency. Köberl and Sigott (1996) compared the scores of the MC-Test with the scores of the C-Test taken by 82 English native students in United Kingdom and 42 German learners of English, and investigated the “item facilities of these two tests were not influenced by whether the subjects were native and non-native” (p.53). In addition, the results show that the item facilities in the MC-Test and the C-Test highly correlated to both subject groups. For this reason, these two language tests are equally appropriate in measuring language proficiency in English for the native- and the non-native-test-takers.

Prapphal (1996) also constructed two MC-Tests by using General English and Academic English texts in order to find whether the MC-Tests in the study could better measure lexical competence or the grammatical competence. Both the General English MC-Test and the Academic English MC-Test were administered to 48 third-year Thai students in the science program at Chulalongkorn University. The results reveal that a MC-Test constructed from General English or Academic English is highly reliable and has concurrent validity in measuring the grammatical competence; however, Prapphal (1996) claimed that these two MC-Tests seem to measure lexical competence rather than the syntactic competence.

In order to adjust the original MC-Test to be more suitable for measuring the English language proficiency in EFL contexts, Wonghiransombat (1998) then proposed the New Modified C-Test (the NMC-Test) by deleting the first half of every third word, which provides more clues for non-native-speaking students. A comparison between text deletion in the MC-Test and the NMC-Test is shown in Example 16. The investigation of Wonghiransombat was designed to compare the original MC-Test to the NMC-Tests with different starting points (the NMC-Test with the third starting point, the NMC-Test with the fourth starting point, and the NMC-Test with the fifth starting point). The subjects were 84 postgraduate students studying at the National Institute Development Administration (NIDA) in Bangkok, Thailand. They were requested to take the original MC-Test and one of the three forms of the NMC-Tests. The findings indicate that the NMC-Tests with different deletions were considered to be easier than the original MC-Test and highly reliable as an alternative assessment of the overall language skills of Thai postgraduate students. Wonghiransombat (1998) also reports that the different starting pints did not affect the reliability, the validity, or the difficulty of the three NMC-Tests.

So this present study further examines whether the original MC-Test and the NMC-Test with third-word deletion can be an alternative in measuring the English language proficiency of non-native-speaking tertiary students in the science program.

Example 16: A comparison of word deletion between the original MC-Test and the

NMC-Test

MC-TEST NMC-TEST

As inflation denotes changes in the general price Acid rain, endangered species, lead poisoning,

levels pervading the whole economy, a number the destruction of the ozone layer, waste disposal

of distortions occur. One _ _ _ _ ous effect _ f – the list of environmental problems today seems

inflation _ s in _ _ _ ome and _ _ _lth distribution. endless. It’s a _ _ _ tty grim picture; _ _ _ _ ver,

we must _ _ _ ept the challenge; _ _ _ re is hope.

(Wonghiransombat, 1998, p. 53) (Wonghiransombat, 1998, p. 58)

2.4 Test-Taking Strategies

Test-taking strategies can be defined as “the processes that the test takers make use of in order to produce acceptable answers to questions and tasks, as well as the perceptions that they have about these questions and tasks before, during, and after responding to them” (Cohen, 1998, p. 216). Generally, the processes of taking language tests are divided into two types: “the process of responding” and “the reactions to items and subtests” (Cohen, 1984, pp. 71-72). The former involves the strategies that the students use while taking the language tests. For example, some students use context clues to restore the deleted parts in the cloze tests (Babaii & Ansary, 2001). The latter focuses on the test-takers’ attitudes towards the language tests. For instance, some students prefer to take the C-Test, which provides a chance of guessing, rather than to take the traditional cloze test (Weir, 1990).

In order to identify the test-taking strategies used by the target respondents, investigations can be done by observation, performance analysis, questionnaires, and interviews (Cohen, 1994). The results of test-taking strategies also enable the language teachers to validate the language test and to determine what language abilities this language test can measure (Cohen, 1994, 1998). Many previous studies of test-taking strategies concentrated on the measurement of cloze tests rather than the completion processes. Some studies (Chàvez-Oller et al., 1985; Fotos, 1991; Jonz, 1987, 1990; McKenna & Layton, 1990; Oller, 1973; Oller & Conrad, 1971; Sasaki, 2000; Storey, 1997; Yamashita, 2003) found that cloze tests were highly reliable in assessing overall language skills. The students required both syntactic and semantic information in order to fill in the cloze passage (Oller, 1979). Nonetheless, some others (Alderson, 1979, 1983; Bachman, 1982; J. D. Brown, 1983; Markham, 1985, 1987; Shanahan et al., 1982) reported that the cloze tests did not measure language abilities beyond the sentence level, because the students sometimes were able to fill in the cloze items by using only lexical competence (Alderson, 1979, 1983, 2000).

Many researchers and practitioners in language testing (Bachman, 1985; J. D. Brown, 1983; Jonz, 1990; Markham, 1985, 1987; Yamashita, 2003) were aware that cloze tests were sensitive to the text-level constraints. For example, each cloze item did not contain the same information. That depended on the types of the words deleted (Alderson, 1980; Bachman, 1982, 1985; Jonz, 1990; Yamashita, 2003). Some items, such as prepositions and articles, could be restored by using only “linguistic knowledge” while some others, such as anaphora, lexical repetition, and conjunction required “textual understanding”(Yamashita, 2003, p. 268). Therefore, students probably use different strategies based on the type of deleted words. Bachman (1985) developed a framework dividing cloze items into four categories according to the textual information that students use while answering each cloze item: (1) Within Clause; (2) Across Clause, Within Sentence; (3) Across Sentences, Within Text; and (4) Extratextual.

Bachman (1985) studied performance on fixed-ratio and rational cloze tests in order to examine what language skills cloze tests with different deletions measured. These cloze tests were administered to 910 participants (native- and non-native-speaking college students) in Illinois. The results show that these students frequently used the ‘Extra-textual’ strategy for fixed-ratio cloze tests while the students taking a rational cloze test mostly employed ‘Across Sentences, Within Text’ strategy. So both fixed-ratio and rational cloze tests have high reliability in assessing the English language proficiency, although all cloze items did not measure the same language abilities. Bachman also suggests that there should be further studies on the test items in the different forms of the cloze tests.

However, other researchers (Sasaki, 2000; Yamashita, 2003) examined test-taking strategies by using verbal reports to see how the students responded to the language tests. Sasaki (2000) studied the effects of cultural schemata on how Japanese EFL students responded to unfamiliar and familiar fixed-ratio cloze passages. The subjects were 60 Japanese EFL students with the same English reading proficiency level. These students were divided into two groups; each group was required to complete culturally familiar or culturally unfamiliar cloze passages (Sasaki, 2000). The students were asked to report their test-taking strategies to be categorized based on the modified cloze test-taking strategies categorization of Bachman (1985). The results show that the students reading the familiar cloze passage could answer more items and used these three categories of test-taking strategies–‘Within Clause’, ‘Across Clause, Within Sentenc