This paper aims at exploring the weakened validity of criterion-test's reading in NEC. There are abundant studies conducted on criterion-test's reading in NEC. However, there is hardly any research that explores the validity of criterion-test's reading in NEC. This study covers Validity Theory, criterion-test's reading, NEC. The significance of this research lies in providing test writers with constructive suggestions for effectively improving the validity of criterion-test's reading in NEC.

Key Words validity, criterion-test's reading, NEC,TBLT


Reading plays an important role in learning English. Much information can be get from reading. As a result of that, a significant subject in language teaching lies in the study of reading process and reading abilities. People do lots of researches in this aspect since 1970s, we can find such books everywhere, like Progress in understanding reading (Stanovich, 2000), Success in English Teaching (Davies, 2000), Measurfing reading abilities: concepts, sources and applications (Pumfrey, 1977), Reading comprehension (Orasanu, 1986), More than meets the eye: foreign language learner reading (Barnett, 1989), Interactive Processes in Reading (Perfetti,1981). Exploring second language reading issues and strategies (Anderson, 2004) and Teaching reading skills in a foreign language (Nuttall, 2002). Only in Interactive Reading and Measuring reading abilities were considered the reading abilities among the books above.

What`s more, the techniques and the principles of constructing English testing papers of reading are well explained by many works, such as Testing for language teachers (Hughes, 1989), An introduction to language testing (Zou,200), Fundamental considerations in language testing (Bachman, 1990) testing in practice (Bachman, 1999), Principles of language (Davies, 1990), Testing as a foreign language (Kong, 2005), Language testing and its methods (Liu, 2000), Language testing theories and methods (Shu, 1999).

This paper explores the weakened validity of criterion-test's reading in NEC. The author intends to improve the validity of reading test, which is suitable for the NEC and the TBLT.

2. Literature Review

2.1Reading abilities required of senior high school students

To evaluate students' receptive skills, teachers and testers have to design items which can measure the Ss`s abilities in reading the English language. As a result of it, specifing related reading abilities as accurately and completely as possible is very important. Before finishing their senior high school learning, middle school learners should have such reading abilities below, according to the teaching syllabus(Teaching syllabus,2002:87).

Macro skills:

(1) Skimming: processing a text selectively to get the main ideals) as efficiently as


(2) Scanning: locating specially required information. Looking quickly through a text, not necessarily following the linearity of it to locate a specific symbol or group symbols. The purpose of scanning is to look for specific information, which is relevant to an established need. It involves both quick and careful reading.

(3) Searching reading: Looking through a text for information on a topic-a selective process which may involve careful reading once relevant information is located.

(4) Skipping: Going from one part to another quickly by omitting some parts to find the details.

(5) Understanding explicitly stated information.

(6) Generalizing and drawing conclusions.

(7) Understanding information not explicitly stated by making inferences.

(8) Anticipating and predicting what will come next in the text.

(9) Distinguishing the main idea from supporting details.

(10)Understanding the message of graphics.

(11)Recognizing the author's purpose and attitude.

(12) Evaluating the reading materials.

(13) Identifying the main idea of the passage and selecting proper title for the materials chapter.

Micro skills

(1) Guessing the meaning of unfamiliar words.

(2) Understanding references of pronounce.

2.2The interactive model of reading

The interactive reading models considers the interaction of bottom-up and top-down process happen at the same in the process of reading.

The interactive model was proposeed by Rumehart in 1977. In this model, three elements dominate a reader`s reading ability: style, content and linguistic knowledge. reader consider his linguistic knowledge as the extent he can understand the structures, sentence and the words. Content means his familiarity to the reading material. Style mesns how much the reader get from the topic and the styles of the reading material. Grabe put forward two conceptions of the interactive model. The first has something to do with the interaction between bottom-up process and top-down process. Stannovich summarizes the interactive model as "processes at any level can compensate for deficiencies at any other level... .Higher processes can actually compensate for deficiencies in lower-level processes." (1980:36) The second has something to do with the interaction that happens between the reader and the text. It means that meaning is lies in the interation of background knowledge between the reader and the text, which does good to the understanding of the task he faces with. Therefore, Murtagh undoubtedly claims that the best second language readers are those who can "efficiently integrate" both bottom-up and top-down processes. (1989:102)

2.3The New English Curriculum

The New English Curriculum was well introduced to all over china, and it highly recommands the Task-based Language Teaching. The New English Curriculum clearly named that the general target of English Curriculum, in the stage of basic education, is to develop the integrate language ability of students, at the same time set the goal and request of each teaching level by "capable to do something" of students., emphasize "teacher should avoid using the only teaching method and take most advantage of Task-based Language Teaching in his teaching activities". Make sure that the student with clear task target, develop his language and thinking ability, as well as his communicative and coorparative ability, which in the end improve his integrate language ability. As a central issue of second language acqucistion and English Study investigation, the Task-based Language Teaching has a well range of the basis of the theory and practice, in the same time it has a Significant impact on the language teaching pattrn. It emphasize the point that learning social intercourse throught communication and leading real material to the Learning environment. Students not only pay attention to language study, but also to the study process itself. What`s more, they should bring their living experience to the class as an important resource, band the in-class and out-class language learning together.

2.3.1 Task-based Language Teaching

Task-based Language Teaching is an analytic approach to syllabus design and methodology in which chains of information-gathering, problem-solving and evaluative tasks are used to organize language teaching and learning; these interdependent pedagogical tasks, which combine insights from socio-linguistics and psycholinguistic research, are designed to methodologically simulate the communicative events which learners encounter in specific second language-using environments. (Markee, 1994)

As a language teaching methodology, Task-basked language teaching(TBLT) was developed in the 1980`s. It is based on the studying practice of the communicative language teaching theory and language acquisition. In recent years TBLT was well known by many other countries throught the world. The New Curriculum Standard for High School English (Tentative Edition) issued in September 2002 also identify with it in classroom teaching clearly.

What`s a task?

A task is a piece of classroom work which involves learners in comprehending, manipulating, producing or interacting in the target language while their attention is principally focused on meaning rather than form. The task should also have a sense of completeness, being able to stand alone as a communicative act in its own right.(Nunan,1989).

A classroom task

A classroom task is not the same as a task, it should has the aspects below: (1) meaning-centered A classroom task is designed to practice some more meaningful or meaningness language forms; (2) A classroom task focus on dealing with certain communicative problem, which is from the real world. Such tasks should close relate to students' usual life, learning experiences and social reality, and can get Ss interests and willing to join in it. (3)the task completion is great valued, that is to say, whether the communicative problem is solved decide the task is success or not. Here comes another question: what is a communicative task? The communicative task is a piece of classroom work which involves learners in comprehending, manipulating, producing or interacting in the target language while their attention is focused on mobilizing their grammatical knowledge in order to convey meaning rather than to manipulate form.( 田丽萍 2006)

2.4 Validity

There`s no one English testing book without the definitions of validity. Here are some taken from different books.

(1) Generally speaking, the validity of a test is whether it tests what it is intended to measure, and whether the result of the test can measure the language abilities of the testees' (Wu, 2002:16).

(2) Frankly speaking, the validity of a test is the degree which the test is intended to measure (Zou, 2000:38).

(3) Chapin (1989: 103) describes validity as "an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales appropriate of inferences and actions based on test scores.

(4) Validity is the extent to which a test measures what it is intended to measure. Validity shows the degree of accuracy of either predictions or inferences that are based upon the scores of a test (Gao, 1996:76).

(5) Briefly, the validity of a test is the extent to which it measures what it is supposed to measure and nothing else (Heaton, 1975:159).

We can find the similarity between the definitions above, that is, they believe a test is valid or not depends on whether it can measure accurately what it is intended to measure.

Pumfrey (1977:50) made a vivid metaphor of validity to the reading test. He wrote as follows:

In one sense, a reading test is like a gun aimed at target area. We

know that the gun is directing its missiles in intended direction. By

analogy, if the reading test items are pertinent to the purpose we have in

mina the reading test is "on target ". However, by itself this is not enough.

Despite being aimed at the center of a target, the bullets shoot from a

close cluster to a wide scatter. This spread of shots is mainly a function of

the characteristics of the gun and the bullets. If it provides a very close

cluster of bullet holes in the target, this close cluster indicates that the

weapon is reliable in an important sense. But what if the cluster is near

the edge of the target rather than at the center? In this case, the gun is

reliable as its Gullets consistently give the same "score "(a low one), but

the weapon is not valid for the purpose of directing shots to the center of

the target. It does not meet the user's requirements. Again, by analogy, it is

possible to administer a reading test to a group of children and obtain

consistent results, but the results may not be pertinent to the purpose you

have in mind.

What can we can conclude from the above statement, that is, test is useless out of validity. Therefore we should make sure that the reading abilities of test-takers could be effectively measure in a reading test.

2.4.1 Types of validity

Since the different aspects of validity are popular in the world, a wild range of names and definitions show up as spring up like mushrooms. Luckily we can divide them into rational(content), empirical and construct validity. And four kinds of validities given by Arthur Hughes and relevant are discussed in the following.

Content validity:

Throught the content validity, we can find the extent of the test reflects the purpose of it and reach the results expected. For example, a bus company wants to employ some bus drivers who can use English. If the test contains far from enough English which will be used later by the bus drivers, we can judge the test lacks content validity for the reason that the testing result can`t match the goal of the test.

Construct validity

We would say a test has construct validity if it can show some ability stucture in theory. In this place, the test constructor have to try his best to show the items in the reading test are related and sufficient samples of behaviors in reading ability.

Concurrent validity

When we want to see how far results of a test agree with the assessment of candidate`s ability, consider his concurrent validity is a best choice.

One should ensure two thing before receiving the concurrent validity. On one hand, the researcher must get enough statistics to make sure it is valid. On the other hand, the difficulty of the criterion test and the paper compared by you shoule be in similar.

Face validity

People call a face validity test with the test measures what it is supposed to measure. For instance, a test which test vocabulary ability but which does not require the candidate to write might be thought to lack face validity.

3. The weakened validity of criterion-test`s reading in NEC

For nearly 100% reading tests are composed of M-C items, we can find the validity of M-C items is the same as the validity of reading tests. In this way, the validity of the M-C items in reading tests refers to the degree of the reading comprehension ability and the required reading abilities test items covered.

3.1 Advantages of M-C items

The scores of the testee in the M-C items is the same whoever scores the test, which wins the objective and superiority in reliability.

M-C comprehension test covers more reading tasks than other types of reading tests in a given period of time.

It can be used for Ss at each level of reading development.

M-C items can limite the chance Ss to "bluff' or "dress-up" their answers by only giving answers of "A,B,C,D".

The difficulties Ss have can be simplily found by the distracter he picks.

The format of the M-C test given by the testee is so clear that the candidates can find his intention.

The test involed in the M-C test cover language knowledge, words and phrases,skill, reading and listening comprehension. In this way, the M-C test can get a greater variety of learning targets than other formats of response-choice items.

According to the request the tester can meet different difficulties` need.

As a whole test, the M-C test can be pre-tested it is very easy to pre-tested the M-C test. As a consequence, the difficulty level of each item can be estimate in theory.

The Ss can know the aim of test for the reason that the test points given by each item.

The rapid and economical advantage of the M-C test lies in the way it got score by machine.

3.2 Disadvantages of M-C items

M-C item does not obay the law of interactive teaching method. That is to say, it can`t cultivate Ss ` interactive ability and the more we use it the worse backwash it brings to Ss.

Cheat or copy other Ss` answers is convenient in M-C test.

The guessing technique in the M-C test should be considered have some unknow effect on test scores. In this way, everybody can get 25% to 33% scores in a 100-item test. To some extent, Ss may pay considerable attention to choose the correct answers in the way of blind guessing.

Ss nearly have no way to write about the topics in the subject they are learning in M-C test, for the only thing they need to do is chooseing the answers.

The part of the passage or the questions Ss do not understand can`t easily be judged after they choose the wrong answer.

Only taking M-C item testing for important or high stakes assessments may have bad effect of the evaluation.

Breaking away from reality reading, which the reader should find the answers by himself, the M-C item has something to choose.

The distracters required from the M-C item are not always available.

Writing a successful M-C items takes a lot of time to construct M-C itemsm even for the high quality testers. Testers take much attention to the pre-testing and statistical analysis on performance on the items.

3.3 Why do we still choose M-C items

Ss` reading abilities can be properly test with the help of M-C technique. Different from testing listening, speaking and writing skills, testing reading focus on lots of reading abilities. Within limited time, the testees are asking to pick up the correct answer in an M-C item, which meets the need of the tester to test many resding abilities

M-C test technique matches well with the Chinese present teaching situation. In other subjects, M-C test is largely used in the test and Ss are familiar with it.

Some disadvantages of M-C technique can be made up. For example, throught arranging the Ss` seats or give them two same content papers bue arranged in different order, teachers can prevent the Ss from cheating. What`s more, by using M-C technique with other techniques together in one reading test paper, the testers can simplily avoid a range of disvantages of M-C items. At the same time, the M-C test paper will not took as a ime-consuming and difficult to construct thing after it do valid testing the testees' reading abilities.

5. Conclusion

This paper has analyzed the weakened validity of criterion-test's reading in NEC. Here I want to share my recognization of IELTS reading, which might do some good help.

Short answer questions.

Short answer questions.in IELTS reading requires Ss give their answers in three to five words, similiarly the College Entrance Exam reading requires less than ten words. Short answer questions aims at letting Ss answer the given questions in blief words, but not express in complete and full sentence.

The advantage in using Short answer questions lies in:

(1)high Authenticity and face validity of the task.

(2)blief keys which are good for increasing the issue amount. At the same time, it can test different kinds of intensive reading abilities and fast-reading abilities of Ss`, with great positive backwash and limit the opportunity of blind guessing.

(3)suitable for many testing condition, structure and different levels of Ss. For this reason, it is considered to be in great Content validity and Theoretical validity. Compared with the Gap-filling/Cloze, short answer questions measure Ss` macro-comprehension ability and the former measure Ss` partial understanding ability.

Information Transfer

As an anamorphosis of short answer questions, Information Transfer usually require Ss to pick up the useful message from the text to fill in the blanks of diagram, flow chart, table or map ect. The answer are limited into three to five words. The advantage of Information Transfer lies in high task veracity, that means we can take it to have a better understand of the important information, such as the main idea, crucial detail, text structure and structure word. Information Transfer is suitable for many testing condition, structure and different levels of Ss. For this reason, it is also considered to be in great Content validity and Theoretical validity.

Matching Techniques

Matching Techniques have many tpes of question, and the one is warm welcomed by IELTS is "best heading" type meanwhile the Matriculation English Shanghai Volume put forwadr this type in 5 consecutive years. Matching Techniques can investigate both macro-conclusion abilities and micro-detail-finding abilities. It can test Ss` accessing information ability in the task given, which is close related to the TBLT in The New English Curriculum.

Gapped Summary

Gapped Summary require Ss to read a passage and then fill in the blanks of the summary of the it. Gapped Summary has many similarity characteristics in Gap-filling/Cloze, which means it can have Ss understood the item and measure their abilities of discard the dross and select the essential and seize the main idea.

True or False

We can find the True or False in IELTS commently. It gives a couple or so sentences similar or different from the meaning in the text passage, and ask the Ss to make the judgement. The True or False is convenient to choose the text and proposition and objective to score, suitable for testing Multiple levels of reading abilities.