How To Test English Proficiency English Language Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Lack of competency in English language is often blamed for the many challenges that overseas students face in their university study. However, differences in assessment cultures may be of equal if not greater importance in explaining these challenges. This essay will firstly discuss which test performs better in assessing candidates' English proficiency. Following this, it will explain the differences of the tests concerning qualitative and quantitative comparisons to find out the merits and demerits of each test. Finally, it will look at how the tests affect the English proficiency of students.

Which test performs better in assessing candidates' English Proficiency?

As there many tests, it is difficult to find which test performs better in assessing candidates' English Proficiency. A test can be used to predict the language proficiency of the test takers, diagnose any potential problems, as well as for purposes as placement, promotion, etc.Yet another important role the analysis of test scores can play is for the test designers or other researchers to judge the test itself-whether it has tested what is means to test; does it do well in discriminating the"good"from the"poor"testees, to name just a few. Therefore, it is necessary to make further analysis on the test results. A general view of the whole group performance can be obtained by first working out the mode, median and mean.Here mode refers to the score which most candidates obtained; median means the score gained by the middle candidates in the order of merit. The mean score of any test is the arithmetical average. In order to get a better picture of the group performance, standard deviation is calculated too.It measures the degree to which the group of scores deviates from the mean; in other words, it shows how all the scores are spread out and thus gives a fuller description of test scores (Heaton, 1988).

Traditionally, there are two measures that are calculated for each objective test item: the facility value (F.V.) and the discrimination index(D.I).The facility value measures the level of difficulty of an item,and the discrimination index measure the extent to which the results of an individual item correlate with the results of the whole test. the percentage of students to answer it correctly.If there are 100 students and 50 of them get the item right, the F.V.of the item is 0.5.If the very high, it means that the item may be too easy; on the contrary, it may be too difficult if the very low. Generally speaking, items of 0.5 will make up the major part of a test, but test constructors can manipulate the test contents by selecting items with the appropriate that the test population achieves the required mean score.

As well as knowing how difficult an item is, it is important to know how well it discriminates, that is how well it distinguishes between students at different levels of ability.Item discrimination is the ability of an item to distinguish between good and poor readers.An item claimed to have good discrimination power should be correctly responded to by more good readers than poor readers.If an item is working well, we should expect more of the top-scoring students to know the answer than the low-scoring students.

Although an average facility value of 0.5 is desirable for many tests, the facility value of a large number of individual items will vary considerably (Heaton, 1988). Thus in practice, test writers always prepare items with facility values between 0.3-0.7.

The index of discrimination tells us whether those students who achieve high scores on the whole test do well or badly on each item in the test.If the top students tend to do well on an item and the poor students badly, the item is a good one because the item distinguishes the"good"from the"bad"in accordance with scores in the whole test.Items showing a discrimination index of below 0.3 are of doubtful use because they fail to discriminate effectively.

Explain the differences of the tests concerning qualitative and quantitative comparisons to find out the merits and demerits of each test.

From different tests' respective testing specification, such as PETS-5, CSC-ET and IELTS, it is clear that basically they can all be used to test those outbound scholarship winners.They all claim to be based on communicative testing theory, the most popular one in the field of language testing in recent years.Yet, the validity of such a claim needs further study and analysis, which will be done in the following parts.

Materials chosen for three of these texts are generally from authentic sources, with variance on its topic orientation and text length. Since it is the academic reading ability that is of the major concern, texts chosen should be able to represent features of general academic reading materials, which also call for further exploration. However, since the test takers come from different specialized fields, both arts and sciences, subjects and topics should not be too narrow or technical, in order to avoid possible test bias. In each of the tests, more than one item type is adopted.Yet, at least from the testing specification or syllabus, mere number suggests that IELTS and CSC-ET use more item type than PETS-5. Several sources provided the data used in this report for these jurisdictions: Broekhuizen (2002, 2004); Brown (2002, 2003); Cummings (2005); Laturno (2001, 2002); Pablo & Koki (1999); Rao (2005); and Taitague (2006).

In terms of skills tested, specific skills are listed with slight differences in the terminology. The most commonly identified skills, such as finding out main ideas, identifying specific information, making inferences, skimming and scanning are all clearly pointed out.In the following parts, efforts will be made to look closer into those specific areas.Such a brief analysis of specification and syllabus tells us the three tests share a lot of similarities in terms of testing purposes, thus laying the foundation for further comparisons.

Look at how the tests affect the English proficiency of students.

It is known that there is no one best method that can accomplish the task of one's reading ability. In fact, different task types can tap the different reading skills of a reader.Therefore, the variety of question types used offer a better way to assess the overall reading competence. As the disadvantages of multiple choice questions have been pointed out, relying on such a method will greatly damage the validity of the whole paper. Meanwhile, as far as authenticity is concerned, such a way seems most irrelevant to what people actually do in their daily life. Since the purpose of the test is to access the academic reading ability, among which critical and evaluative reading is an important part, the use of too many multiple choice questions can not achieve such an effect. To be useful, a test must provide consistently accurate measurements.It must ensure that"scores obtained on a test on a particular occasion are likely to be very similar to those which would have been obtained if it had been administered to the same test takers with the same ability, but at a different time" (Hughes, 1989). However, due to various factors, sometimes it is hard to achieve a high degree of reliability. To ensure the maximum degree of reliability, we need to consider reliability co-efficiency, the standard error of measurement and the true score, as well as scorer reliability.

Of course, one needs to always keep in mind that there is a trade-off effect between reliability and validity.For instance, a high degree of reliability can be achieved if we simply use multiple-choice questions to test the grammatical structure, but if it is used to measure one's ability to conduct effective communication, this test is said to have very low validity. In our efforts to make test reliable, we must be wary of reducing their validity. This is also the reason why such objective test types like multiple-choice questions have been the targets of many criticisms in this respect.


In conclusion, the developing trend of today's test is toward communicative, as can be seen from the respective test specifications of the tests in question. However, the claims in the test specification or the syllabus can not guarantee it is a real communicative test of the English proficiency of students. Only by carefully reviewing the current thinking on communicative competence as well as on communicative language testing, can we get a relatively clear picture of what communicative ability is and therefore know how to assess it. After such a comparison some implications can be gained as to how to write a test of communicative nature.Communicative language tests make an effort to test language in a way that reflects language use in real communication. It is, of course, not always possible to make language tests communicative, but it may be possible to give them communicative elements. Communicative approach to language testing is mainly characterized by its authenticity and interactiveness. In terms of reading comprehension, it means at least the text should exhibit the features of a true discourse: being coherent and clearly organized. In any case, it should be authentic. To make a test communicative and avoid potential test bias, it is necessary to bear in mind that text should be of general academic nature but they should be written for non-specialist audience and topics of texts should be familiar to all students so as to avoid possible bias caused by topic familiarity.