This assignment evaluates the Test of Writing of the International Legal English Certificate (ILEC). ILEC is an examination produced by Cambridge ESOL (English for Speakers of Other Languages) in collaboration with Translegal, a firm of lawyer-linguists. The target candidature for ILEC is legal professionals and law students, operating in the area of international commercial law, who need to demonstrate proof of their language proficiency in English. The assignment will first consider relevant issues for the development of tests for specific purposes and then examine validity aspects of the ILEC Writing paper in detail. (1)
1.1 Tests in Language for Specific Purposes
Testing Language for Specific Purposes (LSP), such as a Test of English for the legal profession refers to language assessment in which the test content arises from an analysis of specific target language use situations: these often (but not always) correspond to the language needs of a particular occupational group. Devising LSP tests presents test developers with a number of issues, including the relationship of test specificity to test generalisability; the importance of ensuring authenticity of test content; the interaction between background content knowledge and language knowledge, and for some
domains, the difficulty in gaining access to relevant information on the nature of language use in that domain. (2)
1.2 Specificity vs Generalisability
LSP tests have often been directly contrasted with general purpose tests. This is now, however, generally acknowledged to be an oversimplification of the issue and there is growing consensus that tests do not fall into one grouping (specific purpose) or the other (general purpose), but that, in the words of Douglas (2000:1), '... there is a continuum of specificity from the very general to the very specific...': all tests are devised for some purpose and fall at some point along the specificity spectrum. The concept of a spectrum or continuum of specificity raises the question of where on the continuum a test should be placed and the related issue of how generalisable the LSP test is intended to be. Generalisability is often held to decrease in proportion to the specificity of the test: the more specific a test (such as English for Air Traffic Controllers), the less possible to generalise from that to other language use situations. This is accepted as a fundamental issue in LSP, to which there are no straightforward answers. (3)
1.3 Background content knowledge
In general purpose language testing, background knowledge of topic or cultural content is viewed as a 'confounding variable', which should be minimised as it has the potential to lead to measurement error. For LSP tests, however, subject specific content is arguably a defining feature of the test. Nonetheless, the question of 'separability', that is, how to distinguish between language knowledge and specific background knowledge in analysing candidates' results on a specific purpose language test, has been a recurring concern. Bachman and Palmer (1996) argued in relation to a test for trainee doctors, that it should be possible to control for background medical knowledge in interpretation of performance on a language test, by, for example, the administration of 'knowledge tests' alongside the LSP test. The difficulty in assessing the extent of the test taker's background knowledge and its interaction with language proficiency has been addressed by Clapham (1996) who concluded that background knowledge was undoubtedly a significant factor in the process of testing reading, but the extent varied with the specificity of the test and the language proficiency of the candidate. There has more recently
been an acceptance that until more is known about how the mind deals cognitively with ability and knowledge, specific background knowledge and language performance need to be treated as being 'inextricably linked' (Douglas 2000:39). (4)
1.4 Access to information on language use within the domain
With an increase, in the second part of the 20th century, in the number of people needing to learn English for education, technology and commerce, the main drive behind the development of LSP was practical rather than theoretical. As a result, LSP itself may be said to have suffered from a lack of theoretical underpinning. A key analytical tool has been the use of Needs Analysis to assess the linguistic requirements of a particular target group. Some analyses resulted in long detailed lists of needs for which empirical verification was held to be lacking. Widdowson, for example, described many LSP Needs Analyses as being made up of 'observational lists with no basis in theory' (Widdowson 1983:8). Alderson, Davies and others have raised similar concerns (Alderson 1988, Davies 1990, Skehan 1984). A further criticism of some needs analyses was that they lacked objectivity, were influenced by the ideological perceptions of the analysts (Robinson 1991:7) and took insufficient account of the students themselves. Nonetheless, assessment of language needs can still inform LSP course and test design. As Clapham has said, 'We now know that such analyses can become too detailed, and also paradoxically, too limited in scope. However, this does not mean that they areunnecessary' (Clapham 1996:5). Analysis of texts and spoken discourse from particular target language use situations is important in revealing how the target language use (TLU) community communicates and disseminates information. The growth of corpus linguistics and the corresponding development of electronic databases of texts can help in enabling the identification of specific syntactic patterns and use of specific lexis among particular occupational groups or discourse communities. At present, however, there is a limited number of such corpora available and genre analysis plays an important role when considering communication between members of the occupational group or discourse community in question. According to Swales (1990), texts belonging to a particular genre share common features with regard to the organisation of information, rhetorical conventions and lexico-grammatical patterns which practitioners within that discourse community need to access and use in order to operate with any degree of effectiveness. Bhatia (1993) developed earlier work by Swales and has extensively researched language use in professional contexts, particularly discourse within business settings. Nonetheless, due to the confidential nature of the work done by some occupational groups (such as lawyers), access to texts from those domains may not be easily acquired. Swales (1996 cited in Flowerdew and Wan 2006) refers to such texts as 'occluded', genres to which access is normally denied to those outside the participating discourse community. One task for the test developer in such circumstances therefore lies in obtaining subject-specific assistance and advice. Bhatia (1993) reports on how the subject specialist or 'specialist informant' has played a role within LSP genre analysis. (5)
2. The ILEC Writing Test: considering the validity issues
A copy of the ILEC Writing Test is attached in Appendix 1. The test will be evaluated according to its context, theory-based, scoring and consequential validity. (6)
2.1 Context Validity
The term 'Content Validity' was traditionally used to refer to the authenticity and content coverage of the task. Context Validity is now a more widely used term as it also takes into account the discoursal, social and cultural contexts as well as the linguistic content. Context validity in the case of writing tasks also relates to the particular performance conditions under which the operations required for task fulfilment are performed such as purpose of task, time available, length, specified addresses. (7)
2.1.1 Authenticity of task and content coverage
Authenticity of task means that
" â€¦ the LSP test tasks should share critical features of tasks in the target language use situation of interest to the test takers" (Douglas 2000:2).
Bachman and Palmer (1996:23) describe a task as being relatively authentic " â€¦ whose characteristics correspond to those of the Target Language Use (TLU) domain tasks" and define authenticity as "the degree of correspondence of the characteristics of a given language test task to the features of a TLU task" (1996:23). In terms of the TLU situation, ILEC is a test of English in an international, commercial law context, the design of which is based on the following characteristics of the language environment of the target candidates:
Areas of the law: law of associations; contract law; sale of goods; debtor-creditor law; commercial paper; employment law; intellectual property law; property law; remedies; civil procedure; administrative law; public international law; family law.
Types of lawyer: lawyers practising (and law students who intend to practise) in a commercial law context with elements of international commercial business dealings.
Types of environments that target lawyers work in: business law firms and other law firms with international dealings; in-house corporate counsel; governmental organisations; international organisations.
Types of people that target lawyers must communicate with in English: other international lawyers; members of the international business community; governmental representatives; client form other countries.
The choice of materials in the Writing Test is based on an analysis of the kinds of tasks that the target lawyers are likely to encounter in their working environment. (8)
In a legal context, for example, a legal writing test must engage the test taker in writing tasks which are authentically representative of the situations they might plausibly encounter. The technical characteristics of language employed in a legal professional context has very specific features that lawyers operating in the field of law must control:
"There are lexical, semantic, syntactic, and even phonological characteristics of language peculiar to any field, and these characteristics allow for people in that field to speak and write more precisely about aspects of the field that outsiders sometimes find impenetrable" (Douglas 2000:7).
Interestingly, Douglas goes on to provide an example of 'legalise' - characterised by
" â€¦ the arcane lexis, the convoluted syntax, the use of Latin terminology, and the interminable cross-references to previous laws and cases in legal texts" (2000:8)
- as an example of the requirement for precise, specific purpose language. Clearly, such language has consciously evolved, developed by the legal fraternity enabling its members to dynamically engage with each other in an attempt to communicate effectively the exact meaning of the law. (9)
A legal test also needs to identify and cover its relevant content domain. Coverage of the appropriate domains of language use is attained through the employment of relevant topics, tasks, text types and contexts. The domains, therefore, need to be specified with reference to the characteristics of the test taker, and to the characteristics of the relevant language use contexts. This is the case with the ILEC Writing paper. (10)
2.1.2 Interactional and Situational Authenticity
As a general principle it is now argued that language tests should as far as is practicable place the same requirements on test takers as involved in writers' responses to communicative settings in non-test "real-life" situations. The purpose for writing in this paradigm is essentially about communication rather than accuracy (Hyland 2002:8) "emphasising validity, particularly the psychological reality of the task, rather than statistical reliability" (ibid:230).
These views on writing reflect a concern with authenticity which has been a dominant theme in recent years for adherents of a communicative testing approach as they attempt to develop tests that approximate to the "reality" of non-test language use (real life performance) (see Hawkey 2004, Morrow 1979, Weir 1993 and Weir 2003).The 'Real-Life' (RL) approach (Bachman 1990:41) has proved useful as a means of guiding practical test development. It is most useful in situations in which the domain of language use is relatively homogeneous and identifiable (see O'Sullivan 2006 on the development of Cambridge Business English examinations). Its primary limitation, however, is that it cannot provide very much information about language ability and hence cannot demonstrate validity in the broadest sense.
The RL approach has been regarded as encapsulating the notion of communicative testing as it seeks to develop tests that mirror the "reality" of non-test language use (real life performance). Its prime concerns are :
the appearance or perception of the test and how this may effect test performance and test use (face validity) and;
the accuracy with which test performance predicts non-test performance (predictive validity).
A number of various attempts have been made to characterise communicative tests (Morrow 1979, Alderson 1981, Porter 1983). Weir (1988), however, points out, there are inherent problems involved in basing test specifications on empirical research and observes that:
"the more specific the tasks one identifies the less one can generalise from performance on its realisation in a test".
The concern with situational authenticity requires writers to make use of texts, situational contexts, and tasks which simulate "real-life" without trying to replicate it exactly. The interactional authenticity (IA) approach is concerned with the extent to which test performance reflects language abilities. In other words, the concern is with construct validity. Bachman (1989) summarises the IA approach arguing that it encapsulates the essential characteristics of communicative language use by reflecting the interactive relationship that exists between the language user, the context and the discourse. The major consideration shifts from that of attempting to sample actual instances of non-test language use, to that of determining the most appropriate combination of test method characteristics. For Bachman, an interactionally authentic test involves the following:
some language function in addition to that of demonstrating the test taker's language knowledge;
the test taker's language knowledge;
the test taker's language schemata;
the test taker's meta-cognitive strategies. (11)
2.1.3 Purpose of task
Task setting (such as Purpose, Response Format, Weighting, Known Criteria, Order of Items, Time Constraints) and Linguistic Demands (such as Channel, Discourse Mode, Text length, Writer-reader Relationship, etc.) are normally conveyed through the rubric/instructions supplied to the candidates. It is generally accepted that the presentation of information in the task rubric should be made as explicit as possible in terms of the production demands required of the test taker. (12)
The writing task rubric must present candidates with clear, precise and unequivocal information regarding the purpose for completing the writing task and the target audience for it. This purpose should provide a reason for completing the task that goes beyond a ritual display of knowledge for assessment. It may well involve suspension of disbelief but having a clear and acceptable communicative purpose in mind is thought to enhance performance. The way the prompt is worded has been shown to affect what the candidate sees as the purpose of the task (Hamp Lyons 1991 and Moore and Morton 1999). For example a term like "discuss" is open to different interpretations unless further specified (see Evans 1988). (13)
The ILEC Writing test gives a clear role to the candidate in each task (eg 'You are a lawyer representing Ms Sandra Meyer.') and a clear purpose and target audience for the task (eg 'Write a letter to Robert Woodly on behalf of your client, Ms Meyer.' 'Write a memorandum to your colleague to brief him on the case.') (14)
2.1.4 Time Constraints
In writing we are concerned with the time available for task completion: speed at which processing must take place; length of time available to write; whether it is an exam or hand in assignment, and the number of revisions/drafts allowed (process element). Outside of examination essays, in the real world, writing tasks would not necessarily be timed (although there is a case for speed writing in a working context on occasions especially in a legal or professional setting where deadlines must be met). Where time in the workplace is not of the essence, students would be allowed maximum opportunity and access to resources for demonstrating their writing abilities. However considerations such as time constraints and reliability issues make longer, processâ€‘oriented tests impractical in most situations. (15)
Weir (2004) points out that the texts we get candidates to produce obviously have to be long enough for them to be scored in a valid manner. If we want to establish whether a student can organize a written product into a coherent whole, length is obviously a key factor. He notes that as regards an appropriate time for completion of productâ€‘oriented writing tasks in an actual examination setting, Jacobs et al. (1981:19), in their research on the Michigan Composition Test, found that a time allowance of thirty minutes probably gave most students enough time to produce an adequate sample of their writing ability for the purpose of assessment. (16)
One might reasonably expect that time-restricted test tasks cannot represent what writers are capable of in normal written discourse where time constraints may be less limited. Kroll (1990:140-154) reports on research comparing timed classroom essays and essays written at home over a 10-14 day period. Contrary to what one might have expected the study indicated that, in general, time does not buy very much for students in either their control over syntax - the distribution of specific language errors being remarkably similar in both - or in their organisational skills. (17)
In the case of ILEC, common tasks are presented to a candidature comprising both B2 and C1 candidates who must complete the test in 1 hour and 15 minutes. (18)
2.1.5 Text Length
Text length potentially has an important effect in terms of what Weir (2005) calls the executive resources that will be called into play in cognitive processing. These resources are both linguistic and experiential and need to be as similar as possible to those made by equivalent tasks in real life language use for use to generalise from test performance to language use in the domain of interest. ILEC Writing comprises two tasks, one of between 120 and 180 words and one of between 200 and 250 words. (19)
2.2 Theory-based validity
Theory-based validity involves collecting evidence through the piloting and trialling of a test before it is made available to candidates on the cognitive processing activated by the test tasks. (20)
Theory-based validity of a test of writing is a function of how closely it represents the cognitive processing involved in performing the construct in real life. Weir (2005) details how establishing theory-based validity for a writing task involves producing evidence on the nature of the executive resources and executive processing activated by the task. 'Executive resources' involve linguistic resources and content knowledge. Content knowledge may already be possessed by the candidate or might be available in information supplied through task input. The 'Executive process' refers to cognitive processing and includes the procedures of goal setting, topic & genre modifying, generating, organizing, translating and reviewing. (21)
Planning relates to a number of stages in the writing process: macro-planning; organisation; micro planning (Field 2004). Macro-planning entails assembling a set of ideas and drawing upon world knowledge. The writer initially establishes what the goal of the piece of writing is to be. This includes consideration of the target readership, of the genre of the text (earlier experience as a reader may assist) and of style (level of formality). Grabe and Kaplan (1996) refer to this stage as Goal Setting. Goal setting involves setting goals and purposes, offering an initial draft of task representation and connecting 'context' with 'verbal working memory' (1996: 226). During the Organisation stage the writer provisionally organises the ideas, still in abstract form, a) in relation to the text as a whole and b) in relation to each other. The ideas are evaluated in terms of their relative importance, and decisions made as to their relative prominence in the text. The outcome may be a set of rough notes. Grabe and Kaplan (1996:226) describe Organizing as grouping, categorizing ideas, establishing new concepts and putting ideas in suitable order. At the micro-planning level, the writer shifts to a different level and begins to plan conceptually at sentence and paragraph level. Throughout this stage, constant reference back to two sets of criteria is made: to decisions taken at earlier stages and to the manner in which the text has progressed so far. Account is taken of the overall goals of the text; of the organisational plan and the direction in which the text is currently tending; and of the content of the immediately preceding sentence or paragraph. At this stage, the writer needs to give consideration to whether an individual piece of information is or is not shared with the reader a) by virtue of shared world knowledge or b) as a result of earlier mention in the text. These processing procedures are described in detail by Hayes and Flower (1980), Bereiter & Scardamalia (1987), and Grabe & Kaplan (1996). (22)
ILEC Writing tasks require candidates to undertake writing tasks which engage these processing abilities. The Needs Analysis revealed that correspondence between legal firms and and clients is a written form of communication frequently needed by professionals. Furthermore, correspondence is often in the form of a response to an earlier letter and includes reference both to this text and to other documents or texts, such as tax statements, procedural documents, company accounts. This reflects the concept of intertextuality as identified by Kristeva (1980:69); research by others (Flowerdew and Wan 2006) has confirmed the prevalence of the interaction between texts in the corporate world. To reflect the findings of the ILEC Needs Analysis 9see Appendix 2), one task on the Test of Writing requires candidates to draw on a previous text and compose a response to it with the use of notes. Composing the response requires the candidate to use a range of functions including clarifying, refuting, requesting information, referring the target reader to other documentation. (23)
2.3 Scoring Validity
Scoring Validity is linked directly to both context and theory-based validity and accounts for the extent to which test scores are based on appropriate criteria, exhibit consensual greement in their marking, are as free as possible from measurement error, stable over time, consistent in terms of their content sampling and engender confidence as reliable decision making indicators. (24)
The assessment criteria for ILEC Writing (see Appendix 3 ) are based on those of a General English test at the same levels related to the CEFR. As Douglas points out: 'contrary to the cases of LSP test content and method, LSP assessment criteria have not usually been derived from an analysis of the TLU situation' (Douglas 2001:174). In the same article, he goes on to make a case for basing LSP assessment criteria on an empirical analysis of the TLU situation. It is also the case with ILEC, that examiners for both the ILEC Writing and Speaking papers, are not required to have a background in Legal English*. It may be argued that this is a weakness in the underpinning scoring validity of the ILEC Writing paper as assessment by a subject specialist may differ from that of the layperson (ie general marker). (25)
Jacobs et al. (1981:3) identify aspects of this relating to cognitive process and social interaction:
The direct testing of writing emphasizes the communicative purpose of writing â€¦ (it) utilizes the important intuitive, albeit subjective, resources of other participants in the communicative process - the readers of written discourse, who must be the ultimate judges of the success or failure of the writer's communicative efforts.
If candidates' self-assessments of their language abilities, or ratings of the candidate by teachers, subject specialists, or other informants (Alderson et al 1995) differs from that of the non-specialist Examiner, predictive validity may be compromised. (26)
2.4 Consequential Validity
Messick (1989:18) argues that 'For a fully unified view of validity, it must â€¦ be recognised that the appropriateness, meaningfulness, and usefulness of score- based inferences depend
* personal information from ILEC Writing subject staff
as well on the social consequences of the testing. Therefore social values and social consequences cannot be ignored in considerations of validity'. Consequential Validity relates
to the way in which the implementation of a test can affect the interpretability of test
scores; the practical consequences of the introduction of a test (McNamara 2000). Shohamy (1993:37) argues that 'Testers must begin to examine the consequences of the tests they develop â€¦ often â€¦ they do not find it necessary to observe the actual use of the test.'
Weir (2005) provides a comprehensive treatment of these key elements within the Socio-Cognitive Validation framework. (27)
ILEC has achieved recognition by a number of different legal entities, including universities and law practices in 36 countries (see Appendix 4). Furthermore, the initial market research and viability study was administered to a number of stakeholders in the field including international and local law firms, large companies with their own legal departments; university law faculties and legal training providers and language schools. Although the exam fee may be considered to be costly which is arguably an implication of the 'social consequences' of testing, it may be argued that within the domain of corporate/commercial law, the consequential validity in this respect is not unsound. (28)
This assignment has examined the ILEC Test of Writing. The development of ILEC saw collaboration between assessment specialist and legal content specialists, with each bringing expertise to the process. This has arguably resulted in a test which authentically simulates the TLU situation and as a result, it may be concluded that the test is sound in terms of Context, Theory-based and Consequential validity. Where the test is arguably less strong is in the area of Scoring Validity (and the resulting impact the issue may be said to have on Consequential Validity), in the use of assessment criteria and examining personnel unrelated to the TLU and specific LSP domain. (29)
Word Count: 4, 125