Evaluations of lecturers and faculties began in the 1960's in America by enterprising college students (Cahn, 1986). Since then the trend has continued to spread across America in which almost all colleges and universities have implemented it. It is also probably the main source of information in evaluating teaching performance (Cave, et al 1997). This same trend has also spread to the Far East. In the advent of providing greater student voice to university students, universities in China in particular have provided students with the ability to rate their lecturer's as well as their respective faculty's performance through the use of Student Evaluation of Faculty (SEF) forms. It is hoped that this form of evaluation would help spur lecturer's development in universities (Kong, 2010).
This form of evaluation has since then been met with mix reviews in which age old questions of validity and reliability are being brought to question. So firstly, what exactly is a Student Evaluation Form? A SEF evaluation form is basically a form that contains a series of questions at which scales are provided for each question to determine how well their lecturer has performed in key areas of his teaching throughout the semester. This paper then seeks to understand the usefulness of student evaluations as well as analyze the shortcomings that come with it. I first begin by making a trend analysis of the situation in China pertaining to universities and the cultural setting, secondly highlighting arguments that favour the SEF, thirdly analyzing its intricacies and finally making a conclusion.
A brief history of education in China suggests that teachers are the central pillar of education. This is echoed through all levels of education may it be primary, secondary or tertiary. According to (Geng, 2007), there are serious malfunctions in terms of relationships, enthusiasm and adjustment to modern times. Under the influence of Confucian teaching, all the Chinese teachers make the decisions during the class. They decide what to teach and learn, how to teach and learn, when to teach and learn and where to teach and learn. Even when the decisions are harmful to the learners, no one dares to challenge teachers. The standard mode of authoritarian style is command and control, with no regard to diversity and efficiency of teaching and learning. With authoritarian style, learners are to be controlled, manipulated and occasionally pacified like little children. They are motivated by fear rather than enthusiasm or passion for learning. They are expected to do what they are told without questioning. The main criterion for progress is high scores in examinations rather than competence and commitment (Geng, 2007). What student evaluation of lecturers does then is provide an avenue for students to voice either their support or their disapproval.
However, because Chinese lecturers have had so much power in their hands over the last decade to dictate how a class should be conducted, in my opinion, lecturers now feel a bit more vulnerable. No longer is it seen as an unmovable seat rather now it has become a place where greater scrutiny is placed upon. This in turn might be a reason as to why lecturers were apprehensive to the implementation of the SEF. Most universities in China however, provided rewards for lecturers who were rated well by their students. Student's evaluations of teachers became the input for a lecturer's performance for the faculty and thus determining their academic promotions/demotions as well as an increase in salary the following academic year (Singh, 2003: 4953. It means that students' assessment of teachers is employed as an effective tool to connect teacher performance with teacher's welfare.
SEF and its benefits
A SEF evaluation is normally carried out once a semester, i.e. at the end of the semester or in some cases twice a semester, i.e. mid and end of semester. Studies by (Marsch & Roche, 1997) do suggest that evaluations conducted twice a semester create a benefit for both teacher and students. Students in their study found that after evaluations were carried out mid-semester there was a greater change in the way their lecturers taught to suit their needs. The improvement was greatest when (a) the professor's self-evaluation was very different from the students' evaluation, (b) the professor received professional consultation on the interpretation of the evaluations, and (c) the student evaluation forms included specific items (such as, "Professor gives preliminary overview of lecture"), as opposed to vague items such as, "How well planned are lessons?".
(Huemer) contends that most researchers agree that SEF are highly reliable, in that students tend to agree with each other in their ratings of an instructor, and that they are at least moderately valid, in that student ratings of course quality correlate positively with other measures of teaching effectiveness. In one type of study, multiple sections of the same course are taught by different instructors, but there is a common final exam. The ratings instructors receive turn out to be positively correlated with the performance of their students on the exam. The correlation is in the neighborhood of .4 to .5, meaning that 16 to 25% of the variance in one variable can be explained by variance in the other. SEF also tend to correlate well with retrospective evaluations by alumni; in other words, former students rarely change their evaluations of their teachers as the years pass, (Centra, 1993).
Finally, students' assessment of teachers is an effective teaching quality control instrument, in which teachers' performance can be monitored. Students' assessment of teaching is different from the other ways of teaching evaluation, which are practiced by very small number of observers, whose bias towards teaching is inevitable. Even if the observers agree with each other in instruction rating, it does not mean that their judgments and descriptions are accurate or fair (Roberts, 1998:168). However, when it comes to students rating, it tends to be participated by all the students in the class. So even if some students hold the bias towards teachers, it does not represent that all the students will give teachers injustice feedbacks and marks. What it implies is that the approach of students' assessment is dominant in numbers compared to the other assessment methods, for example, peer observation and classroom observation.
More and more universities in China are embracing the idea of teachers being evaluated by students (Chen, 2004). Since the implementation of the students' assessment of teachers, university students in China have showed their appreciations on the opportunities they've been given, and actively participated into the procedures of assessment of teachers. According to the survey done by Wang (2008), he found that the assessment is usually popular with students in China, because it reflects a democratic interaction between teachers and students, and students in this process, are no longer the passive acceptance of the classroom learners (Wang, 2008), they now can freely express their ideas and suggestions to teachers via the assessment worksheets. And the suggestions will be delivered to teachers who might make some changes on the content of lessons. By doing so, students in China are not only successfully implementing their voice into their own learning, but also they can make real changes out of it. According to Rudduck &Flutter (2004), students' can find it motivating to be consulted about how they can best be helped to learn and to be treated as active responsible members of the organizations.
Finally, students' learning has the possibility to be improved if a satisfying learning experience being produced by teachers. By saying satisfying learning experience, it means students are willing to participate in classroom activities, and are motivated to manage their own learning. No one can make productive outcome within an unsatisfied working or learning environment. Students are no exception. Then in order to find out the reasons for the dissatisfaction, universities need to employ a mechanism to figure the reasons out and get them settled (Nuhfer, 2003).
Criticisms and Implications
While many Chinese lecturers support the implementation of the SEF, many others contend that there are shortcomings that need to be handled first before the SEF can be implemented successfully. (Jiang, 2004; Li, 2004; Wang & Sui, 2005; Zhang, 2004) contend that many students fail to give teachers the objective and valuable feedbacks due to some psychological and educational factors. For instance, students might not understand the purpose and importance of evaluation, so instead of taking the assessment as a way to enhance their learning, some students might use it as a weapon to take revenge on teachers.
Besides, another prevailing problem recognized by teachers is that the results of the assessment are closely connected with teachers' bonus, salary increases, and promotion (Liu, Teddlie, 2007) This might lead lecturers to work very hard to please students, rather than teaching improvement, because they know that if they fail to get the top mark or the pass mark from students, on one way they will lose a lot of money, on the other, they'll face the reality of being fired. This is echoed in other instances internationally, as highlighted by (Huemer). Accoding to him, because lecturers try to keep students happy all the time, they resort to dumbing down their courses. In one survey, 38% of professors admitted to making their courses easier in response to SEF (Ryan, et.al 1980).
Peter Sacks provides a more detailed, though anecdotal picture. Sacks reports having almost lost his job due to low teaching evaluations from his students. He was able to dramatically raise his teaching evaluations and gain tenure, he says, by becoming utterly undemanding and uncritical of his students, giving out easy grades, and teaching to the lowest common denominator. Sacks claims that this behavior is not unusual but is rather the norm at his college, where students are king and entertainment is all that matters. An excerpt from Sacks' book:
"And so, in my mind, I became a teaching teddy bear. In the metaphorical sandbox I created, students could do no wrong, and I did almost anything possible to keep all of them happy, all of the time, no matter how childish or rude their behavior, no matter how poorly they performed in the course, no matter how little effort they gave. If they wanted their hands held, I would hold them. If they wanted a stapler (or a Kleenex) and I didn't have one, I'd apologize. If they wanted to read the newspaper while I was addressing the class or if they wanted to get up and leave in the middle of a lecture, go for it. Call me spineless. I confess. But in the excessively accommodative culture that I found myself in, "our students" as many of my colleagues called them, had too much power for me to afford irritating them with demands and challenges I had previously thought were part and parcel of the collegiate experience." (Sacks, 1986)
Thirdly, one of the most common criticisms of the SEF is that it potentially creates grading leniency bias. Students tend to give higher ratings when they expect higher grades in the course. Thus, SEF seem to be as much a measure of an instructor's leniency in grading as they are of teaching effectiveness. Many believe that this causes rampant grade inflation, (Sacks, 1986). These and other facts are explained by the leniency bias hypothesis: people tend to like those who praise them (particularly if the praise is greater than expected) and dislike those who criticize them. The instructor who grades leniently in effect praises the students, who then like the instructor more. They then reward the instructor with higher ratings in general (Grimwald & Gilmore, 1997).
Educational Seduction and the Dr. Fox Effect
Abrami, et.al, (1982) found that instructor expressiveness had a substantial impact on student ratings but a small impact on student achievement. In contrast, lecture content had a substantial impact on student achievement but a small impact on student ratings. Many feel that there are questions of reliability of student ratings because they are usually affected by the personal style of the instructor rather than an instructor's ability to convey instructional material. Many have come to refer to the influence of an instructor's personality on student evaluations as the "Dr. Fox effect" or "educational seduction."
In the original Dr. Fox Study, Naftulin, Ware, and Donnelly (1973) found that an entertaining, charismatic lecturer who spoke deliberate nonsense received surprisingly high evaluations from an audience of educators and mental health professionals. Because the lecturer- actually a professional actor-was introduced as Dr. Myron L. Fox, the phenomenon became known as the "Dr. Fox" effect. Naftulin et al. concluded that a lecturer's authority, wit, and personality can "seduce" students into the illusion of having learned, even when the educational content of the lecture was missing. Later studies have obtained similar results showing that audience ratings of a lecture are more strongly influenced by superficial stylistic matters than by content (Abrami, et.al 1982). Of course there were many variables that could have affected the test of the Dr. Fox effect in which many contend that it would have been rendered unreliable.
However there was much to take away from it in which Ware and Williams (1977) found that the higher the amount of lecture content, the more the students learned and the more highly students rated the instructor. In addition, the high-expressive instructor earned much higher ratings and produced somewhat higher achievement than the low-expressive instructor. Finally, expressiveness significantly interacted with content on student ratings: For low expressiveness, high content produced higher ratings than low content; for high expressiveness, content did not affect ratings even though it did affect achievement. These findings led Ware and Williams to suggest that student ratings should not be used to make decisions about faculty promotion and tenure, because charismatic and enthusiastic faculty can receive favorable student ratings regardless of how well they know their subject matter and regardless of how much their students learn.