Setting performance standards to categorize examinees is an indispensable part on assessments. Researchers have probed into this area from various perspectives including the theoretical foundations of standard setting, concepts, and common methods (Cizek, 2012; Cizek, & Bunch, 2007; Hambleton, 2001; Hambleton & Pitoniak, 2006). Still other researchers have applied the various standard setting methods in practical large-scale assessments (Tannenbaum & Katz, 2008; Yin, Schulz, & Sconing, 2005; Hsieh, 2013). In addition to the studies on large-scale assessments standard setting, however, students’ performance on school-based assessments needs attention as well. In Chinese universities, students’ performance on finals will not only influence their grade point average (GPA), but also determine whether they should retake the courses or even repeat a year’s work. Therefore, setting appropriate performance standards on finals is worth exploring.
Except for English major students, the majority of Chinese college students are studying English as foreign language (EFL). They must take English finals at the end of each semester and the most common standard that regulates whether a student passes or fails is obtaining 60 out of 100. Nevertheless, even within the same university or college, students from different majors have varying English abilities. Thus, setting the standard, or cutting-score, of English finals in a more flexible way rather than the rigid 60 is both reasonable and beneficial.
Among the existing standard setting methods, this study uses an extended Angoff method to set cutting-scores for students from various majors at a Chinese university. Twelve teachers who are teaching the students will be invited as the panelists to make decisions on the cutting-scores during a three-round meeting within two days. To ensure the reliability of the cutting-scores, under the framework of generalizability (G) theory, a multivariate generalizability (G) study will be performed to detect the variation among panelists and estimate the standard errors of the cutting-scores. Further, several design (D) studies will be conducted to examine the dependability of the cutting-scores in different hypothesized occasions.
The results of the study will provide some policy suggestions regarding the standards of students’ performance on English examinations at the Chinese university and other universities in China. Similar studies can be conducted in universities around the world that are enrolling international students. When setting the standards on placement tests, universities should ensure that the standards are appropriate for their students and the reliability of the standards is examined.