gning the HSK test (汉语水平考试) with the Chinese Language Proficiency Scales for Speakers of Other Languages (国际汉语能力标准): from the consequential validity perspective
The national standardized test of Chinese language proficiency for non-native speakers, 汉语水平考试(HSK), literally "Chinese Proficiency Test," has played a vital role in certifying language proficiency for higher education and professional purposes. According to Hanban, it is designed based on the Chinese Language Proficiency Scales for Speakers of Other Languages (abbreviated as the Scales), which is an official document with guidelines for CSL teaching and learning. The multiple uses of the HSK have generated growing concerns about its validity. Employing Bachman and Palmer's (2010) Assessment Use Argument (AUA) framework, this mixed methods sequential exploratory study aligned the HSK to the Scales from the perspectives of its micro- and macro-level consequential validity. In Phase I, the official HSK documents were analyzed by content analysis; interviews with 12 test stakeholders were then conducted and analyzed by a two-cycle coding approach. In Phase II, participants (including 136 CSL teachers, 512 test-takers, and 35 score users who use the HSK to inform academic and employment decisions) answered questionnaires, and the data were analyzed quantitatively (e.g., descriptive statistics, exploratory factor analysis, and structural equation modeling). The results of the study indicated that the revision of the HSK reflect the standards set in the Scales. It also highlighted the complexity of the HSK's consequences and washback effects in different contexts. In general, HSK scores and other related information also provided users with relevant, useful, and meaningful data for candidate selection. Overall, based on the AUA conceptual framework, the findings provided evidence that Claim 1 (Consequences), Claim 2 (Decisions), and Claim 3 (Interpretations) were partially supported, in that the test developers' intended goals for the HSK aligning to the Scales were only achieved to a certain degree. This study helped describe the consequential validity of the HSK in the CSL context, shed light on understanding the alignment of the HSK with the Scales, and pointed to implications for the HSK developers and future research.