Although the validity of multiple choice question has been doubted, it has been widely used in various English listening tests at home and abroad. Nowadays, the trend of using multiple choice question as a test method cannot be changed. Therefore, it is necessary to find ways to improve the validity of multiple choice question.
Multiple choice question has two question types, including the global and local questions. Base on the literature review, question types have effects on the validity of listening tests. However, there are not many researches on the question types. Besides, there are controversies in existing researches. Some studies believe that question types have a significant effect on listening comprehension while some studies believe that the effect is neutral. In addition, these studies don't explain the results from the perspective of test validity. Based on this, it's necessary to compare the construct validity of the global question and the local question and find out the reasons.
The current study uses the Buck's (2001) framework for describing listening constructs to compare the construct validity of the global and the local questions in the English tests for junior high students. Firstly, based on results of comparison of literature, it is found that the global question measures more listening abilities, which indicates that construct validity of the global question is better than that of the local question. Thus, hypothesis of the present study is put forward: The global question measures more constructs, so the global question is more difficult than the local question and the score of the global question is lower than that of the local question. In order to testify the hypothesis, the study uses empirical methods to collect data from both quantitative and qualitative aspects. The purpose is to find out which question type has better validity in English tests for junior high students.
Study A is a listening test, including global questions and local questions. Considering that ensuring the authenticity of test tasks is the assurance of the validity of listening tests, especially construct validity, this study adopts question stem preview format, which refers to only allow test takers to preview question stems before listening. The study was conducted for 100 junior high students. By analyzing their test scores, it was indicated that question types have a significant effect on listening comprehension. The score of global questions is lower than that of local questions. Therefore, the hypothesis is verified from a quantitative perspective. For the purpose of testifying whether the validity of question types is mediated with language proficiency, the 100 subjects are divided into high language proficiency group and low language proficiency group according to their language ability. Each group has 50 subjects. The result shows that question types and language proficiency have no interaction effect, so the validity of question types is not mediated with language proficiency.
Study B is a think-aloud task which purpose is to further test and interpret the results of Study A. By analyzing the verbal reports of 3 subjects in the high language proficiency group and 2 subjects in low language proficiency group, it is found that the quantitative data of Study B is consistent with that of Study A. Compared with Buck's (2001)framework for describing listening constructs, the result of think-aloud task is the same as that of comparison of literature. Subjects in both language proficiency groups do not need to apply pragmatic knowledge, sociolinguistic knowledge, strategies of inferencing and summarization when dealing with local questions, so the score of local questions is higher than that of global questions. Subjects in high language proficiency group apply all constructs of Buck's framework when doing global questions. Thus, construct validity of the global question is better than the local question and the global question can measure listeners’ listening ability more accurately.
Taking into account the authenticity of language tests, the test method that only allows test takers to preview question stems before listening should be strongly promoted. At the same time, in order to accurately measure the listening ability of listeners and improve construct validity of listening tests, considering that the global question has a better construct validity than the local question, test developers can increase the number of or the proportion of global questions when designing listening tests.