Jorge Beltrán / Teachers College, Columbia University
Feedback based on speaking assessment performance can potentially be used to guide oral proficiency development. Such feedback can also provide insight into the development and validation of standards used to evaluate oral performance. The goal of this study was to understand what kinds of feedback teachers give to second language spoken performance in an assessment context, and whether the nature of the feedback differs by (a) learner proficiency level, and (b) teacher’s context of teaching / first language background. To this end we developed a prototype online feedback system that allows users to review spoken performance and give feedback by following a 2-step template. A total of 48 experienced English language teachers were invited to review speaking performances in response to TOEFL iBT speaking tasks. Half the participants (n=24) were native English speakers living in the U.S. The other half (n=24) were teachers from China, Korea, and Japan who indicated that English was not their first language. During data collection, the teachers were first asked to provide an analytic rating on each of the three language domains (delivery, language use, and content/organization). They were then asked to identify, per domain, the area for improvement they felt was most noticeable and provided the greatest obstacles for a learner to move to a higher score level, and then gave oral and/or written feedback. We coded teacher feedback in terms of linguistic focus (e.g., pausing) and feedback type (general vs. targeted). The majority of the most frequently observed linguistic categories were found to meaningfully differentiate across proficiency levels. Across the two teacher groups, native English speaker teachers tended to provide more fine-grained and targeted reviews than their counterparts, consistent with the findings from previous research. The study demonstrated a data-driven approach to identifying critical linguistic features that differentiate learners of different levels. These critical linguistic features identified by teachers can potentially be used in constructing human scoring rubrics as well as informing the development of automated scoring models. The results can also shed light on developing level-specific speaking instructional materials for language learners. Last but not least important, the study points out the need to provide support and training to teachers in order for them to provide feedback that is targeted, actionable, and beneficial to learning.