Having a clear knowledge of what the administered test question measures echoes with the idea of evidence-centered design (Mislevy, Almond & Lukas, 2003), which states that the test should be designed to provide proof of students’ knowledge mastery. This is especially true in digitalized assessment, where flexibility in the forms of presentation brings noise to the assessment outcome, rendering items vulnerable to irrelevant format effect.
Alo7 Knowledge Framework (AKF) was constructed to secure the construct validity and manage item content coverage of test items in Alo7 digital learning program item bank. It is a generic knowledge framework of English language learning constructed by content experts, based on extensive experience as English tutors or textbook editors. The framework branches into 3 categories of vocabulary, grammar and skills, and further develops into a hierarchical structure covering knowledge concepts to the finest granule.
The purpose of this study is twofold: the validation for AKF knowledge construct and for the item bank. For such purpose, linear logistic test model (LLTM) is employed. As an extension of Rasch model, LLTM decomposes the item difficulty parameter into a linear combination of cognitive operation (CO) and respectively measures their property (Baghaei & Hohensinn, 2017). For an individual item, a weight is assigned to each CO for their presence and relative importance. Fit for LLTM is determined by the discrepancy between Rasch parameter and LLTM parameter, and the relative fit of the two models .
In this study, content experts tag items with AKF knowledge points, which serves as the COs and their corresponding weights in the model estimation. The analysis result is then used as evidence to modify the AKF structure. For the time being, approximately 1500 knowledge points have been identified and will be tagged onto 366 items and analyzed with response data collected from 70,000 students. This analysis process iterates until the model satisfyingly converges, giving us an empirically proved framework for English language learning and precisely on-target items for specific knowledge points at the test creator’s disposal.
This study expands the application of LLTM in language assessment by applying the model to a generic framework covering a large spectrum of English language learning. Supported by considerably large item and response sample, the current analysis also gives an example on the utilization of measurement model in industrial setting.