Please use this identifier to cite or link to this item:
doi:10.22028/D291-26944
Title: | Reliability and validity of PIRLS and TIMSS: does the response format matter? |
Author(s): | Schult, Johannes Sparfeldt, Jörn R. |
Language: | English |
Title: | European Journal of Psychological Assessment |
Publisher/Platform: | Hogrefe |
Year of Publication: | 2016 |
Free key words: | Response format Multiple-choice Constructed-response Item response theory Validity |
DDC notations: | 150 Psychology 370 Education |
Publikation type: | Journal Article |
Abstract: | Academic achievements are often assessed in written exams and tests using selection-type (e.g., multiple-choice; MC) and supply-type (e.g., constructed-response; CR) item response formats. The present article examines how MC items and CR items differ with regard to reliability and criterion validity in two educational large-scale assessments with fourth-graders. The reading items of PIRLS 2006 were compiled into MC scales, CR scales, and mixed scales. Scale reliabilities were estimated according to item response theory (international PIRLS sample; n = 119,413). MC showed smaller standard errors than CR around the reading proficiency mean, whereas CR was more reliable for low and high proficiency levels. In the German sample (n = 7,581), there was no format-specific differential validity (criterion: German grades, r ˜ .5; ?r = 0.01). The mathematics items of TIMSS 2007 (n = 160,922) showed similar reliability patterns. MC validity was slightly larger than CR validity (criterion: mathematics grades; n = 5,111; r ˜ .5, ?r = –0.02). Effects of format-specific test-extensions were very small in both studies. It seems that in PIRLS and TIMSS, reliability and validity do not depend substantially on response formats. Consequently, other response format characteristics (like the cost of development, administration, and scoring) should be considered when choosing between MC and CR. |
DOI of the first publication: | 10.1027/1015-5759/a000338 |
Link to this record: | urn:nbn:de:bsz:291-scidok-ds-269443 hdl:20.500.11880/26920 http://dx.doi.org/10.22028/D291-26944 |
Date of registration: | 22-Dec-2017 |
Third-party funds sponsorship: | This research was prepared with the support of the German funds “Bund-Länder-Programm für bessere Studienbedingungen und mehr Qualität in der Lehre (‘Qualitätspakt Lehre’)” [the joint program of the Federal and States Government for better study conditions and the quality of teaching in higher education (“the Teaching Quality Pact”)] at Saarland University (funding code: 01PL11012). The authors developed the topic and the content of this manuscript independently from this funding. We thank the Institute for School Development Research (IFS) at Technical University Dortmund / the Max Planck Institute for Human Development (MPIB) Berlin / the Standing Conference of the Ministers of Education and Cultural Affairs (KMK) as well as the Research Data Centre (FDZ) at the Institute for Educational Quality Improvement (IQB) for providing the raw data. |
Sponsorship ID: | 01PL11012 |
Faculty: | HW - Fakultät für Empirische Humanwissenschaften und Wirtschaftswissenschaft |
Department: | HW - Bildungswissenschaften |
Collections: | SciDok - Der Wissenschaftsserver der Universität des Saarlandes |
Files for this record:
File | Description | Size | Format | |
---|---|---|---|---|
Schult-Sparfeldt-PIRLS-TIMSS-2016.pdf | Schult & Sparfeldt (2016) Reliability and validity of PIRLS and TIMSS (Post-Print Manuskript) | 494,38 kB | Adobe PDF | View/Open |
Items in SciDok are protected by copyright, with all rights reserved, unless otherwise indicated.