Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen:
doi:10.22028/D291-26944
Dateien zu diesem Datensatz:
Datei | Beschreibung | Größe | Format | |
---|---|---|---|---|
Schult-Sparfeldt-PIRLS-TIMSS-2016.pdf | Schult & Sparfeldt (2016) Reliability and validity of PIRLS and TIMSS (Post-Print Manuskript) | 494,38 kB | Adobe PDF | Öffnen/Anzeigen |
Titel: | Reliability and validity of PIRLS and TIMSS: does the response format matter? |
VerfasserIn: | Schult, Johannes Sparfeldt, Jörn R. ![]() |
Sprache: | Englisch |
In: | |
Titel: | European Journal of Psychological Assessment |
Verlag/Plattform: | Hogrefe |
Erscheinungsjahr: | 2016 |
Freie Schlagwörter: | Response format Multiple-choice Constructed-response Item response theory Validity |
DDC-Sachgruppe: | 150 Psychologie 370 Erziehung, Schul- und Bildungswesen |
Dokumenttyp: | Journalartikel / Zeitschriftenartikel |
Abstract: | Academic achievements are often assessed in written exams and tests using selection-type (e.g., multiple-choice; MC) and supply-type (e.g., constructed-response; CR) item response formats. The present article examines how MC items and CR items differ with regard to reliability and criterion validity in two educational large-scale assessments with fourth-graders. The reading items of PIRLS 2006 were compiled into MC scales, CR scales, and mixed scales. Scale reliabilities were estimated according to item response theory (international PIRLS sample; n = 119,413). MC showed smaller standard errors than CR around the reading proficiency mean, whereas CR was more reliable for low and high proficiency levels. In the German sample (n = 7,581), there was no format-specific differential validity (criterion: German grades, r ˜ .5; ?r = 0.01). The mathematics items of TIMSS 2007 (n = 160,922) showed similar reliability patterns. MC validity was slightly larger than CR validity (criterion: mathematics grades; n = 5,111; r ˜ .5, ?r = –0.02). Effects of format-specific test-extensions were very small in both studies. It seems that in PIRLS and TIMSS, reliability and validity do not depend substantially on response formats. Consequently, other response format characteristics (like the cost of development, administration, and scoring) should be considered when choosing between MC and CR. |
DOI der Erstveröffentlichung: | 10.1027/1015-5759/a000338 |
Link zu diesem Datensatz: | urn:nbn:de:bsz:291-scidok-ds-269443 hdl:20.500.11880/26920 http://dx.doi.org/10.22028/D291-26944 |
Datum des Eintrags: | 22-Dez-2017 |
Drittmittel / Förderung: | This research was prepared with the support of the German funds “Bund-Länder-Programm für bessere Studienbedingungen und mehr Qualität in der Lehre (‘Qualitätspakt Lehre’)” [the joint program of the Federal and States Government for better study conditions and the quality of teaching in higher education (“the Teaching Quality Pact”)] at Saarland University (funding code: 01PL11012). The authors developed the topic and the content of this manuscript independently from this funding. We thank the Institute for School Development Research (IFS) at Technical University Dortmund / the Max Planck Institute for Human Development (MPIB) Berlin / the Standing Conference of the Ministers of Education and Cultural Affairs (KMK) as well as the Research Data Centre (FDZ) at the Institute for Educational Quality Improvement (IQB) for providing the raw data. |
Fördernummer: | 01PL11012 |
Fakultät: | HW - Fakultät für Empirische Humanwissenschaften und Wirtschaftswissenschaft |
Fachrichtung: | HW - Bildungswissenschaften |
Sammlung: | SciDok - Der Wissenschaftsserver der Universität des Saarlandes |
Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.