Reliability and validity of PIRLS and TIMSS: does the response format matter?

Schult, Johannes; Sparfeldt, Jörn R.

Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen: doi:10.22028/D291-26944

Titel:	Reliability and validity of PIRLS and TIMSS: does the response format matter?
VerfasserIn:	Schult, Johannes Sparfeldt, Jörn R.
Sprache:	Englisch
Titel:	European Journal of Psychological Assessment
Verlag/Plattform:	Hogrefe
Erscheinungsjahr:	2016
Freie Schlagwörter:	Response format Multiple-choice Constructed-response Item response theory Validity
DDC-Sachgruppe:	150 Psychologie 370 Erziehung, Schul- und Bildungswesen
Dokumenttyp:	Journalartikel / Zeitschriftenartikel
Abstract:	Academic achievements are often assessed in written exams and tests using selection-type (e.g., multiple-choice; MC) and supply-type (e.g., constructed-response; CR) item response formats. The present article examines how MC items and CR items differ with regard to reliability and criterion validity in two educational large-scale assessments with fourth-graders. The reading items of PIRLS 2006 were compiled into MC scales, CR scales, and mixed scales. Scale reliabilities were estimated according to item response theory (international PIRLS sample; n = 119,413). MC showed smaller standard errors than CR around the reading proficiency mean, whereas CR was more reliable for low and high proficiency levels. In the German sample (n = 7,581), there was no format-specific differential validity (criterion: German grades, r ˜ .5; ?r = 0.01). The mathematics items of TIMSS 2007 (n = 160,922) showed similar reliability patterns. MC validity was slightly larger than CR validity (criterion: mathematics grades; n = 5,111; r ˜ .5, ?r = –0.02). Effects of format-specific test-extensions were very small in both studies. It seems that in PIRLS and TIMSS, reliability and validity do not depend substantially on response formats. Consequently, other response format characteristics (like the cost of development, administration, and scoring) should be considered when choosing between MC and CR.
DOI der Erstveröffentlichung:	10.1027/1015-5759/a000338
Link zu diesem Datensatz:	urn:nbn:de:bsz:291-scidok-ds-269443 hdl:20.500.11880/26920 http://dx.doi.org/10.22028/D291-26944
Datum des Eintrags:	22-Dez-2017
Drittmittel / Förderung:	This research was prepared with the support of the German funds “Bund-Länder-Programm für bessere Studienbedingungen und mehr Qualität in der Lehre (‘Qualitätspakt Lehre’)” [the joint program of the Federal and States Government for better study conditions and the quality of teaching in higher education (“the Teaching Quality Pact”)] at Saarland University (funding code: 01PL11012). The authors developed the topic and the content of this manuscript independently from this funding. We thank the Institute for School Development Research (IFS) at Technical University Dortmund / the Max Planck Institute for Human Development (MPIB) Berlin / the Standing Conference of the Ministers of Education and Cultural Affairs (KMK) as well as the Research Data Centre (FDZ) at the Institute for Educational Quality Improvement (IQB) for providing the raw data.
Fördernummer:	01PL11012
Fakultät:	HW - Fakultät für Empirische Humanwissenschaften und Wirtschaftswissenschaft
Fachrichtung:	HW - Bildungswissenschaften
Sammlung:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Dateien zu diesem Datensatz:

Datei	Beschreibung	Größe	Format
Schult-Sparfeldt-PIRLS-TIMSS-2016.pdf	Schult & Sparfeldt (2016) Reliability and validity of PIRLS and TIMSS (Post-Print Manuskript)	494,38 kB	Adobe PDF	Öffnen/Anzeigen

Export: BibTex Statistik anzeigen

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.