Please use this identifier to cite or link to this item: doi:10.22028/D291-37904
Title: Modeling intra-textual variation with entropy and surprisal: topical vs. stylistic patterns
Author(s): Degaetano-Ortlieb, Stefania
Teich, Elke
Language: English
Title: Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Publisher/Platform: Association for Computational Linguistics
Year of Publication: 2017
Place of publication: Stroudsburg, PA
Place of the conference: Vancouver, Canada
DDC notations: 400 Language, linguistics
Publikation type: Conference Paper
Abstract: We present a data-driven approach to investigate intra-textual variation by combining entropy and surprisal. With this approach we detect linguistic variation based on phrasal lexico-grammatical patterns across sections of research articles. Entropy is used to detect patterns typical of specific sections. Surprisal is used to differentiate between more and less informationally-loaded patterns as well as type of information (topical vs. stylistic). While we here focus on research articles in biology/genetics, the methodology is especially interesting for digital humanities scholars, as it can be applied to any text type or domain and combined with additional variables (e.g. time, author or social group).
DOI of the first publication: 10.18653/v1/W17-22
Link to this record: urn:nbn:de:bsz:291--ds-379049
hdl:20.500.11880/34402
http://dx.doi.org/10.22028/D291-37904
ISBN: 978-1-945626-58-6
Date of registration: 18-Nov-2022
Third-party funds sponsorship: This work is funded by Deutsche Forschungsgemeinschaft (DFG) under grants SFB 1102: Information Density and Linguistic Encoding (www.sfb1102.uni-saarland.de) and EXC 284: Multimodal Computing and Interaction (www.mmci.uni-saarland.de).
Faculty: P - Philosophische Fakultät
Department: P - Sprachwissenschaft und Sprachtechnologie
Professorship: P - Prof. Dr. Elke Teich
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
File Description SizeFormat 
W17-2209.pdf573,61 kBAdobe PDFView/Open


This item is licensed under a Creative Commons License Creative Commons