Towards wider coverage script knowledge for NLP

Zhai, Fangzhou

Please use this identifier to cite or link to this item: doi:10.22028/D291-41495

Title:	Towards wider coverage script knowledge for NLP
Author(s):	Zhai, Fangzhou
Language:	English
Year of Publication:	2023
DDC notations:	400 Language, linguistics
Publikation type:	Dissertation
Abstract:	This thesis focuses on acquiring wide coverage script knowledge. Script knowledge constitutes a category of common sense knowledge that delineates the procedural aspects of daily activities, such as taking a train and going grocery shopping. It is believed to reside in human memory and is generally assumed by all conversational parties. Conversational utterances often omit details assumed to be known by listeners, who, in turn, comprehend these concise expressions based on their shared understanding, with common sense knowledge forming the basis. Common sense knowledge is indispensable for both the production and comprehension of conversation. As outlined in Chapters 2 and 3, Natural Language Processing (NLP) applications experience significant enhancements with access to script knowledge. Notably, various NLP tasks demonstrate substantial performance improvements when script knowledge is accessible, suggesting that these applications are not fully cognizant of script knowledge. However, acquiring high-quality script knowledge is costly, resulting in limited resources that cover only a few scenarios. Consequently, the practical utility of existing resources is constrained due to insufficient coverage of script knowledge. This thesis is dedicated to developing cost-effective methods for acquiring script knowledge to augment NLP applications and expand the coverage of explicit script knowledge. Previous resources have been generated through intricate manual annotation pipelines. In this work, we introduce automated methods to streamline the annotation process. Specifically, we propose a zero-shot script parser in Chapter 5. By leveraging representation learning, we extract script annotations from existing resources and employ this knowledge to automatically annotate texts from unknown scenarios. When applied to parallel descriptions of unknown scenarios, the acquired script knowledge proves adequate to support NLP applications, such as story generation (Chapter 6). In Chapter 7, we explore the potential of pretrained language models as a source of script knowledge.
Link to this record:	urn:nbn:de:bsz:291--ds-414959 hdl:20.500.11880/37341 http://dx.doi.org/10.22028/D291-41495
Advisor:	Koller, Alexander
Date of oral examination:	23-Nov-2023
Date of registration:	7-Mar-2024
Faculty:	P - Philosophische Fakultät
Department:	P - Sprachwissenschaft und Sprachtechnologie
Professorship:	P - Prof. Dr. Alexander Koller
Collections:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:

File	Description	Size	Format
thesis_p.pdf		3,76 MB	Adobe PDF	View/Open

Export: BibTex

This item is licensed under a Creative Commons License