Augmenting Situated Spoken Language Interaction with Listener Gaze

Mitev, Nikolina

Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen: doi:10.22028/D291-28109

Titel:	Augmenting Situated Spoken Language Interaction with Listener Gaze
VerfasserIn:	Mitev, Nikolina
Sprache:	Englisch
Erscheinungsjahr:	2019
Freie Schlagwörter:	Spoken Interaction
DDC-Sachgruppe:	400 Sprache, Linguistik
Dokumenttyp:	Dissertation
Abstract:	Collaborative task solving in a shared environment requires referential success. Human speakers follow the listener’s behavior in order to monitor language comprehension (Clark, 1996). Furthermore, a natural language generation (NLG) system can exploit listener gaze to realize an effective interaction strategy by responding to it with verbal feedback in virtual environments (Garoufi, Staudte, Koller, & Crocker, 2016). We augment situated spoken language interaction with listener gaze and investigate its role in human-human and human-machine interactions. Firstly, we evaluate its impact on prediction of reference resolution using a mulitimodal corpus collection from virtual environments. Secondly, we explore if and how a human speaker uses listener gaze in an indoor guidance task, while spontaneously referring to real-world objects in a real environment. Thirdly, we consider an object identification task for assembly under system instruction. We developed a multimodal interactive system and two NLG systems that integrate listener gaze in the generation mechanisms. The NLG system “Feedback” reacts to gaze with verbal feedback, either underspecified or contrastive. The NLG system “Installments” uses gaze to incrementally refer to an object in the form of installments. Our results showed that gaze features improved the accuracy of automatic prediction of reference resolution. Further, we found that human speakers are very good at producing referring expressions, and showing listener gaze did not improve performance, but elicited more negative feedback. In contrast, we showed that an NLG system that exploits listener gaze benefits the listener’s understanding. Specifically, combining a short, ambiguous instruction with con- trastive feedback resulted in faster interactions compared to underspecified feedback, and even outperformed following long, unambiguous instructions. Moreover, alternating the underspecified and contrastive responses in an interleaved manner led to better engagement with the system and an effcient information uptake, and resulted in equally good performance. Somewhat surprisingly, when gaze was incorporated more indirectly in the generation procedure and used to trigger installments, the non-interactive approach that outputs an instruction all at once was more effective. However, if the spatial expression was mentioned first, referring in gaze-driven installments was as efficient as following an exhaustive instruction. In sum, we provide a proof of concept that listener gaze can effectively be used in situated human-machine interaction. An assistance system using gaze cues is more attentive and adapts to listener behavior to ensure communicative success.
Link zu diesem Datensatz:	urn:nbn:de:bsz:291--ds-281093 hdl:20.500.11880/27468 http://dx.doi.org/10.22028/D291-28109
Erstgutachter:	Staudte, Maria
Tag der mündlichen Prüfung:	6-Jun-2019
Datum des Eintrags:	5-Jul-2019
Fakultät:	P - Philosophische Fakultät
Fachrichtung:	P - Sprachwissenschaft und Sprachtechnologie
Sammlung:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Dateien zu diesem Datensatz:

Datei	Beschreibung	Größe	Format
mitev_nikolina_phd_thesis_veröffentlich_compressed.pdf		10,01 MB	Adobe PDF	Öffnen/Anzeigen

Export: BibTex Statistik anzeigen

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.