Please use this identifier to cite or link to this item: doi:10.22028/D291-38657
Volltext verfügbar? / Dokumentlieferung
Title: Jointly Improving Language Understanding and Language Generation with Quality-Weighted Weak Supervision of Automatic Labeling
Author(s): Chang, Ernie
Demberg, Vera
Marin, Alex
Language: English
Publisher/Platform: arXiv
Year of Publication: 2021
DDC notations: 400 Language, linguistics
Publikation type: Other
Abstract: Neural natural language generation (NLG) and understanding (NLU) models are data-hungry and require massive amounts of annotated data to be competitive. Recent frameworks address this bottleneck with generative models that synthesize weak labels at scale, where a small amount of training labels are expert-curated and the rest of the data is automatically annotated. We follow that approach, by automatically constructing a large-scale weakly-labeled data with a fine-tuned GPT-2, and employ a semi-supervised framework to jointly train the NLG and NLU models. The proposed framework adapts the parameter updates to the models according to the estimated label-quality. On both the E2E and Weather benchmarks, we show that this weakly supervised training paradigm is an effective approach under low resource scenarios and outperforming benchmark systems on both datasets when 100% of training data is used.
URL of the first publication: https://arxiv.org/abs/2102.03551
Link to this record: urn:nbn:de:bsz:291--ds-386571
hdl:20.500.11880/34848
http://dx.doi.org/10.22028/D291-38657
Date of registration: 4-Jan-2023
Notes: Preprint
Faculty: MI - Fakultät für Mathematik und Informatik
Department: MI - Informatik
Professorship: MI - Prof. Dr. Vera Demberg
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
There are no files associated with this item.


Items in SciDok are protected by copyright, with all rights reserved, unless otherwise indicated.