Please use this identifier to cite or link to this item: doi:10.22028/D291-31939
Volltext verfügbar? / Dokumentlieferung
Title: Speech Synthesis: Text-To-Speech Conversion and Artificial Voices
Author(s): Trouvain, Jürgen
Möbius, Bernd
Editor(s): Brunn, Stanley D.
Kehrein, Roland
Language: English
Title: Handbook of the Changing World Language Map
Startpage: 1
Endpage: 15
Publisher/Platform: Springer
Year of Publication: 2019
Place of publication: Cham
Publikation type: Book Chapter
Abstract: The artificial generation of speech has fascinated mankind since ancient times. The robotic-sounding artificial voices from the last century are nowadays replaced with more naturally sounding voices based on pre-recorded human speech. Significant progress in data processing led to qualitative leaps in intelligibility and naturalness. Apart from sizable data of the voice donor, a fully fledged text-to-speech (TTS) synthesizer requires further linguistic resources and components of natural language processing including dictionaries with information on pronunciation and word prosody, morphological structure, and parts-of-speech but also procedures for automatic chunking texts in smaller parts, or morpho-syntactic parsing. TTS technology can be used in many different application domains, for instance, as a communicative aid for those who cannot speak and those who cannot see and in situations characterized as “hands busy, eyes busy” often as a part of spoken dialog systems. One remaining big challenge is evaluation of the quality of synthetic speech output and its appropriateness for the needs of the user. There are also promising developments in speech synthesis that go beyond the pure acoustic channel. Multimodal synthesis includes the visual channel, e.g., in talking heads, whereas silent-speech interfaces and brain-to-speech conversion convert articulatory gestures and brain waves, respectively, to spoken output. Although there has been much progress in quality in the last decade, often achieved by processing enormous amounts of data, TTS today is available only for relatively few languages (probably fewer than 50 with a dominance of English). Thus, a major task will be to find or create linguistic resources and make them available for more languages and language varieties.
DOI of the first publication: 10.1007/978-3-319-73400-2_168-1
URL of the first publication: https://link.springer.com/referenceworkentry/10.1007/978-3-319-73400-2_168-1
Link to this record: hdl:20.500.11880/29547
http://dx.doi.org/10.22028/D291-31939
ISBN: 978-3-319-73400-2
Date of registration: 19-Aug-2020
Faculty: P - Philosophische Fakultät
Department: P - Sprachwissenschaft und Sprachtechnologie
Professorship: P - Prof. Dr. Bernd Möbius
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
There are no files associated with this item.


Items in SciDok are protected by copyright, with all rights reserved, unless otherwise indicated.