Please use this identifier to cite or link to this item: doi:10.22028/D291-25056
Title: Text skimming as a part in paper document understanding
Author(s): Bleisinger, Rainer
Gores, Klaus-Peter
Language: English
Year of Publication: 1994
OPUS Source: Kaiserslautern ; Saarbrücken : DFKI, 1994
SWD key words: Künstliche Intelligenz
DDC notations: 004 Computer science, internet
Publikation type: Report
Abstract: In our document understanding project ALV we analyse incoming paper mail in the domain of single-sided German business letters. These letters are scanned and after several analysis steps the text is recognized. The result may contain gaps, word alternatives, and even illegal words. The subject of this paper is the subsequent phase which concerns the extraction of important information predefined in our "message type model". An expectation driven partial text skimming analysis is proposed focussing on the kernel module, the so-called "predictor". In contrast to traditional text skimming the following aspects are important in our approach. Basically, the input data are fragmentary texts. Rather than having one text analysis module ("substantiator") only, our predictor controls a set of different and partially alternative substantiators. With respect to the usually proposed three working phases of a predictor - start, discrimination, and instantiation - the following differences are remarkable. The starting problem of text skimming is solved by applying specialized substantiators for classifying a business letter into message types. In order to select appropriate expectations within the message type hypotheses a twofold discrimination is performed. A coarse discrimination reduces the number of message type alternatives, and a fine discrimination chooses one expectation within one or a few previously selected message types. According to the expectation selected substantiators are activated. Several rules are applied both for the verification of the substantiator results and for error recovery if the results are insufficient.
Link to this record: urn:nbn:de:bsz:291-scidok-38979
Series name: Technical memo / Deutsches Forschungszentrum für Künstliche Intelligenz [ISSN 0946-0071]
Series volume: 94-01
Date of registration: 8-Jul-2011
Faculty: SE - Sonstige Einrichtungen
Department: SE - DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Collections:SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Files for this record:
File Description SizeFormat 
TM_94_01.pdf226,9 kBAdobe PDFView/Open

Items in SciDok are protected by copyright, with all rights reserved, unless otherwise indicated.