Document analysis at DFKI. - Part 1: Image analysis and text recognition

Ali, Majdi Ben Hadj; Fein, Frank; Hönes, Frank; Jäger, Thorsten; Weigel, Achim

Bitte benutzen Sie diese Referenz, um auf diese Ressource zu verweisen: doi:10.22028/D291-25005

Titel:	Document analysis at DFKI. - Part 1: Image analysis and text recognition
VerfasserIn:	Ali, Majdi Ben Hadj Fein, Frank Hönes, Frank Jäger, Thorsten Weigel, Achim
Sprache:	Englisch
Erscheinungsjahr:	1995
Quelle:	Kaiserslautern ; Saarbrücken : DFKI, 1995
Kontrollierte Schlagwörter:	Künstliche Intelligenz
DDC-Sachgruppe:	004 Informatik
Dokumenttyp:	Forschungsbericht (Report zu Forschungsprojekten)
Abstract:	Document analysis is responsible for an essential progress in office automation. This paper is part of an overview about the combined research efforts in document analysis at the DFKI. Common to all document analysis projects is the global goal of providing a high level electronic representation of documents in terms of iconic, structural, textual, and semantic information. These symbolic document descriptions enable an "intelligent'; access to a document database. Currently there are three ongoing document analysis projects at DFKI: INCA, OMEGA, and PASCAL2000/PASCAL+. Though the projects pursue different goals in different application domains, they all share the same problems which have to be resolved with similar techniques. For that reason the activities in these projects are bundled to avoid redundant work. At DFKI we have divided the problem of document analysis into two main tasks, text recognition and text analysis, which themselves are divided into a set of subtasks. In a series of three research reports the work of the document analysis and office automation department at DFKI is presented. The first report discusses the problem of text recognition, the second that of text analysis. In a third report we describe our concept for a specialized document analysis knowledge representation language. The report in hand describes the activities dealing with the text recognition task. Text recognition covers the phase starting with capturing a document image up to identifying the written words. This comprises the following subtasks: preprocessing the pictorial information, segmenting into blocks, lines, words, and characters, classifying characters, and identifying the input words. For each subtask several competing solution algorithms, called specialists or knowledge sources, may exist. To efficiently control and organize these specialists an intelligent situation-based planning component is necessary, which is also described in this report. It should be mentioned that the planning component is also responsible to control the overall document analysis system instead of the text recognition phase only
Link zu diesem Datensatz:	urn:nbn:de:bsz:291-scidok-38150 hdl:20.500.11880/25061 http://dx.doi.org/10.22028/D291-25005
Schriftenreihe:	Research report / Deutsches Forschungszentrum für Künstliche Intelligenz [ISSN 0946-008x]
Band:	95-02
Datum des Eintrags:	5-Jul-2011
Fakultät:	SE - Sonstige Einrichtungen
Fachrichtung:	SE - DFKI Deutsches Forschungszentrum für Künstliche Intelligenz
Sammlung:	SciDok - Der Wissenschaftsserver der Universität des Saarlandes

Dateien zu diesem Datensatz:

Datei	Beschreibung	Größe	Format
RR_95_02.pdf		277,81 kB	Adobe PDF	Öffnen/Anzeigen

Export: BibTex Statistik anzeigen

Alle Ressourcen in diesem Repository sind urheberrechtlich geschützt.