8.0

© 2009 ABBYY. All rights reserved.
6
Documents of various types can be processed in a single batch, and you can set up the system for
processing documents of a mixed type. A document type only affects the method by which the
document template is created. The type of processed documents does not affect the work of an operator.
Let us analyze the various document types that can be processed using ABBYY FlexiCapture 8.0
Professional.
Structured documents. Structured documents are documents containing a certain number of
specific data fields whose location and marking are identical on all copies of the document. Such
documents are called “fixed forms”. Questionnaires, surveys, and application forms are usually
fixed forms and are typically paper forms that must be filled out by hand. To identify a specific
form from a flow of various documents and to extract data from such a form, you need to create
a uniform fixed template that tells the program where to find the necessary data fields. Certain
fixed forms are processed more efficiently because they were created to meet specific data
capture requirements. Such forms are called “machine-readable forms”. ABBYY FormDesigner
8.0 is an efficient tool for creating machine-readable forms and is supplied together with
ABBYY FlexiCapture 8.0 Professional. For more information on creating forms using ABBYY
FormDesigner 8.0, please read the ABBYY FormDesigner 8.0 Help file and other
documentation. The main steps for creating a template are described specifically for structured
documents.
Semi–structured documents. These are documents containing a number of data fields whose
quantity, marking, and location may vary on different copies of the document. For example,
invoices are semi-structured documents because invoices received from different companies
often differ with regard to the number of data fields and their format. All invoices have an
invoice number and a total payment amount, but these data fields may be located in different
places on the document. To identify semi-structured documents and to extract data from them,
ABBYY FlexiCapture 8.0 Professional uses flexible templates (FlexiLayouts). To create a
flexible template, you must use the special ABBYY FlexiLayout Studio module. For detailed
information about this module, please refer to the ABBYY FlexiLayout Studio Help file and
User Guide. Flexible document processing differs from fixed document processing only with
regard to the creation and attachment of a template.
Non-structured documents. ABBYY FlexiCapture 8.0 Professional can be used to process non-
structured documents such as contracts, letters, or orders, where the information is presented in a
free-form style. The program can automatically identify non-structured documents as
attachments to fixed or semi-structured documents, or with the help of a flexible template. These
documents can then be exported to searchable PDF files or to graphical files. Data from the index
fields of non-structured documents can be extracted manually, or automatically using a flexible
template. A typical scenario in which non-structured documents are processed is the conversion
of a paper archive into electronic format and the extraction of several index fields for subsequent
quick attribute search.
The following chapters describe how to set up ABBYY FlexiCapture 8.0 Professional to process
documents of various types, including the automated data capture process, how to improve recognition
quality, and how to organize data export.
The capture of structured documents is described in great detail. All processing stages are described
using fixed forms as an example. The peculiarities of other document types and the differences
concerning the creation of templates for such documents are also explained.