INDICIUS 6.0 for KOFAX Capture Getting Started Guide (Free-Form) 10300775-000 Rev 1.
© 2006-2008 Kofax, Inc., 16245 Laguna Canyon Road, Irvine, California 92618, U.S.A. All rights reserved. Portions, copyright 1997-2006 Kofax Development UK Ltd. All Rights Reserved. Use is subject to license terms. Third-party software is copyrighted and licensed from Kofax’s suppliers. This product is protected by U.S. Patent No. 5,159,667. THIS SOFTWARE CONTAINS CONFIDENTIAL INFORMATION AND TRADE SECRETS OF KOFAX, INC.
Contents How to Use This Guide ............................................................................................................1 Introduction ............................................................................................................................ 1 How do I Use This Guide? .................................................................................................... 1 Related Documentation ................................................................................
Data Extraction ................................................................................................................. 9 2 Installing INDICIUS .......................................................................................................... 11 Introduction ........................................................................................................................... 11 Installing INDICIUS for the First Time ....................................................................
Regular Expression ....................................................................................................... 32 5 Configuring Recognition ................................................................................................33 Overview ............................................................................................................................... 33 Recognition Configuration Methodology .........................................................................
Assigning Scripted Validations.................................................................................... 90 Data Transformation Using Scripts ............................................................................. 91 Setting Autoskip Functionality .................................................................................... 94 Viewing the Completion Script.................................................................................... 95 Assigning the Template to a Batch Class ...
How to Use This Guide Introduction This guide introduces INDICIUS and describes how it is used to process free-form (semi-structured or unstructured) documents. It starts with brief installation instructions which are followed by a tutorial. The tutorial will guide you through processing batches using the pre-installed Solicitors Letters example.
Related Documentation The following documentation is included with INDICIUS. Each PDF guide can be opened by clicking Start on the taskbar to display the menu, and selecting All Programs | INDICIUS | Documentation. The INDICIUS Help can be opened from the same menu, but can also be opened from the Help menu within the tools. Pressing F1 within Definer and Script Editor will open the topic for the feature being used. Installation Guide (.
Getting Started (Fixed-Form) (.pdf) The Getting Started Guide (Fixed-Form) (.pdf) focuses on configuring a solution to extract data from fixed-form (structured) documents. The guide explains: How to extract data from single page documents of a known document type, using the installed Order Forms example. How the tools, concepts and configuration files relate to the setup in Kofax Capture. How to replicate the Order Forms configuration by following detailed procedures.
INDICIUS Help The INDICIUS Help is written for those configuring a solution and for system administrators, and assumes those reading it have read the Getting Started Guides or attended an INDICIUS training course. This assumption is made so that the INDICIUS Help can provide the most accurate and detailed information across every aspect of the product. The INDICIUS Help explains: How to configure the INDICIUS modules to process a document set.
Chapter 1 Overview Introduction This chapter introduces some of the concepts of data capture and key points of INDICIUS. What Does INDICIUS Add to Kofax Capture? INDICIUS is a set of modules that provide additional automatic recognition (classification, separation and extraction) as well as advanced keying (indexing and validation) functionality to Kofax Capture. Kofax Capture scans paper-based documents, creating a series of scanned image files.
Chapter 1 Table 1-1. INDICIUS Modules Module Function Recognition Automatically classifies and extracts data from each document, searching intelligently for data when necessary. May also automatically separate a stream of pages into documents. The standard instance of Recognition is installed by default. Additional instances can be registered with Kofax Capture using an AEX file.
Overview Table 1-3. INDICIUS Tools Tool Function BatchCompare A utility for detailed evaluation of classification and separation performance. Definer Configures Recognition (in conjunction with Transformation Studio) and Correction for a specific document set. Document Review Project Editor Configures Document Review for a specific document set. Also configures document separation (in conjunction with Transformation Studio).
Chapter 1 Low Volume, Single Station Environment In lower volume environments it is possible to run batches through all the modules on a single station, using Kofax Capture Batch Manager. Free-form Processing Overview Traditional data capture solutions rely upon the presence of a template to specify the location of the data fields to be captured from each document. If there are multiple document types then a separate template must be defined for each type.
Overview the presence of one or more dates and monetary amounts. The document classification can be used, for example, to automatically route the document to the correct location within a workflow system. Document Separation As an extension of classification, multi-page documents can be separated automatically by using free-form technology to detect the beginning and end of each separate document in a stream of scanned images.
Chapter 1 10 Getting Started Guide (Free-Form)
Chapter 2 Installing INDICIUS Introduction This chapter provides instructions for installing INDICIUS using the installation wizard (standard installation). To install INDICIUS the following items are required: 1 A computer satisfying the system requirements as described in the Installation Guide (.pdf). 2 An INDICIUS installation CD. 3 A Kofax Capture license hardware key with INDICIUS features enabled. Note Kofax Capture must be pre-installed and licensed.
Chapter 2 Installing INDICIUS for the First Time Standard Installation X To install INDICIUS 1 Place the INDICIUS installation CD into the CD-ROM drive. The main installation screen will display. 2 Select INDICIUS and follow the on-screen instructions. 3 To install Document Review, select INDICIUS Document Review and follow the on-screen instructions. 4 To install Transformation Studio, select Transformation Studio and follow the on-screen instructions.
Chapter 3 Processing Tutorial The Solicitors’ Letters Example The INDICIUS installation includes an example configuration that demonstrates some of the features of INDICIUS. The example uses a pre-defined batch class configured to capture data from a set of example images, as well as configuration files for Recognition, Completion and Scripted Export. The example uses the following modules – Kofax Capture Scan, INDICIUS Recognition, INDICIUS Completion and INDICIUS Scripted Export.
Chapter 3 Figure 3-1.
Processing Tutorial Installing the Example Batch Class Installation X To install the example batch class 1 Start Administration by clicking Start on the taskbar to display the menu, and selecting: All Programs | Kofax Capture 8.0 | Administration. 2 Select File | Import… to display a file selection window. 3 Select the following batch class: \examples\Solicitors Letters\INDICIUS Solicitors Letters Example.cab. 4 Click Open to unpack the batch class.
Chapter 3 Troubleshooting In a client-server installation of Kofax Capture, an error may be generated when the batch class is published. The batch class is configured to store images in the Kofax Capture images folder, and an error is raised if this folder exists elsewhere on the network. If this is the case, use the procedure below to specify a different images folder. X To specify a different images folder 1 On the Batch panel, select the “INDICIUS Solicitors Letters” batch class.
Processing Tutorial Viewing the Modules X To view the modules included in the example batch class 1 On the Batch panel, select the “INDICIUS Solicitors Letters” batch class. 2 Right click on the selection to display the menu, and select Properties. 3 The Batch Class Properties window is displayed. 4 Select the Queues tab. The modules included in the batch class are displayed in the Selected Queues list. 5 Click OK.
Chapter 3 Running the Solicitors Letters Example The example can be tested on a batch of example images. The typical test process (which you will work through later) is as follows: Step 1: Create a batch. Step 2: Add images to the batch and establish document boundaries using Kofax Capture Scan. Step 3: Run the batch through INDICIUS Recognition to produce raw captured data. Step 4: Correct, complete and validate the batch with INDICIUS Completion. Step 5: Reformat the data using INDICIUS Scripted Export.
Processing Tutorial Figure 3-2. Create Batch Window 4 Enter a name for the new batch in the “Name:” box, for example “Solicitors Letters 1”. 5 Click Save. 6 Click Close. Your batch is displayed in the list. The Queue column indicates that the batch is ready to be processed by Kofax Capture Scan. X To import images and assemble documents 1 Make sure the name of the new batch is highlighted and select File | Process Batch or click Process Batch ( ) on the toolbar.
Chapter 3 5 Select Batch | Close and click Yes to the message box. In Batch Manager, the Queue column indicates that the batch is ready to be processed by INDICIUS Recognition. Note The Solicitors Letters example processes only a single type of document, using the Scan module to establish fixed document boundaries before the first INDICIUS module. X To recognize the data, click Process Batch on the toolbar in Batch Manager. Recognition will automatically begin processing the batch.
Processing Tutorial X To complete the data 1 Click Process Batch on the toolbar in Batch Manager. 2 Wait for the batch to be automatically loaded into Completion. Every document is being displayed for this example. In production, only documents with missing or invalid data would be displayed. Document 1 has no errors. Figure 3-3. Completion Window for the first document in the Example Batch 3 Use Tab/Shift+Tab to navigate around the fields on Document 1. 4 Press F12 to move to the next document.
Chapter 3 Figure 3-4. Completion Window Displaying a Rejected Field 5 Press Enter to accept this rejected value. As this document has no further errors, you will automatically move to the next document. 6 22 Repeat for the remaining documents, pressing F12 to progress from documents with no rejected fields and Enter to accept the dates on Documents 3 and 5.
Processing Tutorial Note The focus area changes depending on where the data has been found. The “Transfer Date” field has a large focus area on this document. This is because the target searched for is actually the word “on” or “at” followed by the date and on this image this value wraps across two lines. The focus box surrounds the whole area.
Chapter 3 X To export the batch, click Process Batch on the toolbar in Batch Manager. Scripted Export will automatically begin processing the batch. Information messages will be displayed and the “Docs Processed” should increment. When “Docs Processed” reaches 9, Scripted Export will close and the batch is complete. In a production system Kofax Capture Release would run after Scripted Export, copying the data stored in index fields to the back-end system.
Processing Tutorial 6 Click Open to import the images and establish document boundaries. 7 Select Batch | Close and click Yes to the message box. The Create Batch window will display again. X 8 Click Cancel to finish creating batches. 9 Click Batch | Exit to close Kofax Capture Scan. To recognize the data 1 Open Recognition by clicking Start on the taskbar to display the menu, and selecting: All Programs | INDICIUS | Recognition. 2 Select Session | Select Batch.
Chapter 3 X To export the batch 1 Open Scripted Export by clicking Start on the taskbar to display the menu, and selecting: All Programs | INDICIUS | Scripted Export. 2 Select Session | Select Batch. 3 Select your batch from the list. 4 Click Ok. Scripted Export will begin processing the batch. Information messages will be displayed and the “Docs Processed” should increment. When “Docs Processed” reaches 9, the status bar will display “Idle”. 5 Select Session | Exit to close Scripted Export.
Chapter 4 Configuration Concepts Processing Configuration and Document Configuration The behavior of INDICIUS when processing a particular document set is controlled by a set of configuration files as well as settings within the batch class. Configuration is done using both Kofax Capture Administration and the set of specialized INDICIUS configuration tools.
Chapter 4 Configuration Files The following table specifies the configuration files used by each INDICIUS module, whether they are mandatory for free-form extraction and which tools are used to generate them. Table 4-4. Configuration Files (Free-Form Extraction, Single Document Type) Module Configuration File Recognition Definition File (.idf) – mandatory Specifies the options for the full-page read, which provides the raw data used by the free-form technology. Generated by Definer. Script File (.
Configuration Concepts Table 4-4. Configuration Files (Free-Form Extraction, Single Document Type) Module Configuration File Verification Template File (.kfi) – mandatory Specifies the fields to be verified and behavior of the interface. Also specifies the validation to be used for each field. Generated by Template Editor. Script File (.ifv) - optional Used to transform the data when it is output. Also used to validate data as it is keyed. Generated by Script Editor. Scripted Export Script File (.
Chapter 4 30 4 Click OK. 5 Repeat the previous steps for Completion and Scripted Export to view the configuration files.
Configuration Concepts Some Important Terminology This section details specific terminology used by INDICIUS. Field Fields can be of the following types: single line text, multi-line text, mark sense, mark grid, table or barcode. In most free-form processing configurations, you only require text fields to generate full-page reads. Occasionally you may require a field to read an area smaller than the full page. The raw data from these fields is used by the searches (see below).
Chapter 4 Target A target is the data being searched for. Anchor An anchor is a heading/keyword near which one or more targets will be found. Target Search A target search looks for targets independently of anchors. This is used when the target being sought is a fixed value or of a fixed pattern and generally has a strong validation. Anchor-Target Search An anchor-target search looks for targets in close proximity to anchors, in other words, targets near headings or keywords.
Chapter 5 Configuring Recognition Overview Recognition fields and free-form searches are configured using Definer. The main view in Definer is used to configure the fields to read; the Free-form Processing window is used to configure the data searches. The user interface creates both a definition file (containing the fields) and a Recognition script file (containing the searches) that both need to be assigned in the Recognition setup dialog.
Chapter 5 Step 2: Search for Data During the second stage searches are run on the recognized text in order to extract data. Before the search results are returned they can be formatted and validated using free-form technology. Step 3: Analysis and Output of Data Further analysis of the search results can be performed using free-form technology, followed by the output of results to current or new fields in the XML. The following diagram shows these stages in more detail. Figure 5-5.
Configuring Recognition Important If any of these stages fails or are inaccurate, the stages thereafter may fail or be inaccurate. It is therefore important to ensure that every stage of the process operates at its best, particularly the earlier ones.
Chapter 5 Creating the Definition File Field reads can be captured by the full page engine, the standard engine or using multi-voting. Multiple fields can be used in searches and fields can be across multiple pages in a document. The steps to configure the field in the example follow. In this case, a full page read is used. For detailed information on creating definition files and configuring fields, refer to the INDICIUS Help.
Configuring Recognition Creating the Recognition Script File The Recognition script file is configured using the Free-form Processing window in Definer. Unlike the configuration of definition files, free-form scripts are based on a Recognition output data file, not an image. This XML data file can be generated within Definer or loaded if it has already been created.
Chapter 5 7 Exceptions Properties Tab 8 Text Read Properties 9 Results Tab 10 Image Tab 11 Script View/Edit Tab In order to test searches during configuration, Recognition output data needs to be loaded. This data can be created automatically within Definer. 2 Create a data file from the loaded definition file and image by clicking the Create Recognition Output File button. The loaded definition file and image will automatically be selected in the window, along with a default data filename.
Configuring Recognition Figure 5-8. New Search Window 2 Enter “Date” as the name of the search. This uniquely identifies the set of SearchOptions (the options in script which define the search). The “Fields” panel lists the fields to base the search on. Data from these fields will be used in the search for the date. By default, the fields listed will be those in the definition file in Definer, but they can also be imported from Recognition output files or can be added manually by clicking Add Field.
Chapter 5 Figure 5-9. Free-form Processing Window with the Date Search 1 List of searches 2 Test Current Search Note Field names are case-sensitive. Once a search is created, the fields it is based on can be edited by selecting the Text Read tab on the right and clicking Select Fields. It is possible to set the area within which the search is run. If any of the fields selected are not in this area, the text from them will not be used when running the search.
Configuring Recognition X To configure a target search for the date 1 Select the Target property on the Main panel on the right and click on the button to open the integrated Regular Expression Builder. Figure 5-10.
Chapter 5 Note that numeric dates must have one of /\- or . as a separator. Dates including alpha months can also have spaces. 4 Click OK to close the builder. The Target Property is automatically populated with the regular expression. You will need to ensure the regular expression doesn’t fail due to spaces in the data (numeric values can be variably spaced and additional spaces can occasionally be found). 5 On the Basic tab under Target, set the value of the Ignore Spaces property to True.
Configuring Recognition Figure 5-11. Free-form Processing Window with Results 1 Results shown on image 2 Search details 3 Information on the result selected in 4 4 Search results (list of results found) When a result is selected on the Search Results panel, it is also shown and selected on the image (with a black border) and information on the result is shown in the information panel. Figure 5-12.
Chapter 5 3 Page that the match is found on 4 Position of match 5 Confident character for each character of the match (1= confident, 0 = unconfident/rejected) X To set an exception In this example, two dates have been found. In order to eliminate the second value, exceptions need to be defined. If a target is found near an exception heading or keyword, the target will be disregarded.
Configuring Recognition As the exceptions “on” and “at” will always precede the date, the exception zones need to be changed to search only to the left of the target matches. 4 On the Exceptions tab, select the Target Exception Zones property. 5 Click on the button to open the Target Exception Zones window. Figure 5-14.
Chapter 5 It will change to being white. Only zone 4 should be on (green) because the exception always precedes the target for this particular search. Figure 5-15. Select Search Zones 7 Click OK to close the window and save the changes to the new property. 8 Run the search again. Only a single zone will be drawn on the image, to the left of each date. 9 Save the script file as: \config\MySolicitorsLetters.ifv. Note \b indicates a word boundary.
Configuring Recognition X To test on other documents 1 Select Data | Create Recognition Output File. 2 To the right of the “Image File” box, click Select. 3 Select the following image: \examples\Solicitors Letters\images\Solicitor2.tif. 4 Click Yes at the prompt to change the XML file name. 5 Click Create Output File. 6 Click OK then Close to exit the window. 7 Click Yes to load the data. 8 Run the search and check the results.
Chapter 5 3 Click OK to add the search. 4 On the Main panel, set the value of the Search Type property to “2 – Anchor– Target”. 5 For the Anchor property, enter all the different values for the anchor separated by pipes (|), as shown below: Account Number|Account No|Roll Number|Roll No. 6 Select the Target property and click Builder. 7 Build up an expression by adding the following: to open the Regular Expression Table 5-5.
Configuring Recognition Figure 5-16. Anchor-Target Search Results 11 Save the script. Note Anchors can be entered with or without spaces. In this case, they are entered with spaces. If no spaces are entered, the property Ignore Spaces should be changed to True for the anchor. This is set on the Basic panel. By default, the target is searched for below or to the right of the anchor. This can be changed by editing the Anchor-Target Zones.
Chapter 5 The account number anchor is found twice but no target is found. This is because the target falls outside the zone (shown with a blue rectangle). 3 To rectify this, select the Anchor-Target Zones property on the Main panel and click to open the Anchor-Target Zones window. 4 Change the zone width to 4cm. This is the width of any selected zones to the left and right of the anchor. 5 Click OK to save the property and close the window.
Configuring Recognition 3 Click OK to add the search. The Search Type will default to “1 – Target.” 4 Select the Target property. 5 Click 6 Enter a regular expression for the date preceded by either “on” or “at.” to open the Regular Expression Builder. Table 5-6. Entering a Regular Expression Element How to Enter “on” or “at” Enter (on|at) directly into the regular expression, with no spaces.
Chapter 5 Figure 5-17. Anchor-Target Search Results for the Transfer Date 10 The transfer date is found, but will need formatting to remove the “on” from it. This will be covered in the later section Formatting and Validating Matches. 11 Save the script. X To test the search on other documents 1 Load the Recognition output file for image Solicitor2.tif. 2 Test the search and check the results. No value is found. This is because the “at” is on one line and the date on the next.
Configuring Recognition 6 Repeat the above steps for the Recognition output files for the images Solicitor3.tif to Solicitor9.tif. 7 Reload the Recognition output file for image Solicitor1.tif. 8 Run the search and check the results (as the configuration changed in the steps above). Configure the “Zip Code” Field X To create and test a target search for the zip code 1 Create a new search by clicking New Search. 2 Name the search “ZipCode” and base it on the field “Page1.
Chapter 5 10 Build up an expression by adding the following: Table 5-8. Entering a Regular Expression Element How to Enter \b Enter \b directly into the regular expression Numerical value of length 5 Select “Numeric” and specify a “Fixed length” of 5 and click Insert \b Enter \b directly into the regular expression The regular expression created should be: \b\d{5}\b 11 Click OK to close the window and save the property.
Configuring Recognition Figure 5-18. Anchor-Target Search Results for Zip Code Two zip codes are found. The first is the address of the company processing the correspondence. The next section will describe how to ignore this using a validate function. 15 Test the search on the Recognition output files for each of the remaining images, Solicitor2.tif to Solicitor9.tif. 16 Save the script.
Chapter 5 Formatting and Validating Matches Prior to matches being returned in the match collections, they can be formatted and validated. Setting up formats and validations is done in two stages: Step 1: Set the properties to run the format or validation script on the search. Step 2: Write the format and validation scripts. The following sections describe these steps in more detail.
Configuring Recognition Though some areas within the script may need editing in order to implement custom validations, the InitialiseSearchOptions function should not be edited. Formatting Matches Matches can be formatted prior to being validated (optional) and returned in the match collection. In order to format matches a single script function is required: MatchValue_Format The function takes the search name (for example, “Date”) and the match value as arguments.
Chapter 5 Function MatchValue_Format(sSearchName, sValue) Select Case sSearchName Case "Date" 'Insert formatting for search 'Date' here... MatchValue_Format = sValue Case "AccountNo" 'Insert formatting for search 'AccountNo' here... MatchValue_Format = sValue Case "TransferDate" 'Insert formatting for search 'TransferDate' here... MatchValue_Format = sValue Case "ZipCode" 'Insert formatting for search 'ZipCode' here...
Configuring Recognition 5 Update the script as shown in Figure 5-21.
Chapter 5 Function MatchValue_Format(sSearchName, sValue) Dim lRet Select Case sSearchName Case "Date" 'Convert the date to MMDDYYYY format 'The date (sValue) is reformatted within the function and returned as sValue 'If lRet is < 0 an error has occurred lRet = LibFreeForm_FormatDate(sValue) If lRet < 0 Then Recog.LogMessage "LibFreeForm_FormatDate failed.
Configuring Recognition Function MatchValue_Format(sSearchName, sValue) Dim lRet Select Case sSearchName Case "Date" 'Convert the date to MMDDYYYY format 'The date (sValue) is reformatted within the function and returned as sValue 'If lRet is < 0 an error has occurred lRet = LibFreeForm_FormatDate(sValue) If lRet < 0 Then Recog.LogMessage "LibFreeForm_FormatDate failed.
Chapter 5 Figure 5-23. Search Results 62 5 Repeat the previous steps for the “TransferDate” search. 6 Save the script.
Configuring Recognition Validating Matches Matches can be validated after being (optionally) formatted and prior to being returned in the match collection. This is one way of “filtering” out matches before they are returned for final analysis. In order to validate matches a single script function is required: MatchValue_Validate The function takes the search name (for example, “Date”) and the match value as arguments. It then returns True if the value is valid and False if not.
Chapter 5 Function MatchValue_Validate(sSearchName, sValue) Select Case sSearchName Case "Date" 'Insert Validation for search 'Date' here... MatchValue_Validate = True Case "AccountNo" 'Insert Validation for search 'AccountNo' here... MatchValue_Validate = True Case "TransferDate" 'Insert Validation for search 'TransferDate' here... MatchValue_Validate = True Case "ZipCode" 'Insert Validation for search 'ZipCode' here...
Configuring Recognition Function MatchValue_Validate(sSearchName, sValue) Const RECIPIENT_ZIPCODE = "02111" Select Case sSearchName Case "Date" 'Insert Validation for search 'Date' here... MatchValue_Validate = True Case "AccountNo" 'Insert Validation for search 'AccountNo' here... MatchValue_Validate = True Case "TransferDate" 'Insert Validation for search 'TransferDate' here...
Chapter 5 Figure 5-26. Validated Zip Code Search Results 66 2 Test the search on the Recognition output files for the remaining images. 3 Save the script.
Configuring Recognition Integration with Recognition At this stage, all searches have been configured but the matches found are not being analyzed or output. Though the searches are run in the Free-form Processing window, they will not yet run in Recognition as they are not being called from within the standard Recognition function for each document (process_doc). The following sections describe how to link the searches to text from fields, run and analyze the searches and finally to output the results.
Chapter 5 this window will display again. The functions requiring replacement can then be selected using the check boxes and then re-inserted. 4 Click OK to insert all the script functions. 5 Save the script. The Process Document Function A process_doc function will be created which calls each of the other functions. If this function already exists, the code will be inserted into the current function. The inserted code should never need updating as it is independent of the searches being run.
Configuring Recognition Function process_doc() 'Recognition Process Document Function 'This function is called once for each document in the batch. 'Return values <>0 cause batch to be aborted. 'If there is an error, use Recog.LogMessage to log the error to the application window. ' ''The following lines were created automatically by Definer.
Chapter 5 Step 2: Run the Searches Each search (defined by a set of SearchOptions and referenced by name) is run using a TextRead object, referenced by name. Two match collections are returned for each search, the anchor and the target match collections (named MCAnchor_ and MCTarget_ by default). Target searches do not require an anchor match collection to be defined. However, to simplify the configuration both are automatically included in case search types are changed.
Configuring Recognition Function DoSearches() Dim lret '*** Search: Date *** lret = Textread.DoSearch("Date", "TextRead1", "MCAnchor_Date", "MCTarget_Date") If lret < 0 Then Recog.LogMessage "TextRead.DoSearch failed for search: Date. Error code:" & lret, 1 DoSearches = lret Exit Function End If '*** Search: AccountNo *** lret = Textread.DoSearch("AccountNo", "TextRead1", "MCAnchor_AccountNo", "MCTarget_AccountNo") If lret < 0 Then Recog.LogMessage "TextRead.DoSearch failed for search: AccountNo.
Chapter 5 '*** Search: Date *** 'Get the number of matches. There are always the same number of anchors and targets. lMatches = Textread.GetNumberOfMatches("MCTarget_Date") If lMatches = 1 Then 'Set the value in the output field (confidence etc are preserved) lret = Textread.CreateXMLFieldFromMatch("MCTarget_Date", 1, "Date", False, False) If lret < 0 Then Recog.LogMessage "TextRead.CreateXMLFieldFromMatch failed.
Configuring Recognition X To test on multiple documents 1 Select the Image tab. 2 Select Test | Test Multiple Documents. The Test Multiple Documents window is displayed. 3 On the Documents panel, click 4 On the Files to Test panel, select the “Run definition files (if there is no output file already) and Recognition script” option. and select all nine document images. This will only generate the raw output from Recognition if it does not already exist, but the searches will always be run.
Chapter 5 is a cut-down version of the Recognition Test Tool, full help for which is included in the INDICIUS Help. 7 Click Close to exit test mode. 8 Save the script file and exit Definer. Testing a Configuration So far, testing of the Recognition definition and script files has been limited to a few documents in the Definer Free-form Processing window’s test mode.
Configuring Recognition This produces the same results as you saw for the Definer test mode, but in a real life configuration process you would use the high-volume capabilities of the Recognition Test Tool to test on a larger number of images. 10 Select File | Save Project and save the project as: \Test Projects\MySolicitorsLetters.rtp. 11 Select File | Exit to close Recognition Test Tool.
Chapter 5 7 Click Open. 8 In the Recognition Script File list, click Clear to remove the existing entry. 9 Click Select Script File to display a file selection window. 10 Select the following file: \config\MySolicitorsLetters.ifv. 11 Click Open. 12 Click OK. 13 On the Batch tab, select the “INDICIUS Solicitors Letters” batch class. 14 Select File | Publish. The Publish window will display. 15 Click Publish.
Chapter 6 Configuring Completion Overview This tutorial shows how the Template Editor tool can be used to generate a template file (.kfi) for Completion. A template file contains parameters that specify: The layout of the Completion interface for each document. The position and type of the data entry fields. How document images are to be displayed. Simple validation rules and references to more complex validation scripts.
Chapter 6 Creating a Template The typical process to create a template (which you will work through later) is as follows: Step 1: Create a new template with a single data entry tab and set the basic layout. Step 2: Load a sample document image or images.
Configuring Completion 8 Select File | Save Template As… to display a file window. 9 Save the template file as: \templates\MySolicitorsLetters.kfi. Figure 6-33.
Chapter 6 X To load an example image 1 Select Image | Open Image. An Open Image window is displayed. 2 Select the following image: \Examples\Solicitors Letters\Images\Solicitor1.tif. 3 Click Open. The example image should now be displayed in the image viewing area. X To create tabs Tabs are used to group together data fields at a high level or to separate data for different pages of a multi-page document.
Configuring Completion 4 Select the new frame field. 5 Set the values of the following properties in the Properties panel: Table 6-9. Resizing Frame Field 6 Property Value Left: 105 Top: 0 Width: 4400 Height: 6250 Save the template file. Note The position and size of the field can also be set using the mouse. Creating Data Entry Fields Data entry fields are where data captured using Recognition will be displayed. The user will then have the opportunity to edit the data in these fields.
Chapter 6 Important The following steps specify the properties for the fields and labels in the order they are in on the Properties panel. You should specify the value of the Width property before that of the Left property. If you don't do this, the Left property will try to position the object off the template and an error message will display. X To create data entry fields 1 Click on the (New Text Field) button on the toolbar. This will add a text field to the template.
Configuring Completion Table 6-12. Create New Text Field – Account Number Property Value Left: 2450 Top: 1525 Width: 1540 Height: 360 Table 6-13. Create New Text Field – Transfer Date Property Value Name: TransferDate MultiLine: No Mandatory: Yes Allow Spaces: No Left: 3065 Top: 2530 Width: 930 Height: 360 Table 6-14.
Chapter 6 Table 6-14. Create New Text Field – Zip Code 4 Property Value Top: 3520 Width: 930 Height: 360 Save the template file. Note All field names must match the names of the searches defined for Recognition. This is because the search names will be used as the field names output by Recognition, and data is linked between modules using the field names. X To add labels to the fields Label fields are used to add a description to a data fields on the interface.
Configuring Completion Table 6-16. Additional New Label Field – Account Number Property Value Name: Account No: Left: 2465 Top: 1205 Width: 1530 Height: 290 Table 6-17. Additional New Label Field – Transfer Date Property Value Name: Transfer Date: Left: 2720 Top: 2175 Width: 1275 Height: 290 Table 6-18.
Chapter 6 Table 6-19. Create Additional Label for the Solicitor’s Address 8 Property Value Name Solicitor’s Details: Left: 300 Top: 4075 Width: 3695 Height: 1995 Backcolor White Appearance 1 – Flat Border Yes Label ID Details Save the template file. Controlling Image Display In Completion, when a field is selected in the data entry area, the image display area will show the relevant part of the image, thus allowing the data to be read by eye.
Configuring Completion Completion also highlights the actual data field using a blue box. This is known as the Focus Area. Completion can be configured to set the coordinates of the Focus Area to the coordinates that were found by Recognition for the field. Alternatively, they can be set explicitly. The coordinates of fields are retained when they are created from the output of a search. Use these coordinates if the data was found. If it was not, it is important to show the likely location on the image.
Chapter 6 9 Set the View and Focus Areas to be the whole page for the “ZipCode” field. 10 Save the template file. Setting Field Order As new fields are created in Template Editor they are automatically assigned a field index. The index for the first field created is 1 and the fields are incrementally numbered thereafter.
Configuring Completion 2 Make sure the field order matches the order shown in Figure 6-34. Adjust the field order by selecting a field to move and clicking the arrows on the right. 3 Click OK. Setting Simple Validations Template Editor allows simple validations to be specified for a field according to character type, data type (name, address etc.), or according to length and value. Fields can also be set as mandatory. Validations are performed: As each document is loaded into Completion.
Chapter 6 Entering [#] is a shorthand method for entering all numerics (0123456789). The other characters represent themselves and do not have any special meaning. 10 Set the values of “Minimum Length” and “Maximum Length” to 12. 11 Click OK. 12 Select the “TransferDate” field. 13 Select the Validation property on the Properties panel. 14 Click to display a Field Validation window. 15 For the Validation Type, select “MMDDYY” and click OK. 16 Save the template file.
Configuring Completion 8 Double-click on the “Function” column and select the function “zipcode_validate”. The “zipcode_validate” function checks that the value exists in a database. If it does, it also populates the Solicitor’s Details label with the solicitor’s name and address. Figure 6-35. Zip Code Validation 9 Save the template file.
Chapter 6 Table 6-20. Script Hooks Script Hook Runs When… Validate A document is loaded into Completion and when the user attempts to leave the field, prior to the exit hook. Field values should not be changed in this hook. Exit The user navigates away from a field. There are also four global script hooks that apply to all fields on a document: Table 6-21. Global Script Hooks Script Hook Runs When… Start of Document Runs when the document is loaded before the data is validated.
Configuring Completion X To apply data transformation using scripts 1 On the Hooks tab, select the line for the Global Start of Document hook. 2 Click the 3 Select “SolicitorsLetter.ifv.” 4 Click the button on the “File” column. button on the “Function” column and select “Document_start.” This function contains code that will transform all dates to “DDMMYY” format before display and will run the zip code database search. 5 Save the template file.
Chapter 6 Setting Autoskip Functionality Completion can be configured to automatically present the user with the next data field once the entry in a particular field is complete (a length validation must be specified for the field). The keystroke saving achieved by implementing Autoskip can be considerable. X To specify autoskip behavior 1 Select the “Date” field. 2 Select the Properties tab to display the Properties panel. 3 Set the value of the Autoskip property to Yes.
Configuring Completion Viewing the Completion Script The script used for validation and data transformation can be viewed in Script Editor, a tool which can be used to generate scripts for all the modules. For an introduction to Script Editor, see the chapter Configuring Scripted Export. Assigning the Template to a Batch Class Now a new template file has been created, it can be assigned to the example batch class.
Chapter 6 The Publish window will display. 11 Click Publish. The progress of the publishing operation will be logged in the Results panel. 12 When publishing has been completed, click Close. 13 Select File | Exit to close Kofax Capture Administration. Processing a Batch Create a new batch using Kofax Capture Scan to import the example images and then process the images through the modules (if necessary refer to the Processing Tutorial chapter for instructions).
Chapter 7 Configuring Scripted Export Overview Script Editor is used to generate script files (.ifv) for Scripted Export, transforming the data from any previous INDICIUS module ready for release. Script Editor can also be used to develop scripts for transforming data at the moment it is output from Recognition, Completion or Verification and for validating data in Correction, Completion and Verification.
Chapter 7 Figure 7-36. Script Editor 2 Select File | Open File… from the main menu. An Open Script File window is displayed. 3 Select the following file: \examples\Solicitors Letters\scriptex\ScriptEx_XMLtoCSV.ifv. There is a single script function, “process_doc”, defined in this file. 4 Scroll in the code window to view the script contents. 5 Select File | Exit to close Script Editor.
Configuring Scripted Export X To view the script used by Completion 1 Open Script Editor by clicking Start on the taskbar to display the menu, and selecting All Programs | INDICIUS | Tools | Script Editor. 2 Select File | Open File… to display an Open Script File window. 3 Select the following file: \examples\Solicitors Letters\templates\SolicitorsLetter.ifv. 4 Scroll in the code window to view the script contents, or select the functions from the pull-down Declarations list.