LEGAL NOTICES Copyright © 2002 ScanSoft, Inc. All rights reserved. The software described in this book is furnished under license and may be used or copied only in accordance with the terms of such license. IMPORTANT NOTICE ScanSoft, Inc. provides this publication "as is" without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability or fitness for a particular purpose.
C 1 1 2 O N T E N T S WELCOME 7 Using this Guide 8 Getting online Help 9 Online HTML Help 9 Context-Sensitive Help 9 Tech Notes 10 Glossary 10 OmniPage SE 10 INSTALLATION AND SETUP 11 System requirements 12 Installing OmniPage SE 13 Setting up your scanner with OmniPage SE 14 How to start the program 16 Registering your software 17 New features in OmniPage Pro 12 17 OmniPage SE and OmniPage Pro 12 19 INTRODUCTION 21 What is optical character recognition 22 OmniPage S
The Toolbars 25 The Image Panel 26 The Text Editor 26 The OmniPage Toolbox 27 Managing documents Thumbnails 28 Document Manager 29 Customizing Document Manager columns 30 Deleting pages from a document 30 Printing a document 31 Closing a document 31 OmniPage Documents 32 How to save to OPD 32 33 PROCESSING DOCUMENTS 35 Quick Start Guide 36 Loading and recognizing sample image files 36 Scanning and recognizing a single page 36 Processing overview 38 Automatic processing 40
Defining the source of page images 4 5 50 Input from image files 50 Input from scanner 51 Scanning with an ADF 52 Scanning without an ADF 53 Describing the layout of the document 53 Zones and backgrounds 55 Automatic zoning 55 Manual zoning 56 Zone types and properties 57 Working with zones 59 Table grids in the image 61 Using zone templates 63 PROOFING AND EDITING 65 The editor display and views 66 Proofreading OCR results 67 Verifying text 68 User dictionaries 70 Trai
Selecting a formatting level 83 Selecting advanced saving options 84 Saving to PDF 86 Copying pages to Clipboard 86 Sending pages by mail 87 TECHNICAL INFORMATION Troubleshooting 89 90 Solutions to try first 90 Testing OmniPage SE 91 Increasing memory resources 92 Increasing disk space 92 Text does not get recognized properly 93 Problems with fax recognition 94 System or performance problems during OCR 94 ODMA support 95 Advanced features in Schedule OCR 95 Supported file type
Welcome Welcome to OmniPage® SE, and thank you for using our software! The following documentation has been provided to help you get started and give you an overview of the program. This User’s Guide This guide introduces you to using OmniPage SE (Special Edition). It includes installation and setup instructions, a description of the program’s commands and working areas, task-oriented instructions, ways to customize and control processing, and technical information.
Using this Guide This guide is written with the assumption that you know how to work in the Microsoft Windows environment. Please refer to your Windows documentation if you have questions about how to use dialog boxes, menu commands, scroll bars, drag and drop functionality, shortcut menus, and so on. We also assume you are familiar with your scanner and its supporting software, and that the scanner is installed and working correctly before it is setup with OmniPage SE.
Getting online Help In addition to using this guide, you can use OmniPage SE’s online Help to learn about features, settings, and procedures. Online Help is available after you install OmniPage SE. Online HTML Help Open OmniPage SE’s online Help at its top level by choosing Help Topics at the top of the Help menu. This allows you to see topics arranged in a Table of Contents, search an alphabetical list of keywords or make full-text searches through the topics.
Tech Notes Commonly reported issues using OmniPage® are presented on ScanSoft’s web site at www.scansoft.com. Web pages may also offer assistance on the installation process and troubleshooting. Glossary This guide does not include a glossary. The online Help has a comprehensive glossary, with its own alphabetical index and a table of contents. Please consult it if you want to find the meaning of a term used in this guide or in the program.
Chapter 1 Installation and setup This chapter provides information on installing and starting OmniPage SE.
System requirements You need the following minimum system requirements to install and run OmniPage SE 2.0: X A computer with a Pentium or higher processor X Microsoft Windows 98 (from second edition), Windows Me, Windows NT 4.
Chapter 1 Installing OmniPage SE OmniPage SE’s installation program takes you through installation with instructions on every screen. Before installing OmniPage SE: X Close all other applications, especially anti-virus programs. X Log into your computer with administrator privileges if you are installing on Windows NT, 2000 or XP. X If you have previous ScanSoft OCR software on your system, the installer will ask for your consent to uninstall that software first. W To install OmniPage SE: 1.
Setting up your scanner with OmniPage SE All files needed for scanner setup and support are copied automatically during the program’s installation. Before using OmniPage SE for scanning, your scanner should be installed with its own scanner driver software and tested for correct functionality. Scanner driver software is not included with OmniPage SE. Scanner installation and setup are done through the Scanner Wizard. You can start this yourself, as described below.
Chapter 1 X Click on Scan to begin the sample scan. X If necessary, click on Inverse Image… or Missing Image… and X X X X X X X X X X make the appropriate selections. Once the image appears correctly in the window, click on Next. Select the item that most appropriately describes your scanner, then click on Next. Click on Next to proceed to page size. The page sizes that the Scanner Wizard believes your scanner to support are listed in the window.
How to start the program To start OmniPage SE do one of the following: X Click Start in the Windows taskbar and choose Programs ScanSoft OmniPage SE 2.0 OmniPage SE 2.0. X Double-click the OmniPage SE icon in the program’s installation folder or on the Windows desktop if you placed it there. X Double-click an OmniPage Document (OPD) icon or file name; the clicked document is loaded into the program. See “OmniPage Documents” on page 31.
Chapter 1 Registering your software ScanSoft’s registration Wizard runs at the end of installation. We provide an easy electronic form that can be completed in less than five minutes. When the form is filled, and you click Send the program will search an Internet connection to immediately perform the registration online. If you did not register the software during installation, you will be periodically invited to register later. You can go to www.scansoft.com to register online.
page 76. Page backgrounds are defined as process (auto-zone) or ignore, so all zoning instructions appear on the page and can be saved to zone templates. See page 55. Irregular zones can be drawn and zones split and joined more simply, without the need for separate tools. See page 59. X Better proofing and verifying The Proofing dialog box now shows suspect words in a wider context. A dynamic verifier can stay open as text is being checked, with the image display and window tracking the editing position.
Chapter 1 OmniPage SE and OmniPage Pro 12 This list documents features that are not incorporated in OmniPage SE, but can become available by upgrading to OmniPage Pro 12: X Significant improvement in recognition accuracy X Access to training, IntelliTrain and training files X Ability to open and read the contents of PDF files X Ability to save recognized documents to PDF format X Support for two-page scanning to scan books more easily X Flowing page output formatting level for superior page retention X Sch
Installation and setup
Chapter 2 Introduction You probably use your computer for business correspondence, preparing reports, handling data and an ever-increasing number of other uses. The challenge is that, in spite of the digital revolution, certain sources of information still circulate in printed, paper form and cannot be used immediately in a computer.
What is optical character recognition Optical character recognition is the process of extracting text from an image. This image can result from scanning a paper document or opening an electronic image file. Images do not have editable text characters; they have many tiny dots (pixels) that together form character shapes. These present a picture of the text on a page. During OCR, OmniPage SE analyzes the character shapes in an image and defines solutions to produce editable text.
Chapter 2 Documents in OmniPage SE OmniPage SE handles documents one at a time. When you acquire your first image (from scanner or from file) a new document is started. Further acquired images are added to the same document, until you save and close it. A document in OmniPage SE consists of one image for each document page. After you perform OCR, the document will also contain recognized text, displayed in the Text Editor, possibly along with graphics and tables. See “The OmniPage Desktop” on page 24.
The OmniPage Desktop The OmniPage Desktop has a title bar and a menu bar along the top and a status bar along the bottom. It has three main working areas, separated by splitters: the Document Manager, the Image Panel and the Text Editor. Each has close, maximize and restore buttons top right. The Image Panel has an Image toolbar and the Text Editor has a Formatting toolbar. Standard toolbar OmniPage Toolbox Formatting toolbar Thumbnails show a picture of each page in the document.
Chapter 2 We show the program with a three-page document. Page one is the current page, which has been recognized and proofed. Page two has been recognized but not proofed yet. Page three has been acquired and manually zoned, but not recognized yet. The icons at the bottom of the thumbnail images show page status. Status bar buttons let you show or hide the main screen areas and move to other pages in the document.
The Image Panel When this displays the current page image, the Image toolbar is available. All page images have a background value: process or ignore. Zones can be manually drawn on page images, or can be placed automatically after recognition. There are five zone types: Process, Ignore, Text, Table, Graphics. Areas inside process zones and on a process background outside other zones have zones automatically drawn and their zone types determined during processing. See “Zones and backgrounds” on page 55.
Chapter 2 The OmniPage Toolbox This Toolbox lets you drive the processing. By default it is located along the top of the OmniPage Desktop, just above the working areas. It can be floated and also be docked along the bottom of the desktop. Start button Get Pages drop-down list Get Page button Perform OCR button Layout Description drop-down list Export Results button Export Results drop-down list Automatic processing is started, and can be stopped and re-started with the Start (1-2-3) button.
Managing documents Document management can be done by thumbnails in the Image Panel or by the Document Manager, situated along the bottom of the OmniPage Desktop. Both summarize the pages in the document and are synchronized. Our pictures show these with the same seven-page document. Pages 1 and 2 are selected and page 4 is the current page, that is, the one shown in the Image Panel. Page status is shown as follows: Page Status Icon Page image has been...
Chapter 2 the Ctrl key as you click thumbnails to add pages to a selection one by one. Then you can move or delete the selected pages as a group, or send them to (re)recognition. You can also export selected pages. Get information on an input image by hovering the cursor over its thumbnail (so long as ToolTips are enabled). A popup text displays the image size in pixels and the program’s unit of measurement. Image resolution is also shown.
When multiple pages are being selected, the page set as current does not change. All selected pages are highlighted. Customizing Document Manager columns You can specify which columns of information you want to see in the Document Manager. Click Customize Columns... in the View menu for the following dialog box: This item is highlighted. Click a checkbox to select the item. Image sizes are expressed in pixels. Highlight an item and use these arrows to change the order of columns.
Chapter 2 Printing a document You can print the document with the Print item in the File menu. Choose whether to print images or text (that is, recognition results as they appear in the Text Editor). You can print all pages or a range of pages. The Print tool in the Standard toolbar prints images or text, depending whether the Image Panel or the Text Editor is active. Closing a document Choose Close in the File menu to close a document.
An OmniPage Document created and saved in OmniPage SE will not include training data. Any training in an OPD file you open will be ignored. Why save to OPD You do not have to save your documents to the OPD file type. You would typically do this for the following reasons: x You cannot finish working with the document in the current session. You want to pass the document to other users who have OmniPage SE or OmniPage Pro. For example, you can pass an OPD file to a specialist for proofing.
Chapter 2 The title bar shows the file name of the most recent whole-document save. Settings The Options dialog box is the central location for OmniPage SE settings. Access it from the Standard toolbar or the Tools menu. Context-sensitive help provides information on each setting. In overview, the settings panels are: OCR Use this to specify recognition languages, a user or professional dictionary, a reject character and font matching. Click the checkbox before a language to select or deselect it.
scanning for handling books, and other settings. You can change the interface language here. OmniPage SE does not support two-page scanning. Proofing Use this to define whether proofreading should begin automatically after recognition. Define also whether IntelliTrain should run, and use it to load or work with a training file. See “Proofreading OCR results” on page 67. The references to IntelliTrain and training files do not apply to OmniPage SE.
Chapter 3 Processing documents This tutorial chapter describes different ways you can process a document and also provides information on key parts of this processing.
Quick Start Guide This topic takes you step-by-step through the basic OCR process. Loading and recognizing sample image files You will find sample image files in the program folder, both single-page and multi-page files. First try reading these files using the procedure presented below, except for the references to a scanner. See “Input from image files” on page 50. The results provide you with a benchmark of the recognition quality you should expect from your own files of comparable quality.
Chapter 3 What you do: What happens: 1. Set up your scanner using the Scanner Wizard, if this is not already done. Configures OmniPage SE to work with your scanner. 2. Select Start Programs ScanSoft OmniPage SE 2.0 OmniPage SE 2.0 3. Place the document correctly in your scanner. 4. From the Get Page drop-down list, select a scan option for your document: black-and-white, grayscale or color. Allows you to determine how pictures or colored texts and backgrounds will look in the exported document.
Processing overview The following flow diagram summarizes the processing steps: Get Pages from file page 50 from scanner page 51 Describe page layout page 53 Apply a template page 63 Autozoning page 55 Manual zoning page 56 Export pages Perform OCR with current settings page 33 Verify and edit page 68 Proofread page 67 to file page 81 to Clipboard page 86 via Mail page 87 Here is an overview of the processing methods you can use.
Chapter 3 Using the OCR Wizard The OCR Wizard guides you through the selection of settings and commands by asking you questions. It then launches automatic processing. This is a good way to get started if you are new to OmniPage SE. In other applications You can use the Direct OCR feature to call on the recognition services of OmniPage SE while working in your usual word-processor or similar application.
Automatic processing Automatic processing provides an efficient way of handling documents, especially larger ones. First you select all settings needed, then you can use the Start button in the OmniPage Toolbox to process a new document from start to finish or to restart and finish processing on an open document. Start button Get Page button Get Pages drop-down list Perform OCR button Export Results button Export Results drop-down list Layout Description drop-down list 1.
Chapter 3 4. Choose in the Standard toolbar or Options in the Tools menu and check that settings are appropriate for your document. You can, for instance, specify recognition languages and whether you want to proofread the document or not. See “Settings” on page 33. 5. Click the Start button or choose Start auto-processing in the Process menu. Each page of the document is processed and finished one after the other.
Manual processing Manual processing gives you more precise control over the way your pages are handled. You can process the document page-by-page with different settings for each page. The program also stops between each step: acquiring images, performing recognition, exporting. This lets you, for instance, change the page background and draw zones manually on each page. You start each step in the process by clicking the three numbered buttons on the OmniPage Toolbox. 1.
Chapter 3 6. Select a value for the Perform OCR button. You describe the layout of the incoming pages. This value has an influence if auto-zoning runs on any pages. See “Describing the layout of the document” on page 53. You can also select a template to have its zones placed on the current page. See “Using zone templates” on page 63. 7. Click the Perform OCR button to have the current page recognized.
can process it automatically and view results in the Text Editor. You can determine which pages are in order, and which need different settings or some manual zoning. After adjusting settings and/or modifying zones, use manual processing to re-recognize just those pages. 1. Prepare the document and perform automatic processing, as already described. 2. If you close or finish proofing you will be invited to save the document. This is recommended, even if it is not in its final form. 3.
Chapter 3 3. Manually zone pages where you want to process only part of the page or if you want to give precise zoning instructions. Use ignore backgrounds or zones to exclude areas from processing. Use process backgrounds or zones to specify areas to be auto-zoned. 4. Click the Start button, then choose Finish Processing Existing Pages in the Automatic Processing dialog box. 5. After proofing (if requested) you can save or export the document.
5. The last panel asks you to define the export choice: saving to file or copying to Clipboard. After setting the choice, click Finish to close the Wizard and start the automatic processing. 6. If you requested proofing and the text contains suspect words, the OCR Proofreader dialog box will appear. When proofing is finished or closed, the Copy to Clipboard or Save As dialog box let you specify file export settings, including a page range and a formatting level. 7. The document remains in OmniPage SE.
Chapter 3 How to set up Direct OCR 1. Start the application you want connected to OmniPage SE. Start OmniPage SE, open the Options dialog box at the Direct OCR panel and select Enable Direct OCR. 2. Select process options for proofing and zoning. These function for future Direct OCR work until you change them again; they are not applied when OmniPage SE is used on its own. 3. The Unregistered panel displays running or previously registered applications. Select the desired one(s) and click Add.
If OmniPage SE is running when Direct OCR is called from a target application, a second instance of OmniPage SE is launched. See the Direct OCR topics in online Help for more information. These include a topic Direct OCR Questions and Answers. The Readme file and the ScanSoft web site may present more recent information relating to specific target applications. How to use OmniPage SE with PaperPort PaperPort® is a paper management software product from ScanSoft.
Chapter 3 Processing with Schedule OCR OmniPage SE does not support Schedule OCR. The following text applies to OmniPage Pro only. You can schedule OCR jobs to be performed automatically at any time within the following eight days. The job pages can come from a scanner with an ADF or from image files. You do not have to be present at your computer at job start time, nor does OmniPage Pro have to be running.
Defining the source of page images There are two possible image sources: from image files and from a scanner. There are two main types of scanners: flatbed or sheetfed. A scanner may have a built-in or added Automatic Document Feeder (ADF), which makes it easier to scan multi-page documents. The images from scanned documents can be input directly into OmniPage SE or may be saved with the scanner’s own software to an image file, which OmniPage SE can later open.
Chapter 3 Normally the Add button places each file at the bottom of the file list. To place a file at a different location, highlight a file in the list. The new file will be added immediately below the lowest highlighted file. Input from scanner You must have a functioning, supported scanner correctly installed with OmniPage SE. See “Setting up your scanner with OmniPage SE” on page 14. You have a choice of scanning modes.
Brightness and contrast Good brightness and contrast settings play an important role in OCR accuracy. Set these in the Scanner panel of the Options dialog box or in your scanner’s interface. The diagram illustrates an optimum brightness setting. After loading an image, check its appearance. If characters are thick and touching, lighten the brightness. If characters are thin and broken, darken it. Then rescan the page.
Chapter 3 You can scan double-sided documents with an ADF. A duplex scanner will manage this automatically. For non-duplex scanners, select Scan double-sided pages in the Scanner panel of the Options dialog box. Then you can scan the document in just a few passes, with even pages grouped together and odd pages also grouped. OmniPage SE will merge the pages for you. Scanning without an ADF You can scan multi-page documents efficiently from a flatbed scanner, even without an ADF.
Single column, no table Choose this setting if your pages contain only one column of text and no table. Business letters or pages from a book are normally like this. Choose it also for a page with words or numbers arranged in columns if you do not want these placed in a table or decolumnized or treated as separate columns. Graphics may be detected.
Chapter 3 Zones and backgrounds Zones define areas on the page to be processed or ignored. Zones are rectangular or irregular, with vertical and horizontal sides. Page images in a document have a background value: process or ignore (the latter is more typical). Background values can be changed with the tools shown.
Auto-zone a page background Acquire a page. It appears with a process background. Draw a zone. The background changes to ignore. Draw text, table or graphic zones to enclose areas you want manually zoned. Click the Process background tool (shown) to set a process background. Draw ignore zones over parts of the page you do not need. After recognition the page will return with an ignore background and new zones round all elements found on the background.
Chapter 3 No. Type What happens: 1 Text zone OCR runs and generates text. 2 Table zone OCR runs, text is placed in a table grid. 3 Graphic zone Image is embedded in recognized page. 4 Process zone 5 Process background Auto-zoning creates one or more zones, decides their types and processes their contents.
process zones on an ignore background. Draw a process zone to enclose columns of text to have them handled automatically. They will be decolumnized in the Text Editor’s NF view and RFP view, but kept in columns in True Page view. Ignore zone (gray) Use this to draw an ignore zone, to define a page area you do not want transferred to the Text Editor. Auto-zoning will not place zones here. To exclude a given page area from many pages (for example a header or page numbers), place an ignore zone in a template.
Chapter 3 Working with zones The Image toolbar provides zone editing tools. One is always selected. When you no longer want the service of a tool, click a different tool. Some tools on this toolbar are grouped. Only the last selected tool from the group is visible. To select a visible tool, click it. To select a hidden tool, hold down the mouse button on the triangle at the bottom right of the visible tool until the additional tools appear, then click the tool you want.
Join two zones of the same type Draw an overlapping zone of the same type. existing zones new zone resulting zone Make an irregular zone by subtraction Draw an overlapping zone of the same type as the background (in this example, on an ignore background). existing zone on an ignore background resulting zone new ignore zone Split a zone Draw a splitting zone of the same type as the background (in this example, on a process background).
Chapter 3 The following zone shapes are prohibited: Indented along the bottom Indented along the top Hole in the middle To expand a zone more quickly than using its resizing handles, draw a zone of the same type to completely enclose it. The smaller zone is replaced by the larger one. To replace a set of zones of whatever type with a single zone, draw a larger zone of the desired type to completely enclose them. All the smaller zones are replaced by the larger one.
Use the table tools and their cursors as follows: Insert row dividers Click the tool then click at the location in a table zone where you want to place a row divider. Avoid placing a divider so it cuts through text. Insert column dividers Click the tool then click at the location in a table zone where you want to place a column divider. Move dividers Click the tool and move the cursor to the row or column divider to be moved. It displays a double-headed arrow. Drag the divider as desired.
Chapter 3 Using zone templates A template contains a page background value and a set of zones and their properties, stored in a file. A zone template file can be loaded to have template zones used during recognition. Load a template file in the Layout Description drop-down list or from the Tools menu.
How to unload a template Select a non-template setting in the Layout Description drop-down list. The template zones are not removed from the current or existing pages, but template zones will no longer be used for future processing. You can also open the Zone Template Files dialog box, select [none] and click the Set As Current button. In this case, the layout description setting returns to Automatic.
Chapter 4 Proofing and editing Recognition results are placed in the Text Editor. These can be recognized texts, tables and embedded graphics.
The editor display and views The Text Editor displays recognized texts and can mark words that were suspected during recognition with wavy underlines: X Green – Non-dictionary words: These were recognized confidently, but are not found in any active dictionary: standard, user or professional. X Blue – Words with suspect characters: These contain unrecognized characters or are dictionary-approved words containing characters recognized with lower confidence.
Chapter 4 True Page view True Page® view tries to conserve as much of the formatting of the original document as possible. Character and paragraph styling is retained. All page elements, including columns, are placed in boxes and frames. Reading order can be displayed by arrows. See from page 74. The formatting level for export is chosen separately at export time. Proofreading OCR results After a page is recognized, the recognition results appear in the Text Editor.
3. If the recognized word is correct, click Ignore or Ignore All to move to the next suspect word. Click Add to add it to the current user dictionary and move to the next suspect word. 4. If the recognized word is not correct, modify the word in the Edit panel or select a dictionary suggestion. Click Change or Change All to implement the change and move to the next suspect word. Click Add to add the changed word to the current user dictionary and move to the next suspect word. 5.
Chapter 4 To do this: Use this: Turn verifier on F9 or verifier tool Turn verifier off Esc or F9 or verifier tool Turn verifier on/off temporarily F8: press and hold down Show verifier until next keystroke Double-click on word Zoom display in Alt + Num + or click in verifier Zoom display out Alt + Num – or click in verifier Make verifier dynamic or docked/floating Alt + Num / Dynamic context (scroll through 3 values) Alt + Num * The verifier tool is in the Formatting toolbar.
User dictionaries The program has built-in dictionaries for many languages. These assist during recognition and may offer suggestions during proofing. They can be supplemented by user dictionaries. You can save any number of user dictionaries, but only one can be loaded at a time. Your user dictionaries from Microsoft Word are also available; a dictionary called Custom is the default user dictionary for Microsoft Word.
Chapter 4 Training Training, IntelliTrain and training files are not supported in OmniPage SE. They are available in OmniPage Pro 12. Any training data included in an OPD file will be ignored when it is opened in OmniPage SE. Training is the process of changing the OCR solutions assigned to character shapes in the image. It is useful for uniformly degraded documents or when an unusual typeface is used throughout a document. Training will be less useful for texts with random distortions.
the current OCR solution. Change this to the desired solution and click OK. The program takes this training and examines the rest of the page. If it finds candidate words to change, the Check Training dialog box lists these. Incorrect words should be re-trained before the list is approved. For guidance on using the Train Character and Check Training dialog boxes, please consult their context-sensitive help or the online help topic Manual training and its related topics.
Chapter 4 IntelliTrain remembers the training data it collects, and adds it to any manual training you have done. This training can be saved to a training file for future use with similar documents. Training files If you want to be prompted to save your unsaved training data when you close the document, select that option in the Proofing panel of the Options dialog box. Unsaved training data is stored in an OmniPage Document.
You are editing your unsaved training. This frame is grayed. It has been deleted. To undelete it, select it again and press the Delete key. Characters marked as deleted are really deleted when you close the dialog box. Double-click a frame or press Enter to change its OCR solution. Enter the new solution in the text box that appears and press Enter. Changed assignations appear in red. This frame is selected. The top part shows the shape from the image. The bottom part shows the assigned OCR solution.
Chapter 4 between paragraphs. The Text Editor’s horizontal ruler lets you define indent and tab positions easily. Advanced tab settings are done in the Tabs dialog box from the Format menu. Paragraph styles Paragraph styles are auto-detected during recognition. A list of styles is built up and presented in a selection box on the left of the Formatting toolbar. Use this to assign a style to selected paragraphs.
Frames have gray borders and enclose one or more boxes. They are placed when a visible border is detected in an image. Format frame and table borders and shading with a shortcut menu or by choosing Table... in the Format menu. Text box shading can be specified from its shortcut menu. To call up a shortcut menu, right-click inside an element away from a marked word. Multicolumn areas have pink borders and enclose one or more boxes.
Chapter 4 Click the on-the-fly tool with a green signal. The zoning changes will cause changes in the Text Editor. Click the Perform OCR button to have the whole page (re)recognized, including your zone changes. For details on how changes are handled in on-the-fly zoning and their effects in the Text Editor views, see On-the-fly processing in online Help. Reading text aloud The Text-to-Speech facility is not included in OmniPage SE. It is available in OmniPage Pro 12.
The Text-to-Speech facility is enabled or disabled with the Tools menu item Speech Mode or with the F5 key. A second menu item Speech Settings... allows you to select a voice (for example, male or female for a given language), a reading speed and the volume. The three basic speech keys are grouped together on the numeric keypad.
Chapter 5 Saving and exporting Once you have acquired at least one image for a document, you can export the image(s) to file. Once you have recognized at least one page, you can export recognition results – a single page, selected pages or the whole document – to a target application by saving to file, copying to Clipboard or sending to a mailing application. Saving as an OmniPage Document is always possible.
page is recognized (or proofread, if that was requested), an exporting dialog box appears. You can specify export any time the program is not busy. If you ask to export a document with unrecognized pages, you will be asked whether they should be recognized first. If you answer No, only results from recognized pages will be exported. If zones have been modified on recognized pages, you will be invited to re-recognize those pages before exporting.
Chapter 5 Saving recognition results You can save recognized pages to disk in a wide variety of file types. See “File types for saving recognition results” on page 97. 1. Choose Save As... in the File menu, or click the Export Results button in the OmniPage Toolbox with Save as File selected in the drop-down list. 2. The Save As dialog box appears, as shown in its expanded form. Click Advanced to open the lower panel and Basic to close it.
5. Click OK. The document is saved to disk as specified. If Save and Launch is selected, the exported file will appear in its target application; that is the one associated with the selected file type in your Windows system or in the advanced saving options for your selected file type converter.
Chapter 5 The Save As dialog box lists available file types in its Save as Type dropdown list. The OmniPage Document is the last format in the list. If you first save the document as an OmniPage Document (for instance as memo.opd), then modify it and later save it to a text file (for instance as memo.txt), then modify it again and click Save, the recent changes are saved to the memo.txt file, not to the OPD.
does not happen when text boxes are used. Flowing Page export is not offered in OmniPage SE. It is available only in OmniPage Pro. True Page (TP) This keeps the original layout of the pages, including columns. This is done with text, picture and table boxes and frames. This is offered only for target applications capable of handling these. Spreadsheet This exports recognition results in tabular form, suitable for use in spreadsheet applications.
Chapter 5 Click Defaults to have all settings returned to the default values for the current file type. Click Save to have the changed settings applied to the current save and also stored as the settings to be applied in future whenever this file type is selected again for saving. The program currently associated with the chosen file type for the Save and Launch feature is displayed at the bottom of the dialog box. Click the three dots button to specify a different program.
Saving to PDF This section does not apply to OmniPage SE. In OmniPage Pro 12, you have five choices when saving to Portable Document Format (PDF) files. PDF (Normal): Pages are exported as they appeared in the Text Editor in True Page view. The PDF file can be viewed and searched in a PDF viewer and edited in a PDF editor. PDF Edited: Use this if you have made significant editing changes in the recognition results. You have three formatting level choices, including True Page.
Chapter 5 Text formatting, such as bold and italics, is retained when you paste into an application that supports RTF 6.0/95 information. Otherwise, only plain or Unicode text will be pasted. Graphics are retained if the application supports insertion of images. W To copy pages to the Clipboard: • • • With automatic processing, select Copy to Clipboard as the setting in the Export Results drop-down list on the OmniPage Toolbox or in the OCR Wizard.
At any time the program is not busy, choose Send as Mail in the File menu to call up the Send as Mail dialog box. 1. This dialog box lets you specify a file type, a page range, a formatting level and attachment options: one attachment for all pages, one attachment per page, new attachment at each blank page or one attachment for each input file. Set all options and click OK. 2. Log into your mail application if you are prompted to do so. 3.
Chapter 6 Technical information This chapter provides troubleshooting and other technical information about using OmniPage SE. Please also read the online Readme file and other help topics, or visit the ScanSoft web pages. Its scanner section contains detailed and regularly updated information about scanner setup and support. The Readme file contains last-minute information relating to OmniPage SE. Access to the Readme file and to ScanSoft’s web pages is provided in the Help menu.
Troubleshooting Although OmniPage SE is designed to be easy to use, problems sometimes occur. Many of the error messages contain self-explanatory descriptions of what to do – check connections, close other applications to free up memory, and so on. Sometimes that is all the troubleshooting help you need. Please see your Windows documentation for information on optimizing your system and application performance.
Chapter 6 Testing OmniPage SE Restarting Windows 98, Me, 2000 or XP in safe mode or Windows NT in VGA mode allows you to test OmniPage SE on a simplified system. This is recommended when you cannot resolve crashing problems or if OmniPage SE has stopped running altogether. See Windows online Help for more information. Your scanner will not run with OmniPage SE in safe mode or VGA mode, so do not test scanner problems in this configuration.
5. Launch OmniPage SE and try performing OCR on an image. Use a known image file such as one of the supplied sample files. You can also run OmniPage SE from a command line in its own safe mode. Choose Start Run, browse for the file OmniPage.exe and add the command line option /safe. This starts the program, but ignores previously stored settings and does not try to recover a document from an abnormal termination. Increasing memory resources OmniPage SE may run poorly under low-memory conditions.
Chapter 6 Text does not get recognized properly Try these solutions if any part of the original document is not converted to text properly during OCR: X Look at the original page image and ensure that all text areas are enclosed by text zones. If an area is not enclosed by a zone, it is generally ignored during OCR. See the section on creating and modifying zones, “Working with zones” on page 59. X Make sure text zones are identified correctly.
OmniPage SE only recognizes machine printed-text characters such as typewritten or laser-printed text. It can handle dot-matrix characters, though accuracy may be lower on draft-quality texts. It cannot read handprint or handwriting. However, it can retain signatures or other handwritten text as a graphic. Problems with fax recognition Try these solutions to improve OCR accuracy on fax images: X Ask senders to use clean, original documents if possible.
Chapter 6 ODMA support This does not apply to OmniPage SE. If your local network includes a Document Management System (DMS) that supports ODMA clients, OmniPage Pro may be able to work with it. Then an ODMA panel will appear in the Options dialog box allowing you to specify permissible file types and other settings. An ODMA interface will replace the Load Image File and Open OmniPage Document (OPD) dialog boxes.
Supported file types The program supports a wide range of file types for images and text.
Chapter 6 File types for saving recognition results This table shows which formatting levels are available for each file type. File type Extension eBook (see note 1) opf Excel 97, 2000 xls Excel 3.0 to 7.0 xls FrameMaker 5.5.3 mif Freelance Graphics txt Harvard Graphics txt HTML 4.0 (see notes 1 and 2) htm HTML 3.2 (see note 2) htm Microsoft PowerPoint 97 rtf Microsoft Publisher 98 rtf Microsoft Word 6.0, 97, 2000, XP doc PageMaker 6.5.2 doc Quattro Pro for Windows 4.
Tables File type supports tables in grids, no table handling choices at export time File type supports tables, choose to use grids or tab separated columns File type does not supports table grids, choose to convert to tab or space separated columns O OO PO 1 These output formats and Flowing Page are not supported by OmniPage SE. 2 When saving to HTML, all graphics are saved as separate JPEG image files. 3 Recognition results are sent to Clipboard in RTF 95/6.
I A N D E X Accuracy improvement, 51, 71, 93 influence of brightness, 52 influence of training, 71 scanning mode influence, 51 Acquire Text menu items, 47 Acquired pages, 28 Acquiring images, 23, 42 Adding pages to a document, 41 to zones, 60 training to training files, 73 words to a user dictionary, 68 ADF, 33, 50, 52 Advanced saving options, 84 Advice on problems, 90 Alphanumeric zone, 57 Attachments to mail messages, 87 Auto-detect layout, 53 Automatic Document Feeder (ADF), 33, 50, 52 Automatic proce
to target applications, 23, 42, 80 True Page, 84 F Fax recognition, 94 Features OmniPage SE compared to OmniPage Pro, 8, 10, 19 Features, new, 17 Files as export target, 80 as image source, 50 retained on uninstalling, 98 separation options, 81, 88 types, 81 types for export, 83, 97 types supported, 96 Finding non-dictionary words, 67 suspect words, 67 Finishing a document, 41 Floating toolbars, 25 Flowing Page, 83 Folder input for Schedule OCR, 95 Formatting levels, 49, 66, 97 Formatting levels and file t
settings for Direct OCR, 46 Wizard, 39, 45, 46 ODMA support, 95 OmniPage Desktop, 24 OmniPage Documents contents of, 82 definition, 31 purpose of OPD files, 32 saving as, 32, 82 OmniPage Pro new features of, 17 OmniPage SE documents in, 23 earlier versions, 13 features compared to OmniPage Pro, 8, 10, 19 installing, 13 registering, 17 reinstalling, 98 starting, 14 testing, 91 uninstalling, 98 OmniPage Toolbox, 24, 27, 40 Online HTML Help, 9 registration, 17 On-the-fly editing and zoning, 77 OPD files defini
drivers, 14 duplex, 53 setting up, 14 Scanning black-and-white, 51 books, 33 brightness, 33, 52 color, 51 contrast, 33 grayscale, 51 input from, 51 pictures, 51 Wizard, 14 Schedule OCR, 49 input from folders, 95 watched folders, 95 Searching PDF output, 86 Selecting multiple pages, 28 Send Mail dialog box, 87 Sending pages by mail, 87 Setting up a scanner, 14 Setting up Direct OCR, 47 Settings Acquire Text, 47 effect of settings, 34 for Direct OCR, 47 in OCR Wizard, 46 in Options dialog box, 33 zone types,