Award-winning OCR: over 70 top industry awards worldwide Recognized Leader Greater Accuracy Better performance Easier to use
Optical Character Recognition Program ABBYY FineReader Version 6.
Information in this document is subject to change without notice and does not bear any commitment on the part of ABBYY Software House. The software described in this document is supplied under a license agreement. The software may only be used or copied in strict accordance with the terms of the agreement.
Contents Contents Chapter 1 Installing and Starting ABBYY FineReader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Software and Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Installing ABBYY FineReader . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
ABBYY FineReader 6.0 User’s Guide Chapter 6 Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 General Information on Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Recognition Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
WELCOME! Thank you for choosing ABBYY FineReader! We all need to input text into our computers from time to time, whether it be newspaper/magazine articles, contracts, business letters, faxes, price lists, or questionnaires. For years there was only one way to input printed documents – you had to type them in from the keyboard.
ABBYY FineReader 6.0 User’s Guide User’s Guide The User’s Guide introduces you to the basics of using ABBYY FineReader. Each chapter starts with a short summary description and a list of the chapter’s contents. Online Help FineReader's online Help contains basic and advanced information on program features, settings and dialogs. Online Help is provided in HTML format and has been designed for quick and easy information retrieval.
Chapter 1 Installing and Starting ABBYY FineReader This chapter deals with ABBYY FineReader installation procedures and related subjects, such as system requirements and workstation/network installation. A special installation program carries out the set up of FineReader. Always use the diskette/CD-ROM supplied as part of your software package. Installation is not possible using copied files.
ABBYY FineReader 6.0 User’s Guide Software and Hardware Requirements For ABBYY FineReader to function correctly your computer must meet the following system requirements: 1. PC with an Intel® Pentium® 200 MHz processor or higher 2. Microsoft® Windows® XP, Microsoft® Windows® 2000, Windows® NT® Workstation 4.0 with Service Pack 6 or greater, Windows® 95/98/Me 3. 64 Mb (Windows XP/2000), 32 Mb (Windows Me/98/NT 4.
Chapter 1 - Installing and Starting ABBYY FineReader Note: An Installation Code is required to complete installation if one of following applies to your computer: there is no 3.5" floppy disk drive present; installation is being carried out using non-original or corrupted media; applications have been installed that are in conflict with ABBYY FineReader.
ABBYY FineReader 6.0 User’s Guide Installation on a Network Workstation If ABBYY FineReader 6.0 Corporate Edition has been installed on a network server, the setup program can be run directly from the server. To install ABBYY FineReader 6.0 Corporate Edition on a workstation: ● Run Setup.exe from the network folder containing ABBYY FineReader Corporate Edition 6.0. Follow the installation instructions. Note: 1.
Chapter 2 Quick Start In this chapter you will learn how to input a document without having to know anything about the way in which ABBYY FineReader works! You will also learn which windows and toolbars are contained within FineReader. If you already have experience of working with FineReader, you may wish to skip this chapter altogether and go directly to New features of ABBYY FineReader 6.0 in chapter 3.
ABBYY FineReader 6.0 User’s Guide How to Input a Document in Less than a Minute 1. Turn on the scanner if it has a separate power source to your PC. Note: Many scanner models have to be turned on before you turn on the computer. 2. Turn on the computer and start FineReader (Start/Programs/ABBYY FineReader Professional 6.0 or Corporate Edition 6.0). The FineReader main window will appear on your screen. 3. Place the page you want read onto the scanner. 4.
Chapter 2 - Quickstart Main window Standard toolbar Formatting toolbar Wizard Bar provides tools for full text processing: Scanning, Recognition, Spellcheck and Saving Text window displays the recognized text for checking and editing Image window displays the scanned image for viewing and drawing blocks Zoom window displays the zoomed-in image of the text line you edit or part of an image you are working on Image Tools toolbar provides tools for drawing and editing blocks, zoom tools and tool for editing
ABBYY FineReader 6.
Chapter 2 - Quickstart Scan&Read Scan&Read Wizard - launchesScan&Read mode. FineReader guides you through the document processing process and advises you on how best to obtain the desired result. Scan&Read - starts scanning and reading a document using the current options. Scan&Read Multiple Images - scans and reads several consecutive images. Open&Read - opens and reads the images selected in the Open dialog. 1-Scan Open Image - adds image(s) to the batch.
ABBYY FineReader 6.0 User’s Guide The Standard toolbar The Standard toolbar features file and image tools (undo/redo an action, scroll the batch pages, clean and rotate the image) and the list of Recognition languages.
Chapter 2 - Quickstart Note: Block creation and editing buttons can also be used in the Zoom and Image windows. Setting up the toolbar Note: The appearance of the FineReader main window, or more precisely, the number of buttons displayed on FineReader’s toolbars, depends on your monitor’s resolution. To display all available buttons you need to increase your monitor’s resolution.
Chapter 3 General Features of ABBYY FineReader FineReader provides you with all the tools you need for inputting documents into your computer. Just click on the Scan&Read button once and all the rest is done for you - so you don't have to spend hours studying the user’s guide beforehand.
ABBYY FineReader 6.0 User’s Guide What is an OCR System? An OCR (Optical Character Recognition) system enables you to input printed documents into your computer automatically via a scanner. FineReader is an omnifont optical text recognition system. As a result it can recognize texts set in practically any font without any prior training.
Chapter 3 - General Features of ABBYY FineReader Image processing ● ● Printing of scanned images and recognized text. Automatic and manual splitting of dual-page- and business card scans. Recognition 177 recognition languages. See the full list under Supported languages in ABBYY FineReader Help. ● An improved algorithm for the recognition of poor print quality documents.
ABBYY FineReader 6.0 User’s Guide Supported Image Formats ABBYY FineReader opens image files in the following formats: PDF: BMP: PCX, DCX: Files in PDF format (Version 1.3 or earlier).
Chapter 4 Acquiring the Image Recognition quality depends greatly on the quality of the source image. In this chapter you will learn how to scan documents correctly, how to open and read saved images (see the list of supported image formats under Supported Image Formats in the ABBYY FineReader Help section), and how to process images and improve recognition quality (by eliminating scanning "dust") etc.
ABBYY FineReader 6.0 User’s Guide Scanning FineReader "talks" with scanners via the TWAIN interface. This is a universal standard adopted in 1992 to unify the interaction of computer image inputting devices (such as scanners) and external applications.
Chapter 4 - Acquiring the Image Tip: To start recognition immediately after the source images have been scanned, use the Scan&Read or Scan&Read Multiple Images option: Click the arrow to the right of the Scan&Read button and select either Scan&Read or Scan&Read Multiple Images item in the local menu. FineReader will scan and read the images. The Image window displaying a "photograph" of the scanned page and the Text window displaying the recognition results will appear in FineReader’s main window.
ABBYY FineReader 6.0 User’s Guide Tips on Brightness Tuning The scanned image has to be legible. To check its legibility, view the image in the Zoom window. - an example of a good image (from an OCR point of view) If you see that the scanned image is far from perfect (characters are glued or torn), consult the table below to find out how you can improve image quality.
Chapter 4 - Acquiring the Image pause value (in seconds) in the Scanner Settings dialog (Tools>Scanner Settings menu). As a result, the scanner won’t begin scanning the next page until the specified number of seconds has elapsed, thus allowing you sufficient time to place the next page onto the scanner. After the pause, scanning continues automatically. ● Select the Stop between pages option in the Scanner Settings dialog (Tools>Scanner Settings menu).
ABBYY FineReader 6.0 User’s Guide Tip: If you want the opened images to be recognized right away, select Open&Read mode: 1. Select the Open&Read item in the Process menu or just press CTRL+SHIFT+D. The Open dialog will open. 2. Select the images for recognition in the Open dialog. Scanning Dual Pages When scanning a book, although it is easier to scan both the left and right pages (i.e.
Chapter 4 - Acquiring the Image Working with the Image ● ● ● ● ● ● ● ● Despeckle image Invert image Rotate or flip image Clear block Increase/Decrease the image scale Get image information Print image Undo the last action 1. Despeckle image The recognized image may have a large amount of "dust" present on it, i.e. a large number of excess dots.
ABBYY FineReader 6.0 User’s Guide To flip the image: ● horizontally (around the vertical axis) - select the Flip Horizontal item in the Image menu, ● vertically (around the horizontal axis) - select the Flip Vertical item in the Image menu. 4. Clear block If you do not wish a certain image area to be recognized or if you have large areas of dust present on the image, you can simply erase them.
Chapter 4 - Acquiring the Image If you are scanning a large number of double-sided pages according to page number: 1. Select the Ask for page number before adding page to the batch item on the Scan/Open Image tab (Tools>Options). 2. Specify the number of the first scanned page in the Page number dialog, then select the Odd and even separately option in the Page numbering field.
Chapter 5 Page Layout Analysis FineReader must know which image areas it needs to recognize before starting the recognition process. Page layout analysis provides it with this information by identifying text blocks, picture blocks, table blocks, and barcode blocks (note: the latter are only available in the Corporate Edition).
ABBYY FineReader 6.0 User’s Guide General Information on Page Layout Analysis Page layout analysis can be carried out both automatically and manually. In most cases, FineReader manages the complex task of page layout analysis by itself. Start automatic analysis by clicking on the 2Read button. Recognition and layout analysis are performed simultaneously. Note: A stand-alone page layout analysis procedure is also available (Process>Analyze Layout menu).
Chapter 5 - Page Layout Analysis Note: It is possible to have barcode analysis and recognition carried out automatically, but this option is not set by default. To enable this option, select the Look for barcodes item on the Recognition tab (Tools>Options menu).
ABBYY FineReader 6.0 User’s Guide Physical Degrees, Phenomenon Centigrade - this table has more than one line of text per cell Water boiling 100 point Water freezing point 0 2. Use the No merged cells in table option if your table has no merged cells.
Chapter 5 - Page Layout Analysis 2. Position the mouse at the point where you want a corner of your block to be. Hold down the left mouse button and drag the mouse pointer to the point where you want the opposite block corner to be. 3. Release the mouse button. A frame will enclose the image area selected. You may then change the block type. The drawn block type may be one of the following: Recognition Area, Text, Table, Picture, or Barcode.
ABBYY FineReader 6.0 User’s Guide To select a block or a group of blocks: ● Select the tool and click on the desired block or press the left mouse button and draw a rectangle around all the blocks you wish to select. Note: You can select one or more blocks using the usual block drawing tools. To select several blocks , , or and drag the at once hold down SHIFT or CTRL with one of the tools activated: arrow over the blocks you want to select. To invert the selection (i.e.
Chapter 5 - Page Layout Analysis If the table cell only contains a picture, select the Treat cell as a picture item in the Block Properties dialog (View>Properties menu). If the table cell contains both text and pictures, draw a separate picture block (or blocks) inside the cell. To merge table cells or rows: ● Select the Merge Table Cells or Merge Table Rows item in the Edit menu. Note: You can split previously merged cells using the Split Table Cells command (Edit menu).
Chapter 6 Recognition The aim of OCR is to read text from a source image and retain the source page layout. Before this can be done, however, the main recognition parameters – recognition language, source text print type, and document type – need to be set. This chapter deals with these parameters and other important recognition issues, including the use of different recognition settings etc.
ABBYY FineReader 6.0 User’s Guide General Information on Recognition Note: Always ensure that the following options have been correctly set before you start recognition: recognition language, source text print type, and document type. You may: 1. 2. 3. 4. Recognize a block or several blocks drawn on an image. Recognize an open page or all pages selected in the Batch Window. Recognize all unrecognized batch pages. Recognize all pages in background mode.
Chapter 6 - Recognition To recognize a multilingual document: 1. Select the Select multiple languages item in the language list on the Standard toolbar. The Recognition language dialog will open. 2. Select the languages of your choice in the Recognition language dialog. Note: 1. If you find that you often use a certain language combination, you can create a new language group that includes the languages you most often use. 2.
ABBYY FineReader 6.0 User’s Guide An example of draft mode dot matrix text. Character lines are made up of individual dots. An example of typewritten text. All letters are of equal width (compare, for example, "w" and "a"). To change print type: ● Select the print type of your choice on the Recognition tab in the Options dialog (Tools>Options menu).
Chapter 6 - Recognition Note: Running Background mode in the case of multiprocessor systems only leads to an increase in recognition speed if the batch being processed contains a large number of pages. To stop Background Recognition: ● Select the Stop Background Recognition item in the Process menu. Note: Background recognition mode uses currently active recognition options. Recognition with Training As previously stated, FineReader can read texts set in practically any font regardless of print quality.
ABBYY FineReader 6.0 User’s Guide 2. Click the 2-Read button. 3. Train your pattern - recognize one or more pages in Train user pattern mode. Trained characters are saved in the default pattern. Once you have completed training the pattern, FineReader will save the pattern (Default.pat) in the current batch folder. 4. Edit your pattern. 5. Deactivate training mode (click the Use user pattern radio button on the Recognition tab). 6. Recognize the rest of the text - click the 2-Read button. Note: 1.
Chapter 6 - Recognition buttons move the frame border as well (and are useful for training italic symbols - see below). Once you have positioned the frame correctly, type in the character and click the Train button. Note: 1. You may only train the system to read characters included in the alphabet.
ABBYY FineReader 6.0 User’s Guide To edit a user pattern: 1. Select the Pattern Editor item in the Tools menu. The Pattern Editor dialog will open. 2. Select the relevant pattern and click the Edit button in the dialog. The User Pattern dialog will open. 3. Select a character and click the Properties button to edit the character caption and set the correct typeface: italic, bold, subscript or superscript. Click on the Delete button to remove any incorrectly trained characters from the batch.
Chapter 6 - Recognition Set the following language parameters for the new language (all parameters are entered in the Simple Language Properties dialog): 1. The new language name. 2. The basic alphabet to be used by the language. This parameter is set in the Alphabet field. If necessary, edit the alphabet by clicking the button. 3. The dictionary to be used by the application (for both recognition and spell check purposes).
ABBYY FineReader 6.0 User’s Guide 2. The Language Group Properties dialog will open. Set the following new language group parameters (all parameters are set in the Language Group Properties dialog): 1. Group name. 2. Languages contained in the group. Note: 1. If you know that your text will not contain certain characters, you may wish to specify these as prohibited characters in the relevant language group’s properties. Prohibiting such characters can increase both recognition speed and quality.
Chapter 7 Checking and Editing Text Once recognition is over, you will see the recognized text displayed in the Text window. The Text window is ABBYY FineReader's built-in editor, used to check recognition results and edit any recognized text. The FineReader text editor has two distinctive features: 1. A built-in spell check system (see the list of languages with spell check support under Supported Languages in ABBYY FineReader Help). 2.
ABBYY FineReader 6.0 User’s Guide Checking Text in ABBYY FineReader Uncertainly recognized characters and words not found in dictionary are highlighted in different colors. By default, light blue is used for uncertain characters and pink for words not found in the dictionary. To change the colors used: ● Select the Uncertain Character (or Not in Dictionary word) item followed by the color of your choice in the Color item on the View tab (Tools>Options menu) in the Appearance group.
Chapter 7 - Checking and Editing Text made for the word in the Suggestions window, you can enter one yourself in the middle window. (Important: when you switch to edit mode, certain buttons may change function and adopt new captions). Click the Confirm (Confirm All) button to change the current word (or all such words) in the text and move to the next uncertainly recognized word. ● Click Add... to add a word to the dictionary.
ABBYY FineReader 6.0 User’s Guide Ignore words with digits and other non-alphabetic characters The spell check treats all words containing digits and other characters not included in recognition language as correct unless they also contain uncertain characters. Correct spaces before and after punctuation marks The spell check does not stop if it comes across incorrect spacings before or after punctuation marks, it simply corrects them automatically.
Chapter 7 - Checking and Editing Text If the word you wish to add is already present in the dictionary, a notice to this effect will be issued. You may then wish to view its paradigm. If you think the existing paradigm is incorrect (this is often the case with homonymous words, for example), construct another one (click the Add button in the Add Word dialog). Tip: 1. FineReader allows you to import user dictionaries created by previous versions (3.0, 4.0 and 5.0). 2.
ABBYY FineReader 6.0 User’s Guide Font Display nonprinted Superscript characters Underlined Next error Align left Align right Font size Bold Subscript Center Justify Italic Previous error following editing), parts of other inactive blocks may become invisible. If this is the case, the borders of the block(s) concerned will be colored red. When a block is active, its borders are enlarged so as to display the entire block text.
Chapter 7 - Checking and Editing Text Search and replace To find a word or phrase in the text you are editing: 1. ● Either select the Find item in the Edit menu, or ● Press CTRL+F 2. The Search dialog will open. Type the word or phrase you wish to find in the Find what line of the dialog and set the search parameters. Note: To search for the same word again using the same parameters, press F3. To search and replace a word or phrase in the text you are editing: 1.
ABBYY FineReader 6.0 User’s Guide Editing Tables The table editor provides you with tools to carry out the following: ● ● ● ● Merge cell or row contents Split cell contents Split row/column contents Delete cell contents To merge cell or row contents: ● Hold down the CTRL button and select the cells or rows you wish to merge, followed by the Merge Table Cells or Merge Table Rows item in the Edit menu. To split cell contents: ● Select the Split Table Cells item in the Edit menu.
Chapter 8 Saving into External Applications and Formats Recognition results can be saved to a file, sent to an external application without saving, copied to the clipboard, or sent via e-mail. All pages or selected ones only may be saved. FineReader can export recognition results to the following applications: Microsoft Word 6.0, 7.0, 97 (8.0), 2000 (9.0) and 2002 (10.0); Microsoft Excel 6.0, 7.0, 97 (8.0), 2000 (9.0) and 2002 (10.0); Corel WordPerfect 7.0, 8.0, 9.0 and 2002 (10.0); Lotus Word Pro 9.
ABBYY FineReader 6.0 User’s Guide General Information on Saving Recognized Text You may: ● ● ● ● save recognized text using the Save Wizard, save open or selected pages to file or send them to an external application, save all batch pages to a file or export them into an external application, save the page image. Click the 4-Save button to export recognition results to the application of your choice or save them to file. The icon’s appearance will depend on the currently active save mode.
Chapter 8 - Saving into External Applications and Formats Retain pictures If you choose this option, pictures will be saved together with recognized text. The option is only available in the case of RTF, DOC, and HTML formats. Image resolution (RTF/DOC, PDF, and HTML formats) Sometimes you may wish to reduce image resolution. For example, HTML files are normally viewed using browsers, and high-resolution files, due to their size, are usually unwelcome on the Internet.
ABBYY FineReader 6.0 User’s Guide ● Create a new file at each blank page - the whole batch is treated as a set of page groups, with each group ending with a blank page. Pages from different groups are saved into different files with file names consisting of the user-specified name and index number: -1, -2, -3 etc. ● Create a single file for all pages - all (or all selected) batch pages are saved as a single file.
Chapter 8 - Saving into External Applications and Formats Note: 1. A special Replace uncertain words with images option is available if you use Text and pictures only or Text over the page image mode. If you select this option, all uncertain words will be replaced with their images. Set this option on the PDF tab in the Formats Settings dialog. 2. If you wish to edit recognized text before exporting it in PDF format, we recommend you pay special attention to preserving the original line division (i.e.
ABBYY FineReader 6.0 User’s Guide To set the HTML format of your choice: ● Click the relevant radio button on the HTML tab in the Formats Settings dialog (Tools> Formats menu) in the Formats group. Note: The application detects the code page automatically. To change code page, select the code page of your choice in the Code page field on the HTML tab in the Formats Settings dialog. Saving the Page Image 1. Select a batch page. 2. Select the Save Image As item in the File menu.
Chapter 9 Working with Batches The batch is the main ABBYY FineReader data depository: scanned images, recognized text and other data are all kept in the batch. The majority of FineReader settings are batch settings: scanning, recognition, saving options, etc. User patterns, user languages and user language groups are also batch "property". When you create a new batch, you may use the default batch settings, the settings of the current batch, or settings saved in an *.fbt file.
ABBYY FineReader 6.0 User’s Guide General Information on Working with Batches When FineReader starts for the first time, it opens the batch located in the FineReader folder. You can choose to work with this batch or create a new one. A batch may contain up to 9999 pages. Tip: You may find it useful to save similar-type pages (e.g. pages from the same book, written in the same language, or with a similar layout) in the same batch. By doing this you will find that it is much easier to find your work.
Chapter 9 - Working with Batches Note: To save batch settings in a file, click the Save button on the General tab (Tools>Options menu). A Save Batch Template As dialog will open. Enter the file name. The following settings will be saved: the Recognition, Scan/Open Image, Formatting, and Check Spelling tab settings, as well as all Formats Settings dialog tab settings. User languages, user language groups and user patterns will also be saved in this file.
ABBYY FineReader 6.0 User’s Guide Note: If you double-click a page number, the page concerned will be opened. To renumber pages in the Renumber Pages dialog: 1. Select a single page or several pages. 2. Select the Renumber Pages item in the Batch menu. 3. Set the new number for the first page selected (the page with the lowest number). Note 1. To renumber all batch pages, select the All Pages item in the Renumber Pages dialog. 2.
Chapter 9 - Working with Batches Full-Text Search in Recognized Batch Pages (FineReader Corporate Edition only) You can search through all recognized pages for words in all of their grammatical forms. The search pattern may consist of one word or several words. This (These) word(s) may be in any form (for languages with dictionary support), and the words in the search pattern may be located at any distance from each other in the text and in any order. To carry out a full-text search: 1.
Chapter 10 Network Document Processing The ABBYY FineReader Corporate Edition is especially designed for network document processing. Each computer involved in network processing must have a separate copy of FineReader installed (for more information on network installation of FineReader, see under Installation on a Network Server and on a Network Workstation). Mit ABBYY FineReader Corporate Edition haben Sie folgende Möglichkeiten: 1.
ABBYY FineReader 6.0 User’s Guide Work with the Same Batch Over a Network (FineReader Corporate Edition only) 1. Create/Open a batch and set up the required scanning and recognition options. 2. Run FineReader and open the relevant batch on all computers that are to process it. 3. Run background recognition (Process>Start background recognition) on all computers involved in recognizing the batch. 4. Start the scanning on a computer equipped with an ADF scanner.
Chapter 10 - Network Document Processing Once setup is complete, save the batch settings in a batch template file (*.fbt): ● Click the Save button on the Options>General tab (Tools>Options). In the Save Batch Template As dialog, open the folder and enter the file name. Before several users can work with the same user languages and dictionaries stored in a new batch, each of them will need to load the batch settings from the previously saved *.fbt file. Select the Batch template (.
Appendix 71
ABBYY FineReader 6.
Appendix The Batch Menu To: Press: Open the next batch page ALT+Down Open the previous batch page ALT+Up Open a page with specified number CTRL+G Close the current page CTRL+4 Delete the recognized text in the Text window CTRL+SHIFT+Del Delete all blocks in the Image window and all recognized text in the Text window CTRL+Del Update page list F5 The Process Menu To: Press: Scan and read an image CTRL+D Open and read an image CTRL+SHIFT+D Start Scan&Read Wizard CTRL+W Analyze layout
ABBYY FineReader 6.