Optical Character Recognition Program ABBYY FineReader Version 6.0 User’s Guide ©2002 ABBYY Software House.
Information in this document is subject to change without notice and does not bear any commitment on the part of ABBYY Software House. The software described in this document is supplied under a license agreement. The software may only be used or copied in strict accordance with the terms of the agreement.
Contents Contents Chapter 1 Installing and Starting ABBYY FineReader . . . . . . . . . . . . . . . . . . . 9 Software and Hardware Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Installing ABBYY FineReader. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Network Server/Workstation Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Chapter 5 Page Layout Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 General Information on Page Layout Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Block Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Automatic Page Layout Analysis Options . . . . . . . . . . . . . . . . .
Contents Chapter 9 Working with Batches . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 General Information on Working with Batches. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 Creating a New Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Opening a Batch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
A B BY Y Fi n e Re a d e r 6 .
Welcome! Thank you for choosing ABBYY FineReader! We all need to input text into our computers from time to time, whether it be newspaper/magazine articles, contracts, business letters, faxes, price lists, or questionnaires. For years there was only one way to input print ed documents – you had to type them in from the keyboard.
User’s Guide The User’s Guide introduces you to the basics of using ABBYY FineReader. Each chapter starts with a short summary description and a list of the chapter’s contents. Online Help FineReader’s online Help contains basic and advanced information on program features, settings and dialogs. Online Help is provided in HTML format and has been designed for quick and easy information retrieval. Readme file The Readme file contains the latest information on the software.
Chapter 1 Installing and Starting ABBYY FineReader This chapter deals with ABBYY FineReader installation proce dures and related subjects, such as system requirements and workstation/network installation. A special installation program carries out the setup of FineReader. Always use the diskette/CD ROM supplied as part of your software package. Installation is not possible using copied files.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Software and hardware requirements For ABBYY FineReader to function correctly your computer must meet the following system requirements: 01. PC with an Intel® Pentium® 200 MHz processor or higher 02. Microsoft® Windows® XP, Microsoft® Windows® 2000, Windows® NT® Workstation 4.0 with Service Pack 6 or greater, Windows® 95/98/ME 03. 64 Mb (Windows XP/2000), 32 Mb (Windows Me/98/NT 4.
C h a p t e r 1 . I n st a l l i n g a n d S t a r t i n g A B BY Y Fi n e Rea d e r 3. Click the Start button on the Taskbar and select the Settings/Control Panel item. 4. Double click the Add/Remove Programs icon. 5. Select the Install/Uninstall tab and click the Install button. 6. Follow the installation instructions. If your software package contains only a CD ROM, proceed as follows: 1. Insert the CD ROM into the CD ROM drive. 2.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Additional licenses Following installation on a network server, you will need to add serial numbers if FineReader is to be used by more than one user simultaneously: 1 Run LicSetup.exe from the folder Program files\ABBYY FineReader 6.0 where ABBYY FineReader 6.0 Corporate Edition was installed. The Add License dialog will be displayed. 2 Enter a new serial number and click the Add button. N ot e : 1.
C h a p t e r 1 . I n st a l l i n g a n d S t a r t i n g A B BY Y Fi n e Rea d e r Starting ABBYY FineReader To start ABBYY FineReader: z Select the ABBYY FineReader 6.0 Professional (Corporate Edition) item in the Start/Programs menu. N ot e : Make sure your scanner is connected to your computer, plugged in, and turned on before you start FineReader. If your scanner has yet to be installed, please consult the user guide supplied with the scanner for instructions on how to install it.
A B BY Y Fi n e Re a d e r 6 .
Chapter 2 Quick Start In this chapter you will learn how to input a document without having to know anything about the way in which ABBYY FineReader works! You will also learn which windows and tool bars are contained within FineReader. If you already have experience of working with FineReader, you may wish to skip this chapter altogether and go directly to the part entitled New features of ABBYY FineReader 6.0.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e How to input a document in less than a minute 1. Turn on the scanner if it has a separate power source to your PC. N ot e : Many scanner models have to be turned on before you turn on the computer. 2. Turn on the computer and start FineReader (Start/Programs/ABBYY FineReader 6.0 Professional or Corporate Edition ). The FineReader main window will appear on your screen. 3. Place the page you want read onto the scanner. 4.
Chapter 2. Quick Start When you start FineReader for the first time, the default batch is opened. You can choose to work with the default batch or create a new batch of your own. See “General Information on Working with Batches” for more information.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e The Batch window is always displayed in the Main window. Three more windows may also be displayed: the Image , Zoom and Text windows. The Image , Zoom and Text windows are interconnected: when you double click a certain image area in the Image window, the respective area is displayed in the Zoom window, and the pointer in the Text window is moved to the position clicked on (if text has already been recognized on the page).
Chapter 2. Quick Start The WizardBar toolbar The WizardBar buttons launch the main FineReader functions: Scanning , Reading , Checking and Saving the recognition results. The numbers on the buttons indicate the order in which the respective document input actions should be performed. You may per form each action separately or combine them into one by clicking the Scan&Read Wizard button. In the latter case, the Scan&Read Wizard will then perform the full document processing cycle automatically.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e 3 Check Spelling Check Spelling – searches the text for misspelt and uncertain words (i.e. ones containing uncertainly recog nized characters). Options – opens the Check Spelling tab (Options dialog) to allow spellcheck options to be set. 4 Save Save Wizard – opens the Save Wizard to allow saving options and the destination application to be selected. Save Text to File – saves the recognized text to a disk file.
Chapter 2. Quick Start Font Font size Display nonprinted characters Underlined Superscript Align left Align right Next error Bold Subscript Center Justify Previous error Italic The Image Tools bar The Image Tools bar features page layout analysis (e.g. block creation and editing) tools, as well as tools for increasing/decreasing the image scale and image editing (e.g. eraser). N ot e : Block creation and editing buttons can be used both in the Zoom and Image windows.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e FineReader allows you to customize the Standard, Image and Formatting toolbars: applica tion command buttons can be added and removed at will. Each menu item has its own icon. See the full list of commands and their respective buttons in the Customize (Tools>Customize menu) dialog in the Commands list. To add a button to a toolbar: 1. Select the category of your choice in the Categories field.
Chapter 3 General Features of ABBYY Finereader FineReader provides you with all the tools you need for inputting documents into your computer. Just click on the Scan&Read button once and all the rest is done for you – so you don’t have to spend hours studying the User’s Guide beforehand.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e What is an OCR system? An OCR (Optical Character Recognition) system enables you to input printed documents into your computer automatically via a scanner. FineReader is an omnifont optical text recognition system. As a result it can recognize texts set in practically any font without any prior training.
C h a p t e r 3 . G e n e ra l fea t u re s o f A B BY Y Fi n e Rea d e r New features of ABBYY FineReader 6.0 General features z Now you can open and read PDF files in FineReader. PDF is one of the standard formats used for publishing documents on the Internet, as well as for document archiving, etc. You can open, read, and edit any PDF file in FineReader, and then save it in either PDF or any other for mat supported by FineReader. z Integration with Windows Explorer.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Professional features z Shared group mode for the use of user languages, user dictionaries, and user dictionaries for pre defined languages (FineReader Corporate Edition only). z Full text and individual searches for words in any form can be carried out in any document (Edit>Advanced Search ). Available in FineReader Corporate Edition only.
C h a p t e r 3 .
A B BY Y Fi n e Re a d e r 6 .
Chapter 4 Acquiring the Image Recognition quality depends greatly on the quality of the source image. In this chapter you will learn how to scan documents correctly, how to open and read saved images (see the list of supported image formats under “Supported Image Formats” in the ABBYY FineReader Help section), and how to process images and improve recognition quality (by eliminating scan ning “dust”) etc.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Scanning FineReader “talks” with scanners via the TWAIN interface. This is a universal standard adopt ed in 1992 to unify the interaction of computer image inputting devices (such as scanners) and external applications.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e To start scanning: Click the 1 Scan button or select the Scan item in the File menu. The Image window containing a “photograph” of the scanned page will appear in FineReader’s Main window. If you wish to scan several pages, click the arrow to the right of the 1 Scan button and select the Scan Multiple Images item If scanning does not start right away, one of following two dialogs will open: z The scanner’s TWAIN Source dialog.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e adverse effect on recognition quality in the case of documents of medium to low print quality. z Scan mode – color. If you scan color documents that contain pictures, colored text, or colored backgrounds, you may wish to retain the original colors in your electronic document. Use the color scan mode in this case. Otherwise use gray scan mode. z Brightness – a medium brightness value of around 50% should suffice for most cases.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Your image looks like this: Possible remedy: characters are “torn” or very light z Try lowering the brightness (this will make the image darker) z Try scanning it in gray mode (brightness autotuning will then be used).
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Non ADF Scanning: 1. If you are using the FineReader interface z Select the Scan Multiple Images item in the File menu. If you are using a flatbed scanner without an ADF, to increase productivity try using one of the following two methods: z Set a pause value i.e. the time that is to elapse between the scanning of one page and the next.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Opening images Even if you don’t have a scanner, you can still recognize image files (see the list of supported image formats under “Supported Image Formats"). To open an image: z Click on the arrow to the right of the 1 Scan button and select the Open Image item in the local menu. The appearance of the 1 Scan button icon will change – the Scan caption will be replaced with the Open caption. z Select the Open image item in the File menu.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e N o t e : If a dual page has been split incorrectly, clear the Split dual pages checkbox, scan the dual page again, or re add the respective image to the batch and try to split the image manually using the Split Image dialog (Image>Split Image ). (ABBYY FineReader 6.0 Corporate Edition only) Adding business cards images to a batch When inputting business cards, it makes sense to input as many as you can fit onto your scanner.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e 1. Despeckle image The recognized image may have a large amount of “dust” present on it, i.e. a large number of excess dots. The dots arise in the case of documents of medium to low print quality, and dots located close to character outlines may have an adverse effect on recognition quality. To decrease the number of dots: z Select the Despeckle image item in the Image menu.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e To flip the image: z horizontally (around the vertical axis) – select the Flip Horizontal item in the Image menu, z vertically (around the horizontal axis) – select the Flip Vertical item in the Image menu. 4. Clear block If you do not wish a certain image area to be recognized or if you have large areas of dust present on the image, you can simply erase them.
C h a p t e r 4 . Ac q u i r i n g t h e I m a g e Page numbering Each scanned page is given a number. The number given by default is the number of the last batch page plus one. You can also set page numbers manually. You might wish to do this, if, for example, you wish to retain the original page numbers or scan pages according to page number: z Select the Ask for page number before adding page to the batch item on the Scan/Open Image tab (Tools>Options menu).
A B BY Y Fi n e Re a d e r 6 .
Chapter 5 Page Layout Analysis FineReader must know which image areas it needs to recognize before starting the recognition process. Page layout analysis pro vides it with this information by identifying text blocks, picture blocks, table blocks, and barcode blocks (note: the latter are only available in the Corporate Edition).
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e General information on page layout analysis Page layout analysis can be carried out both automatically and manually. In most cases, FineReader manages the complex task of page layout analysis by itself. Start automatic analysis by clicking on the 2 Read button. Recognition and layout analysis are performed simultaneously. N ot e : A stand alone page layout analysis procedure is also available (Process>Analyze Layout menu).
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s rators inside the block to form a table. This block is represented as a table in the output text. You can draw and edit tables manually. Picture – this block type is used for image areas containing pictures. A block of this type may enclose an actual picture or any other object (e.g. a section of text) you wish displayed as a picture in the recognized text. Barcode (Corporate Edition only) – this block type is used for barcode image areas.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Single column – The text is formatted into one column. Use this option if automatic page layout analysis incorrectly determines the text type as multi column. Plain text formatted with spaces – The text is formatted into one column and set in a monospaced font that is uniform in size throughout.
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s N o t e : Do not select One line of text per cell and/or No merged cells in table options if there are tables with differing structures in your text. Selecting these options may result in errors being made during layout analysis and have an adverse effect on recognition quality.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e You may then change the block type. The drawn block type may be one of the following: Recognition Area, Text, Table, Picture, or Barcode. To change block type: z Right click the block and select the Block Type item followed by the corre sponding block type in the local menu. Modifying blocks To move the block borders: 1. Click the block border and hold down the left mouse button. The mouse pointer will become a two headed arrow. 2.
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s between the two upper or lower corners, the application will cut the right block corner (upper or lower) regardless. It will also forbid certain opera tions if they involve moving the segments forming the block borders. To select a block or a group of blocks: z Select the tool and click on the desired block or press the left mouse button and draw a rectangle around all the blocks you wish to select.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Manual table layout analysis Ti p : If automatic table layout analysis has resulted in table rows and columns being drawn incorrectly, try editing the automatic analysis results instead of deleting all the blocks and drawing them manually again. Almost invariably this proves less time consuming.
C h a p t e r 5 . Pa g e Layo u t A n a l ys i s To create a block template: 1. Open an image and draw the blocks automatically or manually. 2. Select the Save Blocks item in the Image menu. The Save Blocks as dia log will open. Type a file name for the block template in the dialog. To load a block template: 1. Click the Batch window and select the pages you wish to apply the block template to. 2. Select the Load Blocks item in the Image menu. The Open Blocks dialog will open. 3.
A B BY Y Fi n e Re a d e r 6 .
Chapter 6 Recognition The aim of OCR is to read text from a source image and retain the source page layout. Before this can be done, however, the main recognition parameters – recognition language, source text print type, and document type need to be set. This chapter deals with these parameters and other important recognition issues, including the use of different recognition settings, etc.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e General information on recognition N ot e : Always ensure that the following options have been correctly set before you start recognition: recognition language, source text print type, and document type. You may: 1. 2. 3. 4. Recognize a block or several blocks drawn on an image. Recognize an open page or all pages selected in the Batch window. Recognize all unrecognized batch pages. Recognize all pages in background mode.
C h a p t e r 6 . Re c o g n i t i o n N ot e : 1. If you find that you often use a certain language combi nation, you can create a new language group that includes the languages you most often use. 2. Increasing the number of the recognition languages used simultaneously may have an adverse effect on recognition quality. A reasonable number of languages to use simulta neously is 2–3. 3.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Source text print type As a rule source text print type is determined automatically. To ensure that this is the case, select Autodetect in the Print Type group (Tools>Options menu, Recognition tab).
C h a p t e r 6 . Re c o g n i t i o n Inverted or flipped block If the application recognizes blocks containing inverted or flipped text incorrectly (a text block, a table cell, or a whole table): z Right click the block concerned and select the Properties item in the local menu. The Block properties dialog will open. Select the Inverted or Flipped item in the dialog and re recognize the image.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e 3. recognizing large volumes (more than a hundred pages) of texts of low print quality. Ti p : Use Train User Pattern mode only if one of the above applies . In other cases you may obtain a slight increase in recognition quality, but the time and effort involved will probably outweigh the benefit received. Pattern training works as follows. One or two pages are recognized in training mode, and, subsequently, a pattern created.
C h a p t e r 6 . Re c o g n i t i o n N ot e : 1. To create several patterns for the same batch, use the Pattern Editor dialog (click the Pattern Editor button on the Recognition tab or select the Tools>Pattern Editor menu item). Create a new pattern (click the New button in the dialog) and select it (click the Set Active button). Working with a created pattern is no different to working with a default pattern (see steps 1 5). Keep in mind, however, that only one pattern may be active at any one time. 2.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Training to recognize a character: The frame in the top dialog window should enclose a single character , and this character must be fully enclosed by the frame. If the frame encloses only part of a character or more than one character, click the frame borders and move them so that the above stated requirements are met. The and buttons move the frame border as well (and are useful for training italic symbols – see below).
C h a p t e r 6 . Re c o g n i t i o n apostrophes are treated as one character – the straight apostrophe. Thus, you will never see right and left apostrophes in recognized text, even if you attempt to train FineReader into recognizing them. 2. The way in which certain characters are recognized depends on their envi ronment.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e you can create a new language consisting only of the numbers and let ters used in the codes to be applied when recognizing documents of this type. z documents set in capitals only. Recognition quality is increased if you create a language in which all lowercase letters are prohibited. You should create a language group if you use a particular language combination often.
C h a p t e r 6 . Re c o g n i t i o n z Built in (the dictionary supplied with FineReader) z User dictionary To add words to the dictionary or to use an existing user dictionary or text file in Windows (ANSI) or Unicode encoding (the only requirement is that words be separated by spaces or other non alphabetic characters) click the Edit Dictionary button.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e To create a recognition language group: 1. Select the Language Editor item in the Tools menu and click the New but ton. A dialog will open. Select the Create a new group of languages item in the dialog. 2. The Language Group Properties dialog will open. Set the following new language group parameters (all parameters are set in the Language Group Properties dialog): 1. Group name. 2. Languages contained in the group. N ot e : 62 1.
Chapter 7 Checking and Editing Text Once recognition is over, you will see the recognized text dis played in the Text window. The Text window is ABBYY FineReader’s built in editor, used to check recognition results and edit any recognized text. The FineReader text editor has two distinctive features: 1. A built in spell check system (see the list of languages with spell check support under “Supported Languages” in ABBYY FineReader Help ). 2.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Checking text in ABBYY FineReader Uncertainly recognized characters and words not found in the dictionary are highlighted in different colors. By default, light blue is used for uncertain characters and pink for words not found in the dictionary.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t 4. If words have been misspelt, you can do one of the following: z Click the Ignore button to leave the word unchanged. z Click the Ignore All button to leave all such words in the text unchanged. N ot e : When you click the Ignore or Ignore All button, the “uncer tain” flag is removed from the word i.e. the system assumes that the word no longer contains any unrecognized or uncertain characters and no longer needs to be highlighted.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e z z z z Stop at words not found in dictionary Stop at compound words Ignore words with digits and other non alphabetic characters Correct spaces before and after punctuation marks Error display level The Error display level option allows you to select the degree to which errors are highlighted: z None – no recognition errors are highlighted. z Standard – unrecognized and uncertainly recognized characters are high lighted.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t makes sense to add new words that are likely to come up frequently (e.g. specialized terms, abbreviations, names etc.) to the user dictionary. A distinctive feature of FineReader’s spell check system is that a word is not only added to the dictionary in its original form, its paradigm (i.e. the set of all of its forms) is also added. This feature results in FineReader being able to recognize a word in all its forms once it has been entered.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Ti p : 1. FineReader allows you to import user dictionaries created by previous versions (3.0, 4.0 and 5.0). 2. FineReader also allows you to import user dictionaries (*.dic) created using Microsoft Word 6.0, 7.0, 97, and 2000. To import a dictionary: 1. Select the View Dictionaries item in the Tools menu, select the dictionary language, and click the View button. 2. Click the Import button in the View Dictionaries dialog and select files with *.
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t FineReader editor features two document viewing modes: full mode (the full layout is dis played) and draft mode. In full mode blocks with recognized text, tables and pictures are displayed exactly as they are to be found on the original image. The complete original layout, therefore, is retained: columns, tables, pictures, and dropped capitals (oversized letters that take up several lines of space in a paragraph).
A B BY Y Fi n e Re a d e r 6 .
C h a p t e r 7 . C h e c k i n g a n d E d i t i n g Te x t Font effects 1. Click the word or highlight the text the font of which is to be changed. 2. Perform one of the following actions: z Either click the font effect button (e.g. ) of your choice on the Formatting bar, or z Right click the Text window and select Character Properties in the local menu. The Character dialog will open.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Editing tables The table editor provides you with tools to carry out the following: z z z z Merge cell or row contents Split cell contents Split row/column contents Delete cell contents To merge cell or row contents: z Hold down the CTRL button and select the cells or rows you wish to merge, followed by the Merge Table Cells or Merge Table Rows item in the Edit menu. To split cell contents: z Select the Split Table Cells item in the Edit menu.
Chapter 8 Saving into External Applications and Formats Recognition results can be saved to a file, sent to an external application without saving, copied to the clipboard, or sent via e mail. All pages or selected ones only may be saved. FineReader can export recognition results to the following applications: Microsoft Word 6.0, 7.0, 97 (8.0), 2000 (9.0) and 2002 (10.0); Microsoft Excel 6.0, 7.0, 97 (8.0), 2000 (9.0) and 2002 (10.0); Corel WordPerfect 7.0, 8.0, 9.0 and 2002 (10.0); Lotus Word Pro 9.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e General information on saving recognized text You may: z z z z save recognized text using the Save Wizard , save open or selected pages to file or send them to an external application, save all batch pages to file or export them into an external application, save the page image. Click the 4 Save button to export recognition results to the application of your choice or save them to file.
C h a p t e r 8 . S av i n g i n to E x t e r n a l A p p l i ca t i o n s a n d Fo r m a t s z Retain font and font size – table structure, paragraph arrangement, font, and font size are all retained. z Remove all formatting – only table structure and paragraph arrangement are retained. N ot e : Some additional options may become available depending on the export format cho sen.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e different JPEG values, and open it in an image viewing application. The JPEG quality value is set on the Formats>PDF (HTML ) tab. Fonts to use (when saving in RTF, DOC, or HTML format) By default the fonts specified on the Formatting tab are used when saving in RTF, DOC, or HTML format. You can, however, change the fonts that are used.
C h a p t e r 8 . S av i n g i n to E x t e r n a l A p p l i ca t i o n s a n d Fo r m a t s Saving the recognized text in RTF and DOC formats Layout retention modes are set on the Formatting tab in the Options dialog (Tools>Options menu). N ot e : When you save text in RTF or DOC formats, the fonts used are those set on the Formatting tab in the Options dialog (Tools>Options menu) or those set during text editing in the Text window.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e 3. When you save texts that use a non Latin code page (e.g. Cyrillic, Greek, Czech, etc.), ABBYY FineReader will save them using ParaType company fonts (www.paratype.com/shop). 4. If, during PDF export, a message appears informing you that your text con tains a number of non standard font characters, you must then select Type 1 working mode and corresponding Type 1 fonts.
C h a p t e r 8 . S av i n g i n to E x t e r n a l A p p l i ca t i o n s a n d Fo r m a t s HTML format is supported by all browsers (Netscape Navigator, Internet Explorer 3.0 and later). 3. Auto (saves Full and Simple formats in a single file with autose lection depending on browser type) – both formats (Simple and Full) are saved to the same file. The browser you use will determine the format that is used.
A B BY Y Fi n e Re a d e r 6 .
Chapter 9 Working with Batches The batch is the main ABBYY FineReader data depository: scanned images, recognized text and other data are all kept in the batch. The majority of FineReader settings are batch settings: scanning, recognition, saving options, etc. User patterns, user lan guages and user language groups are also batch “property”. When you create a new batch, you may use the default batch settings, the settings of the current batch, or settings saved in an *.fbt file.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e General information on working with batches When FineReader starts for the first time, it opens the batch located in the FineReader fold er. You can choose to work with this batch or create a new one. A batch may contain up to 9999 pages. Ti p : You may find it useful to save similar type pages (e.g. pages from the same book, writ ten in the same language, or with a similar layout) in the same batch. By doing this you will find your work much easier.
C h a p t e r 9 . Wo r k i n g w i t h B a t c h e s You may select several different pages, a number of consecutive pages, or all batch pages: z To select a number of pages in a row , hold down the SHIFT key and click the first and last page of the group you wish to select. z To select several pages , hold down the CTRL key and click the pages of your choice. z To select all batch pages , activate the Batch window and choose the Select All item in the Edit menu or press CTRL+A .
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e N ot e : Batches can be opened directly from Windows Explorer: z Right click the batch folder (represented by the icon) and select the Open with FineReader item in the local menu. FineReader will be started and the chosen batch opened. Adding images to a batch z Select the Open Image item in the File menu or press CTRL+O . z Select the image(s) you wish to open in the Open Image dialog.
C h a p t e r 9 . Wo r k i n g w i t h B a t c h e s N ot e : 1. To renumber all batch pages, select the All Pages item in the Renumber Pages dialog. 2. To renumber only part of a batch: z Select the pages you wish to renumber in the Batch window. z Select the Selected pages item in the Renumber Pages dialog. 3. If you want selected pages to be renumbered continuously, select the Continuous page numbering option.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Full text search in recognized batch pages (FineReader Corporate Edition only) You can search through all recognized pages for words in all of their grammatical forms. The search pattern may consist of one word or several words. This (These) word(s) may be in any form (for languages with dictionary support), and the words in the search pattern may be located at any distance from each other in the text and in any order.
Chapter 10 Network Document Processing The ABBYY FineReader Corporate Edition is especially designed for network docu ment processing. Each computer involved in network processing must have a sepa rate copy of FineReader installed (for more information on network installation of FineReader, see under "Installation on a Network Server and on a Network Workstation"). The ABBYY FineReader Corporate Edition allows you to do the following: 1.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e Work with the same batch over a network (FineReader Corporate Edition only) 1. Create/Open a batch and set up the required scanning and recognition options. 2. Run FineReader and open the relevant batch on all computers that are to process it. 3. Run background recognition (Process>Start background recognition ) on all computers involved in recognizing the batch. 4. Start the scanning on a computer equipped with an ADF scanner.
C h a p t e r 1 0 . N e t w o r k D o c u m e n t Pro c e ss i n g N ot e : If your batch contains a large number of pages, recognition speed will be increased if you use “Background mode” in combination with a multi processor system. Group work with the same user languages and dictionaries (FineReader Corporate Edition only) Create a batch and set up the required scanning and recognition options. All the user languages and dictionaries you attach will be stored in one folder.
A B BY Y Fi n e Re a d e r 6 . 0 U s e r ’ s G u i d e N ot e : 1. Before you can use the dictionaries contained in a particular folder, you must have read write access to that folder. 2. When a user language is used simultaneously by several users, it will be avail able as “read only”, i.e. it will not be possible to change any existing parame ters. However, entries can still be added/removed to/from the user diction ary of this language.
Appendix
A B BY Y Fi n e Re a d e r 6 .
Appendix The Batch menu To: Press: Open the next batch page Open the previous batch page Open page with specified number Close the current page Delete the recognized text in the Text window Delete all blocks in the Image window and all recognized text in the Text window Update page list ALT+Down ALT+Up CTRL+G CTRL+F4 CTRL+SHIFT+Del CTRL+Del F5 The Process menu To: Press: Scan and read an image Open and read an image Start Scan&Read Wizard Analyze layout Analyze layout on all batch pages Read active o
A B BY Y Fi n e Re a d e r 6 .
Appendix 95
A B BY Y Fi n e Re a d e r 6 .