® ABBYY FineReader Version 9.0 User’s Guide © 2007 ABBYY. All rights reserved.
ABBYY FineReader 9.0 User’s Guide Information in this document is subject to change without notice and does not bear any commitment on the part of ABBYY. The software described in this document is supplied under a license agreement. The software may only be used or copied in strict accordance with the terms of the agreement.
ABBYY FineReader 9.0 User’s Guide Contents Introducing ABBYY FineReader........................................................................................ 6 What Is ABBYY FineReader?................................................................................................................................................................................................................ 6 What's New in ABBYY FineReader 9.0 .............................................................................
ABBYY FineReader 9.0 User’s Guide Checking Spelling.................................................................................................................................................... 27 User Dictionary: Adding and Removing Words ...................................................................................................... 27 Using Styles ..................................................................................................................................................
ABBYY FineReader 9.0 User’s Guide Appendix ........................................................................................................................ 53 Supported Saving Formats .................................................................................................................................................................................................................53 Supported Image Formats....................................................................................
ABBYY FineReader 9.0 User’s Guide Introducing ABBYY FineReader This chapter provides an overview of ABBYY FineReader and its features. Chapter Contents ● What Is ABBYY FineReader? ● What's New in ABBYY FineReader 9.0 What Is ABBYY FineReader? ABBYY FineReader, an Optical Character Recognition (OCR) application, converts printed and PDF documents and document images into editable computer files.
ABBYY FineReader 9.0 User’s Guide What's New in ABBYY FineReader 9.0 Version 9.0 of ABBYY FineReader provides a number of major enhancements and features. Some features (as specified below) are specific to ABBYY FineReader 9.0 Corporate Edition or ABBYY FineReader 9.0 Site License Edition.
ABBYY FineReader 9.0 User’s Guide ABBYY FineReader 9.0 has been officially certified for Windows Vista devices and software. The Windows Vista Certified logo ensures the program's compatibility with the advanced features of the Windows Vista operating system.
ABBYY FineReader 9.0 User’s Guide Working with ABBYY FineReader 9.0 This chapter will teach you to use ABBYY FineReader 9.0 to get an editable electronic version of your paper or PDF documents. Chapter Contents ● Using ABBYY FineReader 9.
ABBYY FineReader 9.0 User’s Guide Note: The author of a PDF file may choose to restrict access to it. For example, the author may create a password or restrict certain features, such as the ability to extract text and graphics. To adhere to copyright guidelines, ABBYY FineReader will ask you for a password to open such files. Photographing Documents with a Digital Camera ABBYY FineReader can perform OCR on images created with a digital camera. 1. Take a picture of your document.
ABBYY FineReader 9.0 User’s Guide Saving the Recognized Text Recognized text can be saved to a file, sent to an application of your choice, copied to the Clipboard, or sent by e–mail in any supported saving formats. You can save either the entire document or only the selected pages. Important! Be careful to select the appropriate saving options before clicking Save. To save the recognized text: 1.
ABBYY FineReader 9.0 User’s Guide 5. Soon, a new Microsoft Word document containing the recognized text will open automatically. To change some program settings, such as saving options, make the necessary changes prior to running the Convert PDF/Images to Microsoft Word QuickTask. Note: You can also create a Microsoft Word document by setting up and running each processing step manually.
ABBYY FineReader 9.0 User’s Guide 1. Launch ABBYY FineReader. 2. In the Quick Tasks dialog box, select Scan to Image File. The image creation process will begin, using the current program settings. You may also get and save document images manually. 1. Scan your paper documents—the program will save the resulting images to the current document. 2.
ABBYY FineReader 9.0 User’s Guide Improving OCR Quality This chapter offers practical advice on choosing the best scanning and OCR settings to maximize results on non–standard documents.
ABBYY FineReader 9.0 User’s Guide Taking Into Account Some of the Features of Your Paper Document OCR quality greatly depends on the quality of the source image. Consider the following elements to ascertain whether you will get the scanning results you desire: ● Print Type Various devices may be used to produce printed documents, and some (i.e. dot matrix printers, typewriters, etc.) are more difficult to recognize. To maximize results, you need to choose the correct OCR options.
ABBYY FineReader 9.0 User’s Guide Poor–quality documents are best scanned in grayscale. When scanning in grayscale, the program will select the optimal brightness value automatically. Grayscale color mode retains more information about the letters in the scanned text to achieve better OCR results when recognizing documents of medium to poor quality. You can also correct some print defects using the tools in the Edit Image dialog box.
ABBYY FineReader 9.0 User’s Guide Selecting a Scanning Interface ABBYY FineReader can communicate with a scanner in two ways: ● via the ABBYY FineReader interface In this case, select scanning options (including resolution, brightness, and color mode) from the ABBYY FineReader dialog box.
ABBYY FineReader 9.0 User’s Guide ● Lower the brightness to make the image darker. characters are "torn" or very light ● Scan in grayscale. Brightness will be tuned automatically. ● Increase the brightness to make the image brighter. characters are distorted, glued together, or filled ● Scan in grayscale . Brightness will be tuned automatically. Adjusting Image Resolution Image resolution shows the fineness of detail that can be distinguished in an image and is measured in dots per inch (dpi).
ABBYY FineReader 9.0 User’s Guide Note: Straightening text lines may take some time. Editing Images If your scanned document is noisy or has distorted lines or inverted colors, you can correct these defects manually. To edit an image: 1. Select Page>Edit Page Image… 2. In the Edit Image dialog box, use the image editing tools to: ● deskew and straighten lines ● rotate the image ● split the image ● crop the image ● invert the image ● change the image resolution ● erase a part of the image 3.
ABBYY FineReader 9.0 User’s Guide Positioning the Camera If possible, use a tripod. Position the lens parallel to the plane of the document and point it toward the center of the text. At full optical zoom, the distance between the camera and the document must be sufficient to fit the entire document into the frame. Usually this distance will be 50–60 cm. Flash Whenever possible, turn off the flash to avoid glare and sharp shadows on the page.
ABBYY FineReader 9.0 User’s Guide Auto focus may not work properly in poor lighting or when photographing at a close distance. In poor lighting conditions, try using an additional light source. When photographing a document up close, try using the Macro (or Close–Up) mode. Otherwise, if possible, focus the camera manually. If only a part of the picture is blurred, try reducing the aperture value. Increase the distance between the document and the camera and use maximum zoom.
ABBYY FineReader 9.0 User’s Guide lines and tables with color cells). Note: Compared to the Fast mode, the Thorough mode takes more time but ensures better recognition quality. ● Fast reading This mode is recommended for processing large documents with simple layouts and good quality images. Select the mode that best suits your needs. ● Table processing Select how tables should be handled. ● Only find tables with explicit separators Select this option to recognize only tables with explicit separators.
ABBYY FineReader 9.0 User’s Guide Complex Structure of Paper Document Not Reproduced in Electronic Document Before ABBYY FineReader performs OCR on your document, it detects areas containing text, pictures, tables, and barcodes. The program than relies on this analysis to determine the areas and order of recognition. This information is also used to recreate the original formatting. When new pages are added to an ABBYY FineReader document, the program automatically analyzes their formatting.
ABBYY FineReader 9.0 User’s Guide ● Options dialog box To mark each line of text as a separate table cell: 1. Select Tools>Options… and click the 2. Read tab. 2. Under Table processing, select One line of text per cell in table. 3. Re–launch the OCR process. Note: You may also need to adjust the results of automatic table analysis if your table contains cells with vertical text in them. Picture Not Detected Picture areas mark the pictures contained in your document.
ABBYY FineReader 9.0 User’s Guide Industrial 2 of 5 UCC–128 UPC–A UPC–E PDF417 Vertical or Inverted Text Not Recognized Properly A fragment of recognized text may contain a large number of errors if the orientation of the fragment was detected incorrectly or if the text in the fragment is inverted (i.e. light text is printed on a dark background). To solve this problem: 1. In the Image window, select the area or the table cell that contains vertical or inverted text 2.
ABBYY FineReader 9.0 User’s Guide Adjusting area borders 1. Click the area border and hold down the left mouse button. The mouse pointer will become a two headed arrow. 2. Drag the pointer in the desired direction. 3. Release the mouse button. Note: If you click an area corner, you can move both horizontal and vertical borders of the area simultaneously. Adding/removing area parts 1. Select the / tool. 2. Place the mouse pointer inside the area and draw a rectangle.
ABBYY FineReader 9.0 User’s Guide ● User Dictionary: Adding and Removing Words ● Using Styles ● Editing Hyperlinks ● Editing Tables ● Editing Headers, Footers, and Footnotes Checking the Text in the Text Window You can check and edit the recognized text in the Text window. The text formatting tools and saving options are located on the toolbar at the top of the Text window.
ABBYY FineReader 9.0 User’s Guide To add a word to the dictionary while checking the spelling: 1. In the Check Spelling dialog box, click the Add… button. 2. In the Primary form dialog box enter the following information: ● Part of speech (Noun, Adjective, Verb, Uninflected) ● If the word is always capitalized, select the Sentence case item ● The primary form of the word 3. Click OK. The Create Paradigm dialog box will open.
ABBYY FineReader 9.0 User’s Guide 2. Click the button on the toolbar at the top of the Text window. 3. In the Edit Hyperlink dialog box, make the necessary changes in the Text to display field. 4. In the same dialog box, specify the type of address in the Link to group: ● Select Web page to link to an Internet page. In the Address field, specify the protocol and the URL of the page (e.g. http://www.abbyy.com) ● Select Local file to link to a file.
ABBYY FineReader 9.0 User’s Guide Saving: General Once you have performed OCR on a document, you can save the results to disk or send them to an application of your choice. The corresponding commands can be found on the File menu: ● File>Save FineReader Document> Saves the current ABBYY FineReader document on your hard disk to allow later modification. Both the recognized text and the page images are saved. ● File>Save As> Saves the recognized text on your hard disk in a format of your choice.
ABBYY FineReader 9.0 User’s Guide Default paper size You can select the paper size to be used for saving in RTF, DOC, WordML or DOCX format from the Default paper size drop–down list. Tip: To ensure the recognized text fits the paper size, select the Increase paper size if content does not fit option. ABBYY FineReader will automatically select the most suitable paper size when saving.
ABBYY FineReader 9.0 User’s Guide ● Convert numeric values to numbers Converts numbers into the "Numbers" format in the XLS file. Microsoft Excel may perform arithmetical operations on cells of this format. ● Keep headers and footers Preserves headers and footers in the output document. Saving in PDF To save your text in PDF: 1. On the toolbar at the top of the Text window, select PDF Document (*.pdf) from the drop–down list next to the Save button.
ABBYY FineReader 9.0 User’s Guide ● Use standard fonts If this option is selected, the PDF file refers to the standard Acrobat fonts: Times New Roman, Arial, and Courier New. ● Use system fonts If this option is selected, the PDF file refers to the standard fonts installed on your computer. Security You can use passwords to prevent your PDF document from unauthorized opening, printing or editing: ● Click the PDF Security Settings… button and in the dialog box, select the desired security settings.
ABBYY FineReader 9.0 User’s Guide ● Exact copy Produces a document that maintains the formatting of the original. This option is recommended for documents with complex layouts, such as promotion booklets. Note, however, that this option limits the ability to change the text and formatting of the output document. ● Formatted text Retains fonts, font sizes, and paragraphs, but does not retain the exact locations of the objects on the page or the spacing. The resulting text will be left–aligned.
ABBYY FineReader 9.0 User’s Guide ● Select Custom… to specify picture settings manually. In the Custom Picture Settings dialog box, select the desired settings and click OK. Important! When saving results in PPT, ABBYY FineReader creates special HTML files that contain the different parts of the presentation. To save the presentation as a single file, re–save it using PowerPoint (select Save As from the File menu and specify PPT as the saving format). Saving in TXT To save your text in TXT: 1.
ABBYY FineReader 9.0 User’s Guide ● Append to end of existing file Appends the text to the end of an existing CSV file. ● Insert page break character (#12) as page separator Saves the original page arrangement. ● Field separator Selects the character that will separate the data columns in the CSV file. Character encoding ABBYY FineReader detects the code page automatically.
ABBYY FineReader 9.0 User’s Guide ABBYY FineReader supports the following compression methods: ● ZIP is a compression method suitable for images with large areas of the same color (e.g. screenshots). ZIP is a lossless method, i.e. it does not affect the quality of resulting images. ● JPEG is a compression method that is usually used for grayscale and color images, such as photographs. The JPEG is a lossy compression method which can greatly reduce the size of an image file.
ABBYY FineReader 9.0 User’s Guide Advanced Features Chapter Contents ● Customizing the Workspace ● Using Area Templates ● User Languages and Language Groups ● ABBYY FineReader Document ● Recognition with Training ● ABBYY FineReader Automated Tasks ● Group Work in a LAN ● ABBYY Hot Folder & Scheduling Customizing the Workspace You can customize the ABBYY FineReader workspace to suit your needs.
ABBYY FineReader 9.0 User’s Guide ● To customize the Document, Image, Text, and Zoom windows, click window and in the Options dialog box, click the View tab. on the toolbar at the bottom of the Document ● To make the Quick Access Bar visible, select View>Toolbars and then select Quick Access Bar. Document window ● To switch between Thumbnails and Details views in the Document window, click window and select the desired view from the menu.
ABBYY FineReader 9.0 User’s Guide To create an area template: 1. Open an image and either let the program analyze the layout automatically or draw the desired areas manually. 2. From the Areas menu, select the Save Area Template… command. In the saving dialog box, provide a name for your template and click Save. Important! To be able to use an area template, you must scan all the documents in the set using the same resolution value. Applying an area template: 1.
ABBYY FineReader 9.0 User’s Guide abc abc, Abc, ABC Abc abc, Abc, ABC ABC abc, Abc, ABC aBc aBc, abc, Abc, ABC ● Regular expression You can use a regular expression to create new language. ● Advanced… Opens the Advanced Language Properties dialog box, where you can specify more advanced properties for your language: ● non letter characters that may occur at the beginning or at the end of words ● standalone non letter characters (punctuation marks, etc.
ABBYY FineReader 9.0 User’s Guide ABBYY FineReader Document: General At launch, ABBYY FineReader creates a new document automatically. You can either continue working with this document or open another document. All the pages of a document are displayed in the Document window. To view a page, click its thumbnail in the Document window or double–click its number. The image of the page will be displayed in the Image window and the recognized text will be displayed in the Text window.
ABBYY FineReader 9.0 User’s Guide ● Save the current document options To save the current document options to a file: 1. Select Tools>Options… and click the Advanced tab. 2. Click the Save Options… button. Note: To restore the default options, click Reset to Defaults. 3. In the Save Options dialog box, type in a name for your file and specify a storage location. The following document options will be saved: ● the options selected on the Document, 1. Scan/Open, 2. Read, 3.
ABBYY FineReader 9.0 User’s Guide 7. On the toolbar at the top of the Image window, click Read. Now if ABBYY FineReader encounters an unknown character, a Pattern Training dialog box will display the unknown character. 8. Teach new characters and ligatures. A ligature is a combination of two or three "glued" characters (for example, fi, fl, ffi, etc.). These characters are difficult to separate because they are "glued" during printing.
ABBYY FineReader 9.0 User’s Guide Editing a User Pattern You may wish to edit your newly created pattern before launching the OCR process. An incorrectly trained pattern may adversely affect OCR quality. A pattern should contain only whole characters or ligatures. Characters with cut edges and characters with incorrect letter correspondences should be removed from the pattern. 1. From the Tools menu, select Pattern Editor…. 2.
ABBYY FineReader 9.0 User’s Guide . ● The tasks that are shipped with ABBYY FineReader are marked with You cannot delete or modify these tasks. However, you can copy a task and then modify it. . ● The custom tasks created by the user are marked with To rename a custom task, right–click the task and select Rename… from the shortcut menu. ● The tasks that, for some reason, cannot be run on your computer are marked with .
ABBYY FineReader 9.0 User’s Guide 4. Click Change… to change the properties of the step Click Delete to delete a step from your automated task. The choice of available steps depends on which steps have been selected earlier. Therefore, not every step can be deleted on its own. For example, if you add a Read Document step to your automated task, you will not be able to delete the Analyze Layout step. However, you can use the << Back button to roll back the automated task. 5.
ABBYY FineReader 9.0 User’s Guide 2. Analyzing the layout This is an optional step where you may specify which area templates should be used. ● Load Areas Template Provides the path to the area template file to be used. ● Analyze Layout Once ABBYY FineReader has acquired the images, it will analyze them and draw the necessary areas. To draw the areas manually, select the Draw areas manually option. 3. OCR At this step, ABBYY FineReader performs OCR on the images.
ABBYY FineReader 9.0 User’s Guide ● A separate copy of ABBYY FineReader 9.0 should be installed on each computer. ● All the users must have full access to the ABBYY FineReader document. ● Each user may add pages to the document and modify them. If a user adds new pages and launches the OCR process for them, the program will process the entire document anew.
ABBYY FineReader 9.0 User’s Guide ABBYY FineReader includes ABBYY Hot Folder & Scheduling, a scheduling agent which allows you to select a folder with images and set the time for ABBYY FineReader to process the images contained in the folder. For example, you can schedule your computer to recognize images overnight. To process images in a folder automatically, create a processing task for that folder and specify the image opening, OCR, and saving options.
ABBYY FineReader 9.0 User’s Guide Note: By default, task files are stored in %Userprofile%\Local Settings\Application Data\ABBYY\HotFolder\9.00. (In Microsoft Windows Vista: %Userprofile%\AppData\Local\ABBYY\HotFolder\9.00). The ABBYY Hot Folder & Scheduling main window displays a list of set–up tasks. For each task, the full path to the corresponding hot folder is displayed, together with its current status and the scheduled processing time.
ABBYY FineReader 9.0 User’s Guide ● Thorough (in this mode, ABBYY FineReader will read even poor quality images) or ● Fast (recommended only for good quality images) ● Under Hyperlinks, select the Highlight hyperlinks option to highlight detected hyperlinks in the recognized text and then select a color in the Color field. 3. Click Next. In the Hot Folder – Step 3 of 3: Save Document dialog box, specify where the recognized text should be saved and in which format.
ABBYY FineReader 9.0 User’s Guide Appendix Chapter Contents ● Supported Saving Formats ● Supported Image Formats ● Regular Expressions ● Glossary ● Keyboard Shortcuts Supported Saving Formats ABBYY FineReader saves recognized texts in the following formats: ● Microsoft Word Document (*.DOC) ● Microsoft Office Word 2007 Document (*.DOCX) ● Rich Text Format (*.RTF) ● Microsoft Office WordML Document (*.XML) ● Adobe Acrobat Document (*.PDF) ● HTML Document (*.HTM) ● Microsoft PowerPoint Presentation (*.
ABBYY FineReader 9.
ABBYY FineReader 9.0 User’s Guide TIFF, Gray, ZIP compression tif, tiff + + TIFF, Gray, LZW compression tif, tiff + + TIFF, Color, Unpacked tif, tiff + + TIFF, Color, Packbits tif, tiff + + TIFF, Color, JPEG compression tif, tiff + + TIFF, Color, ZIP compression tif, tiff + + TIFF, Color, LZW compression tif, tiff + + PDF pdf PDF v. 1.6 or earlier pdf + + + + GIF gif + – XPS (Microsoft .NET Framework 3.
ABBYY FineReader 9.0 User’s Guide Capital Cyrillic letter [А–Я] Small Cyrillic letter [а–я] Digit [0–9] Space \s @ Reserved. Note: 1. To use a regular expression symbol as a normal character, precede it with a backslash. For example, [t–v]x+ stands for tx, txx, txx, etc., ux, uxx, etc., but \[t–v\]x+ stands for [t–v]x, [t–v]xx, [t–v]xxx, etc. 2. To group regular expression elements, use brackets. For example, (a|b)+|c stands for c or any combinations like abbbaaabbb, ababab, etc.
ABBYY FineReader 9.0 User’s Guide code page A table that sets the interrelation between the character codes and the characters themselves. Users can select the characters they need from the set available in the code page. color mode A scanning parameter that determines whether an image must be scanned in black and white, grayscale, or color.
ABBYY FineReader 9.0 User’s Guide tagged PDF A PDF document which contains information about the document structure such as its logical parts, pictures, tables, etc. This structure is encoded in PDF tags. A PDF file equipped with the tags may be reflowed to fit different screen sizes and will display well on handheld devices. text area An area that contains text. Note that text areas should only contain single–column text.
ABBYY FineReader 9.
ABBYY FineReader 9.
ABBYY FineReader 9.
ABBYY FineReader 9.0 User’s Guide How to Buy an ABBYY Product You can buy ABBYY products from our online store or from our partners (see http://www.abbyy.com for the list of ABBYY partners). For detailed information About ABBYY products, please ● visit our Web site at http://www.abbyy.com ● call us at +7 495 783 37 00 or send us a fax at +7 495 783 26 63 ● write to us at sales@abbyy.com. Additional fonts for various languages can be purchased from www.paratype.com/shop/.
ABBYY FineReader 9.0 User’s Guide Support e–mail: support@abbyy.ru Web: http://www.abbyy.ru http://www.abbyy.
ABBYY FineReader 9.0 User’s Guide Technical Support If you have any questions regarding the use of ABBYY FineReader, please consult all the documentation you have (the User's Guide and Help) before contacting our technical support service. You may also wish to browse the technical support section on the ABBYY Web site at www.abbyy.com/support — you may find the answer to your question there.