14.0
Table Of Contents
- Introducing ABBYY FineReader
- The New Task window
- PDF Editor
- OCR Editor
- Launching the OCR Editor
- OCR Editor interface
- Obtaining documents
- Recognizing documents
- Improving OCR results
- If your document image has defects and OCR accuracy is low
- If areas are detected incorrectly
- If the complex structure of a paper document is not reproduced
- If you are processing a large number of documents with identical layouts
- If tables and pictures are not detected
- If a barcode is not detected
- If an incorrect font is used or some characters are replaced with "?" or "□"
- If your printed document contains non-standard fonts
- If your document contains many specialized terms
- If the program fails to recognize certain characters
- If vertical or inverted text was not recognized
- Checking and editing texts
- Copying content from documents
- Saving OCR results
- Integration with other applications
- Automating and scheduling OCR
- ABBYY Compare Documents
- Reference
- Types of PDF documents
- Scanning tips
- Taking photos of documents
- Options dialog box
- Format settings
- Supported OCR and document comparison languages
- Supported document formats
- Document features to consider prior to OCR
- Image processing options
- OCR options
- Working with complex-script languages
- Supported interface languages
- Current date and time on stamps and in Bates numbers
- Fonts required for the correct display of texts in supported languages
- Regular expressions
- Installing, activating and, registering ABBYY FineReader 14
- ABBYY Screenshot Reader
- Appendix
- Technical support
- Copyrights
233
ABBYY® FineReader 14 User’s Guide
Regular expressions
The table below lists the regular expressions that can be used to create a dictionary for a custom
language .
Item name
Conventiona
l regular
expression
symbol
Usage examples and explanations
Any character
.
c.t— denotes "cat," "cot," etc.
Character from
group
[]
[b-d]ell— denotes "bell," "cell," "dell," etc.; [ty]ell— denotes "tell"
and "yell"
Character not from
group
[^]
[^y]ell— denotes "dell," "cell," "tell," but forbids "yell”; [^n-s]ell—
denotes "bell," "cell," but forbids "nell," "oell," "pell," "qell," "rell," and
"sell"
Or
|
c(a|u)t— denotes "cat" and "cut"
0 or more matches
*
10*— denotes numbers 1, 10, 100, 1000, etc.
1 or more matches
+
10+— allows numbers 10, 100, 1000, etc.
Letter or digit
[0-9a-zA-Zа-
яА-Я]
[0-9a-zA-Zа-яА-Я]— allows any single character; [0-9a-zA-Zа-яА-
Я]+— allows any word
Capital Latin letter
[A-Z]
Small Latin letter
[a-z]
Capital Cyrillic letter
[А-Я]
Small Cyrillic letter
[а-я]
Digit
[0-9]
@
Reserved.
Note:
1. To use a regular expression symbol as a normal character, precede it with a back slash. For
example,[t-v]x+ stands for tx, txx, etc., ux, uxx, etc., and vx, vxx, etc., but \[t-v\]x+ stands for [t-v]x,
[t-v]xx, [t-v]xxx, etc.
2. To group regular expression elements, use brackets. For example, (a|b)+|c stands for c or any
combinations like abbbaaabbb, ababab, etc. (a word of any non-zero length in which there may be
any number of a's and b's in any order), while a|b+|c stands for a, c, b, bb, bbb, etc.
Examples
133










