User Guide

Acrobat SDK User’s Guide 109

Searching and Indexing

Indexing PDF Documents

When creating a replacement search plug-in for Acrobat, you must decide what indexes

your search plug-in will use. You can either create your own indexes (see “

Extracting and

Highlighting Text”) or search the Lextek indexes created by the Acrobat 7 Catalog plug-in.

Indexing PDF Documents

You can use the Acrobat SDK to create a full-text index of a set of PDF documents. A full-text

index is a searchable database of all the text in the documents. After building an index, you

can use search the entire library quickly.

You can build and manipulate indices from a plug-in, from Acrobat JavaScript or from an

external application using IAC (DDE or Apple events) calls.

Extracting and Highlighting Text

For indexing PDF files, Acrobat provides text extraction APIs. Text extraction also supplies

position information that can be used to highlight search hits in the original PDF file. The

text extraction tools are provided as calls in the plug-in API on the Acrobat platforms

(Mac

OS and Windows).

You can extract ASCII text from a PDF file using a plug-in or using Acrobat JavaScript. You

can also save the PDF document as text or rich text.

Indexing and Acrobat JavaScript

It is possible to extend and customize indexes for multiple PDF documents using the

Acrobat JavaScript Catalog, CatalogJob, and Index objects. These objects may be used to

build, retrieve, or remove indexes.

The Index object represents a Catalog-generated index and contains a build method that

is used to create an index.

For more information, see the Acrobat JavaScript Scripting Guide.

The Acrobat Catalog Plug-in

Acrobat Catalog is a plug-in that allows you to create a full-text index of a set of PDF

documents. The Catalog plug-in has an HFT consisting of several methods that plug-in

developers can import and use. In addition, Catalog supports DDE, and broadcasts several

Windows messages.

For more information, see the Acrobat and PDF Library API Overview.