HP IAP Version 2.1 User Guide, March 2011

What the IAP indexes
The following sections explain what is indexed by the IAP and can be searched for in the Web
Interface.
Indexed file types
You can search the contents of email messages and the file types listed below. For a complete list of
the file types that are indexed by the IAP, see Indexed file types and MIME types on page 63.
Plain text files
Rich text files (.rtf)
Documents in TNEF format (for Microsoft Outlook and Exchange)
HTML (HyperText Markup Language) files
Files used by the following Microsoft Office programs: Word, Excel, and PowerPoint
We support indexing of these file types for Microsoft Office 2007 and prior releases of Office.
See Additional indexing detail for Microsoft Office on page 65 and Limitations for Microsoft
Office items on page 16 for more information on what is and is not indexed in Microsoft Office
documents.
Files used by the following Corel WordPerfect Office programs: WordPerfect, Quattro Pro,
Presentations
We support indexing of these file types for WordPerfect Office X3 and prior releases of Office.
PDF (Portable Document Format) files viewed with Adobe Acrobat Reader
ZIP files
Embedded messages (RFC 822 messages)
For ZIP files and embedded messages, the content inside the files is expanded and indexed.
Message MIME types
An email message can contain message parts of different MIME (Multipurpose Internet Mail Extensions)
content-types. The IAP indexes the content-types shown in Indexed file types and MIME
types on page 63. Each content-type corresponds to one of the indexed file types.
An email message that is entirely plain text, not MIME, is also indexed.
What the IAP does not index
The following sections explain what the IAP does not index.
Document markup and formatting
Separators (such as punctuation) between words are ignored during indexing. Invisible source-code
words, such as HTML markup tags, are also ignored in indexing.
Document formatting usually has no bearing on indexing, and only the words you see in an email or
file are indexing candidates. However, when a dropped cap (a large initial capital letter) is used in
a Microsoft Word document, the word with the dropped cap is indexed as two separate words. This
is because Word puts a dropped cap into a text box to set it off from the surrounding paragraph.
HP IAP 2.1 User Guide 15