HP Integrated Archive Platform Version 2.1 nl User Guide Includes information about using the Integrated Archive Platform (IAP) Web Interface (Content Search and Retrieve interface). For additional user information on Email Archiving software for Microsoft Exchange and IBM Lotus Domino, see the HP Email Archiving software for Microsoft Exchange User Guide and HP Email Archiving software for IBM Lotus Domino User Guide contained in those products.
Legal and notice information © Copyright 2004–2011 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents About this guide ................................................................................... 9 Intended audience ...................................................................................................................... 9 Prerequisites ............................................................................................................................... 9 Related documentation ...............................................................................................
Accessing saved results ............................................................................................................. Sending search results ............................................................................................................... Exporting search results ............................................................................................................. Using quarantine repositories ..............................................................................
A Indexed file types and MIME types ..................................................... 63 B Additional indexing detail for Microsoft Office ..................................... 65 Microsoft Office supported features ............................................................................................. 65 Microsoft Office supported file properties ..................................................................................... 66 Index ........................................................
Figures 1 IAP Web Interface toolbar ........................................................................................ 18 2 Simple Search page ................................................................................................ 21 3 Advanced Search page (email content type) ............................................................... 24 4 Advanced Search page (document content type) ......................................................... 27 5 Query Results page (email content type) ..
Tables 1 Document conventions ............................................................................................... 9 2 Applications for users .............................................................................................. 13 3 Toolbar buttons, IAP Web Interface ........................................................................... 18 4 IAP Web Interface tasks ...........................................................................................
About this guide This guide provides information about using the IAP Web Interface to search for and retrieve archived documents. For additional information on using Email Archiving software for Microsoft Exchange and IBM Domino, see the HP Email Archiving software for Microsoft Exchange User Guide and HP Email Archiving software for IBM Domino User Guide contained in those products. Intended audience This guide is intended for users of the IAP Web Interface.
Convention Element • File and directory names Monospace text • System output • Code • Commands, their arguments, and argument values Monospace, italic text Monospace, bold text • Code variables • Command variables Emphasized monospace text CAUTION: Indicates that failure to follow directions could result in damage to equipment or data. IMPORTANT: Provides clarifying information or specific instructions. NOTE: Provides additional information. TIP: Provides helpful hints and shortcuts.
To find more information about access levels, go to http://support.openview.hp.com/ new_access_levels.jsp. Subscription service HP strongly recommends that customers register online using the Subscriber's choice Web site: http:/ /www.hp.com/go/e-updates. Subscribing to this service provides you with email updates on the latest product enhancements, newest driver versions, and firmware documentation updates as well as instant access to numerous other product resources. HP IAP 2.
About this guide
1 IAP overview This section describes the HP Integrated Archive Platform (IAP) and the IAP Web Interface, which you will use to search for email or files archived on the system.
Application Tasks Use your Web browser to search for and view archived email and files on the IAP. Send email to your email account or download files to a local computer or network folder. IAP Web Interface Compliance officers: Place a legal hold on documents. View the compliance system log (AuditLog). (Customer option) Export email and files from the IAP when the appropriate export utility is installed on your system. The IAP Web Interface is available to all users.
What the IAP indexes The following sections explain what is indexed by the IAP and can be searched for in the Web Interface. Indexed file types You can search the contents of email messages and the file types listed below. For a complete list of the file types that are indexed by the IAP, see “Indexed file types and MIME types” on page 63. • • • • • Plain text files Rich text files (.
In the example below, XYZcorp would be indexed as “X” and “yzcorp.” Non-indexed file types The following file types are not indexed: • Graphic files • Music files • Video files Depending on the way your IAP system is configured, these types of files might be archived, but they are not indexed. You can search for them only by using external identifying information, such as the file name or file extension.
2 Introducing the IAP Web Interface Use the Web-based search tool to search for email and documents archived in the HP Integrated Archive Platform (IAP) system. This section explains the basics of the IAP Web Interface. • • • • • Logging in and out, page 17 Understanding the user interface, page 17 Common tasks, page 19 Changing your password, page 20 Changing your language, page 20 Logging in and out Before logging in for the first time, see your system administrator for the URL to use.
Using the toolbar Each page of the IAP Web Interface has a toolbar at the top. Figure 1 IAP Web Interface toolbar . The following table describes each button: Table 3 Toolbar buttons, IAP Web Interface Button New Search Query Manager Description Click to display the Simple Search page, where you can submit a query. See “Completing simple searches” on page 21. To display the Advanced Search page, point to this button and click Advanced search from the menu. See “Completing advanced searches” on page 22.
Searching indexed contents The Search for field is checked for a match against the indexed contents of documents. • For email messages, the Search for field searches for words in the message body, but not in other message fields such as From or To. The Search for field can also be used to search indexed message attachments; for example, the text in a Microsoft Word file. • For files, the Search for field applies only to indexed document files.
Changing your password NOTE: This option does not appear if you cannot change your password from the Web Interface. Depending on how your system is configured, your password is usually the same as your Windows or your Lotus Notes password. Users who have a local account created in the IAP, such as system administrators or compliance officers, have the option of managing their password through the Web Interface. To change your password for accessing the IAP Web Interface: 1.
3 Searching for email or files This section explains how to conduct simple searches or more advanced searches for documents that are archived in the IAP. • Completing simple searches, page 21 • Completing advanced searches, page 22 • Using other search page options, page 28 Completing simple searches Use the Simple Search page to search for email or files using a pre-set time frame and words entered in the Search for field.
2. Search using the following fields on the Simple Search page: • Content Type: • Use email to search for email messages. • Use document to search for files that were migrated using the HP File Archiving software (formerly known as FMA). Using “document” does not search for attachments in email sent by Microsoft Outlook or Lotus Notes. Use “email” to search for email attachments. Compliance officers: For instructions on searching the AuditLog repository, see “The AuditLog repository” on page 39.
2. Search using the following fields on the Advanced Search page: • Content Type: • Use email to search for email messages. • Use document to search for files that were migrated using the HP File Archiving software (formerly known as FMA). Using “document” does not search for attachments in email sent by Microsoft Outlook or Lotus Notes. Use “email” to search for email attachments. Compliance officers: For instructions on searching the AuditLog repository, see “The AuditLog repository” on page 39.
Email Content Type Figure 3 Advanced Search page (email content type) .
Table 5 Additional advanced search query fields (email) Query field Matches Subject Enter one or more words from the email's Subject field. From Enter the email address of the person who sent the message. You can use a partial email address with a wildcard. The From field is disabled if the Address search field is completed. Enter the email address of a person who received the message or was copied in the message. You can use a partial email address with a wildcard.
Query field Matches NOTE: This field only appears if your IAP system records the name of the Outlook folder where an email was located. The Folder Name field is not available for messages archived from Lotus Notes. To complete the field, enter the name of the Outlook folder. Use quotation marks to search for a folder that has spaces in the name. For example: “my folder” or “Sent Items”. (This does not apply to Japanese, Chinese, or Korean email folder names.
Query field Matches Enter the MessageID (message identification number) from Outlook. (Not all messages have MessageIDs.) This field is used mainly for compliance searches. To display the MessageID field in Outlook: MessageID 1. Double-click to open the message in its own window. 2. Select View > Options. The Message Options dialog box is displayed. If the message has a MessageID, the field is shown in the Internet headers field of the Message Options dialog box.
Table 6 Additional advanced search query fields (documents) Query field Matches Document Name Enter the file name, not including the file extension. Enter the document's original file path. Document Path As for any other text query field, separators such as slash ( / ), backslash ( \ ), and colon ( : ) are ignored, and the query words are searched in any order. For example, query text c:\abc\xyz will match path abc:\xyz\c, as well as path c:\abc\ xyz.
4 Working with search results After completing a search, there are many ways to work with the results of the search. This section explains how to save the items you find, save the criteria for the search, send message copies to your email account, and create quarantine repositories for legal holds.
Figure 6 Query Results page (document content type) . You can complete any of the following tasks from the Query Results page: • To display the contents of an email in the viewing pane, click the item from the list once. Clicking the item twice will open the preview pane as a new window. • To display the contents of a document (file), click the item in the list, click Download, and then click Open in the File Download dialog box. • To download a file, click the item in the list and then click Download.
Query results navigation bar When search results are retrieved, the most recent documents are displayed first. Documents that were archived within the past two hours might not appear in the results, depending on your system's configuration. Fifty results (maximum) are shown on the Query Results page. You can use the query results navigation bar to display different groups of 50 results. Figure 7 Query results navigation bar .
Saving search criteria After you submit a search, you can save the search criteria. To save criteria: 1. 2. Display the Query Results page by completing one of the following tasks: • Submit a simple or advanced search (see “Completing simple searches” on page 21 or “Completing advanced searches” on page 22). • Access previously saved results (see “Accessing saved results” on page 34). From the Query Results page, click More Options, and then click Save Current Search Criteria.
NOTE: Deleting search results does not delete the items on the IAP. The actual items remain on the IAP according to the retention period set by your system administrator. If a search locates a large number of documents, saving the results is useful. The query is resubmitted as a background process. Because the query runs in the background, you can continue to use the IAP Web Interface (for example, to submit other queries) and then retrieve the results at a later time. To save results: 1. 2.
Accessing saved criteria If you save the criteria used to define a query, you can access the saved criteria from the Query Manager page. Each item listed shows the name of the saved criteria and the date when you saved the criteria. Twenty sets of query criteria can be saved at any one time. After that, a set must be deleted before you can save a new set of query criteria. To access saved criteria: 1. Click Query Manager in the toolbar.
To access saved results: 1. Click Query Manager in the toolbar. (You can also access this feature by right-clicking inside the Simple Search or Advanced Search page and selecting Query Manager.) The default Query Manager page displays all saved results. You can also access this view by clicking Saved Results on the Query Manager page. Figure 11 Saved Results view, Query Manager page . 2. Complete any of the following tasks: • To display the results, click Reload. The Query Results page is displayed.
Exporting search results An export utility is required to export archived email or files from the IAP. Email Content Type • Exporting email to Microsoft Outlook: The HP EAs Exchange Outlook Plug-In must be installed on your computer. Microsoft .NET Framework version 2.0 or later is required for the utility to be installed. • Exporting email to IBM Lotus Notes: Two methods are supported.
Using quarantine repositories Documents are kept in the IAP according to the retention policy specified by your organization. After an item's retention period has expired, it is automatically deleted from the system. Documents that are placed in a special repository called a quarantine repository are not subject to automatic deletion. As long as items are in this repository, they are preserved in a legal hold and cannot be deleted when their retention period expires.
Deleting the contents of a quarantine repository Once the legal hold is not required, you can delete the quarantine repository. The documents in the repository return to the control of the retention manager. If an item's retention period has not expired, it remains archived in the IAP. To delete a quarantine repository: 1. Click Query Manager in the toolbar to access the saved results. 2. In the Query Name column, locate the row with the saved results. 3.
5 Using the AuditLog This section describes how to work with the AuditLog, the IAP's compliance system log. Major topics include: • The AuditLog repository, page 39 • Searching the AuditLog, page 40 • Viewing AuditLogs, page 44 The AuditLog repository NOTE: AuditLog repositories are usually available only to legal or compliance officers. If the AuditLog repository does not appear in the Where to Search list on the search page, you do not have access.
Searching the AuditLog You can complete either a simple or advanced search when searching the AuditLog repository. To perform a search: 1. Click New Search in the toolbar and select the simple or advanced search form. TIP: If you need to search for multiple criteria, complete an advanced search. 2. From the Content Type list, select document. NOTE: An email search on an AuditLog repository will produce no results. You must perform a document search. 3.
Completing the Search for field TIP: You can use Boolean operators AND, OR, and NOT to enter multiple criteria in this field. For example, jdoe AND Search; Mail OR Export; jsmith* NOT jsmithson. For more information on Boolean expressions, see “Boolean query expressions” on page 58. You can also use wildcards (* and ?). See “Matching words” on page 54.
Logged action Description Actions performed to delete email Login IAP Admin login to the Platform Control Center (PCC), for the session in which Administrative Delete authority is granted. nl Admin Delete Authority Administrative Delete authority granted to a user (from the PCC). Delete by Admin Email deleted from the IAP by the Administrative Delete user. (Performed from the Web Interface.
Figure 12 AuditLog advanced search . Table 9 Additional fields for AuditLog repository searches Query field Matches Name of the component generating the AuditLog.
Query field Matches Search for: • Actions performed by a specific user. Use one of the following forms to identify the user: • User ID: Enter the login name of the user, such as jdoe. Author • First Name: Enter the user's first name, such as John, as shown in the LDAP directory (address book). • Last Name: Last Name: Enter the user's last name, such as Doe, as shown in the LDAP directory. • Messages merged during a Duplicate Manager job: Enter Merge Manager.
Each logged session contains the following additional information: • User Name: Name of the user performing the logged actions. For a Duplicate Manager job, the user is shown as MergeManager. • Session ID: The ID of the logged session. • Session Start Time: Start time of the session. • Session End Time: End time of the session. • User ID: Login ID of the user performing the logged actions. (Correlates with User Name.) • Operation: Name of the logged action. • TimeStamp: Date and time of the logged action.
Duplicate Manager jobs Figure 15 AuditLog for Duplicate Manager job . Administrative Delete actions IAP administrator actions in PCC Figure 16 AuditLog for Administrative Delete actions in PCC . Administrative Delete actions in Web Interface Figure 17 AuditLog entry for email deletion .
6 Troubleshooting problems in the Web Interface Use the topics in this section to troubleshoot problems that can occur when using the Web Interface. • • • • • • Unable to display saved results, page 47 Corrupted documents, page 47 Wildcard errors in queries, page 47 Query and display issues in Japanese, Chinese, and Korean documents, page 48 Problems exporting email, page 49 Problems exporting files, page 49 Unable to display saved results Search results are saved for one week and then deleted.
Wildcards in phrase searches Wildcards are not allowed in phrase queries (queries with spaces between words). For example, accounts* department or accounts? department are not valid queries. Query and display issues in Japanese, Chinese, and Korean documents Extra space appears in email Subject text When the text in an email Subject field includes non-ASCII characters, the Web Interface may sometimes display a space in the text. The space is not present in the archived email.
Japanese underscore character not displayed If your computer runs Microsoft Windows XP Service Pack 2 or later, the underscore (“_”) Japanese multi-byte character might not be visible in the Web Interface under the following conditions: • Your computer's language option for non-Unicode languages (Control Panel > Regional and Language options > Advanced) is set to English and the language in the Web Interface is set to Japanese • The non-Unicode language is set to Japanese and the Web Interface language is s
Troubleshooting problems in the Web Interface
7 Query expression syntax and matching This section describes the IAP Web Interface syntax used to search for archived documents (files or email messages), and explains how queries are matched against documents.
Topics include: • Word characters and separators, page 52 • Regular expression definition of English word characters, page 52 Word characters and separators Word characters include all uppercase and lowercase letters, digits, and the following additional characters: • _ (underscore) • # (number/pound/hash sign) • & (ampersand) All other characters are separators (except in queries, wildcards ? and *, and special query characters ~, ", -, and !). However, && by itself is not a word.
Letters and digits in files Although all letters and digits are word characters, their treatment in files (including email message attachments) depends on the character encoding used. You can search for any words in email message bodies and headers, regardless of the encoding. You can search for words in files (including email body, header, attachments, and indexed documents) provided the character encoding is shown in the following table. The IAP does not perform automatic character set detection.
Supported character set Description BIG5–HKSCS Chinese (Hong Kong) EUC-KR Korean ISO-2022–KR Korean Johab Korean KS_C_5601-1987 Korean ISO-2022-JP Japanese EUC-JP Japanese SHIFT-JIS Japanese Matching words Matching words is not case-sensitive: cat, Cat, cAt, and CAT all match. Corresponding uppercase and lowercase letters, such as A and a, are treated the same in all respects.
word. For example, the fuzzy word define~ matches the similar words defined and definite, but does not match defining, definition, indefinite, or pine. It also matches define itself. NOTE: Do not use wildcards in fuzzy searches. For example, foo*~ or foo?~ is not a valid query Measuring word similarity The edit distance (also called Levenshtein distance) between two words is the number of single-character operations (deletion, replacement, or insertion) required to change one word into the other word.
Proximity word sequences You can use simple word sequences to search for words separated by separators but not by other words. To search for document words that are in an ordered sequence, but might be separated by other words, use a proximity word sequence. To write a proximity word sequence, use the same syntax as a simple word sequence, but append a tilde (~) character to the second quote, and follow that with a numeric proximity value.
Spreadsheets Look at the external representation of the following spreadsheet example.
And the following queries would not match: "John Tyler" "Quincy Adams" "John Quincy Adams" "John Adams 1797–1801" PDF documents PDF documents are another case where the internal text representation can vary widely from the visible presentation in PDF readers. The following issues can arise: • Text sequences can appear out of order on the same page depending on how the page was composed.
Using the NOT operator The Boolean NOT operator excludes every term after NOT in a query. For example, the query beta* NOT beta2 would look for beta1 or beta05, but not beta2. For IAP searches, the NOT operator must always connect at least two terms or sub-expressions (such as yes NOT no), but the query cannot consist solely of negative criteria (such as (NOT yes) OR (NOT no)).
Separators Boolean operators must be surrounded by one or more separators, typically white space. For example, the query peas&&carrots is not equivalent to the query peas && carrots; peas&&carrots is a single word. Negation operators (- and !) are exceptions to this rule. They must be preceded by a separator, but they need not be followed by a separator. For example, carrot-a6 is a single query word, but carrot -a6, like carrot (- a6), is equivalent to the Boolean expression carrot NOT a6.
Query expression examples The following are examples of query expressions. Table 14 Query expression examples Query expression Finds documents with ... peace OR quiet Either peace or quiet, or both, in either order. peace quiet peace AND quiet Both peace and quiet, in either order. peace && quiet peace&&quiet The single word peace&&quiet. peace or quiet The three words peace, or, and quiet, in any order. “or” is a word. The OR operator must be uppercase. not quiet The words not and quiet.
Query expression syntax and matching
A Indexed file types and MIME types The following file types and MIME content-types are indexed by IAP 2.1. You can search the contents of archived files or email attachments if their file type is listed in this table. Table 15 IAP indexed file types and MIME types File extension File type MIME content-type .xml XML document text/xml .txt Plain text file; treated as ISO8859-1 unless otherwise specified text/plain .htm, .html, .stm HTML document text/html, rtf/html .
File extension File type MIME content-type .xlsm Microsoft Excel 2007 macroenabled workbook application/vnd.ms-excel.sheet.macroEnabled.12 .xltx Microsoft Excel 2007 template application/vnd.openxmlformats-officedocument. spreadsheetml.template .xltm Microsoft Excel 2007 macroenabled workbook template application/vnd.ms-excel.template.macroEnabled.12 .xlam Microsoft Excel 2007 add-in application/vnd.ms-excel.addin.macroEnabled.12 .pptx Microsoft PowerPoint 2007 presentation application/vnd.
B Additional indexing detail for Microsoft Office The tables below list the Microsoft Office features and document properties that are indexed (and not indexed) by IAP 2.1. Indexed features and properties can be used in Web Interface searches. The tables cover Microsoft Office versions 97–2003 and 2007.
Feature Microsoft Word Microsoft PowerPoint Sheet's name N/A N/A Microsoft Excel Excel 97-2003: Yes Excel 2007: No Microsoft Office supported file properties Table 17 Microsoft Office supported properties Type Property Microsoft Word Microsoft PowerPoint Microsoft Excel Author Yes Yes Yes Title Yes Yes Yes Subject Yes Yes Yes Keywords Yes Yes Yes Category Yes Yes Yes Status Yes Yes Yes Comments Yes Yes Yes Location No No No Type No No No Location No No No
Type Advanced Properties: Custom Property Microsoft Word Microsoft PowerPoint Microsoft Excel Last saved by Yes Yes Yes Revision number Word 97-2003: No PowerPoint 97-2003: No Excel 97-2003: No Word 2007: Yes PowerPoint 2007: Yes Statistics No No No Checked by Yes Yes Yes Client Yes Yes Yes Date completed Yes Yes Yes Department Yes Yes Yes Destination Yes Yes Yes Disposition Yes Yes Yes Division Yes Yes Yes Document number Yes Yes Yes Editor Yes Yes Yes
Type Advanced Properties: Contents Advanced Properties: Summary Property Microsoft Word Microsoft PowerPoint Microsoft Excel Reference Yes Yes Yes Source Yes Yes Yes Status Yes Yes Yes Telephone number Yes Yes Yes Typist Yes Yes Yes Document Contents No No No Title Yes Yes Yes Subject Yes Yes Yes Author Yes Yes Yes Manager Yes No Yes Company Yes No Yes Category Yes Yes Yes Keywords Yes Yes Yes Comments Yes Yes Yes Word 2007: Yes PowerPoint 97-2
Index Symbols && in Boolean query expressions, 58 , 20 A access control list (ACL) definition, 14 AND queries, 58 archiving definition, 14 audit queries, 33 AuditLog repository accessing, 39 B Boolean queries AND and OR operators, 58 characters, 51 expressions, 58 nested, 60 NOT operator, 59 separators, 60 sub-expressions, 60 C cannot display saved results, 47 case sensitivity Boolean queries, 58 matching words, 54 word characters, 51 changing IAP Web Interface language, 20 IAP Web Interface password, 20
exporting email, 36 files, 36 expressions, query about, 51 Boolean, 58 examples, 61 languages, 52, 53 letters and digits, 52 matching words, 54 separators, 51 sequences, 55 sequences, matching, 55 word characters, 51 F File Export, 13 file type indexing support, 15, 63 furigana, 48 fuzzy words, 54 H half-width Katakana, 48 Hankaku-kana, 48 help, obtaining, 10 HP Subscriber's choice Web site, 11 technical support, 10 HTML formatting, 15 I IAP definition, 13 IAP Web Interface, 14 advanced searches, 22 basi
PowerPoint, 15, 65, 66 Preferences button IAP Web Interface, 18 Q quarantine repositories, 32, 37 creating, 37 deleting, 38 Quattro Pro, 15, 63 query criteria deleting, 34 displaying saved criteria, 34 saving, 32 query expressions about, 51 Boolean, 58, 60 examples, 61 languages, 52, 53 letters and digits, 52 separators, 51 sequences, matching, 55 word characters, 51 query issues Japanese, Chinese, Korean documents, 48 Query Manager button IAP Web Interface, 18 query problems wildcards, 47 query results di
wildcard characters, 54 Word, 15, 65, 66 word sequences matching, 56 proximity, 56 WordPerfect, 15, 48, 63 WordPerfect Office applications, 15, 63 words, query characters and separators, 51 fuzzy, 54 letters and digits, 52 literal, 54 matching, 54, 55 sequences, 55 Z ZIP files, 15, 63 72