Installation
Table Of Contents
- Planning for Search Appliance Installation
- Contents
- Planning for Search Appliance Installation
- About This Document
- How Does the Search Appliance Work?
- About the End User License Agreement
- How Do I Plan My Installation?
- What Character Encoding Should Content Files and Feeds Use?
- What Hardware and Software Do I Need?
- What Does the Google Search Appliance Shipping Box Contain?
- What File Types Can Be Indexed?
- What File Sizes Can Be Indexed?
- What Content Locations Can Be Crawled or Traversed?
- How Many URLs Can Be Crawled?
- How Do I Control Security?
- Can the Search Appliance Use a Dedicated Network Interface Card for Administration?
- What Ports Does the Search Appliance Use?
- What User Accounts Do I Need?
- How are Administration Accounts Authenticated?
- How Do I Obtain Technical Support?
- How is Power Supplied to the Search Appliance?
- How is Data Destroyed on a Returned Search Appliance?
- What Values Do I Need for the Installation Process?
- What Tasks Do I Need to Perform Before I Install?
- Electrical and Other Technical Requirements

Google Search Appliance: Planning for Search Appliance Installation 12
How Many URLs Can Be Crawled?
The number of URLs that your search appliance can crawl depends on the model and license limit. The
follow table lists the maximum number of URLs matching the crawl patterns you define that the search
appliance can crawl.
How Do I Control Security?
Your business may require you to restrict access to certain enterprise content. You might want to
restrict what content is crawled and indexed, and you might want to restrict which users have access to
particular content. The Google Search Appliance supports various security models:
• You can exclude content from the index by storing the content in locations that are not crawled.
• You can exclude content from the index by using a robots.txt file to prevent particular locations
from being crawled.
• You can require the search appliance to provide credentials before crawling particular locations.
• You can design an authentication model under which users who cannot be authenticated are not
able to see particular content.
• You can design an authorization model that defines which users are authorized to perform certain
functions on particular documents.
The search appliance supports a range of authentication and authorization methods, including HTTP
Basic, Windows NT LAN Manager Authentication (NTLM), HTML forms-based authentication, certificate
authentication, lightweight delivery access protocol (LDAP) directory servers, Authentication and
Authorization SPI.
For information on how to configure crawl for your security model, see Administering Crawl. For
information on how to integrate your search appliance with different authentication and authorization
models, see Managing Search for Controlled-Access Content.
How Can My Security Model Improve Performance?
Using policy ACLs and per-URL ACLs to control which users have access to content located in particular
URLs speeds up the process of authorization and improves search appliance performance. For more
information on ACLs, see Managing Search for Controlled-Access Content.
Search Appliance Model Maximum License Limit Maximum Number of URLs
that Match Crawl Patterns
GB-7007 10 million ~ 13.6 million
GB-9009 30 million ~ 40 million
G100 20 million ~26 million
G500 100 million ~133 million