PART I TE CHAPTER 1: What Is Enterprise Search? RI AL Introduction MA CHAPTER 2: Developing a Strategy – The Business Plan of Search CO PY RI GH TE D CHAPTER 3: Overview of Microsoft Enterprise Search Products
1 What Is Enterprise Search? WHAT’S IN THIS CHAPTER? ‰ Defining Enterprise Search, and how it differs from Internet search portals ‰ Giving an overview of Enterprise Search architecture and the Microsoft Search lineup ‰ Characterizing the use of search within an organization. ‰ Exploring Search ROI and SCOE ‰ Answering common questions about Enterprise Search Many people assume that “Enterprise Search” refers to search behind a corporate fi rewall.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? But if you own it (or lease it), and it’s not working, you can change it. You can adjust it, tweak it, audit it, enhance it, or rip the whole darn thing out and start over! Ownership equals control! Broadly, Enterprise Search could be thought of as all search engines except the public Yahoo!, Google, and MSN ones, since you do own and control the search engine that powers your public website or online store.
How Enterprise Search Differs from Web Search x 5 responded to these differences with enhancements to their enterprise offerings. However, the underlying architecture and design may prove to be a fundamental mismatch for some specific search applications. Technical Differences in Search Requirements and Technologies Aside from data volume, there are a number of other technical differences between a company’s private intranet and the Internet.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? Every Intranet Search Project Is Unique Although some engines were not created for the Internet, they were still usually targeted at specific business applications. For example, imagine an engine that was created to serve a complex parts database. Perhaps a spider and HTML fi lters were later added, but that was not its genesis.
Enterprise Search Technology Overview x 7 Indexing Determine Document Language Separate into Paragraphs, Sentences, and Words Calculate Stemming, Thesaurus, etc. Write to Full-Text Index Full-Text Index Word Inversion Index Special Indexes (i.e., Soundex, Casedex, etc.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? Scalability In the early days of search software, if you needed to handle more data, you upgraded the machine’s memory or hard drives, or upgraded to a faster machine. Most modern engines scale by adding more machines and then dividing the work among them. This division of labor is usually done by distributing these subsystems across these multiple machines, so this is an additional motivation for you to understand the various subsystems in your engine.
Categorizing Your Organization’s Use of Search — Examples x 9 Microsoft is also positioning these products according to employee- versus customer-facing uses, although these are not hard and fast rules. Microsoft’s Explanation Microsoft will continue to embed search in specific products such as Windows and Office applications. For server-based search, however, Microsoft will offer different products designed for customer- and employee-facing applications.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? search will never be exactly the same. However, honestly evaluating your company’s use of search might help with decision making down the line. Instead of spouting all kinds of abstract rules, let’s dive into some concrete examples. The following are examples of companies where search is absolutely key: ‰ Internet search engines or yellow pages (a “no-brainer”) ‰ eCommerce sites (shopping, travel, B2B, etc.
The ROI of Search x 11 searches that would run constantly against all inbound information and keep them up to date without pestering them. This is a very tall order. The global search could just as easily bring back obsolete junk, and saved searches could turn the already beeping cell phone into a nightmare. This implementation could be done very badly. If it were done well, however, it could enable that sales office to run like an incredibly efficient machine.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? Predicting and accurately measuring how much money you’ll earn or save can be difficult.
The ROI of Search x 13 2. Financial benefits/ROI: These can be the result of the direct generation of additional revenue and cost savings, improving efficiency, and so forth — hard ROI versus soft ROI. We talked about this in the previous section. Although the ROI of search gets mentioned quite a bit in the press, we don’t think it always justifies new search projects to management, except in the case of a customer-facing B2C or B2B commerce site. 3.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? When justifying search projects, we encourage clients to think in terms of all three levels of benefits. When thinking about the BI benefits of search, try to include additional stakeholders in the earliest parts of planning. Most companies already involve IT and site designers in their planning process. These BI benefits, however, will also be of interest to upper management, content creators, customer service/tech support, and marketing.
The Search Center of Excellence x 15 At the same time, Internet retailers realized that, without really good search, customers would abandon their site. Jakob Nielsen, in his well-known 1999 study, found that half of site visitors would use search if there were a search box. We also all learned that a poor search results page is probably the last thing a frustrated site user sees before he or she leaves the site for one with better search.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? ‰ Maintaining corporate knowledge of existing systems and the content they serve ‰ Serve as a sounding board for reported problems or newly proposed projects ‰ Cataloguing agreed-to search best practices ‰ Maintaining relationships with key vendors ‰ General industry awareness ‰ In-house training ‰ Helping to maintain controlled vocabularies or taxonomies SCOE Staffing and Skills Here are some general areas that an ideal SCOE would have covered: ‰
The Search Center of Excellence x 17 WHY NOT JUST BUY A GOOGLE APPLIANCE? Even if you’re not seriously considering a Google appliance, or believe you understand fully why you’d select a Microsoft/FAST engine instead, make sure you understand what the Google appliance represents. We can almost guarantee that at some point in the purchasing process, some high-level executive at your company is going to ask about it, possibly more than once, and possibly very late in the process.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? Business Considerations There are differences in licensing between the Google Appliance and the FAST/Microsoft offerings. This is not to say that Google is necessarily more expensive, and as most corporate buyers know, the prices for hardware, software, and services can vary because of many factors.
The Search Center of Excellence x 19 As with what we’ve said about the Google Appliance, this certainly isn’t about “good” or “bad” engines. These open source engines are very impressive for what they do! In their current incarnations, however, these offerings are aimed squarely at programmers and tinkerers, not busy IT departments. We refer to this as “enterprise packaging.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? stumbling block for some of the open source solutions. Getting data out of databases and other content repositories can be a challenge, depending on the engine and the corporate infrastructure. Other things that tend to multiply the seeming complexity include: ‰ Staff that’s not particularly familiar with web and database technologies. ‰ Staff that’s not familiar with search engine terms and technology.
The Search Center of Excellence x 21 various parameters and try to quantify the improvement. Presenting “before and after” five-digit decimal numbers to management, based on these measurements, may not be very compelling, but the staff involved with the search engines should be learning from these measurements. Another possibility is that, although money was spent and vendors were swapped out, the new engine isn’t really much better.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? And as a quick review, generally a taxonomy attempts to organize data in a hierarchy. Yahoo! is probably most widely known example. A taxonomy can be organized by subject, like the way Yahoo and your local library’s card catalog do; this would be a subject or subject “domain” based taxonomy. Alternately, it can be arranged by grouping and subgrouping the data you already have into different segments.
The Search Center of Excellence x 23 In summary, if your source data is very structured, such as that from a database or XML feed, then you should consider facets. They will provide a good navigation aid for users. If your source data is not structured, but it does have people, places, and other well-defined items within its text, then consider using entity extraction to mine that text to form data that facets can run against. We do suggest using a POC.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? If you run through the numbers, you can see this very clearly. If a large public website with 1 billion visitors gets 1% of them to tag documents, that’s 10 million taggers, which is great. A private search application with 1,000 daily users, however, might have only 10 active taggers. You might get more than 1% of users to tag something once in a while, but the numbers still tend to be small.
The Search Center of Excellence x 25 of service goals can also get very expensive. Fortunately virtualization and cloud technologies may keep these costs from growing unbounded. How long it will take is perhaps easier for us answer here: longer than you’d expect. However, a phased approach mitigates this to some extent. Gathering data ahead of time can also speed things up. However, we feel there is a diminishing return on specifying applications in a vacuum, before vendors and integrators are brought in.
x CHAPTER 1 WHAT IS ENTERPRISE SEARCH? Many of the other roles surrounding search can be performed as a part time activity by staff with other duties, as long as they are done consistently. You’ll want to have contact with administrators for the repositories you’ll be searching. You’ll typically want one or two businesspeople to be involved, looking at the search activity reports. If you’ll be getting involved with taxonomies or a specialized vocabulary, you’ll need somebody to work with that.
Summary x 27 ‰ Generally, these lists don’t directly compare things like implementation time, pricing, or customer service metrics. These are usually huge factors for a successful project. ‰ As we’ve said, most search engines on the market these days are at least decent, if configured and maintained properly. Using the Gartner Magic Quadrant to justify random vendor changes is a recipe for disaster, and a very expensive one at that.