Datasheet

582
Choosing the Right Indexes
Indexes can improve the performance of database retrievals dramatically,
for several reasons. One reason is that an index tends to be small compared
with the table that it’s indexing. This fact means that the index is likely to
be in the cache, which is accessible at semiconductor-memory speed rather
than on disk — a million-to-one performance advantage right there. Other
reasons depend on the type of query being performed and on whether the
index is clustered. I discuss clustering in the next section.
Avoiding unnecessary indexes
Because maintaining indexes carries an overhead cost, you don’t want
to create any indexes that won’t improve the performance of any of your
retrieval or update queries. To decide which database tables shouldn’t be
indexed, consult the workload description you created as the first step in
the design process (refer to “Analyzing the Workload,” earlier in this chap-
ter). This description contains a list of queries and their frequencies.
Here’s a no-brainer: If a table has only a small number of rows, there’s no
point in indexing it. A sequential scan through relatively few rows executes
quickly.
For larger tables, the best candidates for indexes are columns that appear in
the query’s WHERE clause. The WHERE clause determines which table rows
are to be selected.
It’s likely — particularly in a system in which a large number of different
queries are run — that some queries are more important than others. Those
queries are run more often, or they’re run against more and larger tables, or
getting results quickly is critical for some reason. Whatever the case, priori-
tize your queries, with the most important coming first. For the most impor-
tant query, create indexes that give the best performance. Then move down
the line, adding indexes that help the progressively less-important queries.
Your DBMS’s query optimizer chooses the best execution plan available to it
based on the indexes that are present.
Different kinds of indexes exist, each with its own structure. One kind of
index is better for some retrievals; another kind is better for others. The
most common index types are B+ tree, hash, and ISAM (see “Choosing an
index type,” later in this chapter). Theoretically, for any given query, the
query optimizer chooses the best index type available. Most of the time,
practice follows theory.
Choosing a column to index
Any column that appears in a query’s WHERE clause is a candidate for
indexing. If the WHERE clause contains an exact-match selection, such as
EMPLOYEE.DepartmentID = DEPARTMENT.DepartmentID, a hash index
Choosing the Right Indexes
40_9780470929964-bk07ch01.indd 58240_9780470929964-bk07ch01.indd 582 2/24/11 3:45 PM2/24/11 3:45 PM