Datasheet

584

Consider the following example:

SELECT DeptNo

FROM EMPLOYEE

WHERE EMPLOYEE.Age > 29 ;

You can use a B+ tree index on Age to retrieve only the rows in which

employee age is greater than 29. Whether this method is worthwhile

depends on the age distribution of the employees. If most employees are 30

or older, the indexed retrieval won’t do much better than a sequential scan.

Suppose that only 10 percent of the employees are more than 29 years old.

If the index on Age is clustered, you gain a substantial improvement over a

sequential scan. If the index is unclustered, however — as it’s likely to be —

it could require a buffer-page swap for every qualifying employee and will

likely be more expensive than a sequential scan. I say that an index on Age

is likely to be unclustered based on the assumption that at least one column

in the EMPLOYEE table is more deserving of a clustered index than the Age

column.

You can see from this example that choosing whether to create an index

for a table column isn’t a simple matter. Doing an effective job of choosing

requires detailed knowledge of the data as well as of the queries that are run

on it.

Figure 1-1 compares the costs of using a clustered index, an unclustered

index, and a sequential scan to retrieve rows from a table.

Figure 1-1:

The cost of

retrievals

with and

without an

index.

Cost

Percentage of rows retrieved

100O

Clustered index

Sequential scan

Unclustered index

Choosing the Right Indexes

40_9780470929964-bk07ch01.indd 58440_9780470929964-bk07ch01.indd 584 2/24/11 3:45 PM2/24/11 3:45 PM