Datasheet

11
Chapter 1: What’s in a Data Warehouse?
2. The team in charge of building the data warehouse assigns a group of
business users and other key individuals within the company to play the
role of subject-matter experts.
Together, the data warehousing team and subject-matter experts com-
pile a list of different types of information that can enable them to use
the data warehouse to help track sales activity (or whatever the focus is
for the project).
3. The group then goes through the list of information (data assets), item
by item, and figures out where the data warehouse can obtain that par-
ticular piece of data (raw material).
In most cases, the group can get the data from at least one internal
(within the company) database or file, such as the one that the applica-
tion uses to process orders over the Internet or the master database
of all customers and their current addresses. In other cases, a piece
of information isn’t available from within the company’s computer
applications, but you could obtain it by purchasing it from some other
company. Although a bank doesn’t have the credit ratings and total
outstanding debt for all its customers internally, for example, it can
purchase that information from a third party — a credit bureau.
4. After completing the details of where the business can get each piece of
information, the data warehousing team creates extraction programs.
Extraction programs collect data from various internal databases and
files, copy certain data to a staging area (a work area outside the data
warehouse), cleanse the data to ensure that the data has no errors, and
then copy the higher-quality data (data assets) into the data warehouse.
Extraction programs are created either by hand (custom-coded) or by
using specialized data warehousing products — ETL (extract, transform,
and load) tools.
You can build a successful data warehouse by spending adequate time on the
first two steps in the preceding list (analyzing the need for a data warehouse
and how you should use it), which makes the next two steps (designing
and implementing the data warehouse to make it ready to use) much easier
to perform.
Interestingly, the analysis steps (determining the focus of the data warehouse
and working closely with business users to figure out what information is
important) are nearly identical to the steps for any other type of computer
application. Most computer applications create data as a result of a transac-
tion or set of transactions while a particular application is being used to
run the business, such as filling a customer’s order. The primary difference
between run-the-business applications and a data warehouse is that a
data warehouse relies exclusively on data obtained from other applications
and sources. Figure 1-1 shows the difference between these two types of
environments.
05_407479-ch01.indd 1105_407479-ch01.indd 11 1/26/09 7:23:41 PM1/26/09 7:23:41 PM