4 TE RI AL Designing a Relational Database GH TE D MA Chapter 1 introduced you to databases and databases management systems. As you’ll recall from that discussion, a database is a collection of related data organized and classified in a structured format that is defined by metadata. Not all databases are structured the same, though, as can be attested to by the different data models that have emerged over the years.
Chapter 4 the ACM, Volume 13, Number 6 (June 1970). In this paper, Codd introduced a relational data structure that was based on the mathematical principles of set theory and predicate logic. The data structure allowed data to be manipulated in a manner that was predictable and resistant to error. To this end, the relational model would enforce data accuracy and consistency, support easy data manipulation and retrieval, and provide a structure independent of the applications accessing the data.
Designing a Relational Database Data Normalization One of the concepts most important to a relational database is that of normalized data. Normalized data is organized into a structure that preserves the integrity of the data while minimizing redundant data. The goal of all normalized data is to prevent lost data and inconsistent data, while minimizing redundant data. A normalized database is one whose tables are structured according to the rules of normalization.
Chapter 4 Another way in which the table violates the first normal form is found in the second and third rows, which are identical. Duplicate rows can exist for a number of reasons, and without the necessary data to distinguish them, you cannot tell whether this is an error in data entry or whether there are supposed to be two records for this one book.
Designing a Relational Database IDs for authors and books in a way that supports books written by multiple authors, authors who have written multiple books, and multiple authors who have written multiple books. Had you tried to match the authors to their books in one of the two tables, the table would potentially fill with enormous amounts of redundant data, which would defeat one of the purposes of the relational database.
Chapter 4 You can see how the primary key works in the fourth and fifth rows, which are related to the author Annie Proulx. Although both rows are concerned with the same author, they refer to different books. As a result, the values Proulx and Postcards identify one row, and the values Proulx and The Shipping News identify the second row.
Designing a Relational Database In addition, a primary key has been defined on the AuthID and BookID columns of the AuthorBook table. As a result, any of the primary key columns in a row, when taken as a whole, must be unique from all other rows. Because there are no other columns in this table, the issue of dependent columns is not a concern, so you can assume that this table also conforms to the second normal form.
Chapter 4 By separating the data into two tables, each column is now dependent on its respective primary key, and no columns are dependent on nonkey columns.
Designing a Relational Database One-to-One Relationships A one-to-one relationship can exist between any two tables in which a row in the first table can be related to only one row in the second table and a row in the second table can be related to only one row in the first table. The following example demonstrates how this works. In Figure 4-8, a one-to-one relationship exists between the Authors table and the AuthorsBios table.
Chapter 4 Authors AuthID 1006 1007 1008 1009 1010 AuthFN Hunter Rainer John Annie Nelson AuthorBook AuthID BookID 1006 14356 1007 12786 1008 17695 1009 19264 1009 19354 1010 16284 AuthMN S.
Designing a Relational Database Many-to-Many Relationships A many-to-many relationship can exist between any two tables in which a row in the first table can be related to one or more rows in the second table, but a row in the second table can be related to one or more rows in the first table. Take a look at an example to help illustrate how this relationship works. In Figure 4-10, you can see three tables: Authors, AuthorBook, and Books.
Chapter 4 An important part of the database design process is the development of a data model. A data model is a physical representation of the components that make up the database as well as the relationships between those components.
Designing a Relational Database Authors AuthID:SMALLINT AuthorBook AuthID:SMALLINT-FK1 BookID:SMALLINT-FK2 AuthFN:VARCHAR(20) AuthMN:VARCHAR(20) AuthLN:VARCHAR(20) Born:YEAR Died:YEAR Publishers PubD:SMALLINT PubName:VARCHAR(40) PubCity:VARCHAR(20) PubState:CHAR(2) BookPublisher AuthID:SMALLINT-FK1 BookID:SMALLINT-FK2 Books BookID:SMALLINT BookTitle:VARCHAR(60) Copyright:YEAR PubID:SMALLINT-FK1 Figure 4-11 One other aspect to notice in the Books table is the PubID column, which is followed by FK1.
Chapter 4 Regardless of which data modeling system you use, there are generally four steps that you should follow when developing a data model: ❑ Identifying the potential entities that will exist in the database ❑ Normalizing the data design to support data storage in the identified entities ❑ Identifying the relationships that exist between tables ❑ Refining the data model to ensure full normalization Whenever you’re developing a data model, you’re generally following the business rules that have
Designing a Relational Database ❑ One author can write one or more books, one or more authors can write one book, and one or more authors can write one or more books. ❑ One publisher can publish one or more books, and one or more publishers can publish one book. From this information, you should be able to create a list of objects that require data storage. For example, the author’s first name is an example of one of the objects that exists in this scenario.
Chapter 4 entities in this way provides you with a foundation to begin normalizing the data structure, which is the next step in the data modeling process. Normalizing Data Once you define and categorize the primary entities and attributes in your data model, you can begin normalizing that structure, which results in the initial tables that make up your database.
Designing a Relational Database ❑ Names can contain any alphanumeric characters that are included in the default character set. ❑ Names can include underscores (_) and dollar signs ($). ❑ Names can start with any acceptable character, including digits. ❑ Names cannot be made up entirely of digits. ❑ Names cannot include a period (.). ❑ Names cannot include an operating system’s pathname separator, such as a backslash (\) or forward slash (/).
Chapter 4 Because your data model contains three tables, three possible relationships can exist among them: ❑ Authors/Books ❑ Authors/Publishers ❑ Books/Publishers First take a look at the Authors/Books table set. As you’ll recall from the business rules outlined for this example, one author can write one or more books, one or more authors can write one book, and one or more authors can write one or more books.
Designing a Relational Database with multiple books. The BookPublisher table works the same way as the AuthorBook table. The BookPublisher table includes two foreign keys: one on the PubID column and one on the BookID column. Together these two columns form the primary key. Once you add the junction table and indicate which columns are foreign keys, you must properly show the two one-to-many relationships that result from adding the table.
Chapter 4 The business rules provided here are not meant to be an exhaustive listing of all the specifications that would be required to create a database, particularly if those specifications were to include a front-end application that would be interfacing with the database. The business rules shown here, however, are enough to get you started in creating your data model. Later in the book, as you add other elements to your database, the necessary business rules are provided.
Designing a Relational Database 5. The next step is to organize the objects into entities. You’ve already done some of the work simply by listing the objects. For example, the DVDs for rent group of objects can be left together as you have them, although you can simplify the category name to “DVDs.” The Transactions/ orders group of entities also provides you with a natural category.
Chapter 4 How It Works In this exercise, you identified the potential entities and attributes that help form the foundation for your data model. First, you organized your objects according to the business rules in which they appeared. From there, you separated the objects into logical categories that grouped together entities and attributes in a meaningful way.
Designing a Relational Database The table should include a column that references the customer ID and a column that references the employee ID. The table should also include a primary key column. Because you’re tracking the customer and the employee at the order level, you do not need to include them at the transaction level. As a result, your Transactions table does not need to reference the employee or customer, but it does need to reference the order.
Chapter 4 How It Works As part of the data modeling process, you identified the tables included in the database and you listed the columns in each table. Normally, you would also assign data types to the columns, but for now, these were assigned for you. In determining how to set up the tables in your data model, you had to apply the rules of normalization to the entities and their attributes.
Designing a Relational Database Draw the relationship on your data model. Use the three-prong end on both sides. 5. Next look at the Transactions and Orders tables. Every order can contain one or more transactions, but each transaction can be part of only one order, so a one-to-many relationship exists between these two tables, with the Transactions table being the many side of the relationship. Draw the relationship on your data model. 6.
Chapter 4 DVDs Participants Roles DVDID:SMALLINT PartID:SMALLINT RoleID:VARCHAR(4) DVDName:VARCHAR(60) NumDisks:TINYINT YearRlsd:YEAR MTypeID:VARCHAR(4)-FK1 StudID:VARCHAR(4)-FK2 RatingID:VARCHAR(4)-FK3 FormID:VARCHAR(2)-FK4 StatID:CHAR(3)-FK5 PartFN:VARCHAR(20) PartMN:VARCHAR(20) PartLN:VARCHAR(20) RoleID:VARCHAR(4)-FK1 RoleDescrip:VARCHAR(430) MovieTypes MTypeID:VARCHAR(4) Studios StudID:VARCHAR(4) Status MTypeDescrip:VARCHAR(30) StudDescrip:VARCHAR(40) StatusID:CHAR(3) StatDescrip:VARCHAR(20
Designing a Relational Database the primary key on all three columns. Also, be sure to mark the one-to-many relationship on your model, and remove the RoleID column from the Participants table. The column is no longer needed in the Participants table, because it is included in the DVDParticipant table. Your data model should now look like the model illustrated in Figure 4-20.
Chapter 4 How It Works In this exercise, you took the final step in completing your data model by adding a junction table that bridged the many-to-many relationship between the DVDs and Participants table and the many-to-many relationship between the Participants and the Roles tables. The junction table — DVDParticipant — allows you to associate a DVD with a participant and a role.
Designing a Relational Database you can create additional constraints in the database that meet any additional specifications outlined in your business requirements. Once you’ve created your database, you can add, retrieve, and manipulate data. Chapter 5 shows you how to assign data types and how to create and manage databases, tables, and indexes. Exercises The following questions are provided as a way for you to better acquaint yourself with the material covered in this chapter.