Q-Built Solutions Makes Quick, Quality-Built Software For Your Business Needs!
 

RDB Design Boot Camp

Place your ad here! This domain gets 400 visitors per day and has Google Page Rank of 3 or 4.  Get great ad rates!

Place your ad here! This domain gets 400 visitors per day and has Google Page Rank of 3 or 4.  Get great ad rates!


 
Please help us write more free software and tips for this site by donating.
Please donate today!


Q-Built Solutions Web Statistics
.

 
Statistics As Of:
26 Mar. '07

Number of unique visitors since 20 March '04:
204,793

Number of Web pages served since 20 March '04:
429,683

Our Most Popular Web Pages:

 1.  How-To Tips
 2. 
Technical Articles
 3. 
Gem Tips
 4. 
VBA
 5. 
Free Stuff
 6. 
Links
 7. 
Free Downloads
 8. 
Our Custom Microsoft Access Products
 9.
Forms
10. 
FAQ's
 

 

 

 

 

 

 

Relational Database Design

Boot Camp

Training Day 1 (TD1):  Introduction to Relational Database Design

You don't know who I am, so I will introduce myself. I am the Gunny.  Gunny is short for Gunnery Sergeant, United States Marine Corps.  That means you'll always get straight, no-nonsense information, without wasting your time or mine. The Gunny will always give you The Word.

If you've ever spent time in a military uniform, you'll understand me when I say that there are only two ways to do something:  the Marine Corps way, and the wrong way. If you pay attention, I'll teach you how to design relational databases that don't waste users' time and effort. That's the Marine Corps way.

You may have heard -- or you may think -- that designing databases is difficult. That's just scuttlebutt.  If a Jarhead can learn how to design relational databases, then you can, too. If you pay attention I will teach you how, using the Marine Corps way.

The Marine Corps way of teaching is quite efficient. First, I'll list your objectives for each lesson. Next, I'll tell you about a concept, then illustrate the concept, then tell you what concept you just learned.  It will be quick and painless, like all Marine Corps training.


Intimidation (or "Why You Shouldn't Be Intimidated")

Do any of the terms entity-relationship, Normal Form, transisitive dependency, primary key, foreign key or referential integrity sound confusing? These are concepts of relational database design that are important, but can be intimidating at first.  Relational database theory was invented by academics with PhD's, so the terminology and concepts can easily sound intimidating for most folks who work in the real world.

As every Leatherneck knows, one of the keys to success is to let the enemy be intimidated, without being intimidated yourself. Marines are not intimidated because they know the "trick." The "trick" in this case is that these intellectuals, with all of their complex theories and jargon, are merely following Gunny's Rules.

That's right.  They're following Gunny's Rules.  You should, too.  Learn them. Live by them.

Here they are:


Gunny's Rules:

  1. Don't waste the Gunny's time.
  2. Make retrieving data as quick and as easy as possible.
  3. Make entering and updating data as quick and as easy as possible.
  4. Make storing data as quick and as easy as possible.
  5. The computer works for you, not the other way around.

Do any of these rules sound difficult to you?  Didn't think so.

And how do you implement Gunny's Rules?  You implement them by following Gunny's Guidelines.


Gunny's Guidelines:

  • Never store your data in the same arrangement that you intend to display it, unless you have a very simple table that will never change.  (Hint:  Your data and display requirements will never change, except on Monday through Saturday, and sometimes on Sunday, too.)
  • Always design for changes.
  • Always design for exceptions.
  • Always check your work.
  • Always keep things squared away (clean up after yourself).
  • Make things as simple as possible, but no simpler. (Albert Einstein said this, so even the most intellectual of people have agreed with the Marine Corps way.)

Do any of these guidelines sound difficult to you? Didn't think so.

I'm not going to ask you if these rules and guidelines make sense to you just yet. I'm going to illustrate them for you over the next several lessons, so you can see why these rules and guidelines work for efficient relational database design.
 

Lesson Objectives

Read lesson objectives here.


Art Imitates Life (or "Databases Model The Real World")

The most common use of a relational database is for simple CRUD (create, read, update, and delete) purposes.  For example, a customer's new order (saved as a record) can be created.  Once created, this order can be read, changed, or deleted as needed. Using computers to automate this process improves productivity by eliminating the man-hours required to do this manually, but the real value lies in how that data in a database is used. If you don't plan to leverage the advantages and power of a relational database, then you might as well store the data in a text file or a spreadsheet.

Databases store data, but they are also models of the real world.  These models compartmentalize and organize processes and information in an efficient manner and employ appropriate "business logic" to govern how the data is stored and used.  At least that's the goal. And just like the real world, employing a successful model can bring quick, easy solutions, but relying on a poorly designed model can bring a plague of problems and waste time or money.  Databases that aren't built correctly may show inaccurate data, or they may be slow and take up too much space or network bandwidth.  A properly designed database will be an efficient storage container for data, without wasting precious (and costly) development time when building database applications or when using the data.

Experienced database developers can easily tell the difference between a successful model and a poorly designed model. Unfortunately, inexperienced managers often can't tell the difference -- until after it's too late.  Many organizations mistakenly rely on a poor database design and make important business decisions or future plans based upon inaccurate data in databases (or spreadsheets) that they've built themselves.  The good news is that experience is a very effective teacher. The bad news is that disasterous experiences can be very costly teachers as well.

Effective database design is both an art and a science.  The art portion relies on intuition, but the intuition needed isn't the unpredictable "you just know it when you see it." It's the type of intuition a Marine uses before waterproofing his Deuce gear:  "Is my Deuce gear likely to get wet today?" -- in which case, he only needs to check the training schedule and the weather forecast for the insight to make a good decision. The science portion of effective database design relies on well-established techniques for preventing the common problems with storage and usage of data. These techniques, coupled with well-thought out planning, will reduce the need for costly changes to the database, the software applications that use it, and the equipment users need to access the database when new situations arise for using this data.

Before we dig into relational database design, let's learn a few definitions we'll be using.
 

Definitions:

  • Relation:  a two-dimensional table.
  • Think of these dimensions as horizontal (width) and vertical (height). The horizontal dimension expands as each column (an attribute or field) is added to the table.  The vertical dimension expands as each new entity (record, row, or tuple -- rhymes with "couple") is added to the table, with each row containing its own value for each attribute in the table.  So a relation is merely a two-dimensional table describing a set of entities (for entities think of nouns such as "people" or "customer orders"), and their attributes ("first name," "last name," or "order date").

  • Column:  an attribute or field describing the entity.
  • Each attribute is a separate column. An entity can have multiple attributes.  For example, first name, last name, date of birth, and place of birth can be four columns in the table describing people. Don't try to glue the attributes together by consolidating them into a single column (such as first name and last name), because it will take extra time later to unglue them when one or another attribute is needed, which is in violation of Gunny's Rule #2.
     

  • Row: a record or tuple describing a single entity with all its attributes.
  • Most importantly, each record in each table should represent a single theme. For example, a table of records describing a set of people can use first name, last name and date of birth to describe each person, but adding attributes such as academic courses enrolled in, sporting awards, cars owned, et cetera, would be describing multiple themes that each person may or may not be involved in (even if some of the people -- records -- listed in the table are involved in them).  These additional themes should be separated and an additional table for each individual theme should be created.
     

  • Primary key: a group of one or more attributes that uniquely identifies a row.
  • Each row has a column or group of columns that have unique values that no other row in the table has.  For example, a table that contains the names of the 50 States of the United States and their State capitols could use the column "StateName" or "StateAbbreviation" to uniquely identify each row, since no two States (or abbreviations) are the same. On the other hand, a table containing people's names and using the "FirstName" column and "LastName" column can't be reasonably assured that there won't be repeated entries.  For example, if George Foreman (former World Heavyweight Champion) and his five sons were listed, then there would be six records with George in the "FirstName" column and Foreman in the "LastName" column -- hardly unique.

  • Foreign key: the primary key of one relation stored in another (foreign) relation.
  • A row in the current table may be logically related to a record in another (foreign) table, and the primary key of that record in the foreign table is stored as the foreign key for this row in the current table. For example, if the current table stores records about people and one of the attributes describing each person is place of birth, then the primary key (two-letter abbreviation for the State) for that related record in the foreign table ("tblStates") may be placed in the "PlaceOfBirth" (foreign key) column in the current table. And while values in the column(s) in the foreign table must be unique, there can be many records in the current table that use this same foreign key. In other words, more than one person can use "TX" in the "PlaceOfBirth" (foreign key) column in the current table.

So you understand these definitions. Correct? Then get ready for the next part, because we're almost done with this lesson.


The Home Stretch

Relations

For a table to be a relation, it must meet all of the following restrictions:

    1.)  The cells of the table must be of a single value; and
    2.)  Neither repeating groups nor arrays are allowed as values; and
    3.)  All entries in the same column must be of the same kind; and
    4.)  Each column has a unique name; and
    5.)  The order of the columns is insignificant; and
    6.)  No two rows in a table may be identical; and
    7.)  The order of the rows is insignificant.

Two of these restrictions (that the column order and row order don't matter) are actually permissions.  Go ahead and store the data with the columns (attributes of the entity) in any order and the records in any order (alphabetically, chronologically, or even "scattered"). We'll be able to easily retrieve the data from the table later on by using the names of the columns (that's why each column has a unique name) and the unique primary key for the row (that's why rows can't be duplicates of each other).

We don't glue multiple values together in the cells of the table, because it would take extra time to unglue these values when we later need only one of them.  We ensure that all entries in a column are of the same type, so that when we read the data later, we know what type to expect. For example, if we had a column named "DateOfBirth," and we usually placed a date in this column but occasionally only placed the day of the week (such as Thursday), when we later retrieved the data, we would have no idea which Thursday a person was born on.  Storing the complete date as a date data type for entries in the "DateOfBirth" column (instead of a word or phrase) will prevent errors (and guesswork).

Explaining Gunny's Guidelines:

Gunny's Guideline #1:  Why don't we store data in the same arrangement that we intend to display it?  Each record in each table should represent a single theme, but we usually display data with multiple themes, such as information about a  purchase order along with information about the vendor who delivered the purchase.  If we store information about the vendor, such as the vendor's address and phone number, in each purchase record, then when the vendor changes his phone number, we'll have to update his phone number in every single record in order to ensure that future reports have the correct information.  This would violate Gunny's Rule #3, "Make entering and updating data as quick and as easy as possible."  Store all of the vendor's information in one place only:  in the vendors table. That way, we'll only have to update each piece of vendor's information in one place.

When we have to make updates in multiple places, the chances are high that we will either introduce a typographical error or miss at least one record that needs to be updated.  What if the Gunny needed to contact a vendor, so opened your database and read one of the records that contained the "old" phone number, because it was one of the records that hadn't been updated in the purchase orders table?  Gunny would be wasting time dialing the wrong number, in violation of Gunny's Rule #1.  Whatever you do, don't waste the Gunny's time, or you'll find yourself on Mess Duty with the rest of the maggots or running laps around Lake Bandini, equipped with a 125 lbs. ALICE pack on your back and carrying all the extra bandoliers of ammo for your entire fire team.


What You Just Learned

You've learned that databases model the processes and information of the real world and, if built correctly, can offer quick and efficient means of making informed business and organizational decisions. If designed incorrectly, databases can offer a plague of problems that may be costly to fix or replace.

You've learned that a database is made up of tables to store the data and  appropriate logic for governing how that data is stored and used.  You've learned that each properly designed table is a relation. You've learned that a relation uses columns and rows as building blocks for a table that describes a set of entities (nouns such as "people" or "customer orders"). You've learned the seven restrictions required to make a two-dimensional table a relation.  (Okay.  Five restrictions and two permissions.)

You've learned that each record can be uniquely identified by its primary key, and that a related record in a foreign table can be identified by storing the foreign key as a column (attribute) for the current record in the table.

 

That's it for today's lesson.  Come back for Training Day 2.

 

You can read about your favorite expert contributors.

Copyright © 2005 - 2006 Q-Built Solutions. All rights reserved.

 Top

Disclaimer:  This article is not an official curriculum of the U.S. Marine Corps or any military service.  The information contained in this article is solely the work of the author and is intended to provide readers with instructions on how to build a relational database with the proper techniques to avoid common problems that plague poor database design.

 

Visitors since 26 Nov. '05: 

 

[MS Access] [Free Stuff] [Articles] [Gem Tips] [How-To Tips] [Links] [Products] [Scorecard] [About Us] [Search]

Sign up for PayPal and start accepting credit card payments instantly.