IT Lecture Notes by Mark Kelly, McKinnon Secondary College
Databases for Dunderheads |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
To avoid confusion, I'll use the term Database Management Software (DBMS) to refer to database applications such as Access, Filemaker, SQL. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Database Management Software (DBMS) are programs that let you
large quantities of data. Modern business would be impossible without the data-managing power of the modern electronic database. Popular databases used by average citizens include:
Less popular (yet) , but free include:
More technically-minded people often use SQL (Structured Query Language) products such as the free and powerful MySQL. Other brands are Foxpro, dBase. There are industrial-strength corporate databases with massive capacity and huge numbers of simultaneous users ; such million-dollar database manufacturers include Oracle, Lotus, Sybase. You are already on a dozen or so databases, starting when you were born and increasingly as you deal with organisations. Let's look at some data basics... |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DATABASICS |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
The concept of a database is really simple: it's a collection of data in a table. The simplest database is a list of your friends...
Each column contains a particular type of data for each person (e.g. for any row of this database, column 1 will always contain the person's first name.) Each column corresponds to a database field. Of course, you must put the right type of data in the right field : putting a person's birthday in the first name column would make the database useless. A full collection of fields about a single person (in other words, each row of the table) is called a record. So Alice's data is stored in record 1. Database rule 1 - each field contains the same type of data (e.g. phone number) in every record. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Capabilities and limitations of databases Capabilities: DBMS can
Limitations: DBMS have trouble with
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
For those who always get rows and columns confused, remember... COLUMNS go up and down like the COLUMNS that hold up a roof. They are database fields. ROWS go from side to side like ROWS of seats. They are database records. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
If names were stored in a single field and sorted, you get a useless list like this, because the names are sorted by first name rather than surname...
Storing the name parts separately also allows you to create reports that say things like "Dear Sharon" or "Dear Miss Shingle" (rather than nasty constructions like "Dear Sharon Shingle" using the whole name.) Database Rule 2 - store people's first and last names in separate fields. If a title is stored (e.g. "Miss", "Doctor", "Sir") it goes into a third separate field.Database Rule 3 - only store one piece of data in a field. Don't combine different data in one field (e.g. addresses should be stored in 3 fields - number + street, suburb, postcode.)
When you create a new database, you have to define its fields before you're allowed to enter data. After all, where can the data be stored without fields? Typically, you have to tell the DBMS the following information about its fields: 1. The field name e.g. firstname, DOB, PayRate
2. The field type: commonly text, numeric, date/time, Boolean or calculated. Specifying the field type lets the DBMS efficiently allocate storage space. Type tips:
3. Text field length - If a field is specified as type text, you need to specify a maximum length it can be. Text length tips:
4. Validation options - you can set rules about how the value is entered. These typically include setting that the field:
Different DBMSs offer luxury options such as automatically entering values for new fields in new records such as a serial number, a timestamp, the current date, the value from the previous record, fixed data (e.g. auto-setting the customer's country to "Australia" since that will be the case for 99.9% of your customers).
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Planning a databaseAs you might have guessed already, creating a database takes some thought. It would be a big pain to create a database with a thousand people's names in one field, then decide to separate them into two fields, and have to go through 1000 names cutting and pasting. Plan the structure first, get it right, then start entering the data. It's a bit like building a house - you don't want to have to tear down a half-built house because you got the foundations wrong. A very handy tool for creating databases is a data dictionary. Another handy tool, for defining calculated fields, is an IPO chart. Details of these design tools are here.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TABLES |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
In database talk, a table like the ones shown above is (surprisingly) called a table. A database needs at least one table. A database with a single table is called a flat-file database. Sometimes that's all you need, but databases often grow... Let's consider a business that buys parts, builds things and sells those things to customers. They know that storing data in an electronic database is far better than keeping paper records so they create a single database to record purchases by customers like so...
Already you should be able to see a problem developing... lots of repetition. The same customer name and address is being repeated, even the name of the ordered goods is repeating. This can lead to really bad wasted space, making the database much bigger than it needs to be. A more serious problem is - what happens if Goldfish Pty Ltd changes its address? To keep the database up to date and accurate, you would need to change every occurrence of their address in the database. This could be a massive and tiresome job, and any missed occurrence could result in orders not being found, or being sent to the wrong place. So, what to do? Here's the way you would probably do it if you were creating a pen and paper list...
Now, when an order comes in, you look up the customer in the customers table, find their ID and enter that in the orders table. Notice that the repetition in the orders table has been largely removed. Also, each customer's details are recorded only once in one place - the customers table. If a customer's address changes, for example, it only has to be changed once - in the customers table. But how do you know that a customer in the orders table matches a customer in the customers table? You tell the database that a field in one table is related to (or matches) a field in another table. By telling this two-table database that a customer ID in the orders table is related to a given ID in the ID field in the customers table, the database can join them together. It knows that the ID "00002" in the orders table refers to Supernova Ltd because "00002" in the ID field of the customers table lines up with "Supernova Ltd". Notice we are using the ID field in both tables to create the relationship. An ID in one table always exactly matches the same ID in the other table. This ID field is therefore known as a key field, and it has to be unique. That means that every customer must have an ID all to itself that it will always keep and never be used for any other customer. This is why on most bills you will see a customer number or account number - it identifies you uniquely in their database and regardless of how many "Fred Smiths" there are in the database, each can be uniquely identified by the table's key field: the unique identifying customer account number. Database rule 4 - key fields must be unique. A value in a key field (e.g. 00002) may only appear once in the entire table. When you have a database like this with two or more related tables, you have (drumroll please...) a relational database! The process of removing duplication of data by creating related tables is called normalisation. Don't ask me why. Just remember it. Is our sample database fully normalised yet? Has all repetition been removed? Not quite... look at the goods ordered field. There's repetition there too that can be removed by ... yep. Another related table. Let's create a new table called products and every product we sell is listed once, and only once. We need another key field to match a product in the orders table with a product in the products table. We'll call it ProdID.
Ah! How nice it is to be fully normalised! No unnecessary repetition! Using the orders table, you can get all the information you need about an order by looking up the customer's details in the customers table (using the ID field to match customers between the tables) and you get product info by looking up the product's ProdID in the products table. Easy as pie! Tip: Notice that the field in the orders table is called "Product ID" but the matching field in the products table is called "prodID"? This is perfectly OK, but some people like to use the same field name in both related tables just to make it clear that the fields are in fact related. Note: Microsoft's Access database is so concerned about having unique key fields in tables that when you create a table it insists on you nominating a key field that is unique for every record in the table! If you choose not to name one, it creates one for you! Access is a demanding wife when it comes to relationships! Still with me? Good. Let's wander onward...
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
DEFINING A TABLE |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
When you create a table in a database, you have to tell the database some stuff so it can create the fields. These can be called field properties. Typical properties you define are
*Field Length - some field types have automatic lengths, such as Date, Byte, Short Integer and Yes/No data types. Others, such as text, can vary. Setting a maximum length for a field's data lets the database store data more efficiently and not assign lots of disk space for tiny pieces of data. **Calculated fields are not entered as data - they are automatically calculated by the database using existing date, much like Excel can calculate new values based on existing values. You need to enter a formula to make the calculation happen. The formulas actually look a lot like those in Excel or Visual BASIC. (The "&" operator joins pieces of text together to create new text values.) It is really calculated fields that give databases their massive power and usefulness.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
ACCESS DBMS |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Access databases contain objects called queries and reports.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
FILEMAKER PRO DBMS |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Filemaker Pro databases approach similar tasks slightly differently.
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Back to the last page you visited
Created March 29, 2006
Last changed: April 30, 2008 10:20 AM
IT Lecture notes (c) Mark Kelly, McKinnon Secondary College