Relational databases
October 26, 2005 1:06 PM   Subscribe

Resources for teaching myself about relational databases?

I work with natural history collections. I have reached the limits of what I can do with spreadsheets. I have only a vague understanding of what relational databases are, but when I tell people about the trouble I'm having juggling spreadsheets they tell me that's what I need. I am happy to learn to write code, and have some none-zero experience with PERL. I use OSX, but I have some access to pc's.
posted by Eothele to Computers & Internet (18 answers total) 2 users marked this as a favorite
Are you looking for instructional materials or books or dbs to use?

I always give a simple analogy, which might not work if these commercials are localized:

In the commercials, you see a couple standing there and the womans says something like "I want a car". All around her, 10,000 cars spring from the ground. The guy says "Red" and most of the cars disappear back into the ground and a fleet of red cars spring up. Then they voice some qualifiers "Under 30,000 miles!", "V6!", etc, while cars appear and disappear, with the pool getting smaller and smaller.

At the end, they are presented with the one car that presumably meets all their needs.

In a nutshell, that's a database query.
posted by unixrat at 1:27 PM on October 26, 2005

unixrats: That's a bunch of queries. Should be all give me a car WHERE the color's red AND the mileage is under 30,000 AND the engine's v6.

Of course you gotta bellow the words in caps to get the real effect.
posted by xmutex at 1:30 PM on October 26, 2005

Anyways, you can easily get MySQL for Windows (FOSS, $0), Learning SQL by O'Reilly, and using MS Query/Excel as a frontend/viewer.

That should give you all the tools that you'll need for some time.
posted by unixrat at 1:31 PM on October 26, 2005

xmutex: The AND is implied? It's a useful analogy, because those ))%(#)%#( AutoTrader commercials run every 10 minutes around here. :)
posted by unixrat at 1:32 PM on October 26, 2005

Depending on what you need to do, something like Microsoft Acess may be sufficient - it offers more flexibility than spreadsheets, integrates with spreadhseet tools, has graphical tools and is essentially Relational Database Lite.
posted by vacapinta at 1:40 PM on October 26, 2005

For a beginner with a bit of cash to spend, I'd suggest going with the Developer version of MS SQL Server 2000 (~$45) and Beginning SQL. "Beginning SQL" is aimed at someone that wants to do database design and simple programming, if you are looking more toward DBA stuff, I'd pick a different title.

MySQL is good, but makes learning more difficult. If you are going to bother learning to write SQL, MS Access is not worth the effort. Generally what you learn for SQL Server can be applied to any other database system you might consider once you are proficient.
posted by McGuillicuddy at 1:44 PM on October 26, 2005

I've said it before and I'll say it again, SQL Zoo is a site that teaches SQL by example.

It explains the concepts to you, then tells you to make a query and run it right on their site. You get automatic feedback about any problems with the query until you get the correct results.

I think it's one of the cleverer tech-education sites online.
posted by revgeorge at 1:49 PM on October 26, 2005

Microsoft Access is probably the most "user-friendly" database tool. One advantage of Access is that you can easily import/export Excel spreadsheets to/from your database. One spreadsheet page is equivalent to one database table, and all a relational database does is link tables together where the data in different tables relates to each other.

MySQL is free to install and use (and available for OSX), but it doesn't have the graphical hand-holding that Access does.

SQL is the language used to make queries against these databases. Google "SQL tutorial" or buy the O'Reilly book mentioned above, and while you're reading that, try some things out with Access or MySQL. I don't think it's possible to "get" relational databases without trying things out for yourself and seeing how they work.
posted by jellicle at 1:55 PM on October 26, 2005

Phil Greenspun is fun: here (RDBMS) and here ("SQL for Web Nerds").
posted by i_am_joe's_spleen at 2:01 PM on October 26, 2005

If you want to try SQL with minimal installation, SQLite may be your friend. It came installed on my iBook (as a Terminal application), and is a trivial install on the PC.

Best of all, it holds everything in one file, which seems to be portable across platforms. You can also talk to it through Perl's DBI layer.
posted by scruss at 2:02 PM on October 26, 2005

PS: you're obviously smart and have an academic bent. Go to a second hand bookstore near a university and buy a textbook on database management systems. Learn about normalisation. Learn about alternatives to SQL and alternatives to the relational model. Some CS theory provides a necessary counterweight to the Learn X in 15 seconds how-to books.

Also, if you have some Perl, you may get some value out of Alligator Descarte's Perl DBI book published by O'Reilly.
posted by i_am_joe's_spleen at 2:08 PM on October 26, 2005

Microsoft Access is a user interface (Access) and a database (Microsoft's Jet Database).

Although MySQL just provides the database part it can be paired with similarly user-friendly interfaces such as MySQL Front or MySQL Administrator.

(technically you could even use Access as the interface to MySQL, but then you'd lose access to many features).

Considering you've just moved from a spreadsheet, any database will be a big step up. I don't think you need to bite the bullet and go straight to SQL Server / PostgreSQL or worry about being tainted by poor database habits.

So far as learning databases, normalizing databases is the concept you'll need to learn. This is all about learning one-to-one, one-to-many and many-to-many relationships. You then insert and select records across tables with a language called SQL.

Really though, it's easier to learn in person so -- like when people ask about lprogramming -- find someone willing to teach you and use the software they'll help you learn.
posted by holloway at 2:26 PM on October 26, 2005

Find a book called Database Design for Mere Mortals.
posted by matildaben at 2:39 PM on October 26, 2005

Definitely go for MySQL - you could use SQLite but you never know how much information you'll need to store and the latest MySQL 5.0 has a lot of new features and improvements.

This forum has a section for SQL/MySQL newbies - reading things here might help you understand what kind of difficulty you may run into when you first start setting up/using MySQL.

Also MySQL Cookbook from O'Reilley is the best companion you can ever own - not only does it teach you about database, it provides solution to most common problems users run into.

If you want a basic SQL introduction book, try Sam's Teach yourself SQL in whatever time-frame-limit series (they have one for SQL in 10 minutes, SQL in 24 hours, and SQL in 21 days...don't ask me why you should buy SQL in 21 days when you can master SQL in 10 minutes). Or try SQL: Visual QuickStart Guide by Chris Fehily - this book is recommended for someone with no or very little programming background.
posted by grafholic at 2:53 PM on October 26, 2005

You might look at Microsoft SQL Server 2000 Developer Edition. It's $43 and you get the whole SQL server product to play around with (it's just not licensed for commercial use). You can use Access as a front end to SQL Server. In any case, knowing how more than one DB works is good and this is an affordable way to get some experience on Microsoft's platform.
posted by kindall at 2:59 PM on October 26, 2005

FileMaker seems to be the main mac application for database clients. I'm quite fond of web applications though. For example Maypole ( will do what you want, it's free, perl based and works out-of-the-box on mac osx tiger (not earlier), once you've installed mysql, or other database server thingy, enabled mod_perl and installed the various CPAN modules. The learning curve is rather steep though, although the vanilla installation is sufficient for many purposes.

Avoid Microsoft Access like the plague. It looks compellingly simple, but in actuality it's a horrendus dogs breakfast.
posted by singingfish at 3:07 PM on October 26, 2005

Second the normalizing tables thing.

In a nutshell: if you've got a spreadsheet with columns like

institution_name, exhibit_details_1, exhibit_details_2, exhibit_details_3, exhibit_details_4

you *could* just make that into a database table with the same fields. The trouble with this structure is that it's hard to search for exhibit details, it's troublesome to extend to more than four exhibits per institution, extra troublesome to add more per-exhibit details, and downright painful if exhibits are shared across institutions.

To normalize that database, you'd do these things:

1. Add a guaranteed-unique ID field to the Institutions table and remove ALL the exhibit_n fields, like so:

institution_id, institution_name

2. Make a new, separate table called Exhibits:

exhibit_id, exhibit_details

Put every item that used to be in an exhibit_details_n column in the original spreadsheet into its own row in this table; delete any duplicates.

3. Make a third table called Institution_exhibits, that exists solely to relate the other two:

institution_exhibit_id, institution_id, exhibit_id

Note that the only things that exist in the third table are (a) a unique ID to identify the table entry - this, like the unique ID's from the first two tables, is a Primary Key in database-speak; (b) sets of ID's from other tables: these, in database-speak, are Foreign Keys.

In a strict translation of your original spreadsheet, you'd end up with four rows in Institution_exhibits for each row in Institutions. Each of those four rows would have the same institution_id, and the details_id on each would identify the appropriate row in the Details table.

Now that the database is normalized, you can easily search by exhibit, or by institution, and it's no problem if an institution has only one exhibit or ninety-leventeen.

You can spot a properly normalized database by the fact that no information is duplicated except foreign keys; there's always exactly one Right Place for any given piece of real information to be.

The magic of relational databases is that you can now treat this normalized underlying structure pretty much AS IF it were your original spreadsheet - your spreadheet (or any other conceivable arrangement of your data) can get rebuilt on-the-fly as the result of a database query.

Despite its dog's breakfastness, Access is actually a reasonable place to start exploring this stuff, just because there's so much tutorial material available for it on the Web - and its form generator, report generator and query generators do work very well, and let you get some quite slick stuff going fairly quickly.

Given your programming background, by the time you've learned enough to spot the assorted ways in which Access does suck, you'll be well placed to choose another database engine and frontend that suit your needs better.
posted by flabdablet at 4:05 PM on October 26, 2005

I didn't know about SQLZoo, looks useful.

W3Schools also has a live SQL database you can play with in your browser:
posted by AmbroseChapel at 5:40 PM on October 26, 2005

« Older Ipod Video   |   searching for advertisers Newer »
This thread is closed to new comments.