Why reinvent the SQL wheel?
December 15, 2006 8:22 AM   Subscribe

Software design question - Why do people still write PC archiving and cataloging software?

A number of recent AskMe threads have inquired about cataloging or archiving software for use on their computers to catalog their CDs, DVDs, books, etc. My question, and I'm hoping for a technical answer, not chatfilter, is why do people write this kind of software the "old way" with specialized data structures, file formats, Windows or Mac interfaces, etc.?

Unlike the software inquired about here, which requires processing in real-time, archiving software amounts to just an interface to a database, right? Wouldn't it simply be easier to take an open source database backend, like SQLite, and interface directly with that? Then the app could include a data dictionary of what is stored where in the database that would then allow users to extract the data and generate whatever reports they want using ordinary SQL.

Furthermore, why don't PC's run tiny little local webservers, so that the interface could be through the browser. This way, it you wanted to put your catalog on the web (and interface to your blog, whatever) it would be a trivial matter.

Is there a specific technical software development reason that this is not done?
posted by Pastabagel to Computers & Internet (8 answers total) 3 users marked this as a favorite
Plenty of these applications do use a database backend; it's just that they don't necessarily expose that fact to the user.

Real clientside apps can provide richer interfaces than a web app coming through a browser, particularly for things like scanning your CDs or adding cover art images.
posted by chrismear at 8:26 AM on December 15, 2006

Some companies don't want you to have easy access to data for reports and the like. That helps promote lockin and gives them a chance to sell you a reporting solution.

Another reason applications haven't adopted your choosen architecture: inertia.
posted by mmascolino at 8:48 AM on December 15, 2006

Way back in the day (Y2k! Eek!) when I was taking my databases course, the first thing my instructor told us was this: the #1 competitor to your average database system is a flat file. Why? Because you don't have to worry about setting up a database system, you don't have to come up with a database schema to make it do exactly what you want to do (when you can just write a class for it), and most importantly - you don't need to know anything about databases! Many (I'm not going to say most, but I'd say it's a large percentage) programmers don't know squat about SQL, and a fairly big chunk of those who DO know anything about SQL, probably don't know enough to do it well. Add that to mmascolino's answer, and you've probably got the basic reasons why companies don't use databases for their little dinky programs.
posted by antifuse at 9:17 AM on December 15, 2006

A DB and/or a webserver is a pretty heavy load if you just want to keep track of a few hundred records. How much memory does SQLite require and how does that compare to VI's footprint?

I can program a database but I use flat files all the time for light weight datastorage. A flat text file is also infinitely portable, you'd be hard pressed to find a system that can't open and write to it.
posted by Mitheral at 9:44 AM on December 15, 2006

Best answer: Macs contain SQLite and Apache. Some applications use them. Some apps you might expect to use them don't. For example, Apple's Mail writes your emails to files so that Spotlight can index the mail contents.

Data has to be written, stored, and retrieved in different ways for different needs. Even seemingly-similar data sets (the catalog of a collection of books and a catalog of a collection of CDs) can have vastly different data structures if the data is going to be easily savable and findable. Not only is there rarely a single best way to save the records of a stack of books, there might not be a single best way save the records of a stack of books and a stack of CDs in the same space. Scholarly books may have different metadata than fiction books. Jazz, pop, and classical CDs all have different metadata.

The problem isn't really, 'why do all these apps roll their own databases?' The scratch file an app temporarily writes an array to is, in a sense, a database table. A file system is, in a sense, a database. Any application that writes to files is using a common database (such as Apple's Mail, which is doing it for the benefit of a different database). And there you go.

But when speaking less broadly about the problem, and keeping in mind the books vs. CDs example above, every app may have a common need (storing and retrieving structured data) but have to express that need in highly specialized ways. The application might be large, but why begrudge an extra couple megs on your 100 gb disk? At that scale, storage is pretty close to free, and directly manipulating the data in a file is usually more efficient than sending those requests through a proxy. For an existing app, ditching something that works for a general-purpose DB like SQLite means losing some control and increasing complexity, as well as dealing with SQLite's own constraints.

As for handing off the application interface to a web browser, that's a ten year old flame war. One way to consider the problem in context of your question: Is it more efficient to develop an application within the native constraints of another application within an OS? Sometimes, the answer is 'yes'. But frequently the answer is 'no', and it's better to work within the operating system, where there are available libraries and toolkits for windows, behaviors, common tasks, and so on.
posted by ardgedee at 9:44 AM on December 15, 2006

Other reasons why your "just put it on the web" solution is more difficult than it sounds:

* People don't generally understand how to set up their machine as a server, from a networking and security point of view.
* A lot of broadband providers officially discourage using a home machine as a server.
* If you want your machine to have a reasonable DNS name, that's one more complicated thing to configure.
* If you're behind a NAT device, which many people are, you need to configure that device too.
* Have to leave your machine on all the time, which wastes energy.
* What happens when multiple apps want to all serve up stuff on port 80? You need some way of coordinating between them. Or, you could use a different port, but then the user would have to memorize the port number.

It's certainly possible to do all this stuff, it's just a logistical nightmare to do it in an automated way.
posted by xil at 10:37 AM on December 15, 2006

This question, I think, tends to boil down to a long-running debate within the information sciences, between using flat files and application-specific schema within them, or using a system-wide database to save structured data in a standard format. Both approaches have their advantages.

Some big iron OSes (VMS comes to mind) implemented database-like filesystems for saving structured data that many applications in the system could use, with the operating system handling things down to the record level. However, UNIX never went this route, and preferred flat (often text) files that the OS didn't get into, allowing each program to define their own format to suit their own needs. Record-level locking, etc., is all implemented on the application level and not the OS one.

I have heard (though this may be apocryphal) that the difference in approaches stemmed from UNIX being a minicomputer OS, where data was typically input from paper tape and thus tended to favor unstructured stream or 'bitbucket' files, and VMS's legacy in mainframes where information was often input from punch-cards, which favor structured data. That's before my time, but it's an interesting theory at least.

So it's far from a stupid question; I think the difficulty in database-like filesystems is making one that's flexible enough to be useful for all the different types of structured data that programmers will want to store, while still making it easier to use than just rolling your own flat-file format.
posted by Kadin2048 at 3:52 PM on December 15, 2006

I have an RSS feed reading program that does just what you suggest. (creatively called "feed reader")
posted by delmoi at 9:11 AM on September 6, 2007

« Older Save my job, please!   |   Best Collaborative Content Management Software Newer »
This thread is closed to new comments.