What are some standard databases used for websites?
December 17, 2010 7:17 AM   Subscribe

What are some standard ways for a website to interact with a large database? I'm looking for suggestions of tools or programming languages.

I'm writing a specification for a small job in a university and don't know the exact terms to use. The job will require someone to make a few large databases that can be queried by a website. The website is quite basic and uses html and javascript.

I was thinking of writing something like this:

"Looking for a student to work on developing several large databases that will be used to visualize data on a webpage. Experience with Xdatabase, Ydatabase or similar, and their interaction with javascript is preferred."

What would Xdatabase or Ydatabase be? It is all numeric data, and probably will be a few hundred thousand entries. The data is currently in ascii file format (similar to an image, or huge matrix). I don't know if creating something like this is feasible or crazy as I don't have much experience in this field.

thanks hivemind!
posted by a womble is an active kind of sloth to Computers & Internet (13 answers total) 2 users marked this as a favorite
mysql, oracle would be some databases.

but honestly, with just a few hundred thousand records, simply querying the flatfile will keep things simple and should be plenty adequate performance wise assuming the programmer knows their game.
posted by H. Roark at 7:20 AM on December 17, 2010

Where will the data come from? Will it be updated often? How?

If it's just a static dataset, I'd check to see what happens when you load it all as JSON into a browser, and assuming that was fine, seconding 'flat file'. You can serve the JSON from the same site as the website, and have the client's browser pull it down whenever doing a visualization.

There aren't really good options for matrix-like data. It doesn't fit well into a relational model, it's not "object oriented", it doesn't belong in a NoSQL key-value store, really, for data like that, you want an underlying representation that matches the data, either a matrix or a sparse matrix. I'd love to see small, OSS DBs that provide a matrix data type.
posted by doteatop at 7:33 AM on December 17, 2010 [1 favorite]

You can read the flatfile just fine. An actual database wouldn't really be needed unless you plan to make updates to the dataset through the web site, or change it frequently.

The SQL database systems are standard, and you'll make your choice based on what system your server is running. If it's a Windows server, you can use MSSQL; if you're on Linux or Mac OS go with MySQL. The languages used by both of these are virtually identical.

You're not going to be doing database queries (or flatfile reads) in JavaScript alone. You'll want a server-side application of some sort. If you're on Windows this could be ASP.Net or ColdFusion (for example); on Linux or Mac OS you might go with PHP or Perl. It sounds like you've got a server already running, so you should really figure out what system it uses and which environments are most compatible with it.
posted by The Winsome Parker Lewis at 7:33 AM on December 17, 2010 [1 favorite]

Response by poster: I currently am just using data stored in JSON format for testing functionality, but I had assumed that for a large amount of data, this would be become quite slow as I would need to load it all when the page loads (perhaps this is not correct). Is it possible to have a flat file that is queried remotely? I do want to limit how the queries run so that all of the data cannot be downloaded at once, as there are some restrictions in how the data can be shared (this is a more minor concern, but one I need to consider).

It is a static dataset - there will be occasional updates, but at the weekly/biweekly scale rather than seconds.

I have a test server, but plan on getting hosting specifically for this project.
posted by a womble is an active kind of sloth at 7:50 AM on December 17, 2010

Don't use the term "large" when referring to the database. Those numbers you list make them "medium" at best.

MySQL or Microsoft SQL Server are pretty common. The first one is free, the second is usually covered under a Microsoft services agreement that already would exist at most universities.
posted by blue_beetle at 7:52 AM on December 17, 2010

Find out what your final host will support. If the project is hosted by the university, find out what you have available to you. This will save a lot of headaches later. You'd hate to get a ton of work done and end up with everything written in something they won't install.

As for databases, I'm partial to PostgresSQL. MySQL is fine for most purposes, but we really like PostGIS for geographical information. It's also nice to be running a database not owned by a company known for crazy licensing costs, sadly this now includes MySQL.
posted by advicepig at 8:17 AM on December 17, 2010

Microsoft SQL Server is surprisingly nice, and there's a free version that's sufficient for 99% of what you'd want to do.

(And I say this as a PHP programmer who prefers working with Unix systems)

If you're crunching this data outside of a web-based environment, MATLAB is almost certainly the easiest way to do it. It's a language designed for manipulating and visualizing large matrices. However, it's somewhat cumbersome to build web interfaces for MATLAB projects.
posted by schmod at 8:24 AM on December 17, 2010

How about: store the data in JSON still, but write a simple nodejs access layer for the server side that will serve slices of the data in response to a client request. Pure javascript website (server and client), keep the JSON storage, lighter on the browser, and it addresses your concerns about sharing the whole datafile at once.
posted by doteatop at 8:54 AM on December 17, 2010

I use informix, mysql and microsoft SQL server, all on a daily basis. If I was choosing a new database platform I'd probably plump for MySQL - it's free, but you probably want some sort of support option - and it's probably worth paying for this to make sure you have backups and adequate recovery plans.

A few hundred thousand records is at best a medium sized database, and MySQL should be able to cope with that with absolutely no problems at all. Our informix database probably handles around 8 million new records per month by comparison. (I love informix and most of the people who end up with informix love it, but it's hard to recommend something that few people use as you won't be able to find people with the skillset).

In terms of what you use to access that database, I'd probably go with something like php - Yes, it's simple, but it works because a website is just what it's been designed for.
posted by BigCalm at 8:56 AM on December 17, 2010

Another vote for using PHP to access the database. As far as importing the data into a mysql database, for example, you can import an ascii file preferably as CSV(comma separated values).
posted by JJ86 at 9:09 AM on December 17, 2010

Best answer: Experience with Xdatabase, Ydatabase or similar, and their interaction with javascript is preferred."

In-browser javascript doesn't interact directly with a database. What you'd need in addition to the database itself is some serverside gateway code to pass information back and forth from the database to the client (and convert it into JSON along the way.) That gateway could be in perl, PHP, ruby, or really any number of other languages (even serverside javascript, if you use Node or the like); it doesn't really make a tremendous amount of difference. The decision should be based on how easy it will be to find others able to maintain the code, which means you should choose something popular and simple, not something awesome and cutting edge. Especially if you're hiring students. (This probably rules out Node, for example.)

If you're on a unix host, I'd agree with many above that MySQL and PHP are your best bets. If you're on a windows host, I am unqualified to make good recommendations.

Is it possible to have a flat file that is queried remotely?

One big flat file, not really, unless your data set is really small.

It's certainly possible to break up your data into a bunch of separate flat files, one per record, and do ajax requests for them individually as needed -- whether this would be a reasonable strategy here depends on what kind of queries you need to do against the data. (If the requests are all "get this single easily identifiable chunk of data and do something to it in the browser", it'd be fine. If it's more like "find all the records that match X,Y, and Z but not W", then you probably need a real database.)
posted by ook at 9:55 AM on December 17, 2010 [1 favorite]

You should only use things that solve problems. It doesn't sound like a database solves any problems for you, and creates more.
posted by gjc at 8:53 AM on December 18, 2010 [1 favorite]

Response by poster: Thanks everyone, for your answers. I've a better idea how to frame this now.
posted by a womble is an active kind of sloth at 8:00 AM on December 20, 2010

« Older Excel sort for relative placement   |   It's not an idiom. And it's not a phrase. And it's... Newer »
This thread is closed to new comments.