Best way to store arbitrary form data?
January 24, 2008 5:23 PM   Subscribe

What are best practices for storing arbitrary form data? More inside...

Let's say you have a bunch of online forms - I'm talking thousands here - and they could all have different fields that make them up. Is there any good way to store all this data in a database without making tables specifically for each form? By good I suppose I mean a few things:

1) Easy and efficient to report on that data in meaningful ways.
2) Not a horrible waste of database space
3) Not horrible db best practices (potentially creating thousands of tables on the fly doesn't sound so good to me, but maybe it's the way to go)

Speculation on how sites like Wufoo do it is welcome. (just using them as much as I'd like to is not an option for me) I feel stuck in a relational database state of mind. Here are my thoughts so far.

a) 1 "results" table to store all form responses, mapping a large varchar field containg the data in maybe csv, or json format to a particular form schema.
b) 1 "results" table to store all form responses, using 1 table with some large # of columns, all varchars, each column containing 1 form value, the row itself mapping back to a form schema. (sounds sucky from a data storage perspective and not so nice for reporting)
c) create a "results" table for each form schema - will result in as many result tables as there are forms. benefits are the data all will get stored in appropriately sized columns, and reporting should be easy.

What is this problem called? Surely smarter people than me have wanted to do this before and succeeded. Right? Googling around has gotten me not much besides enterprise solutions like InfoPath/InfoJet and Documentum but I am not at all interested in looking at these. I feel like I don't know the right term to search for.

Thanks for your thoughts/experiences/help! I am open into looking into non-relational storage solutions like couchdb if anyone thinks it might be useful, as long as they aren't too enterprisey.
posted by zackola to Technology (6 answers total) 1 user marked this as a favorite
 
Best answer: The database tables don't need to be associated with the forms directly--the form information can be embeded at a higher level. For example, suppose you had the following tables:

Form
form_id
form_name

TextField
textfield_id
textfield_name
form_id

TextValue
val: string
textfield_id


NumberField
numberfield_id
numberfield_name
form_id

NumberField
val: int
numberfield_id

This would let you build forms with any number of different field types and then associate values with those fields.
posted by null terminated at 5:56 PM on January 24, 2008


(in the above case you'd also need a FormValue table that allowed specific values to be associated with one instance (or result) of a form submission).
posted by null terminated at 5:57 PM on January 24, 2008


The flexible way to store arbitrary user-submitted form data is to store each response value in a separate row, associated with the form response it's part of and the field it was entered in. This SQL has some Postgres-specific bits but should demonstrate the idea:
create table forms ( 
  form_id serial primary key, 
  form_title varchar(128)
);

create table fields ( 
  field_id serial primary key, 
  form_id integer references forms, 
  field_title varchar(128), 
  field_type varchar(16)
);

create table responses (
  response_id serial primary key,
  time_submitted timestamp default now(),
  ip_address inet
);

create table values (
  field_id integer references fields, 
  response_id integer references fields default currval('responses_response_id_seq'),
  integer_value integer, 
  real_value real, 
  text_value text
);
What you lose is the ability to use SQL queries normally in the reporting as if each form had its own custom table.
posted by nicwolff at 8:57 PM on January 24, 2008


nicwolff - actually, you could get that ability with the right join, or?

SELECT values.response_id
FROM forms, fields, values
WHERE form_title='my form'
AND field_title='my field'
AND text_value LIKE 'searchterm%'
AND values.field_id=fields.field_id
AND fields.form_id=forms.form_id

Seems like that should work for querying for a search term in 'my field' in 'my form' where value is like 'searchterm%', or?
posted by syzygy at 10:02 AM on January 25, 2008


Ack - I hate that ", or" habit of mine. Comes from being immersed in German for the last 7 years, ja?
posted by syzygy at 10:03 AM on January 25, 2008


Oh sure syz you can still use normal SQL to get the response values one-by-one or in simple groups, but it gets trickier to select, say, all the records whose answer to question 1 was < 10 and whose answer to question 2 was "blue". Not impossible, but not fun.
posted by nicwolff at 10:51 AM on January 25, 2008


« Older Help me hack in to my Font folder!   |   Need cell phone advice for a trip to russia. Newer »
This thread is closed to new comments.