Data overload
May 30, 2013 8:49 AM Subscribe
I'm designing and managing a very ambitious data visualization app for the arts and culture sector, and am trying to find the best way to organize the data types and attributes. Excel just ain't doin' it anymore.
We have a multitude of data sources, each with its own set of attributes. The data types interact w/ each other in the (web-based) software. (For example, you might want to filter Data Type A by an attribute of Data Type B.)
Each data type comes from a different source and has unique characteristics/considerations. Some data types relate to each other, and some don't. In some cases, we have data that goes back years, all of which will be accessible to the user; in other cases, we don't.
My task is to figure out how to make this data available to the user in an intuitive interface, which means that I have to understand in great detail how it all relates to each other. Spreadsheets can only show relationships between two things well, but I need to look at relationships between three or more pieces/types of data, plus somehow show that some individual data types & attributes have the capability of being displayed through time.
A relational database seems like overkill to me, not the least because what I really need to be able to do is *see* all of these data types and attributes together, but if there's a GUI that will allow me to do that, that might work. I've never used MS Access before, but I do have access (har) to it and the will to learn.
If you have suggestions or ideas that are even tangentially related, it would be helpful. Or pointers to articles that discuss a similar problem. Or, really, anything. Thanks!
We have a multitude of data sources, each with its own set of attributes. The data types interact w/ each other in the (web-based) software. (For example, you might want to filter Data Type A by an attribute of Data Type B.)
Each data type comes from a different source and has unique characteristics/considerations. Some data types relate to each other, and some don't. In some cases, we have data that goes back years, all of which will be accessible to the user; in other cases, we don't.
My task is to figure out how to make this data available to the user in an intuitive interface, which means that I have to understand in great detail how it all relates to each other. Spreadsheets can only show relationships between two things well, but I need to look at relationships between three or more pieces/types of data, plus somehow show that some individual data types & attributes have the capability of being displayed through time.
A relational database seems like overkill to me, not the least because what I really need to be able to do is *see* all of these data types and attributes together, but if there's a GUI that will allow me to do that, that might work. I've never used MS Access before, but I do have access (har) to it and the will to learn.
If you have suggestions or ideas that are even tangentially related, it would be helpful. Or pointers to articles that discuss a similar problem. Or, really, anything. Thanks!
Best answer: A relational database seems like overkill to me
With the caveat that I work with them every day, what you're describing sounds exactly like what a relational database is good at, taking disparate data sets and tying them together where they match up so you can look at "Type A only", "Type B only", "Type A and B" or any other combination of A-Z you care to tie together. Again, this assumes a decent database design, but it's hard to give any help with that without knowing what at least some of the data is and what format it's in right now.
posted by yerfatma at 9:29 AM on May 30, 2013
With the caveat that I work with them every day, what you're describing sounds exactly like what a relational database is good at, taking disparate data sets and tying them together where they match up so you can look at "Type A only", "Type B only", "Type A and B" or any other combination of A-Z you care to tie together. Again, this assumes a decent database design, but it's hard to give any help with that without knowing what at least some of the data is and what format it's in right now.
posted by yerfatma at 9:29 AM on May 30, 2013
Response by poster: I just yanked my poor husband, who does this kind of thing all the time, from his work to ask him about this, and he said something similar to Sublimity. I'm going to define a semantic schema, place each data attribute into that schema, and then figure out what sort of visualization each combination of the schema can/will generate.
Once the schema is in place and I know that all the data fits nicely in it, engineers will implement this in a relational database.
posted by nosila at 9:51 AM on May 30, 2013
Once the schema is in place and I know that all the data fits nicely in it, engineers will implement this in a relational database.
posted by nosila at 9:51 AM on May 30, 2013
Looks like you need to be reading about Database Normalization.
posted by oceanjesse at 11:22 AM on May 30, 2013
posted by oceanjesse at 11:22 AM on May 30, 2013
What you are describing is a 'Data Warehouse'.
If you have a limited budget you can build a data warehouse using standard database tech just be aware that everything you read about relational database design will be wrong when applied to a Data Warehouse, i.e. you should denormalise the data instead of normalising it.
posted by Lanark at 2:11 PM on May 30, 2013
If you have a limited budget you can build a data warehouse using standard database tech just be aware that everything you read about relational database design will be wrong when applied to a Data Warehouse, i.e. you should denormalise the data instead of normalising it.
posted by Lanark at 2:11 PM on May 30, 2013
This thread is closed to new comments.
posted by Sublimity at 9:05 AM on May 30, 2013