How can I have my computer analyze a small dataset to find patterns?
Here is a very crude, semi-entertaining analogy of what I'm trying to do (completely unrelated to my actual project, trust me).
Say I'm sitting in a cubicle and I keep smelling an odor. I decide to graph a bunch of factors that might be related. The data would look like this, commas separating what was noted each hour:
(Time): 10am, 11am, 12pm, 1pm, 2pm, 3pm, 4pm, 5 pm
Toilet flushes heard: 2, 1, 1, 6, 2, 1, 4, 3
Taco Bell meals brought in: 1, 1, 3, 4, 2, 1, 1
Boss's dog in room: 1, 0, 0, 1, 1, 0, 1, 1 (1=in room, 0=not)
Coworkers in room: 5, 2, 1, 6, 3, 2, 2, 4
Febreze squirts heard: 2, 4, 1, 1, 3, 2, 3, 5
Odor strength: 5, 2, 6, 3, 0, 3, 6, 4
I would want to have the computer match the different sets with one another, trying different normalizations and so forth, to determine what factor results in the smallest standard deviation with "odor strength"... i.e. what factors might be most at play (especially in combination). Maybe it's the Nacho Supremes and no Febreze, not the activity in the restroom.
Obviously in Excel I can look at all this data and graph it, but it's tough to compare sets. Is there a cheap shareware program that can do stuff like this? I'm not looking for Mathematica or anything pricey... this is just for an experiment I'm doing.
(and yes, the correct solution for the analogy is "quit your job!")
posted by rolypolyman to science & nature (27 comments total)
1 user marked this as a favorite
weka is a suit of AI tools you can use, and it's open source. I've only really heard of people using it as a platform for other AI research, although it does include a lot of basic functionality.
posted by delmoi at 2:20 PM on March 14, 2006