Join 3,512 readers in helping fund MetaFilter (Hide)


Best way to organize a list of tags and their hierarchy in a database of some sort? (More details & examples inside)
October 24, 2011 1:16 PM   Subscribe

Best way to organize a list of tags and their hierarchy in a database of some sort? (More details & examples inside)

I'm working on a project that is focused on tagging images and things (don't worry too much about this). Essentially I want to be able to save my list of tags so that at some point I can use it in conjunction with a MYSQL database. However! I don't know how to program, so right now everything is just in a text file. But I'd like to not have to retype everything down the line, so is there some sort of tagging program (or is there a simple way I could make a database with the right parameters)?

Essentially, it could boil down to something like this:

Level 1: Type
Level 2: Detailed Type
Level 3: Very Detailed Type

For example:

1: Tomato
2: Red
3: Heirloom

or

1: Lamborghini
2: Gallardo
3: Superleggera

There may even be a fourth level for some items.

Eventually, I want people to be able to enter this information from dropdown menus that are already populated with the tags that I'm currently writing in a text file. For all I know, using excel and various levels of columns is what I need to do, but maybe there is a better answer.
posted by darkgroove to Computers & Internet (5 answers total) 2 users marked this as a favorite
 
I don't know what you're asking. Are you asking for an appropriate data structure for organizing this sort of information, or a specific method for storing these things in MySQL?

The data you want to store is hierarchical which makes a relational database (like mysql) not ideal for it, which isn't to say it can't be done that way.

In MySQL you could do it in a single table with three columns labelled "tagIDNumber", "tagName" and "parentTagIDNumber"

Your top level tags are retrieved with (I forget exact SQL syntax for this):
select tagIDNumber, tagName from tagTable where parentTagIDNumber == NULL

Then for each successive level down the tree, you do:
select tagIDNumber, tagName from tagTable where parentTagIDNumber == (tagIDNumber that you just looked up in the previous level.)

Alternatively, using a hierarchical model you just do:
Object {
  Car {
    Lamborghini {
      Gallardo {
        Superleggera {}
      }
    }
  }
  Tomato {
    Red {
      Heirloom (which aren't all red, BTW) {}
    }
  }
}
And at each level you create a dropdown menu showing all the items visible at that level of the tree.
posted by tylerkaraszewski at 2:27 PM on October 24, 2011


There's a lot of work you COULD do to support what you are talking about. For instance: are people allowed to enter new categories / subcategories? Are you ever going to rename them? Are you ever going to reorganize them? Are you ever going to want to show them in languages other than English? Let's just answer no to all of those for now and move on.

I would encourage you to keep it simple while you figure out your project, but keep the data structured. A text file is fine, but if you keep it in some format (ANY structured format) it will be more useful later. Most notably, your primary goal seems to be "keep a list, but never have to retype it". A noble goal. I suggest JSON. It's super easy to parse, is practically a csv already, and will be useful for javascript and other web technologies.

Your top level categories can just be a list
["Tomato", "Lamborghini", "Cats on the Internet", "Metafilter"]
Then for each top level category create your subcategories. I would do this in a dictionary, so:
{"Tomato": ["red", "green", "black"],
  "Lamborghini": ["diablo", "aventador", "gallardo", "vroom"],
  "Cats on the Internet": ["black", "cute", "cuter", "kitttieeeeeeee!"],
  "Metafilter", []}
Sub-sub categories get trickier. You could have the same format as subcategories, but then what about collisions? For instance, both tomatoes and cats may be black (ps throw out black tomatoes immediately). If you care about this, then you need a more formal hierarchy. As tyler has alluded to, if you're storing this data in mysql, maybe every category and subcategory has an ID. But for now, you're just writing a text file. So I suggest you either choose "don't care about this" or concatenate your categories with a character you won't be using like "."

So now we have:
{"Tomato": ["red", "green", "black"],
  "Lamborghini": ["diablo", "aventador", "gallardo", "vroom"],
  "Cats on the Internet": ["black", "cute", "cuter", "kitttieeeeeeee!"],
  "Metafilter", [],
  "Tomato.red": ["overripe","large","small"],
  "Tomato.black": ["rotten"],
  "Cats on the Internet.black": ["in the sun", "in shadow"],
}
Now you can keep this file in whatever order is convenient while editing it, and it's practically already javascript. Link to json documentation, but all you need to know is: a list is ["value1", "value2"], a dictionary is {"key":"value", "key":"value"} and whitespace isn't important.
posted by Phredward at 2:49 PM on October 24, 2011 [2 favorites]


I'm going to assume that you only need 3 or 4 levels of heirarchy and every Red - Heirloom will always be labelled as a Tomato first. If thats the case you can just build 3 or 4 tables in whatever database you are going to use:

Table_MainType:
MainType_id, MainType
01, Tomato
02, Lamborghini

Table_DetailType:
DetailType_id,DetailType,MainType_id
100, Gallardo,02
101, Red,01

Table_VeryDetailType:
VeryDetailType_id,VeryDetailType,DetailType_id
50, Superleggera,100
51, Heirloom,101

I suggest you start building this in MS Access, you can always export the data to MySQL later.
Notice that (unlike Excel) you don't have enter the data in any particular order.
To join the tables together in a query link
MainType_id = MainType_id
and
DetailType_id = DetailType_id

I would avoid using the terminology 'tagging' what you have here are structured categories, when people talk about Tags (like the tags when posting to MeFi) they are usually referring to unstructured data, i.e. one big table with all the tags in it.
posted by Lanark at 3:09 PM on October 24, 2011


Sub-sub categories get trickier.

They don't if you do it right:
  Tomato {
    Red {
      Heirloom (which aren't all red, BTW) {}
      Cherry {
        TastyAndRound {}
        NotSoTastyAndShapedLikeAPumpkin {}
      }
      KindThatGoesOnHamburgers {}
    }
  }

  Colors {
    Red {
      Maroon {}
      Pink {}
      RockInUtah {}
      SovietFlagBackground {}
      Cherry {
        ActualColorOfAFruit {}
        ColorOfA1950sCar {}
      }
    }
  }

posted by tylerkaraszewski at 3:16 PM on October 24, 2011


I really wish I stuck with computer programming. I just had to put that out there.

Based on what tylerkaraszewski said, it looks like MYSQL may not be the best database. The only reason I picked that one is because 1) I know it is 'a database' and 2) I don't know any other databases.

I'm going to assume that you only need 3 or 4 levels of heirarchy and every Red - Heirloom will always be labelled as a Tomato first. If thats the case you can just build 3 or 4 tables in whatever database you are going to use.

Correct. I honestly don't think I would need four levels for all but the rarest of occasions. And you would not select something from the second or third level unless you selected the first level. So if someone was going to tag a photo as a red heirloom, they would first have to tag it as a Tomato, and only then would they be given the option to add more detail with further levels of tags.

There's a lot of work you COULD do to support what you are talking about. For instance: are people allowed to enter new categories / subcategories? Are you ever going to rename them? Are you ever going to reorganize them? Are you ever going to want to show them in languages other than English? Let's just answer no to all of those for now and move on.


Eventually I would like to do all those things. But a programer can bake it into the code later I'm sure.

It sounds like JSON could be the winning formula here. I'm sure there will be javascript on the site to pull this information as well, so that adds another point for JSON. I will look into it. Thank you!
posted by darkgroove at 7:42 AM on October 25, 2011


« Older Please help me identify the ar...   |  Will the enhanced Home Afforda... Newer »
This thread is closed to new comments.