Join 3,572 readers in helping fund MetaFilter (Hide)


Meta Descriptions and Keywords for MediaWiki
October 2, 2008 8:04 PM   Subscribe

I have a MediaWiki wiki that I've started. Unfortunately, it doesn't seem to automatically generate meta keywords and meta descriptions from the articles. What is the best solution for this? I tried to find out how Wikipedia does it, but my Google Fu is weak.
posted by entropicamericana to Computers & Internet (1 answer total) 1 user marked this as a favorite
 
I think you're hung up on the "automatically." Computers do not understand content. They've come a long way, and are capable of parsing a grammatically-sound sentence into a structure, but the extraction of meaning is still hard.

Being MediaWiki, that sort of thing is probably done by a human, or several humans, working in tandem, like everything else at Wikipedia. Right now, they seem to just pick the categories for the meta keywords, which seems kinda weak to me.

If you wanted to do it algorithmically, I'd probably start first by comparing the words in the article in question against the body (corpus) of all of the other words in the other articles and searching for the less frequent words. Then, given a list of less frequent words in that article, I'd find the most frequent words in the article that are also the less frequent words in the whole corpus. That would give you keywords.

Next, retune the hell out of the algorithm. You'd probably want to add a list of stopwords, words which are thrown out of any list: a, an, and, to, from, just, like, or, and so forth.

A description would be far harder. That would basically be the same thing as asking a computer to, given a PDF, generate a book report for you. If you don't see anyone else doing it yet, that's probably in the "hard" category.

I don't see a meta description in a few random pages from Wikipedia.
posted by adipocere at 4:18 AM on October 3, 2008


« Older Where should I live in Hawaii?...   |  I'm recovering from a bout of ... Newer »
This thread is closed to new comments.