February 27, 2011 6:46 AM   Subscribe

If there are "poetry generator" engines available on the Web, is there anything that comes close to reading input of (English) poetry which then computer-generates the scansion pattern that might result from somebody reading it?

I dare to dream of a free tool on the web that would be able to break down prose down into information about iamb, trochee, anapest, dactyl, spondee & pyrrhic parts --- --- --- maybe there's something out there that if you type in "the rain in spain falls mainly on the plain" could return back to you "the RAIN in SPAIN falls MAIN-ly on the PLAIN", or even a half-baked suggestion for emphasis patterns. Computer-poets, know of any scansion tools?
posted by shipbreaker to Technology (6 answers total) 7 users marked this as a favorite
I'm not aware of any polished software that does scansion, and having once worked on the problem myself I can tell you that, while it's doable, dividing lines "correctly" into feet (in a way that matches a good reader's intuitive ear for meter) is not as trivial a task as you might think — there are a surprising number of debatable boundary cases, and no clear best algorithm for doing it. Charles Hartman's book Virtual Muse has a chapter on computer scansion which may be the closest you'll come to a description of how to do it, though I recall finding his actual algorithm a bit dubious.

Given that it's not a fully solved problem, if you want scansion software, your best bet might be to try to interest a computational linguistics grad student in the problem; while doing a really good job of it is hard, an unpolished rough cut should be easy enough to whip up.
posted by RogerB at 10:27 AM on February 27, 2011

Ray Kurzweil's Cybernetic Poet "reads" poetry by one or more authors and creates a model used to generate original poetry based on language analysis techniques and a variation on Markov modeling, helping you find rhymes, alliterations, turns of phrase. It doesn't look like it's been updated in a while, but it may be worth downloading the free version for experimentation:

posted by iconoclast at 11:31 AM on February 27, 2011

Link to previous...

Cybernetic Poet
posted by iconoclast at 11:36 AM on February 27, 2011

Since it's pretty difficult so far to break input text into even syllables perfectly, I have a feeling you're going to be waiting a while for something that even touches stresses on those syllables. There is a Ruby implementation of some of these features, but...
posted by rhizome at 12:44 PM on February 27, 2011

Stress patterns for polysyllabic words are mostly known quantities. It's easy to find electronic sources that mark them, and since they're the backbone of a line, they don't tend to vary much from the dictionary form, making them tractable to search and replace except for numerous cases where stress distinguishes minimal pairs (e.g. ally n. vs. ally v.). A bigger problem is words of one syllable, which in poetry can seriously go either way. A lexicon like this one will tend to mark them either all stressed or all unstressed, which is useless.

Consider this Wordsworth poem:
A SLUMBER did my spirit seal;	 
  I had no human fears:	 
She seem'd a thing that could not feel	 
  The touch of earthly years.	 
No motion has she now, no force;	         5
  She neither hears nor sees;	 
Roll'd round in earth's diurnal course	 
  With rocks, and stones, and trees.
I have a 30-line perl script that gets lines 4, 6, 7, and 8 right. "Seem'd" (3) and "roll'd" (7) aren't recognized, but they aren't marked incorrectly, so we just need a bigger dictionary for those. It's the function words that are a problem. Basically, my script assumes that a short list of monosyllabic function words are unstressed, while all monosyllabic content words are stressed, and it lets the lexical database do the rest. But as a result, "did" (1), "had" (2), "could" (3), and "has" (5) are all the opposite of what they should be, which is kind of a mess to sort out.

Presumably, you could be more tentative in marking monosyllabic words, try forcing particular meters onto them, and see which ones result in a best fit to arrive at a reasonable guess. But you'd always have trouble with content words like "falls" in your Spain example (my script gets that one wrong, and I think "on" should be stressed there, which it gets wrong too). And once you start getting half-way reasonable answers, you'll often find good poems are intentionally doing unexpected things with words and meter that don't fit and that would have been more fun to discover for yourself.

But maybe these observations help you in doing it by hand ... Basically, start by marking the polysyllabic words, and then the monosyllabic content words, and then fill the others in appropriately as a pattern emerges. An imperfect script really doesn't save you time.
posted by Monsieur Caution at 3:30 PM on February 27, 2011 [2 favorites]

CMU Pronouncing Dictionary? I did a bit of Googling around the Python library Natural Language Toolkit, which includes the CMUdict.
posted by asymptotic at 6:07 AM on February 28, 2011

« Older How do I keep from messing up my neck &...   |   I have (camera) baggage Newer »
This thread is closed to new comments.