CAN COMPUTERS READ POETRY PROPERLY?
February 27, 2011 6:46 AM Subscribe
If there are "poetry generator" engines available on the Web, is there anything that comes close to reading input of (English) poetry which then computer-generates the scansion pattern that might result from somebody reading it?
I dare to dream of a free tool on the web that would be able to break down prose down into information about iamb, trochee, anapest, dactyl, spondee & pyrrhic parts --- --- --- maybe there's something out there that if you type in "the rain in spain falls mainly on the plain" could return back to you "the RAIN in SPAIN falls MAIN-ly on the PLAIN", or even a half-baked suggestion for emphasis patterns. Computer-poets, know of any scansion tools?
I dare to dream of a free tool on the web that would be able to break down prose down into information about iamb, trochee, anapest, dactyl, spondee & pyrrhic parts --- --- --- maybe there's something out there that if you type in "the rain in spain falls mainly on the plain" could return back to you "the RAIN in SPAIN falls MAIN-ly on the PLAIN", or even a half-baked suggestion for emphasis patterns. Computer-poets, know of any scansion tools?
Ray Kurzweil's Cybernetic Poet "reads" poetry by one or more authors and creates a model used to generate original poetry based on language analysis techniques and a variation on Markov modeling, helping you find rhymes, alliterations, turns of phrase. It doesn't look like it's been updated in a while, but it may be worth downloading the free version for experimentation:
posted by iconoclast at 11:31 AM on February 27, 2011
posted by iconoclast at 11:31 AM on February 27, 2011
Since it's pretty difficult so far to break input text into even syllables perfectly, I have a feeling you're going to be waiting a while for something that even touches stresses on those syllables. There is a Ruby implementation of some of these features, but...
posted by rhizome at 12:44 PM on February 27, 2011
posted by rhizome at 12:44 PM on February 27, 2011
Stress patterns for polysyllabic words are mostly known quantities. It's easy to find electronic sources that mark them, and since they're the backbone of a line, they don't tend to vary much from the dictionary form, making them tractable to search and replace except for numerous cases where stress distinguishes minimal pairs (e.g. ally n. vs. ally v.). A bigger problem is words of one syllable, which in poetry can seriously go either way. A lexicon like this one will tend to mark them either all stressed or all unstressed, which is useless.
Consider this Wordsworth poem:
Presumably, you could be more tentative in marking monosyllabic words, try forcing particular meters onto them, and see which ones result in a best fit to arrive at a reasonable guess. But you'd always have trouble with content words like "falls" in your Spain example (my script gets that one wrong, and I think "on" should be stressed there, which it gets wrong too). And once you start getting half-way reasonable answers, you'll often find good poems are intentionally doing unexpected things with words and meter that don't fit and that would have been more fun to discover for yourself.
But maybe these observations help you in doing it by hand ... Basically, start by marking the polysyllabic words, and then the monosyllabic content words, and then fill the others in appropriately as a pattern emerges. An imperfect script really doesn't save you time.
posted by Monsieur Caution at 3:30 PM on February 27, 2011 [2 favorites]
Consider this Wordsworth poem:
A SLUMBER did my spirit seal; I had no human fears: She seem'd a thing that could not feel The touch of earthly years. No motion has she now, no force; 5 She neither hears nor sees; Roll'd round in earth's diurnal course With rocks, and stones, and trees.I have a 30-line perl script that gets lines 4, 6, 7, and 8 right. "Seem'd" (3) and "roll'd" (7) aren't recognized, but they aren't marked incorrectly, so we just need a bigger dictionary for those. It's the function words that are a problem. Basically, my script assumes that a short list of monosyllabic function words are unstressed, while all monosyllabic content words are stressed, and it lets the lexical database do the rest. But as a result, "did" (1), "had" (2), "could" (3), and "has" (5) are all the opposite of what they should be, which is kind of a mess to sort out.
Presumably, you could be more tentative in marking monosyllabic words, try forcing particular meters onto them, and see which ones result in a best fit to arrive at a reasonable guess. But you'd always have trouble with content words like "falls" in your Spain example (my script gets that one wrong, and I think "on" should be stressed there, which it gets wrong too). And once you start getting half-way reasonable answers, you'll often find good poems are intentionally doing unexpected things with words and meter that don't fit and that would have been more fun to discover for yourself.
But maybe these observations help you in doing it by hand ... Basically, start by marking the polysyllabic words, and then the monosyllabic content words, and then fill the others in appropriately as a pattern emerges. An imperfect script really doesn't save you time.
posted by Monsieur Caution at 3:30 PM on February 27, 2011 [2 favorites]
CMU Pronouncing Dictionary? I did a bit of Googling around the Python library Natural Language Toolkit, which includes the CMUdict.
posted by asymptotic at 6:07 AM on February 28, 2011
posted by asymptotic at 6:07 AM on February 28, 2011
This thread is closed to new comments.
Given that it's not a fully solved problem, if you want scansion software, your best bet might be to try to interest a computational linguistics grad student in the problem; while doing a really good job of it is hard, an unpolished rough cut should be easy enough to whip up.
posted by RogerB at 10:27 AM on February 27, 2011