Is speech recognition software worthwhile?
September 14, 2004 7:41 PM
Is speech recognition software worthwhile for a slow typist? I've heard that even the best programs have annoyingly high rates of error.
I would also bet that unless you're a very slow typist, or don't have to type very much at a stretch, your average WPM typing would exceed that speaking.
posted by kenko at 8:51 PM on September 14, 2004
posted by kenko at 8:51 PM on September 14, 2004
All of the speech recognition software I've ever used has mangled my sentences completely. Yeah, I'd recommend getting whatever the latest learn-to-type software is instead and spending a few hours with it -- you'd spend that long reading stuff to train speech recognition anyway.
posted by reklaw at 9:06 PM on September 14, 2004
posted by reklaw at 9:06 PM on September 14, 2004
I am in the position magick described, and unfortunately he's probably right. For me, voice recognition (DragonDictate) is a godsend. My friends and family have watched me use it, entranced, and most have tried it themselves repeatedly, but none have managed to adopt it. Voice recognition systems take a lot of patience to operate effectively, and most people simply don't want to deal with the constant correcting of errors when all they have to do is reach for the keyboard. If memory serves, something like 50% of new users give up on it.
Having said that, I should point out that the technology really work. I'm typing this answer completely hands-free. If your typing really and truly sucks for reasons beyond your control (dyslexia, carpal tunnel, etc.) and you're willing to put in the effort to train and consistently correct the voice recognition program, you'll be happy with it.
posted by Soliloquy at 9:12 PM on September 14, 2004
Having said that, I should point out that the technology really work. I'm typing this answer completely hands-free. If your typing really and truly sucks for reasons beyond your control (dyslexia, carpal tunnel, etc.) and you're willing to put in the effort to train and consistently correct the voice recognition program, you'll be happy with it.
posted by Soliloquy at 9:12 PM on September 14, 2004
I'm not disabled, but occasionally get severe forearm pain and wrist numbness. ViaVoice is one of the tools i use to help manage this. I bought it very cheaply for OS X around the time I started working as an online news editor for a website; my job was to pull and rewrite 10 or more news items daily, mostly off the web, as well as developing features for the print version of the publication.
It was a huge pain to train the software, but once it was trained, it was a great boon. It was never as fast as typing, but I am an error-prone typist anyway, so my first drafts must always be extensively worked over. It made my job massively easier to do.
I found it to be of the greatest help in transcription, although I would caution against this if the transcription is for direct publication, as the error rate necessarily made the transcription quite inaccurate, but certainly good enough to find the sense and then relisten for the exeact quote.
I have geekily amused myself as well by occasionally working toward full voice control of the user interface, but it's pretty aggravating. I have also used it to directly chat into iChat with reasonable results.
Be certain to invest in a good headset mic, as well, I understand. My copy of ViaVoice came with a USB headset mic that is certainly better than any of the computer mics I had laying around the house.
And finally, if you missed it: IBM open-sources voice-rec.
posted by mwhybark at 9:42 PM on September 14, 2004
It was a huge pain to train the software, but once it was trained, it was a great boon. It was never as fast as typing, but I am an error-prone typist anyway, so my first drafts must always be extensively worked over. It made my job massively easier to do.
I found it to be of the greatest help in transcription, although I would caution against this if the transcription is for direct publication, as the error rate necessarily made the transcription quite inaccurate, but certainly good enough to find the sense and then relisten for the exeact quote.
I have geekily amused myself as well by occasionally working toward full voice control of the user interface, but it's pretty aggravating. I have also used it to directly chat into iChat with reasonable results.
Be certain to invest in a good headset mic, as well, I understand. My copy of ViaVoice came with a USB headset mic that is certainly better than any of the computer mics I had laying around the house.
And finally, if you missed it: IBM open-sources voice-rec.
posted by mwhybark at 9:42 PM on September 14, 2004
Speech recognition software came with my tablet PC, but I agree with majick--not good enough to be useful. Here's an example, just now created:
(The original I was trying to recreate here, about halfway down the page.)
posted by DevilsAdvocate at 9:42 PM on September 14, 2004
She should have died here after;This was with about an hour or so of initial training when I first got the tablet. Actually pretty impressive, when you consider all the processing that must go into it, but at the same time not really good enough to be a practical alternative to a keyboard, an onscreen keyboard, or even the handwriting recognition.
There would have been a time for such a word.
Tomorrow, and tomorrow, and tomorrow,
Creeps in this at the pace from day to day
To the last syllable of recorded time,
And all our yesterdays have lighted fools
The way to test the depths. Out, out, briefed and all!
Life's been walking shadow, but for player
That struts and frets his HR upon the stage
And then is heard no more: it is a tale
Told by an idiot, full of sound and fury,
Signifying nothing.
(The original I was trying to recreate here, about halfway down the page.)
posted by DevilsAdvocate at 9:42 PM on September 14, 2004
I've used ViaVoice (and, earlier, Dragon NaturallySpeaking.) and I find that they add "color" to any actual writing I would try to do - if they even worked tolerably after extensive training. I find myself making concessions to the software to simplify things, and just make it go.
Not unlike what happens in SMS messages, or on unfamiliar keyboards, or on dinky thumb pads. It happens in IM clients, too, but it feels more transparent to me - probably because of long use of IM itself or similar modes of interactive texting, from BBS chat to IRC and onward.
I think I can sense a smidgen of this "color" in Soliloquy's post, but such a smidgen I wouldn't have guessed he was using voice recognition without being tipped off as we were. At most I would have scratched my head a little over the syntax - if I had cared to actually inspect the structure carefully rather than simply read and comprehend it at a typical level.
Then again, I'm not really a shining example of typical or precise syntax.
I would love good voice recognition, but I still want to eventually be able to plug my brain straight in to a computer to record stream of consciousness, be it text, texture, sound, color, or whatever. Internal and external.
I highly recommend finding a keyboard you like and learning to touch type. Experiment with keyboard pitches and throws. Some people love the ergonomic split 'boards. Some can only stand straight. Some like more stagger in the rows, some like specific function key positions and groups.
A touch typing trick I use is to emboss dots into specific keys to help my fingers find them. A pin or wire held in pliers and heated over a candle and lightly applied can put a subtle but tactile bump or pattern of bumps on a well known key to establish orientation in that key group. I've used a dot on tab, home, F5, and others depending on the keyboard. And better (or older) keyboards come with such tactile cues built in.
posted by loquacious at 11:20 PM on September 14, 2004
Not unlike what happens in SMS messages, or on unfamiliar keyboards, or on dinky thumb pads. It happens in IM clients, too, but it feels more transparent to me - probably because of long use of IM itself or similar modes of interactive texting, from BBS chat to IRC and onward.
I think I can sense a smidgen of this "color" in Soliloquy's post, but such a smidgen I wouldn't have guessed he was using voice recognition without being tipped off as we were. At most I would have scratched my head a little over the syntax - if I had cared to actually inspect the structure carefully rather than simply read and comprehend it at a typical level.
Then again, I'm not really a shining example of typical or precise syntax.
I would love good voice recognition, but I still want to eventually be able to plug my brain straight in to a computer to record stream of consciousness, be it text, texture, sound, color, or whatever. Internal and external.
I highly recommend finding a keyboard you like and learning to touch type. Experiment with keyboard pitches and throws. Some people love the ergonomic split 'boards. Some can only stand straight. Some like more stagger in the rows, some like specific function key positions and groups.
A touch typing trick I use is to emboss dots into specific keys to help my fingers find them. A pin or wire held in pliers and heated over a candle and lightly applied can put a subtle but tactile bump or pattern of bumps on a well known key to establish orientation in that key group. I've used a dot on tab, home, F5, and others depending on the keyboard. And better (or older) keyboards come with such tactile cues built in.
posted by loquacious at 11:20 PM on September 14, 2004
For what it's worth IBM contributed some of their speech recognition technology to the open source Apache Project yesterday. Perhaps getting this code in the hands of even more talented developers will speed improvements in the field.
posted by Songdog at 8:18 AM on September 15, 2004
posted by Songdog at 8:18 AM on September 15, 2004
In a (futile) effort to Stop The Pain I've taken a run at Dragon voice recognition twice (successive 'generations'). I found it very tough to adapt to. In following some of the voice recognition and rsi lists, I also learned (and to some extent demonstrated to myself) that if you are prone to rsi with keyboards/mouse you may well be prone to develop a voice-related version as well. I think a parallel may be 'addictive personality' where if it isn't one thing, it's another.
If it comes down to it in the end, I will give yet another tilt at VR. Meantime, with a Kinesis keyboard I accidentally doubled my typing speed so now I'm able to just do more in a burst, and then rest.
posted by cairnish at 10:30 AM on September 15, 2004
If it comes down to it in the end, I will give yet another tilt at VR. Meantime, with a Kinesis keyboard I accidentally doubled my typing speed so now I'm able to just do more in a burst, and then rest.
posted by cairnish at 10:30 AM on September 15, 2004
If you are a heavy computer user, and you would have to be to be interested in voice recognition, one possible downside is hoarseness and throat pain. You'll be talking a lot more than you're used to. Doesn't affect some people, but I have at least one former colleague who suffered from this unexpected consequence.
posted by i_am_joe's_spleen at 12:20 PM on September 15, 2004
posted by i_am_joe's_spleen at 12:20 PM on September 15, 2004
Slightly unrelated.
Bill gates talks speech recognition for a while.
If you've never tried to transcribe a person's speech on-the-fly, give it a shot sometime. I worked as a telephone relay operator for the deaf last summer, and a lot of people don't realize how quick the average conversational voice is. When speech recognition gets to a good point, there are going to be a few hundred telecom workers out of business, thousands of happier people in the deaf and senior communities, and a significant productivity increase for anyone that types anything.
posted by rfordh at 12:59 PM on September 15, 2004
Bill gates talks speech recognition for a while.
If you've never tried to transcribe a person's speech on-the-fly, give it a shot sometime. I worked as a telephone relay operator for the deaf last summer, and a lot of people don't realize how quick the average conversational voice is. When speech recognition gets to a good point, there are going to be a few hundred telecom workers out of business, thousands of happier people in the deaf and senior communities, and a significant productivity increase for anyone that types anything.
posted by rfordh at 12:59 PM on September 15, 2004
This thread is closed to new comments.
posted by majick at 8:05 PM on September 14, 2004