Tuesday, October 4, 2011

Speak and be heard

20111004 speechcast
Obviously, there are issues in the text. A few dropped words here and there, and "nano rhino" is an excellent malapropism. I would prefer numbers and ordinals to be spelled out ("2 daughters", "but 1st"). Frankly, I haven't even begun to plumb the depths of the documentation. This feels like it could be a configuration option somewhere. I'm also not sure how to introduce capitalization in the middle of a sentence, though the software is clever enough to pick up some proper nouns on its own: Google, and Elizabeth, and its own name.

Speaking of names, there was a question of how it would do with proper names, especially when dealing with character names. Elizabeth suggested using uncommon substitute words in the text ("rutabaga") and then running a find/replace operation afterwards. This isn't a bad idea. Just to test things, here's a list of baby names in the U.S., pulled from the Social Security Administration's web site:

Popularity Male name Dictated Different? Female name Dictated Different?
1 Jacob Jacob
Isabella Isabella
2 Ethan Ethan
Sophia Sophia
3 Michael Michael
Emma Emma
4 Jayden Jason Yes Olivia Olivia
5 William William
Ava Ava
6 Alexander Alexander
Emily Emily
7 Noah Noah
Abigail Abigail
8 Daniel Daniel
Madison Madison
9 Aiden Stephen Yes Chloe Chloe
10 Anthony Anthony
Mia Nina Yes

100 Brian Brian
Rachel Rachel
101 Bentley Bentley
Mya Maye Yes
102 Alejandro Alejandra Yes Rylee Riley Yes
103 Sean Sean
Katelyn Caitlin Yes
104 Nolan Nolan
Ellie Ellie
105 Riley Riley
Isabelle Isabel Yes
106 Kaden Kayden, Yes Vanessa Vanessa
107 Kyle L Yes Lilly Lily Yes
108 Micah Mica Yes London London
109 Vincent Vincent
Mary Mary
110 Antonio Antonia Yes Kennedy Kennedy

250 Corbin Corbin
Alondra A longer Yes
251 Simon Simon
Jazmin jazzman Yes
252 Clayton Clayton
Breanna Rihanna Yes
253 Myles Miles Yes Quinn Quinn
254 Xander Xander
Christina Christina
255 Dante Dante
Kyla Kyler Yes
256 Erik Eric Yes Adalyn paddling Yes
257 Rafael Rafael
Fiona Fiona
258 Martin Martin
Kaydence cadence Yes
259 Dominick Dominick
Allyson Alison Yes

In general, it did better than I expected, and may do better still with additional training on my part. Homophones will surely give the software fits, though. It claims to be context-aware, which may be why it did so well on these name lists. But unless you're planning on writing about a longer jazzman named Breanna Jazmin, search/replace should be your friend.


Cameron said...

Mike, this is a fascinating concept. I just checked out the website and the software looks promising.

James has always wanted to write a Southern Gothic Novel, as he's from south Georgia and is positively FULL of interesting stories. I think that Dragon Dictate could help a great deal with his writing endeavors. Thanks for sharing this!

mpclemens said...

Cameron, absolutely see if you can get James to join up with NaNoWriMo this year. Having a deadline really helps me make those writing projects happen, and not having one -- like when transcribing -- kills my motivation.

Elizabeth H. said...

*Love* the "nano rhino," wrong though it may be. Too funny! That term deserves to be heavily promoted in the coming weeks.

Seriously, though, this text doesn't look too bad at all. And though I'm not altogether keen on the idea of hearing my own words aloud, it *is* an excellent editing tool. Reading aloud can really help pinpoint awkward sentences.

Thanks for the review!

mpclemens said...

I think the blog entry would have been even more lucid and useful if I'd been able to hear myself think. Note to self: dictate novels early in the morning.

It is possible to correct on-the-fly, too. Say "STRIKE THAT" and the software backs up to your last pause, and gives you a chance to re-enter. There's also commands for word-level selection and navigation, but I haven't played with them yet. Just getting a rough copy electronically will be awesome.

Ryan Adney said...

Interesting. Very interesting. On another completely unrelated note, Mike I completely lost your email address and I wanted to send you a little missive. Would you mind terribly shooting off an email to tryanpa@cox.net. I have a question.

Mike Speegle said...

Ha! Until I reached the end of the paragraph, I thought that "nano rhino" was intentional. Kinda like an elephant in the room?

I'm impressed, though, that voice-rec programs have come so far. I remember the one packaged in with Msoft Office a few years ago, and how abominable it was.

notagain said...

When I was doing Nano Rhino ;-) last year, the chef at the next desk liked the idea of it and thought he would dictate into software while exercising. I immediately thought of the toughest part - punctuation. How is it with punctuating dialogue? Could it be trained to recognize Victor Borge's Phonetic Punctuation?

mpclemens said...

@notagain: Punctuation is said aloud. The software comes with a number of pre-set commands, including those to control the computer ("Save file", "Open Browser", and so on.) To dictate:

"Oh, John!" said Martha.
"Oh, Martha!" said John.

you say:

Open quote Oh comma John exclamation mark close quote said Martha period new paragraph Open quote Oh comma Martha exclamation mark close quote said John period

It seems like a lot, but it goes pretty quickly once you get used to reading that way. Most notably, there's no need to pause between words: just read naturally (although clearly) and plan for a cleanup pass later.

I don't know about dictating while exercising, but it would certainly be an efficient use of time!

Elizabeth H. said...

Could it be trained to recognize Victor Borge's Phonetic Punctuation?

Notagain--I asked Mike a similar question in the original Google+ conversation. Love Victor Borge!

One thing I wonder is if it can distinguish between different types of dashes and hyphens. I'm addicted to em dashes, and while it'd probably do me good to lighten up on them, I'm not sure I'd want to go cold-turkey.

mpclemens said...

Not sure how the software handles hyphens, either (and em-dashes, and en-dashes, and so on.) There does appear to be a customization mode, which addresses the question of teaching the software how to properly recognize special names and words. If you were sadistic, I suppose you could train it with the phonetic punctuation.

To be perfectly honest, I am a terrible transcriber, and the pages that I keyed in were still filled with errors and strangeness that required multiple passes to clean them up later. The software looks to be on par with that experience. It won't be turning out press-ready pages, but it should reduce the hassle of keying everything in, and also address the issue of trying to get a computer to read hand-scribbled notes and marginalia, which I am prone to use on my original draft. OCR can't possible process words that I myself can barely read, and as I have a tendency to spread my edits between lines and across pages, entering into the computer is a non-linear process.

I'm still going to do a scan on our office copier as a backup of the draft, pre-edits, but now I'm not dreading the aftercare project nearly as much. I might actually have something digitized and readable by Spring.