Can A Dragon Roar Remotely As A Bear In The Woods?

by Shepard Gorman

Dragon NaturallySpeaking 11.5 Premium is the latest iteration of a product that started in Waltham, Massachusetts 14 years ago. After experiencing steady growth, the company was bought out by a Belgian holding company that engaged in questionable accounting practices and eventually was forced to sell the company to what is now Nuance Software, a publisher of voice recognition, optical character recognition software and paperwork management systems.

This new edition has quite a number of advantages. First and foremost, it has improved its accuracy rate to what it claims to be 99%. This means that the experienced users will get only one or two errors in a given page of dictation, not counting any homonyms like “bawled” and “bald”. From a user's perspective, the strangest part of using this software is that one quickly forgets whether the user is training the program to recognize speech or whether the program is training the user to talk to it. The improved accuracy of this version can be seen in the initial training session in which the program is familiarized with the user's voice. The original version of the program required more than an hour of continuous reading. Fourteen years later, Version 11.5 can be up and working quite well with about five minutes of dictation. However, it’s the accuracy improves even more with continued use..

The Naturally Speaking program has continuously extended its usefulness by becoming a virtual hands-free control system not only for dictation but also as a means of controlling the whole Windows ™ operating system. A handy, context-sensitive sidebar, introduced with the last version, makes the available recently expanded set of commands even easier to use. For example, the operating system could be commanded to open a subfolder like "My Pictures"

What does “remote roaring” mean? How about a low-effort typed transcript of your entire lecture? For this somewhat lazy peripatetic professor, the ability to transcribe a lecture is very valuable. How else could every single one of my brilliant bon mots be committed to the written record without great tedium? No longer is tethering one's self to a computer necessary in order to review a previously oral presentation. Reviewing a lecture to transform it to something more permanent, like soporific PowerPoint. slides. is also much less effort , since reading is still much faster than listening. It is obvious as well that the small extra effort to use a typescript of a lecture for review and rewriting is considerably less effortful than dictating it twice or keyboarding it into a word processor or presentation package.

NaturallySpeaking has always worked most accurately when a direct, wired microphone is plugged into a microphone jack or a USB port. In fact, microphones are usually bundled with the program. However, the program can work with other types of input. For example, you could patch a recording made with a digital voice recorder into the program, About a decade ago, one of the first portable digital audio recorders was included in one version of the program in an effort to encourage this type of use. In this reviewer's opinion, it was a wretched disaster. The transcription accuracy was extremely poor and the effort of importing audio files and having them recognized as real information rated somewhere between bailing the ocean or counting the grains of sand on the beach. Happily, all that seems to have changed with this version. First, several manufacturer, notably Sony, Samson and Olympus produce broadcast-quality high fidelity digital audio recorders currently priced under $200. Second, significant changes in the program interface have made this process much easier, if not quite effortless.. In fact, Naturally Speaking Version 11.5 even accepts the seemingly ubiquitous iPhone and iPad as remote wireless microphones. .

Having said all of this, the process is not "seamless". A number of program features could still use improvement. In my opinion, the biggest stumbling block is the inputting of files from a digital audio recorder or DAR (no connection whatever with a certain women's historical organization ). The article you are currently reading was dictated entirely either using a USB headset microphone or via transcription from a Sony ICD SX 712 DAR. (About $120 street price) . We have also had good success with the Samson Zoom products that are priced very similarly.

To date, the process of importing files for transcription is as described below:

1. Open Dragon, if it is not already in its normal toolbar mode.

2. Open Sony's Sound Organizer program.

3. Connect the SX 712 audio recorder.

4. Find the correct folder and file on the recorder.

5. Right-click and choose "Open in Dragon"

6. Click "Start voice recognition".

7. Transcribe.

8. If you don't like the basic WordPad Windows application for *.rtf files, open a word processor to "clean up" the files, so "two", "too" and "to" are seen as English words rather than as an arithmetic problem.

While this process is seemingly lengthy, practice makes these steps quite rapid. But the results of the transcription from the digital recorder lack the accuracy that experienced voice recognition software users have come to expect from a wired microphone. The most frustrating part is that there is no way to really improve the recognition accuracy after the initial setup. Dragon uses customized user information files it calls "profiles" to improve its understanding of the dictator's voice, style, etc. Herein lies the difference between the wired microphone and the wireless devices. All device undergo "profile" training during the initial setup consisting of reading a known passage to the device in order to reduce transcription errors by learning more about the user's grammar , vocabulary and cadence.. While the training for a wired microphone seems the same as for recorder, the big distinction is that user can always "retrain" the wired microphone but the "profile" for the DAR cannot be so refined. In short, without a profile, the program doesn't improve its "intelligence" by not making repeated errors unique to the user and the dictation device So, while this program is certainly not a bear to use, remedying this shortcoming would go a long way to allow this Dragon to roar wherever it wanted. Not to polarize things but a few improvements in "profiling" would make this program a honey!




