Help Plan for Text to Speech Dictation Program

Hi All,

As you know, I’ve been recording and uploading dictation files as I need them. I spend a few hours doing as many as I can before life intervenes. Some days (after four attempts to record a passage at 60wpm), the computer text-to-speech program is very tempting. The current computer method works, but I’d like to automate it further.

The plan is for you to get your own copy of the speech engine, Cepstral. This solves all the licensing problems. It’s $30 for personal use, or free if you don’t mind “Please register me” messages.

Would you be comfortable downloading a program from us? You input text and a speed, and it calls Cepstral to create the sound files. This gives the most options for speed calibration.

It will probably be an old-fashioned text-only question-and-answer interface. It’s faster to program and the resulting program file size is much smaller.

The other option is you upload a passage to our site. We create several text files, including a .bat file, and you download them to your own machine. You double-click on the .bat and it calls Cepstral to do the rest. All the files would be readable by any text editor or word processor, but only a programmer would be able to understand them. (Probably only a programmer over 40, since .bat files aren’t popular these days.)

How important is exact speed? So far, it’s trial-and-error, and voices vary by 20%. We have several options.

Cricket for everyone)


5 comments Add yours
  1. More thinking about calibration. The easiest and most accurate is if you download the program. It could test several parameters for each passage.

    The online version can't call the speech engine directly, so you'd have to stand in the middle and tell it the results of each group of experiments. We'd also keep records of the test results, so eventually it would get it right the first time.

  2. Here's how i've been doing it. I read from the shorthand, only after reading it a few times for accuracy, into a sony digital recorder. I then play this back on "ExpressScribe" which was a free software. This allows me to play the dictation at a comfortable speed. I have found that i have a certain pace that i read based on the numbers at the end of the lesson and the time of the file. I then can choose the playback %. I also have a transcription pedal if i need it.

    My concern in what i am seeing in your post is the amount of energy on proper dictation files that could be used for actual dictation. It seems that the goal would be to move way beyond those speeds asap. Maybe i am misunderstanding though.

  3. Oooh, Express Scribe looks nice.

    How low a playback percentage does it have?

    Will it save sound files for a player, or would I be tied to the computer? Most of my shorthand time is away from the computer.

    The actual TTS dictation program is probably another 2 hours of work for my husband, including user testing, or 4 for me if I learn another programming language, which is a good exercise for me. (They keep discontinuing the ones I learn.) Assuming we don't keep adding features. After that, I can generate dictation files almost as fast as I can type, and do it in a noisy environment. Currently, it takes recording time (including retakes) plus another 5 minutes per passage to get all the speeds.

Leave a Reply