Archive for July, 2005

7/28/2005: 11:49 pm: RobertSpeech, VoiceXML

Many months ago, IBM announced that they were open sourcing and donating their Reusable Dialog Components library to the Jakarta project at the Apache Software Foundation. Finally, version 1.0 of the RDC has been released.

The RDC is a JSP tag library that simplifies the development of server side code for generating VoiceXML documents for use in voice and multimodal applications. The RDC originated as a bunch of static VoiceXML files. Nearly two years ago, someone in the Speech group at IBM told me that they had decided to switch to using JSPs to dynamically generate the VoiceXML documents. Dynamic generation is the only way to go for complex VoiceXML applications.

Although my SoccerPhone and PhoneBlogger projects use static VoiceXML files, a lot of the work in those applications is being done in Python. The dialog portions of those apps are fairly simple. I’ll probably port part or all of one of these apps to use RDC to get a feel for what it’s like to develop with.

7/12/2005: 11:50 pm: RobertEverything Else, Soccer, SoccerPhone, Speech, VoiceXML

It’s been ages since I’ve written about Soccerphone, or even about anything at all. The last few weeks have just been too hectic. But, I did find time this weekend to make a few updates to Soccerphone, which is an automated speech application I built a few years ago so I could receive live Major League Soccer scores by phone.

One update of questionable merit was to use audio recordings made by me to replace some of the prompts that are currently being synthesized by a text-to-speech engine. Not only is the use of myself as a voice talent a rather dodgy decision, but also, there is still quite a bit of TTS. I’m not sure the recording effort really improved the quality of the app that much, if at all. It was fun to do the recordings, though.

Speaking of TTS, I switched from a female voice to one of the male voices that BeVocal supports. I’m now using Reed, which is a Nuance Vocalizer voice. Not only does the app sound better due to no longer switching back and forth between genders, but the TTS engine used to synthesize the Reed voice does a much better job of pronouncing names than the TTS engine used to synthesize the Jennifer voice.

I also finally got around to adding Chivas USA and Real Salt Lake to the grammar, so you can now say them at the Team Name prompt. I added FC Dallas to the grammar, but also left in their old name, the Dallas Burn.

Another minor update was to add a dummy recognition block just before the backend query. Without this, the confirmation prompt from the previous dialog wasn’t being played until the HTTP fetch completed. Since it sometimes takes more than five seconds to get the response back, the confirmation had sounded sort of odd when it was played so late.