Microsoft Speech Server

By | July 14, 2003

Microsoft Beta to Make ’em Talk

Microsoft has released a public beta of their Speech Server and a new beta of their Speech Application SDK.

Microsoft had previously teamed up primarily with Intel to propose a new standard called SALT that is somewhat competitive with VoiceXML. As of now, I contrast the two as:

SALT
useful for speech-enabling web applications
VoiceXML
useful for web-enabling speech applications

While this is an oversimplification, it reasonably reflects their current usage. Both VoiceXML and SALT based speech applications follow a similar pattern.

  1. Prompt the user
  2. Interpret the user’s response
  3. Act on the response

The action will often be to play/speak a new prompt.

The VoiceXML or SALT prompting tag will specify a recorded audio file or text that is synthesized by a text-to-speech engine. The user’s response is always interpreted in the context of a grammar. The grammar specifies the allowable responses. Multiple utterances (yeah, un-huh, sure, yep) will often be treated as the same response (yes). Other VoiceXML and SALT tags (although SALT relies much more on existing HTML tags) act like a decision tree to determine the following action. A series of these prompts and responses is called a dialog.

SALT is used primarily to mark up documents that are interpreted in a web browser on a client side device. SALT consists of a very small set of tags that add multimodality to HTML/XHTML-based web applications.

VoiceXML is primarily used to create speech applications that run on a server and are accessed via a telephone. Although plenty of proprietary speech application languages preceded VoiceXML, VoiceXML was the first widely accepted and implemented standard and it greatly simplified the integration of speech applications with existing server side web applications.

With Speech Server, Microsoft is clearly moving SALT onto VoiceXML’s turf. At the same time, IBM, Motorola, and Opera are proposing XHTML+Voice (a.k.a., X+V) as multimodal extensions to VoiceXML that would enable it to support the kinds of browser based applications that SALT now supports. Although Microsoft and IBM have been teaming up a lot on web services, they are very much in opposition with respect to the important speech technology standards.

Microsoft has developed their own speech recognition engine, but is partnering with SpeechWorks to supply a text -to-speech engine. In my experience with a previous version of the Microsoft speech recognition engine, I found it to be very mediocre. The only redeeming quality was that it was a free download.

Until now, third party interest in server side development with SALT has been extremely tepid in comparison with VoiceXML. I wonder if Microsoft will weave some of their developer magic with this server, or if it will be like one of their many other failed experiments. Of course, they’re big enough that they can survive quite a few failures, as long as they occasionally hit the big home run. I think they will end up being a big player in speech technologies in the future, but I very much doubt that SALT will become a commonly accepted standard in its current form.

11 thoughts on “Microsoft Speech Server

  1. syed afsar ali

    hi there,

    its a nice document but how we can register or be a member of MSS, and how we can download MSS or make use of it?

    Reply
  2. Robert

    You can download beta 4 of the 1.0 release of the Speech Application SDK here.

    They aren’t accepting any new beta customers for Microsoft Speech Server, but you can sign up for info about it here

    Reply
  3. sadit zia khan

    hello!

    how can i free download the mss or what are the alternative of mss to make call or dail extension or receive call?

    Reply
  4. Robert

    I don’t know whether or not it is a free download. I suggest searching the Microsoft website.

    Reply
  5. Shan

    You can download the speech sdk from microsoft.com/speech.
    I don’t think currently MSS can be downloaded for free.
    One can send their queries to microsoft.public.speech_tech and microsoft.public.netspeechsdk newsgroup.
    sadit, I didn’t get the second part of your question “..what are the alternative of mss to make call or dail extension or receive call”

    Reply
  6. sadit zia khan

    any one know the alternative of mss for making call,receiving call, or dail extension etc.

    Reply
  7. Robert

    At SpeechTEK today, Kai-Fu Lee (VP for Microsoft Speech Technologies) announced that they would offer a free 180 day trial version of Speech Server Standard Edition and Enterprise Edition. Info on buying the starter kit is on their website. While Speech Server is free, you need telephony equipment to use it for phone calls. That’s not free.

    Reply
  8. Ranganath

    We already have the dialogic software. So, we just want to try Microsoft speech server. But, i didn’t find any free download on microsoft website (though microsoft is claiming that it is free for a 180 day trial.)

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.