More details of Microsoft's entry into the speech-recognition market have been unveiled prior to its official launch next week.
Speech Server will offer speech recognition for the masses, the company claims, although it appears to have made few allowances for the more experienced players already in the market, or the difficulty in producing such complex technology.
"Our goal is to make speech recognition technologies mainstream," said James Mastan, director of marketing forSpeech Server. It intends to do that by making it cheaper but also easier to deploy, manage, develop and maintain than competing products, he said.
As you would expect with any Microsoft software, the system interconnects only with the software giant's own products. Developers can add speech capabilities to existing Web applications, based on Microsoft's ASP application framework, by adding code based on XML and SALT (Speech Application Language Tags) technologies using Visual Studio .Net. Speech Server takes calls and communicates with the Web server through XML and SALT and makes applications offered online available through the phone, Mastan said.
Speech Server will only run on Windows Server 2003. The Enterprise Edition needs to run on a separate physical server while the Standard Edition, designed for small and medium-sized installations, can be placed on the same hardware as the Web server. Microsoft will recommend configurations and resellers will offer fully configured systems, Mastan said.
Users will like Speech Server because it is familiar, Mastan reasoned. Developers can use Visual Studio and it runs just like any other Microsoft server product. "It is not some black box in a call center that you have to program for in some weird language and you can't maintain yourself because you don't know how it works," he said.
Bill Gates will formally launch Speech Server 2004 at the SpeechTEK conference in San Francisco, and step into a market dominated by IBM, Nuance and ScanSoft.
It won't be an easy task. "Speech applications and a voice user interface are pretty tricky to do. That may well get lost in the first version of the Microsoft marketing hype that will go out there," said Steve Cramoysan, a principal analyst at Gartner. "If you're going to use Microsoft Speech Server, use professional services people who know exactly what they're doing."
Still, Microsoft's entry into the speech recognition market is a significant event, Cramoysan said. "Microsoft will certainly shake up this market, but I think we're going to be looking at the second and third version of this product when they will become much more competitive than with this first release of the product," he said.
Nuance - fingered by Mastan as Microsoft's chief rival - agrees with the analysts and goes a step further. "Microsoft is developing an inexpensive and easy way for developers to design really bad applications," said Nuance's principal product manager, Kevin Chatow. Adding speech to Web applications may not result in usable applications, he said.
While Microsoft may like to position Nuance's product as obscure, Chatow pointed out that Nuance supports VoiceXML 2.0, a recognized standard, and not SALT, which is still making its way through the standards process. Furthermore, the Nuance product isn't tied to Microsoft technologies but also works with Java application servers.
On Tuesday, Nuance plans to announce the third major release of its Nuance Voice Platform product. Release 3.0 adds support for Linux, in addition to Windows and Solaris, and a new application design and deployment environment that promises to cut development costs by about a third, the company said in a statement.
While the Nuance Voice Platform may be more expensive than Microsoft's Speech Server, in terms of acquisition cost, the Microsoft offering may end up costing more and paying back for itself later because of technology upgrades, development quirks and other costs associated with setting up and running the product, Chatow said.
Gartner's Cramoysan said that while Microsoft does plan to offer its product at a lower price, that does not mean it will cost less over a longer timeframe. "Although Microsoft is talking about fairly aggressive pricing, it is an unproven product. We would caution people in terms of assuming that it would be lower cost in terms of total cost of ownership," Cramoysan said.
Pricing for Microsoft's Speech Server products will be "an order of magnitude lower" than competing products, Mastan said. Details will be announced next week. One analyst has estimated a 30 percent cut.
Microsoft will offer a free 180-day trial version of the software, currently only available in US English. Those wishing to use VoIP will have to wait for third-party vendors to develop interfacing software. General availability of the software is expected a few weeks after the launch.