Microsoft releases open source toolkit used to build human-level speech recognition

Enlarge (credit: Microsoft)

Last week, Microsoft announced a speech recognition breakthrough: a transcription system that can match humans, with a word error rate of 5.9 percent for conversational speech. This new system is built on an open source toolkit that Microsoft already developed. A major new update to the toolkit, now called the Cognitive Toolkit, was released today in beta.

Formerly called the Computational Network Toolkit (CNTK), the MIT-licensed, GitHub-hosted project gives researchers some of the building blocks, such as neural networks, to develop their own machine learning systems. These machine learning applications can run on both CPUs and GPUs, and the toolkit has support for compute clusters. This scalability has already made CNTK strongly competitive with other popular frameworks, including Google’s TensorFlow.

The Computational Network Toolkit was originally built for speech applications, but has since grown to accommodate other machine learning use cases. The Bing team uses it to make inferences about search terms. For example, a search for “how do you make an apple pie?” is a search for recipes, even though it doesn’t include the word “recipe.” The new version of the toolkit adds features, such as support for Python scripting, and new algorithms to further expand its reach to these more diverse applications.

Read 1 remaining paragraphs | Comments

Technology Lab – Ars Technica