What open-sourced and accurate speech-to-text engines and APIs currently exist?


What open-sourced and accurate speech-to-text engines and APIs currently exist?

Instead of speech-to-text (STT), Wikipedia has “Speech recognition” [1], otherwise known as automatic speech recognition (ASR).  Wikipedia also has a higher level “Category:Speech recognition” [2]; under which, you can find a “List of speech recognition software” [3].  In terms of open source, Wikipedia includes an entry for “Speech recognition in Linux” [4].

Of course, Nuance is the industry leader.  AT&T is also promoting their “Watson Voice Recognition Technology & Speech API” [5].  Red Shift Company [6] offers “RASR Speech Recognizer”.  Koemei API [7] offers speech to text for video transcription.  Google also has their undocumented HTML5 Chrome speech API.  And in Windows, there is the Microsoft Speech API [8].

In terms of open source, CMUSphinx and their PocketSphinx [9] are probably most popular.  There is also the iOS version of PocketSphinx, called OpenEars [10].  There is an open source JavaScript SpeechAPI [11], similar to the MIT WAMI (Web-Accessible Multimodal Applications) toolkit [12].  Julius [13] is an open source example of “large vocabulary continuous speech recognition” (LVCSR).

[1] http://en.wikipedia.org/wiki/Speech_recognition

[2] http://en.wikipedia.org/wiki/Category:Speech_recognition

[3] http://en.wikipedia.org/wiki/List_of_speech_recognition_software

[4] http://en.wikipedia.org/wiki/Speech_recognition_in_Linux

[5] http://www.research.att.com/projects/WATSON

[6] http://www.redshiftcompany.com

[7] http://koemei.com/api

[8] http://en.wikipedia.org/wiki/Microsoft_Speech_API

[9] http://cmusphinx.sourceforge.net/?s=pocketsphinx

[10] http://www.politepix.com/openears

[11] http://speechapi.com

[12] http://wami.csail.mit.edu

[13] http://en.wikipedia.org/wiki/Julius_(software)