From HandWiki - Reading time: 5 min
Speech recognition software is available for many computing platforms, operating systems, use models, and software licenses. Here is a listing of such, grouped in various useful ways.
The following list presents notable speech recognition software engines with a brief synopsis of characteristics.
| Application name | Description | Open-source | License | Operating system | Programming language | Supported language, note | Offline or online |
|---|---|---|---|---|---|---|---|
| CMU Sphinx | HMM | Yes | BSD style | Cross-platform | Java | English, German, French, Mandarin, Russian | Offline |
| HTK | HMM neural net | No | HTK specific | Cross-platform | C | English; version 3.5 released December 2015 | |
| Julius | HMM trigrams | Yes | BSD style, non-commercial | Cross-platform | C | Japanese, English; [2] | Offline |
| Kaldi | Neural net | Yes | Apache | Cross-platform | C++ | English | |
| RWTH ASR | RWTH Aachen University | No | RWTH ASR, non-commercial use only | Linux, macOS | C++ | English | |
| Whisper (speech recognition system) | Encoder/decoder transformer | Yes | MIT license | Cross-platform | Python (programming language) | Multilingual | Online (through API) and Offline |
| Application name | Description | Open-source | License | Price | Note |
|---|---|---|---|---|---|
| Dragon for Mac (discontinued 2018) | macOS; by Nuance | No | Proprietary | ||
| Dragon Dictate (discontinued) | macOS; by Nuance | No | Proprietary | ||
| MacSpeech Scribe (discontinued) | Transcription from recorded text; acquired by Nuance | ||||
| iListen (discontinued) | PowerPC Macintosh; discontinued by MacSpeech; acquired by Nuance | ||||
| Speakable items | Included with macOS | ||||
| ViaVoice (discontinued) | IBM Product; acquired by Nuance | ||||
| Voice Navigator | Original GUI voice control; 1989 |
The following list presents notable speech recognition software that operate in a Chrome browser as web apps. They make use of HTML5 Web-Speech-API.[1]
| Application name | Description | Open-source | License | Price | Note |
|---|---|---|---|---|---|
| Speechmatics[2] | Cloud based and on-premise automatic speech recognition | No | Proprietary | From £0.06 per minute of audio |
Many mobile phone handsets, including feature phones and smartphones such as iPhones and BlackBerrys, have basic dial-by-voice features built in. Many third-party apps have implemented natural-language speech recognition support, including:
| Application name | Description | Open-source | License | Price | Note |
|---|---|---|---|---|---|
| Assistant.ai | Assistant for Android, iOS and Windows Phone | No | Proprietary, freeware | Free | Discontinued |
| Dragon Dictation | No | Proprietary, freeware | Free | ||
| Google Now | Android voice search | No | Proprietary, freeware | Free | |
| Google Voice Search | No | Proprietary, freeware | Free | ||
| Microsoft Cortana | Microsoft voice search | No | Proprietary, freeware | Free | |
| Siri Personal Assistant | Apple's virtual personal assistant | No | Proprietary, freeware | Free | |
| Alexa – Amazon Echo | Amazon's personal assistant | No | Proprietary | ||
| SILVIA | Android and iOS | No | |||
| Vlingo |
The Windows Speech Recognition version 8.0 by Microsoft comes built into Windows Vista, Windows 7, Windows 8 and Windows 10. Speech Recognition is available only in English, French, Spanish, German, Japanese, Simplified Chinese, and Traditional Chinese and only in the corresponding version of Windows; meaning you cannot use the speech recognition engine in one language if you use a version of Windows in another language. Windows 7 Ultimate and Windows 8 Pro allow you to change the system language, and therefore change which speech engine is available. Windows Speech Recognition evolved into Cortana (software), a personal assistant included in Windows 10.
The following are interactive voice response (IVR) systems: