Machine Learning in Linux: Speech Note
Speech Note is a GUI frontend for various processing engines. For Speech to Text it uses Coqui STT, Vosk, and Whisper. Whisper is our highest rated speech recognition tool and features in our award-winning Top 100 CLI apps study. It’s that good. Coqui STT is also highly recommended although it’s no longer actively maintained.
For Text to Speech, Speech Note uses espeak-ng, MBROLA, Piper, RHVoice, and Coqui TTS. And the machine translation is handled by Bergamot Translator.
This is free and open source software written in C++.