Whisper is an open source AI model for multi-language speech recognition developed by OpenAI and trained on 680,000 hours of audio in almost 100 different languages. With moderate computational requirements, it achieves near state-of-the-art performance compared to both other open source models and commercial systems.
AutoSubs.net is a web app for automatic transcription, translation and subtitling. It combines in an automatic workflow transcription with the Whisper model, translation services such as DeepL or Google Translate and the embedding of subtitles generated in the video. It also provides voice-over capabilities via speech synthesis at autosubs.net/voice.