Seamless is a Windows web service for converting speech to text. It utilizes a foundational large language model developed by Meta that represents a significant breakthrough in automatic transcription and translation. Please note that Seamless is still under development. Users can try a demo version on the official website.

Underlying technology

The platform is based on the multimodal neural network. It supports over 100 input and output languages. You are able to perform tasks such as speech recognition and text translation with a high degree of accuracy. There is a powerful artificial intelligence model for reducing the number of errors in the output and increasing the speed of the transcription process.

How it works

Users can try a demo version of the platform. Like in Mu Voice, it is possible to record microphone input and automatically recognize the speech. The next step is to select up to three target languages and wait until the operation is finished. There is an embedded audio player for listening to the results.

An option to synthesize virtual voice is provided. The unsupervised speech encoder learns to find structure and meaning in the text as well as splits sentences into separate segments.

Features

free to download and use;
provides instruments for voice recognition, transcription and translation;
you can automatically convert speech to text;
there is support for over 100 different languages;
compatible with all modern versions of Windows.