Transcribers (Speech-to-Text)
Currently, Vapi supports two providers for speech-to-text transcriptions:Deepgram
(nova - family models)Talkscriber
(whisper model)
Voice (Text-to-Speech)
Once you have set your transcriber and corresponding language, you can choose a voice for text-to-speech in that language. For example, you can choose a voice with a Spanish accent if needed. Vapi currently supports the following providers for text-to-speech:PlayHT
11labs
Rime-ai
Deepgram
OpenAI
Azure
Lmnt
Neets
Each provider offers varying degrees of language support. Azure, for instance, supports the most languages, with approximately 400 prebuilt voices across 140 languages and variants. You can also create your own custom languages with other providers.
Multilingual Support
For multilingual support, you can choose providers like Eleven Labs or Azure, which have models and voices designed for this purpose. This allows your voice assistant to understand and respond in multiple languages, enhancing the user experience for non-English speakers. To set up multilingual support, you no longer need to specify the desired language when configuring the voice assistant. This configuration in the voice section is deprecated. Instead, you directly choose a voice that supports the desired language from your voice provider. This can be done when you are setting up or modifying your voice assistant. Here is an example of how to set up a voice assistant that speaks Spanish:es-ES-ElviraNeural
from the provider azure
supports Spanish. You can replace es-ES-ElviraNeural
with the ID of any other voice that supports your desired language.
By leveraging Vapi’s multilingual support, you can make your voice assistant more accessible and user-friendly, reaching a wider audience and providing a better user experience.