English female text-to-speech model trained on the ljspeech dataset at 22050 Hz and is available to synthesize the English language.
Model Description
This English female text-to-speech model is trained on the the LJSpeech dataset at 22050 Hz and is available to synthesize the English language. The model is based on the fast_pitch encoder.
pip install tts
tts --text "Hello, world!" --model_name tts_models/en/ljspeech/fast_pitch
Voice Samples
default (F)
English
English is a West Germanic language that originated in England and is now one of the most widely spoken languages in the world. It belongs to the Indo-European language family and is closely related to German and Dutch. English has a diverse vocabulary and is known for its global influence as a lingua franca. It uses the Latin alphabet with modifications, including the addition of letters such as ð and þ in Old English. English features a complex phonetic system with a wide range of vowel and consonant sounds.
LJSpeech Dataset
The LJSpeech dataset is a large-scale English speech dataset that contains single-speaker recordings. It is commonly used for training and evaluating text-to-speech (TTS) models.
FastPitch
FastPitch is an advanced text-to-speech (TTS) model that combines the power of autoregressive generation and non-autoregressive duration modeling. It offers a fast and efficient way to synthesize speech from text. FastPitch achieves high-quality results while significantly reducing the inference time compared to traditional autoregressive models. This makes it suitable for real-time applications, interactive systems, and scenarios where low-latency speech synthesis is desired.
Follow AI Models on Google News
An easy & free way to support AI Models is to follow our google news feed! More followers will help us reach a wider audience!
Google News: AI Models