Join Our Discord (940+ Members)

Catalan TTS Model 257 Voices Vits Encoding Trained on Custom Dataset at 22050Hz

Catalan (català) text-to-speech model containing 257 voices trained on the custom dataset at 22050 Hz and is available to synthesize the Catalan language.

Catalan (català) text-to-speech model containing 257 voices trained on the custom dataset at 22050 Hz and is available to synthesize the Catalan language.

Model Description

This Catalan (català) text-to-speech model contains 257 voices trained on the a custom dataset at 22050 Hz and is available to synthesize the Catalan language. The model is based on the VITS encoder.

pip install tts
tts --text "Hello, world!" --model_name tts_models/ca/custom/vits

Voice Samples

00236e350cc8 (M)

00459 (M)

00762 (M)

Click here to expand and listen: 254 voice(s)

00983a845f95 (M)

01591 (F)

02452 (F)

02689 (M)

02992 (M)

02f7d61edf50 (M)

03115 (M)

03386 (F)

03655 (F)

03944 (F)

04247 (M)

04484 (M)

04787 (M)

04910 (M)

05147 (M)

056d7638d714 (F)

05739 (M)

06008 (F)

06042 (F)

06279 (M)

06311 (F)

06582 (M)

06705 (M)

06942 (F)

06c6d2e09362 (F)

07140 (M)

07245 (F)

07803 (F)

08001 (M)

08106 (F)

085503e68b07 (F)

08664 (M)

08935 (M)

08967 (M)

09204 (F)

09598 (M)

09901 (F)

0befb1084ad0 (M)

0c6bf6782176 (M)

0d0a943d348b (M)

0da83aed1427 (M)

0ff19536d614 (F)

125d9d1721de (M)

1378866a4d2b (M)

14bc32c10eb2 (M)

151fcb1168f4 (M)

1610e2960395 (M)

1887c37f4187 (M)

1add23d44d2d (F)

1b7fc0c4e437 (M)

1b8354b1fe92 (F)

1be6c773da63 (M)

1c7af1cc1357 (M)

1c7f19a7fa0b (M)

1c80e9d982aa (F)

2256cc5ee6c6 (M)

238532dddf77 (M)

241ca4fdf212 (M)

2421aa51a089 (M)

24d967d0e8b8 (M)

25911630ab15 (M)

26099adbc4db (F)

28e2fe1944a5 (F)

2b59e9f830e5 (M)

2bc2a177bf56 (F)

2ce84c6ea6aa (M)

2d84f39c2cca (F)

2e6ccdf9f0a7 (M)

2f92b4704080 (M)

2fb95c3b786f (M)

30b1f81c5797 (M)

31535cb2ece4 (M)

31e6f3a01166 (F)

32550810ba55 (M)

336f82b4645b (M)

35b962b08846 (M)

3637902e0d19 (M)

3723bd65a05a (M)

373d86f9fa3a (M)

379d321bff71 (M)

37c12c700c95 (M)

3a4a32c7cff1 (M)

404ecea5ae8e (M)

41e5e21b3a3b (F)

464d9ac63f79 (M)

4869d94d4936 (M)

496b66c9cb70 (M)

49a765407153 (M)

4b6c7e4e9bde (F)

4bce212aca40 (F)

4cedaa8d9643 (M)

4d7e2548403c (M)

4de9f262eee7 (M)

4e5e58a6ec7d (M)

4ec8f1e81d7a (M)

4f57d1abde33 (M)

503dbbe83f01 (M)

51795e8ea8fa (M)

52cfac480c0c (M)

537e815df933 (M)

547dd49c2cbe (M)

54f344faa37d (M)

56071bfe30e9 (F)

57e5f7cc5fac (M)

5a9a6481f136 (F)

5ba168675a3f (M)

5da56ed89657 (M)

5ebf04dfec6c (M)

620b0d4c3be9 (F)

6323ec0401b2 (F)

633e7303eae4 (F)

6688b60c24d0 (F)

6745c47d0bd5 (F)

6892c6ba9f66 (F)

689a213fd2d6 (M)

696e88087171 (M)

6bdec6b6f7e6 (M)

6e5948f904b3 (F)

7115c00371f8 (M)

71b67ba5ec75 (M)

72a3d5bde83f (M)

73d3685f3e78 (M)

74a679bf6c4a (M)

7638395f7d47 (M)

76383f56d997 (M)

77cd12af0a3d (M)

7834da277192 (M)

79a830901c1b (F)

7b7593f44cc6 (M)

7c7d917d9741 (F)

7d19dccf4811 (M)

7d8d6fa22ff7 (M)

7e36be2204fe (M)

7ff908cc2a18 (M)

8154716e77ac (M)

8162d651b621 (F)

8348c81a2530 (F)

84b101db8d07 (M)

853fb95e0f01 (M)

85c9e13ccfc0 (M)

85ea0b349a8d (F)

88673d4f24d0 (M)

88ec4ff5a1b0 (M)

892bf89bd3a0 (M)

894bd433b4b0 (F)

896256329fbe (M)

897c3401b4a3 (M)

89e6f6a865ab (F)

8b707d4f8f32 (F)

8e98d00c5d11 (M)

90bb7c91281b (M)

911c26cf8283 (F)

92862e616dce (F)

92a15e2cbd0c (M)

97679def7032 (F)

97e29f9edfe7 (F)

9b5f9ebc9614 (M)

9b847b5006ea (F)

9cdf4ab91c8e (M)

9fb127fbe465 (F)

9fe6ba948da2 (M)

a1afb2eae495 (M)

a2b06b546791 (M)

a2b503bc78bd (M)

a359c15185b6 (F)

a35dea43a67c (M)

a4b1eb406ff2 (F)

a4b8fa949865 (M)

a6bc3c6beffd (M)

aabfdbdc2115 (F)

af506d21ee14 (M)

b04a1d5062f2 (M)

b0a3c5148905 (M)

b1a0cbb91459 (M)

b47a96b489f4 (F)

b52e493e5049 (F)

b5419f6ea89d (M)

b570d19edbda (M)

baff09432cff (M)

bc0b544f1c13 (M)

bc3886ba087d (M)

bd609b6955a6 (M)

bet (F)

bf64f21ff129 (M)

bfe8d96ce71f (M)

c088e98f02d3 (M)

c1bafe50eb70 (M)

c1e166044d77 (M)

c21ee3641607 (M)

c3f1018eb1f7 (M)

c4d740361d5f (M)

c5d4c712e060 (F)

c777d3358a0a (M)

c96c4e97012d (F)

c9774fae6c0a (M)

cb557116fa7b (M)

cc3b30ba0f73 (M)

ccd85fb40538 (M)

cd1226e73c82 (M)

cdc5df38351e (M)

ce31dc5dfa61 (M)

cefa12e7ac99 (M)

cf5b890eb74b (F)

cf8c583b1282 (F)

d0cd44fcdae6 (M)

d15bfc3278de (F)

d3d64ab67746 (M)

d647b73602a3 (M)

d98d182c89b4 (M)

dafd89491990 (M)

db6932752693 (M)

db8eecd1ac9b (M)

dbe9efadf636 (M)

dca1aa77f919 (M)

dee065b956b9 (M)

df52eb2c24a6 (M)

dfc8721858bd (M)

e249989b0c39 (M)

e364856fe22a (M)

e37d85b60af5 (M)

e41b679ec144 (M)

e61565e75d63 (M)

e6a64aa839b9 (F)

e751d2f83310 (M)

e7847a5814b8 (M)

e82ba384934a (M)

e9da05b6d590 (F)

ea8456e0667e (F)

eb415e110eaf (M)

eb5078bcb64f (M)

ed5c9e654bfb (M)

edba91511ccf (M)

ee216d2d13cb (F)

eli (F)

eva (F)

f1812dbb566e (M)

f26a63e5171e (M)

f2f359ea473c (F)

f35ce011f75f (F)

f4df4a067fec (M)

f56a47b89ebd (M)

f61bdd3abb2d (M)

f62196a11f50 (M)

f8e4bf2dd4f9 (M)

f980d152d5c1 (F)

fa8641fb64db (M)

fdde8cdd2fa5 (F)

jan (M)

mar (F)

ona (F)

pau (M)

pep (M)

pol (M)

teo (M)

Catalan (català)

Catalan is a Romance language primarily spoken in Catalonia, a region in northeastern Spain, as well as in the Balearic Islands, Valencia, and Andorra. It is also spoken by communities in the Roussillon region of France and the city of Alghero in Sardinia, Italy. Catalan has its roots in the Vulgar Latin spoken in the early Middle Ages. It is known for its distinct phonetic features, including the presence of voiced and voiceless alveolar sibilants and a contrast between dental and alveolar consonants.

Custom Dataset

The custom dataset refers to a user-defined or specific dataset that is created for a particular task or project. It can include speech recordings and associated metadata tailored to the specific requirements of the project.

VITS (VQ-VAE-Transformer)

VITS, also known as VQ-VAE-Transformer, is an advanced technique used for training audio models. It combines different components to create powerful models that can understand and generate human-like speech. VITS works by breaking down audio into tiny pieces called vectors, which are like puzzle pieces that represent different parts of the sound. These vectors are then put together using a special algorithm that helps the model learn patterns and understand the structure of the audio. It’s similar to how we put together jigsaw puzzles to form a complete picture. With VITS, the model can not only recognize and understand different speech sounds but also generate new sounds that sound very similar to human speech. This technology has a wide range of applications, from creating realistic voice assistants to helping people with speech impairments communicate more effectively.

Follow AI Models on Google News

An easy & free way to support AI Models is to follow our google news feed! More followers will help us reach a wider audience!

Google News: AI Models

Related Posts

Sakura Miyawaki (LE SSERAFIM) RVC Model AI Voice

Sakura Miyawaki (LE SSERAFIM) RVC Model AI Voice

The LE SSERAFIM collection features AI Sakura Miyawaki’s captivating vocals, produced with innovative VITS Retrieval based Voice Conversion methods.

Overwatch Lucio RVC Model AI Voice

Overwatch Lucio RVC Model AI Voice

Experience the ultimate fusion of AI technology and music with the AI Overwatch Lucio Collection.

Dio Brando (From JoJo's Bizzare Adventure) RVC Model AI Voice

Dio Brando (From JoJo's Bizzare Adventure) RVC Model AI Voice

Discover AI Dio Brando’s collection of songs featuring a range of styles (rock, pop, ska, reggae) and languages (english, german, spanish) made using VITS Retrieval-based Voice Conversion methods.