Join Our Discord (750+ Members)

Catalan TTS Model 257 Voices Vits Encoding Trained on Custom Dataset at 22050Hz

Catalan (català) text-to-speech model containing 257 voices trained on the custom dataset at 22050 Hz and is available to synthesize the Catalan language.

Catalan (català) text-to-speech model containing 257 voices trained on the custom dataset at 22050 Hz and is available to synthesize the Catalan language.

Model Description

This Catalan (català) text-to-speech model contains 257 voices trained on the a custom dataset at 22050 Hz and is available to synthesize the Catalan language. The model is based on the VITS encoder.

pip install tts
tts --text "Hello, world!" --model_name tts_models/ca/custom/vits

Voice Samples

00236e350cc8 (M)

00459 (M)

00762 (M)

Click here to expand and listen: 254 voice(s)

00983a845f95 (M)

01591 (F)

02452 (F)

02689 (M)

02992 (M)

02f7d61edf50 (M)

03115 (M)

03386 (F)

03655 (F)

03944 (F)

04247 (M)

04484 (M)

04787 (M)

04910 (M)

05147 (M)

056d7638d714 (F)

05739 (M)

06008 (F)

06042 (F)

06279 (M)

06311 (F)

06582 (M)

06705 (M)

06942 (F)

06c6d2e09362 (F)

07140 (M)

07245 (F)

07803 (F)

08001 (M)

08106 (F)

085503e68b07 (F)

08664 (M)

08935 (M)

08967 (M)

09204 (F)

09598 (M)

09901 (F)

0befb1084ad0 (M)

0c6bf6782176 (M)

0d0a943d348b (M)

0da83aed1427 (M)

0ff19536d614 (F)

125d9d1721de (M)

1378866a4d2b (M)

14bc32c10eb2 (M)

151fcb1168f4 (M)

1610e2960395 (M)

1887c37f4187 (M)

1add23d44d2d (F)

1b7fc0c4e437 (M)

1b8354b1fe92 (F)

1be6c773da63 (M)

1c7af1cc1357 (M)

1c7f19a7fa0b (M)

1c80e9d982aa (F)

2256cc5ee6c6 (M)

238532dddf77 (M)

241ca4fdf212 (M)

2421aa51a089 (M)

24d967d0e8b8 (M)

25911630ab15 (M)

26099adbc4db (F)

28e2fe1944a5 (F)

2b59e9f830e5 (M)

2bc2a177bf56 (F)

2ce84c6ea6aa (M)

2d84f39c2cca (F)

2e6ccdf9f0a7 (M)

2f92b4704080 (M)

2fb95c3b786f (M)

30b1f81c5797 (M)

31535cb2ece4 (M)

31e6f3a01166 (F)

32550810ba55 (M)

336f82b4645b (M)

35b962b08846 (M)

3637902e0d19 (M)

3723bd65a05a (M)

373d86f9fa3a (M)

379d321bff71 (M)

37c12c700c95 (M)

3a4a32c7cff1 (M)

404ecea5ae8e (M)

41e5e21b3a3b (F)

464d9ac63f79 (M)

4869d94d4936 (M)

496b66c9cb70 (M)

49a765407153 (M)

4b6c7e4e9bde (F)

4bce212aca40 (F)

4cedaa8d9643 (M)

4d7e2548403c (M)

4de9f262eee7 (M)

4e5e58a6ec7d (M)

4ec8f1e81d7a (M)

4f57d1abde33 (M)

503dbbe83f01 (M)

51795e8ea8fa (M)

52cfac480c0c (M)

537e815df933 (M)

547dd49c2cbe (M)

54f344faa37d (M)

56071bfe30e9 (F)

57e5f7cc5fac (M)

5a9a6481f136 (F)

5ba168675a3f (M)

5da56ed89657 (M)

5ebf04dfec6c (M)

620b0d4c3be9 (F)

6323ec0401b2 (F)

633e7303eae4 (F)

6688b60c24d0 (F)

6745c47d0bd5 (F)

6892c6ba9f66 (F)

689a213fd2d6 (M)

696e88087171 (M)

6bdec6b6f7e6 (M)

6e5948f904b3 (F)

7115c00371f8 (M)

71b67ba5ec75 (M)

72a3d5bde83f (M)

73d3685f3e78 (M)

74a679bf6c4a (M)

7638395f7d47 (M)

76383f56d997 (M)

77cd12af0a3d (M)

7834da277192 (M)

79a830901c1b (F)

7b7593f44cc6 (M)

7c7d917d9741 (F)

7d19dccf4811 (M)

7d8d6fa22ff7 (M)

7e36be2204fe (M)

7ff908cc2a18 (M)

8154716e77ac (M)

8162d651b621 (F)

8348c81a2530 (F)

84b101db8d07 (M)

853fb95e0f01 (M)

85c9e13ccfc0 (M)

85ea0b349a8d (F)

88673d4f24d0 (M)

88ec4ff5a1b0 (M)

892bf89bd3a0 (M)

894bd433b4b0 (F)

896256329fbe (M)

897c3401b4a3 (M)

89e6f6a865ab (F)

8b707d4f8f32 (F)

8e98d00c5d11 (M)

90bb7c91281b (M)

911c26cf8283 (F)

92862e616dce (F)

92a15e2cbd0c (M)

97679def7032 (F)

97e29f9edfe7 (F)

9b5f9ebc9614 (M)

9b847b5006ea (F)

9cdf4ab91c8e (M)

9fb127fbe465 (F)

9fe6ba948da2 (M)

a1afb2eae495 (M)

a2b06b546791 (M)

a2b503bc78bd (M)

a359c15185b6 (F)

a35dea43a67c (M)

a4b1eb406ff2 (F)

a4b8fa949865 (M)

a6bc3c6beffd (M)

aabfdbdc2115 (F)

af506d21ee14 (M)

b04a1d5062f2 (M)

b0a3c5148905 (M)

b1a0cbb91459 (M)

b47a96b489f4 (F)

b52e493e5049 (F)

b5419f6ea89d (M)

b570d19edbda (M)

baff09432cff (M)

bc0b544f1c13 (M)

bc3886ba087d (M)

bd609b6955a6 (M)

bet (F)

bf64f21ff129 (M)

bfe8d96ce71f (M)

c088e98f02d3 (M)

c1bafe50eb70 (M)

c1e166044d77 (M)

c21ee3641607 (M)

c3f1018eb1f7 (M)

c4d740361d5f (M)

c5d4c712e060 (F)

c777d3358a0a (M)

c96c4e97012d (F)

c9774fae6c0a (M)

cb557116fa7b (M)

cc3b30ba0f73 (M)

ccd85fb40538 (M)

cd1226e73c82 (M)

cdc5df38351e (M)

ce31dc5dfa61 (M)

cefa12e7ac99 (M)

cf5b890eb74b (F)

cf8c583b1282 (F)

d0cd44fcdae6 (M)

d15bfc3278de (F)

d3d64ab67746 (M)

d647b73602a3 (M)

d98d182c89b4 (M)

dafd89491990 (M)

db6932752693 (M)

db8eecd1ac9b (M)

dbe9efadf636 (M)

dca1aa77f919 (M)

dee065b956b9 (M)

df52eb2c24a6 (M)

dfc8721858bd (M)

e249989b0c39 (M)

e364856fe22a (M)

e37d85b60af5 (M)

e41b679ec144 (M)

e61565e75d63 (M)

e6a64aa839b9 (F)

e751d2f83310 (M)

e7847a5814b8 (M)

e82ba384934a (M)

e9da05b6d590 (F)

ea8456e0667e (F)

eb415e110eaf (M)

eb5078bcb64f (M)

ed5c9e654bfb (M)

edba91511ccf (M)

ee216d2d13cb (F)

eli (F)

eva (F)

f1812dbb566e (M)

f26a63e5171e (M)

f2f359ea473c (F)

f35ce011f75f (F)

f4df4a067fec (M)

f56a47b89ebd (M)

f61bdd3abb2d (M)

f62196a11f50 (M)

f8e4bf2dd4f9 (M)

f980d152d5c1 (F)

fa8641fb64db (M)

fdde8cdd2fa5 (F)

jan (M)

mar (F)

ona (F)

pau (M)

pep (M)

pol (M)

teo (M)

Catalan (català)

Catalan is a Romance language primarily spoken in Catalonia, a region in northeastern Spain, as well as in the Balearic Islands, Valencia, and Andorra. It is also spoken by communities in the Roussillon region of France and the city of Alghero in Sardinia, Italy. Catalan has its roots in the Vulgar Latin spoken in the early Middle Ages. It is known for its distinct phonetic features, including the presence of voiced and voiceless alveolar sibilants and a contrast between dental and alveolar consonants.

Custom Dataset

The custom dataset refers to a user-defined or specific dataset that is created for a particular task or project. It can include speech recordings and associated metadata tailored to the specific requirements of the project.

VITS (VQ-VAE-Transformer)

VITS, also known as VQ-VAE-Transformer, is an advanced technique used for training audio models. It combines different components to create powerful models that can understand and generate human-like speech. VITS works by breaking down audio into tiny pieces called vectors, which are like puzzle pieces that represent different parts of the sound. These vectors are then put together using a special algorithm that helps the model learn patterns and understand the structure of the audio. It’s similar to how we put together jigsaw puzzles to form a complete picture. With VITS, the model can not only recognize and understand different speech sounds but also generate new sounds that sound very similar to human speech. This technology has a wide range of applications, from creating realistic voice assistants to helping people with speech impairments communicate more effectively.

Follow AI Models on Google News

An easy & free way to support AI Models is to follow our google news feed! More followers will help us reach a wider audience!

Google News: AI Models

Related Posts

Yoruba male TTS Model vits Encoding Trained on openbible Dataset at 22050Hz

Yoruba male TTS Model vits Encoding Trained on openbible Dataset at 22050Hz

Yoruba (Yorùbá) male text-to-speech model trained at 22050 Hz and is available to synthesize the Yoruba language.

Thom Yorke RVC Model AI Voice

Thom Yorke RVC Model AI Voice

Discover a groundbreaking collection of songs by AI Thom Yorke, featuring models produced by a community of AI enthusiasts using unique Voice Conversion technology.

Lily (NMIXX) RVC Model AI Voice

Lily (NMIXX) RVC Model AI Voice

Discover the unique sound of AI Lily (NMIXX) with their collection of songs created using cutting-edge VITS Retrieval Voice Conversion technology.