Indic Parler - Bhili TTS
Fine-tuned version of ai4bharat/indic-parler-tts on 2 hours of Bhili conversational speech data.
Installation
pip install git+https://github.com/huggingface/parler-tts.git
Inference
import torch
from parler_tts import ParlerTTSForConditionalGeneration
from transformers import AutoTokenizer
import soundfile as sf
device = "cuda:0" if torch.cuda.is_available() else "cpu"
model = ParlerTTSForConditionalGeneration.from_pretrained("sanjay73/indic-parler-bhili-tts").to(device)
tokenizer = AutoTokenizer.from_pretrained("sanjay73/indic-parler-bhili-tts")
description_tokenizer = AutoTokenizer.from_pretrained("google/flan-t5-large")
prompt = "चाला आपुऊ आमी बाजार केरा जाहूं"
description = "A male speaker delivers speech at a moderate speed with a moderate pitch. The recording is of good quality."
desc_ids = description_tokenizer(description, return_tensors="pt").to(device)
prompt_ids = tokenizer(prompt, return_tensors="pt").to(device)
generation = model.generate(
input_ids=desc_ids.input_ids,
attention_mask=desc_ids.attention_mask,
prompt_input_ids=prompt_ids.input_ids,
prompt_attention_mask=prompt_ids.attention_mask,
)
audio = generation.cpu().numpy().squeeze()
sf.write("output.wav", audio, model.config.sampling_rate)
- Downloads last month
- 11
Model tree for sanjay73/indic-parler-bhili-tts
Base model
ai4bharat/indic-parler-tts