You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

IndicWav2Vec-Hindi

This is a Wav2Vec2 style ASR model trained in fairseq and ported to Hugging Face. More details on datasets, training-setup and conversion to HuggingFace format can be found in the IndicWav2Vec repo.
Note: This model doesn't support inference with Language Model.

Script to Run Inference

import torch
from datasets import load_dataset
from transformers import AutoModelForCTC, AutoProcessor
import torchaudio.functional as F

DEVICE_ID = "cuda" if torch.cuda.is_available() else "cpu"
MODEL_ID = "ai4bharat/indicwav2vec-hindi"

sample = next(iter(load_dataset("common_voice", "hi", split="test", streaming=True)))
resampled_audio = F.resample(torch.tensor(sample["audio"]["array"]), 48000, 16000).numpy()

model = AutoModelForCTC.from_pretrained(MODEL_ID).to(DEVICE_ID)
processor = AutoProcessor.from_pretrained(MODEL_ID)

input_values = processor(resampled_audio, return_tensors="pt").input_values

with torch.no_grad():
    logits = model(input_values.to(DEVICE_ID)).logits.cpu()
    
prediction_ids = torch.argmax(logits, dim=-1)
output_str = processor.batch_decode(prediction_ids)[0]
print(f"Greedy Decoding: {output_str}")

About AI4Bharat

Website: https://ai4bharat.org/
Code: https://github.com/AI4Bharat
HuggingFace: https://huggingface.co/ai4bharat

Downloads last month: 735

Model tree for ai4bharat/indicwav2vec-hindi

Quantizations

1 model

ai4bharat
/

indicwav2vec-hindi

You need to agree to share your contact information to access this model

IndicWav2Vec-Hindi

Script to Run Inference

About AI4Bharat

Model tree for ai4bharat/indicwav2vec-hindi

Spaces using ai4bharat/indicwav2vec-hindi 15