rop-bible-audio-aligned-speecht5

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0511

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 3407
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 40000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
0.0787 1.1892 1000 0.0626
0.0745 2.3785 2000 0.0590
0.0701 3.5677 3000 0.0557
0.0689 4.7569 4000 0.0594
0.0667 5.9461 5000 0.0567
0.0664 7.1345 6000 0.0548
0.0644 8.3237 7000 0.0567
0.0631 9.5129 8000 0.0551
0.0614 10.7022 9000 0.0542
0.0607 11.8914 10000 0.0545
0.0612 13.0797 11000 0.0549
0.0582 14.2690 12000 0.0529
0.06 15.4582 13000 0.0542
0.059 16.6474 14000 0.0519
0.0563 17.8367 15000 0.0524
0.057 19.0250 16000 0.0519
0.0569 20.2142 17000 0.0523
0.0559 21.4035 18000 0.0513
0.0563 22.5927 19000 0.0517
0.0554 23.7819 20000 0.0512
0.055 24.9711 21000 0.0516
0.0537 26.1595 22000 0.0521
0.0537 27.3487 23000 0.0510
0.0544 28.5379 24000 0.0509
0.054 29.7272 25000 0.0512
0.052 30.9164 26000 0.0509
0.0517 32.1047 27000 0.0508
0.0522 33.2940 28000 0.0509
0.0516 34.4832 29000 0.0509
0.0518 35.6724 30000 0.0509
0.0503 36.8616 31000 0.0506
0.0516 38.0500 32000 0.0514
0.0514 39.2392 33000 0.0510
0.0524 40.4284 34000 0.0512
0.0497 41.6177 35000 0.0510
0.0508 42.8069 36000 0.0510
0.0512 43.9961 37000 0.0510
0.0491 45.1845 38000 0.0511
0.0512 46.3737 39000 0.0509
0.0504 47.5629 40000 0.0511

Framework versions

  • Transformers 4.57.1
  • Pytorch 2.8.0+cu128
  • Datasets 4.2.0
  • Tokenizers 0.22.2
Downloads last month
70
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sil-ai/rop-bible-audio-aligned-speecht5

Finetuned
(1340)
this model