mt5-small-finetuned-amazon-en-es

This model is a fine-tuned version of google/mt5-small on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5.6e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 8

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum
3.8823	1.0	1209	3.1692	16.7942	8.8845	16.2074	16.1565
3.5477	2.0	2418	3.0972	17.8143	9.0129	17.3397	17.2233
3.3697	3.0	3627	3.0647	16.1625	7.6303	15.8452	15.8262
3.265	4.0	4836	3.0604	16.6999	8.2476	16.4754	16.2985
3.1856	5.0	6045	3.0429	16.3401	7.7399	16.0389	15.8367
3.1291	6.0	7254	3.0369	17.2926	8.604	16.8836	16.7412
3.0953	7.0	8463	3.0254	17.3864	8.538	17.0815	16.9494
3.067	8.0	9672	3.0253	16.8311	8.3984	16.5907	16.4502

Safetensors

Model size

0.3B params

Tensor type

F32

Base model

Finetuned

(646)

this model