Upload model.safetensors.index.json
Hi there!
While trying to load PORTULAN/gervasio-8b-portuguese-ptpt-decoder I ran into this error:
File ".../transformers/modeling_utils.py", line 1243, in _get_resolved_checkpoint_files
raise OSError(
OSError: PORTULAN/gervasio-8b-portuguese-ptpt-decoder does not appear to have a file named
pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.
After a bit of digging I discovered the cause:
the weights are split across four .safetensors shards, but the repository doesn’t include the
index file (model.safetensors.index.json) that tells HF Transformers how to stitch those shards
together.
This PR simply adds that index file. Feel free to regenerate it yourself or keep this one—either
way the model loads correctly once the file is present. 🚀
Follow-up question
Do you plan to release a Brazilian-Portuguese variant (something likegervasio-8b-portuguese-ptbr-decoder), similar to what you did for the 7 B model?
Or does the 8 B checkpoint already cover both pt-PT and pt-BR?
I noticed the 7 B models were deprecated and was a bit unsure about the naming.
Thanks—and great work on these models!
Thank you for reporting the error. I missed that file when uploading the model to huggingface.
I have now included the original model.safetensors.index.json in the repository. For some reason the value in metadata/total_size is slightly different from the one in your file. Maybe it was computed in a different way.
Thanks, once again.
About your follow-up question:
Since there are several pt-BR models available on hugging face, we are currently focusing on pt-PT models and have no plans to release pt-BR versions of Gervásio soon. However, I wouldn't be surprised if the current 8B Gervásio pt-PT performed better than the old/deprecated 7B Gervásio pt-BR in pt-BR tasks/datasets.
Best regards and thanks for your interest in Gervásio models!
Luís