Upload model.safetensors.index.json

by abarbosa - opened Jul 13

base: refs/heads/main

←

from: refs/pr/2

Discussion Files changed

-0

abarbosa

Jul 13

Hi there!

While trying to load PORTULAN/gervasio-8b-portuguese-ptpt-decoder I ran into this error:

File ".../transformers/modeling_utils.py", line 1243, in _get_resolved_checkpoint_files
raise OSError(
OSError: PORTULAN/gervasio-8b-portuguese-ptpt-decoder does not appear to have a file named
pytorch_model.bin, model.safetensors, tf_model.h5, model.ckpt or flax_model.msgpack.

After a bit of digging I discovered the cause:
the weights are split across four .safetensors shards, but the repository doesn’t include the
index file (model.safetensors.index.json) that tells HF Transformers how to stitch those shards
together.

This PR simply adds that index file. Feel free to regenerate it yourself or keep this one—either
way the model loads correctly once the file is present. 🚀

Follow-up question

Do you plan to release a Brazilian-Portuguese variant (something like
gervasio-8b-portuguese-ptbr-decoder), similar to what you did for the 7 B model?
Or does the 8 B checkpoint already cover both pt-PT and pt-BR?
I noticed the 7 B models were deprecated and was a bit unsure about the naming.
Thanks—and great work on these models!

Upload model.safetensors.index.json2dce02c9

luismsgomes

PORTULAN org Jul 15

Thank you for reporting the error. I missed that file when uploading the model to huggingface.
I have now included the original model.safetensors.index.json in the repository. For some reason the value in metadata/total_size is slightly different from the one in your file. Maybe it was computed in a different way.

Thanks, once again.

luismsgomes changed pull request status to closed Jul 15

luismsgomes

PORTULAN org Jul 15

About your follow-up question:

Since there are several pt-BR models available on hugging face, we are currently focusing on pt-PT models and have no plans to release pt-BR versions of Gervásio soon. However, I wouldn't be surprised if the current 8B Gervásio pt-PT performed better than the old/deprecated 7B Gervásio pt-BR in pt-BR tasks/datasets.

Best regards and thanks for your interest in Gervásio models!
Luís

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment