different max_position_embeddings and rope_theta in and OpenR1-Qwen-7B-SFT and it's base Qwen2.5-Math-7B-Instruct ?
#3
by
zhuzhuyue - opened
how to use open-r1 project to get this model.
I find "max_position_embeddings" of Qwen2.5-Math-7B-Instruct is 4096, and "rope_theta" is 10000.0, but in OpenR1-Qwen-7B, "max_position_embeddings" is 32768, and "rope_theta" is 300000.0. why these value are different, and how to get such result?
If use the demo config of recipes/openr1-qwen-7b/sft/config.yaml, the value will be the same.
Hello, you need to:
- download locally the model (using for instance
hugginface-cli download Qwen2.5-Math-7B-Instruct --local-dir <your-local-directory>) - change the
rope_thetaandmax_position_embeddingin your local folder (inconfig.jsonfile) - replace the
model_name_or_pathin theconfig.yamlby the local model