Transformers
Safetensors
unsloth

tau train medium 100 sample

β”‚ πŸ† Average Reward: 0.5175                                                                                                                    β”‚
β”‚                                                                                                                                              β”‚
β”‚ πŸ“ˆ Pass^k Metrics:                                                                                                                           β”‚
β”‚ k=1: 0.518                                                                                                                                   β”‚
β”‚ k=2: 0.421

tau train medium 200 sample

β”‚ πŸ† Average Reward: 0.5526                                                                                                                    β”‚
β”‚                                                                                                                                              β”‚
β”‚ πŸ“ˆ Pass^k Metrics:                                                                                                                           β”‚
β”‚ k=1: 0.553                                                                                                                                   β”‚
β”‚ k=2: 0.482   

tau train medium 300 sample

β”‚ πŸ† Average Reward: 0.5175                                                                                                                    β”‚
β”‚                                                                                                                                              β”‚
β”‚ πŸ“ˆ Pass^k Metrics:                                                                                                                           β”‚
β”‚ k=1: 0.518                                                                                                                                   β”‚
β”‚ k=2: 0.421  

image

image

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train amityco/Qwen3-4B-Thinking-2507-tau-train-v0.7-300