rvienne/layton-eval
Viewer
•
Updated
•
1.01k
•
118
All layton-eval related datasets
Note Dataset containing layton-eval riddles
Note Dataset containing everything to compute PPI-based benchmark score
Note Benchmark final results on several frontier models