timhua/1p_1000_examples_8b-threshold_0.57-RM-n_examples_100-probe_linear_layers_10 Text Generation • 8B • Updated 23 days ago • 15
timhua/1p_1000_examples_8b-threshold_0.57-RM-n_examples_100-probe_linear_layers_10 Text Generation • 8B • Updated 23 days ago • 15
Steering Evaluation-Aware Language Models to Act Like They Are Deployed Paper • 2510.20487 • Published Oct 23 • 1
Steering Evaluation-Aware Language Models to Act Like They Are Deployed Paper • 2510.20487 • Published Oct 23 • 1