kureha295/deepseek-ai-DeepSeek-R1-Distill-Llama-8B-ortho-baseline-layer-11 8B • Updated 15 days ago • 11
kureha295/deepseek-ai-DeepSeek-R1-Distill-Qwen-7B-ortho-baseline-layer-17 8B • Updated 15 days ago • 10
Bochkov/growing-transformers-model-frozen-16-bit-baseline-monolyth-181m Text Generation • Updated 5 days ago • 20
Bochkov/growing-transformers-model-frozen-unicode-baseline-monolyth-247m Text Generation • Updated 5 days ago • 17
Bochkov/growing-transformers-model-unfrozen-baseline-monolyth-247m Text Generation • Updated 5 days ago • 10