15 136 272

Travis King

travisking

AI & ML interests

have you heard of generative AI?

Recent Activity

liked a model about 5 hours ago

adaptive-classifier/chayan

upvoted an article about 5 hours ago

🌳 QAT: The Art of Growing a Bonsai Model

upvoted an article about 5 hours ago

The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs

View all activity

Organizations

None yet

liked a model about 5 hours ago

adaptive-classifier/chayan

Updated 4 days ago • 15 • 2

upvoted 2 articles about 5 hours ago

Article

🌳 QAT: The Art of Growing a Bonsai Model

9 days ago

•

Article

The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs

3 days ago

•

upvoted an article about 8 hours ago

Article

Why Did MiniMax M2 End Up as a Full Attention Model?

19 days ago

•

liked a model about 13 hours ago

nvidia/llama-embed-nemotron-8b

Feature Extraction • 8B • Updated 3 days ago • 102k • 73

liked a dataset about 13 hours ago

ytz20/LMSYS-Chat-GPT-5-Chat-Response

Viewer • Updated 1 day ago • 192k • 78 • 8

liked a model about 13 hours ago

nvidia/NVIDIA-Nemotron-Parse-v1.1

Image-Text-to-Text • Updated about 15 hours ago • 191 • 6

upvoted a paper about 19 hours ago

DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published 6 days ago • 78

upvoted 2 papers 4 days ago

LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls

Paper • 2511.09148 • Published 6 days ago • 15

Motif 2 12.7B technical report

Paper • 2511.07464 • Published 11 days ago • 38

New activity in PleIAs/Baguettotron 6 days ago

Incorrect eos token

#1 opened 6 days ago by

jncraton

upvoted 2 papers 7 days ago

Routing Manifold Alignment Improves Generalization of Mixture-of-Experts LLMs

Paper • 2511.07419 • Published 8 days ago • 24

Too Good to be Bad: On the Failure of LLMs to Role-Play Villains

Paper • 2511.04962 • Published 11 days ago • 50

upvoted a collection 7 days ago

SYNTH

Collection

Fully generalist synthetic dataset and SOTA small reasoners • 3 items • Updated 8 days ago • 9

liked a model 8 days ago

PleIAs/Baguettotron

Text Generation • 0.3B • Updated about 2 hours ago • 4.85k • 155

upvoted a collection 8 days ago

Common Pile v0.1

Collection

All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6 • 36

liked a model 8 days ago

autoweeb/Qwen-Image-Edit-2509-Photo-to-Anime

Image-to-Image • Updated 7 days ago • 2.45k • • 83

liked a model 13 days ago

inference-net/Schematron-3B

Text Generation • 3B • Updated Sep 9 • 1.45M • 102

liked a Space 14 days ago

3.49k

The Ultra-Scale Playbook

🌌

The ultimate guide to training LLM on large GPU Clusters

liked a model 14 days ago

Salesforce/Llama-Fin-8b

Updated 27 days ago • 91 • 4

Travis King

AI & ML interests

Recent Activity

Organizations

travisking's activity

🌳 QAT: The Art of Growing a Bonsai Model

The Heterogeneous Feature of RoPE-based Attention in Long-Context LLMs

Why Did MiniMax M2 End Up as a Full Attention Model?

Incorrect eos token

The Ultra-Scale Playbook