So far so good but the CoT rambling is way too much + question

#2
by SerialKicked - opened

Congrats for the release! I like the writing style, feels a lot less artificial than most other recent models. Model feels okay for its size, didn't get to really test it much yet, obviously. But so far I like it.

My only pet peeve is the CoT. Either I'm using the wrong sampling method (i tested a few) or you haven't tuned against forever / looping CoT. It, too often, gets into the thousand (even plural on occasion) of tokens for simple tasks. It's really wasteful in that regard. I thought Qwen was bad, but this is a whole level above it. Was it trained on some native way to disable CoT, like /nothink in qwen. Normally, i'd prefill queuing the start/end think tags with nothing in between, sadly this doesn't work most of the time with your model (blank generation). Or maybe is there a reasoning "effort" setting?

Another thing, looking at the jinja template you have an "environment" role setup. Was it used during training, and if so, what's its purpose? Is it a form of system message?

Edit: Prefilling responses with this seems to help "modulating" the reasoning effort. No idea how damaging it's to CoT. But worth sharing.

<think>\nOkay, I'll keep my thinking short.

We're working on this for future versions @SerialKicked ! I agree it is a big time yapper, especially for easy queries. Its more similar for math+coding+reasoning queries.

natolambert changed discussion status to closed
natolambert changed discussion status to open

For what its worth, the longer the total context is, the worst it seems to get. It's not as bad in 1 shot interactions.

Still nice, it's quite an achievement. A fully open 32B CoT model wasn't on my bingo card this year. Good luck to you all.

Sign up or log in to comment