daily driver

#3
by starsinwinter - opened

good at everything, including graphic nsfw, more so than models advertised for it in my experience. almost wish it wasn't this awesome since it just can't match the tone of some settings because of how anime it is, but other models can't match its strengths.

Thank you for the kind words! Glad it's working well.

To not spam your board. Same. Tried most Painted* models, and this 2026 version is really solid.

Adding back some actual logic / assistant prompts into the dataset really helped. As much as your previous PaintedFantasy models were quite decent prose-wise for this size, they were RP-only models with little ability to extend to more mixed use cases. This one feels a LOT more solid and consistent. It passed my usual testing bench (summary, ad hoc function calling, generate web queries based on chat session, complete existing RP chatlogs, and so on). It's also less prone (but not immune) to massive hallucinations.

Good job, first good fine-tune (i'm aware of) of the year in my book.

Always good to hear more feedback!

The assistant data has definitely made a difference in making the model more stable.

Out of curiosity, how does v4 compare (good or bad)? That's the only model I've released so far this year (v3 was actually last year), it has a few experimental changes in it as always. One of them is the model seeing more assistant data during both stages of the training.

Oh shit! I just realized I'm writing in v3's board, lol not v4. I would have made a post otherwise. πŸ˜…

My bad! Oops! So, yeah... I was talking about v4.

Just for reference, I'm using a very atypical setup. It's running on my own private front-end. Q6K / 24K context. The models also have to contend with very long system prompts, and the use of system messages all over the chatlog. So you shouldn't really generalize my experience here. I haven't tried to toggle on "thinking" / CoT mode yet either.

To me I would classify the notable MS24 2025 models like this
Painted 3 = strong roleplay, but lower "intelligence"
Pinecone Sage = superior intelligence, but not that great at roleplay (which is unexpected given it's a merge)
Cydonia (the good releases, it's been quite uneven last year) = middle ground

This time it seems closer to the middle ground. It's working much better in this mixed 'assistant / RP partner / agent' role I normally use models for than v3.

I don't think I've caught it going in the "Oh $User..." repetition pattern either, which is normally so common in mistral models that I added a Regex to delete it in my front-end :D Might just be a matter of time.

I'd have to use it exclusively for a bit to be sure it's not a fluke but, behavior-wise, yeah a few things jumped at me. One of the chars I built for the app is supposed to be stubborn. Under v3, this trait wouldn't actually do anything in practice. In v4, it works a lot better: using the same "dialog tree" v3 changed his mind, but not v4.

Language and descriptions also seem more direct. v3 was very verbose, the signal-to-noise ratio was a bit too much (for me at least). Here, it was just fine, at least in my limited experience.

That's about all I can say for now :)

Sign up or log in to comment