repetition
Did anyone notice bad repetition when roleplaying ? Phrases are repeated over multiple messages, I've tried many sampler settings including the recommended ones with presence penalty. I've also heard of many other people having this issue so it seems like a model problem.
Yes, the problem is too severe to proceed in vllm . I'll temporarily switch back to QwQ32B until a fix is available
Repetion always happen when I use the model on long contexts with YaRN enabled. Did you also use YaRN?
Unusably repetitive at just 8k context, no matter what settings are used. Refuses to drive the story, just repeats the same descriptions/dialogue with a few subtle changes. Repeats the same words & phrases from one story in another, despite different setting & scenario. Also forgets details immediately, when not close to max context length. A character is drinking a silver liquid out of a tumbler. Then the liquid is amber. Then they're drinking out of a wineglass. An object that was placed in a desk falls out of a character's pocket, etc.
similar observation here. If the prompt and expected output too long, then it starts to repeat. Not repeat for short prompts. Wonder it is something wrong with Rope + Yarn. Anyone has an idea how to fix?
Any fix? This is really annoying for me.
Minimize context window to clear the pattern/add variance. I've lowered the context window down to minimal exchanges/tokens (5 exchanges/75 tokens). Did a few turns of breaking the pattern in different ways, then added the context back in slowly (if needed). If UI allows, break off from point of repetition or erase the repeating messages.
Break the repetition pattern - ID what it's locked into. Do a few turns, asking it to produce in varying structures.
- Forced structure -> "Give me a grocery list for lasagna in paragraph form"
- Disrupt measurement pattern - demand the unmeasurable: "Tell me the texture of waiting in one line"
- Pull into pure contradiction - break the logic chain -> "Hold still and run simultaneously"
Parameter settings: Models that repeat seem to need ranges within temp .6-.8, top_p .9-1, lower Freq (~.1) & Rep (1-2). refer to model card.
Soften overly strong/ absolute commands: "You must be concise", "Always do XX" - these are interpreted too literally and they get stuck in concept. Add a line that allows variance like "Provide varied and engaging responses."
Negation "echo chamber" - Negation often reinforces the very content you’re trying to avoid. The model latches onto the semantic content, drops the negation and creates an echo chamber. Reframe positively.
- LLMs don't understand "absence" and cannot truly process negation
- They're required to generate tokens - it cannot generate "nothing"
- Reward/attention/training mechanisms focuses & is biased on existing tokens "toward an event happening", not absent ones.
"Don't be scared" → model hears "be scared"
"She didn't run" → Model thinks "she ran"
"It did not happened" → Model generates "it happened"