Systematic Logic Failures (Negation Blindness, Semantic Drift, & Prior Overwrite)

#3
by Repaltoofficial - opened

I’ve been stress-testing the model on Logical Constraint Adherence and Instruction Following. While the visual fidelity (lighting, texture) is high, I found three consistent logic failures where the model prioritizes training priors over prompt instructions.Here is the breakdown of the failures I mapped today:1. Negation Blindness ("Without" Bug):The model treats "without" as a stop word and ignores the negative constraint.Test: "Cup without handle" $\rightarrow$ Generated a handle.Test: "Elephant without tail" $\rightarrow$ Generated a tail.2. Semantic Drift (Association Bias):The model struggles to decouple specific objects from their most common training pairs.Test: "Xbox 360 Console" $\rightarrow$ Generated a Controller (likely due to high co-occurrence in the dataset).3. Prior Overwrite (Anatomical Rigidity):The model's internal priors override explicit numerical instructions.Test: "Cat with 3 legs" $\rightarrow$ Generated a standard 4-legged anatomy, refusing to deviate from the biological norm.The Diagnosis:It appears the attention mechanism is latching onto noun tokens (Handle, Tail, Xbox, Cat) while bypassing modifier tokens (Without, Console, 3 legs).My team at Repalto specializes in creating adversarial datasets to fix these specific logic gaps. We can construct a targeted "Constraint Logic" benchmark to help fine-tune this behavior. Happy to share that data if useful.
10.01.2026_06.39.07_REC
10.01.2026_06.37.58_REC
10.01.2026_06.36.06_REC
10.01.2026_06.34.31_REC

Sign up or log in to comment