Arcade-3B: SLM Optimization via Orthogonal Decoupling of Latent State Spaces

Community Article Published March 15, 2026

Upvote

Zixi "Oz" Li

In parameter-constrained Small Language Models (SLMs), it is often difficult for the model to effectively distinguish between "task state representation" and "underlying logical constraints" within high-dimensional search spaces. Traditional fine-tuning methods frequently lead to coupling conflicts between these two in the Latent Space, which limits the model's convergence ceiling.

Arcade-3B introduces the SC-OrthFine architecture, with the core objective of achieving decoupling of state-space search: it forcibly projects the model's search behavior into mutually orthogonal State Vectors and Constraint Vectors.

1. The Coupling Dilemma in State-Space Search

In a 3B-scale model, the hidden state output $H$ carries extremely high information density. During gradient backpropagation, the traditional $L_{ce}$ (Cross-Entropy Loss) adjusts weights indiscriminately to fit the target distribution. However, when handling logical reasoning (e.g., GSM8K) or code generation (e.g., HumanEval), the model must simultaneously process:

Semantic State ( S ): Generating the contextual representation of the current token.
Logical Constraints ( C ): Adhering to syntax, mathematical rules, or long-range structural dependencies.

When these two overlap on the same Manifold, the search behavior experiences significant interference.

2. SC-Orthogonal: Orthogonal Projection Decoupling Mechanism

To address the issues mentioned above, we designed the SC-Orthogonal Optimization Loop. Its core logic involves splitting the hidden state $H \in \mathbb{R}^{B \times L \times D}$ along the feature dimension to define two independent subspaces:

State Projection Half (State Half, $S$ ): Focuses on the feature representation for instantaneous prediction.
Constraint Projection Half (Constraint Half, $C$ ): Carries global logical boundaries and structural constraints.

Mathematical Definition and Loss Function

To ensure the decoupling of search behavior, we introduce an orthogonality constraint. By minimizing the inner product of $S$ and $C$ , we force them to maintain a $90^\circ$ orthogonality in a geometric sense:

$Dot = S \cdot C = \sum_{i=1}^{D/2} S_i C_i$

To implement this constraint during the training process, we define the Orthogonality Loss function $L_{orth}$ :

$L_{orth} = \frac{1}{B \cdot L} \sum_{b,l} (S_{b,l} \cdot C_{b,l})^2$

The final joint optimization objective function is:

$L_{total} = L_{ce} + \lambda \cdot L_{orth}$

By introducing the orthogonal penalty term regulated by $\lambda$ , the model is forced to perform parameter searches within mutually independent subspaces, thereby avoiding feature collapse.

3. Experimental Analysis: Performance Gains from Decoupling

Experimental results indicate that this state-space decoupling is particularly prominent in logic-intensive tasks:

Robustness in Logical Reasoning: In the GSM8K benchmark, Arcade-3B achieved an accuracy of 62.9%. This proves that through orthogonal constraints, the model can better isolate mathematical logical constraints from language generation states, reducing "hallucination" interference during the reasoning process.
Coding Efficiency: In the HumanEval task, the score of 41.5% significantly leads other models of the same scale that do not employ orthogonal decoupling (such as Qwen1.5-1.8B at 27.4%), demonstrating that orthogonal subspaces offer higher search efficiency for complex structured data.

Benchmark	Arcade-3B	Gemma-2-2B	Llama-2-7B
MMLU	52.9%	52.4%	45.3%
GSM8K	62.9%	50.9%	14.6%
HumanEval	41.5%	32.3%	12.8%

Conclusion

The technical path of Arcade-3B demonstrates that for small parameter models, simply increasing data volume or obtaining logits via distillation is insufficient. Through the underlying mathematical constraints of SC-OrthFine, achieving state-space search decoupling from a geometric perspective is an effective means of enhancing a model's "logical density."

Community

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote