mdts-circuit-mlp-only

Circuit MLP-only fine-tuned cross-encoder. Trains MLP output sublayers of layers 8-11 only (2.36M params). Strategy A from MDTS circuit fine-tuning experiments on SciFact.

Base Model

cross-encoder/ms-marco-MiniLM-L-12-v2

Training Data

SciFact (BEIR benchmark) with BM25 hard negatives.

Results on SciFact (NDCG@10)

Strategy Params NDCG@10 Delta
A: Circuit MLP-only 2.36M 0.6545 +0.0110
B: Last-4 Layers 7.10M 0.6686 +0.0251
C: Full Fine-Tuning 33.36M 0.6879 +0.0444
D: Circuit-Full (BM25) 4.73M 0.6707 +0.0272
E: Circuit-Full (Mixed) 4.73M 0.6622 +0.0187

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

tokenizer = AutoTokenizer.from_pretrained("JAYADIR/mdts-circuit-mlp-only")
model = AutoModelForSequenceClassification.from_pretrained("JAYADIR/mdts-circuit-mlp-only")

query = "What fertilizer is best for wheat?"
passage = "Wheat requires nitrogen-rich fertilizer during early growth stages."

inputs = tokenizer(query, passage, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
    score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score:.4f}")
Downloads last month
35
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for JAYADIR/mdts-circuit-mlp-only