mdts-circuit-mlp-only
Circuit MLP-only fine-tuned cross-encoder. Trains MLP output sublayers of layers 8-11 only (2.36M params). Strategy A from MDTS circuit fine-tuning experiments on SciFact.
Base Model
cross-encoder/ms-marco-MiniLM-L-12-v2
Training Data
SciFact (BEIR benchmark) with BM25 hard negatives.
Results on SciFact (NDCG@10)
| Strategy | Params | NDCG@10 | Delta |
|---|---|---|---|
| A: Circuit MLP-only | 2.36M | 0.6545 | +0.0110 |
| B: Last-4 Layers | 7.10M | 0.6686 | +0.0251 |
| C: Full Fine-Tuning | 33.36M | 0.6879 | +0.0444 |
| D: Circuit-Full (BM25) | 4.73M | 0.6707 | +0.0272 |
| E: Circuit-Full (Mixed) | 4.73M | 0.6622 | +0.0187 |
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
tokenizer = AutoTokenizer.from_pretrained("JAYADIR/mdts-circuit-mlp-only")
model = AutoModelForSequenceClassification.from_pretrained("JAYADIR/mdts-circuit-mlp-only")
query = "What fertilizer is best for wheat?"
passage = "Wheat requires nitrogen-rich fertilizer during early growth stages."
inputs = tokenizer(query, passage, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
score = model(**inputs).logits.squeeze().item()
print(f"Relevance score: {score:.4f}")
- Downloads last month
- 35
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for JAYADIR/mdts-circuit-mlp-only
Base model
microsoft/MiniLM-L12-H384-uncased
Quantized
cross-encoder/ms-marco-MiniLM-L12-v2