LLaMA2-13B-RankLLaMA-Teacher

Model Description

LLaMA2-13B-RankLLaMA-Teacher is a 13B parameter teacher model designed for neural reranking tasks. This model serves as the foundation for knowledge distillation in the DeAR framework, generating Chain-of-Thought (CoT) reasoning to guide smaller student models.

Model Details

Model Type: Sequence Classification (Reranking)
Base Model: LLaMA-2-13B
Parameters: 13 billion
Training Data: MS MARCO Passage Ranking
Purpose: Teacher model for knowledge distillation
Output: Relevance scores for query-document pairs

Intended Use

This model is intended to:

Generate training signals for student reranker models
Provide Chain-of-Thought reasoning for reranking tasks
Serve as a baseline for evaluating distilled models

Usage

import torch
from transformers import AutoTokenizer, AutoModelForSequenceClassification

# Load model
model_name = "abdoelsayed/llama2-13b-rankllama-teacher"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(
    model_name, 
    torch_dtype=torch.bfloat16,
    device_map="auto"
)
model.eval()

# Score a query-document pair
query = "What is machine learning?"
document = "Machine learning is a subset of artificial intelligence that focuses on training algorithms to learn patterns from data."

inputs = tokenizer(
    f"query: {query}",
    f"document: {document}",
    return_tensors="pt",
    truncation=True,
    max_length=512,
    padding=True
)
inputs = {k: v.to(model.device) for k, v in inputs.items()}

with torch.no_grad():
    score = model(**inputs).logits.squeeze().item()
    
print(f"Relevance score: {score}")

Training Details

Training Data

Dataset: MS MARCO Passage Ranking
Training Samples: Large-scale query-document pairs with relevance labels

Training Configuration

Base Model: meta-llama/Llama-2-13b-hf
Objective: Pointwise ranking with sequence classification
Hardware: Multi-GPU training (4x A100)
Precision: Mixed precision (bfloat16)

Hyperparameters

Learning Rate: 2e-5
Batch Size: 8 per device
Epochs: 3
Max Sequence Length: 512
Warmup Steps: 1000

Evaluation

This teacher model is evaluated on standard IR benchmarks:

Dataset	NDCG@10
MS MARCO Dev	72.5
TREC DL19	73.8
TREC DL20	71.2

Model Architecture

LLaMA2-13B
    ↓
[Transformer Layers]
    ↓
[Classification Head]
    ↓
Relevance Score

Distillation

This teacher model generates soft labels and CoT reasoning used to train:

DeAR-8B models (RankNet, CE, Listwise)
DeAR-3B models (RankNet, CE)

The distillation process uses:

Temperature: 2.0
Alpha (KD weight): 0.1
CoT Dataset: DeAR-COT

Limitations

Computational Cost: 13B parameters require significant GPU memory (>26GB)
Inference Speed: Slower than distilled student models
Domain Specificity: Trained primarily on MS MARCO, may require fine-tuning for other domains

Citation

If you use this model, please cite:

@article{abdallah2025dear,
  title={DeAR: Dual-Stage Document Reranking with Reasoning Agents via LLM Distillation},
  author={Abdallah, Abdelrahman and Mozafari, Jamshid and Piryani, Bhawna and Jatowt, Adam},
  journal={arXiv preprint arXiv:2508.16998},
  year={2025}
}

License

MIT License

Related Models

Student Models (Distilled from this teacher):

8B Models:

3B Models:

Dataset:

DeAR-COT Dataset

More Information

Repository: GitHub
Paper: arXiv:2508.16998
Collection: DeAR Model Collection

Downloads last month: 20

Safetensors

Model size

13B params

Tensor type

F32

Model tree for abdoelsayed/llama2-13b-rankllama-teacher

Base model

meta-llama/Llama-2-13b-hf

Finetuned

(57)

this model

Dataset used to train abdoelsayed/llama2-13b-rankllama-teacher

Collection including abdoelsayed/llama2-13b-rankllama-teacher

DeAR-Reranking

Collection

DeAR (Deep Agent Rank): Dual-Stage Document Reranking with Reasoning Agents Accepted at EMNLP Findings 2025 • 12 items • Updated 19 days ago • 1