bge-m3-bank-it

A fine-tuned version of BAAI/bge-m3 for domain-specific retrieval and reranking.

Produces dense, sparse (lexical), and ColBERT embeddings simultaneously.

Benchmark Results

Evaluation on held-out test set (20% split, queries never seen during training):

Dense Retrieval

Metric Base (bge-m3) Fine-tuned Delta
Recall@1 29.5% 85.0% ↑ 55.5%
Recall@5 63.0% 96.5% ↑ 33.5%
Recall@10 76.0% 99.0% ↑ 23.0%
MRR 45.0% 90.4% ↑ 45.4%
NDCG@10 51.8% 92.5% ↑ 40.6%

Multi-Mode Reranking

Metric Base (bge-m3) Fine-tuned Delta
Accuracy 37.5% 94.0% ↑ 56.5%
MRR 57.3% 96.6% ↑ 39.4%

Usage

from FlagEmbedding import BGEM3FlagModel

model = BGEM3FlagModel("Sophia-AI/bge-m3-bank-it", device="cuda", use_fp16=True)

# Embeddings
output = model.encode(["Your query here"], return_dense=True, return_sparse=True)

# Reranking
scores = model.compute_score(
    [["query", "document"]],
    weights_for_different_modes=[0.30, 0.65, 0.05],
)

Fine-Tune Your Own

This model was fine-tuned using bge-auto-tune:

pip install bge-auto-tune

bge-auto-tune generate --collection your_collection --min-pairs 2000
bge-auto-tune finetune --dataset bge_m3_training.jsonl --epochs 4
bge-auto-tune test --model ./bge-m3-finetuned
bge-auto-tune publish --repo your-user/your-model-name
Downloads last month
15
Safetensors
Model size
0.6B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Sophia-AI/bge-m3-bank-it

Base model

BAAI/bge-m3
Finetuned
(383)
this model
Quantizations
1 model