| | ---
|
| | license: apache-2.0
|
| | language:
|
| | - en
|
| | tags:
|
| | - sentence-transformers
|
| | - embeddings
|
| | - retrieval
|
| | - agents
|
| | - memory
|
| | - rag
|
| | - semantic-search
|
| | library_name: transformers
|
| | pipeline_tag: sentence-similarity
|
| | datasets:
|
| | - custom
|
| | metrics:
|
| | - mrr
|
| | - recall
|
| | - ndcg
|
| | model-index:
|
| | - name: agentrank-small
|
| | results:
|
| | - task:
|
| | type: retrieval
|
| | name: Agent Memory Retrieval
|
| | metrics:
|
| | - type: mrr
|
| | value: 0.6375
|
| | name: MRR
|
| | - type: recall
|
| | value: 0.4460
|
| | name: Recall@1
|
| | - type: recall
|
| | value: 0.9740
|
| | name: Recall@5
|
| | - type: ndcg
|
| | value: 0.6797
|
| | name: NDCG@10
|
| | ---
|
| |
|
| | # AgentRank-Small: Embedding Model for AI Agent Memory Retrieval
|
| |
|
| | <p align="center">
|
| | <img src="https://img.shields.io/badge/MRR-0.6375-brightgreen" alt="MRR">
|
| | <img src="https://img.shields.io/badge/Recall%405-97.4%25-blue" alt="Recall@5">
|
| | <img src="https://img.shields.io/badge/Parameters-33M-orange" alt="Parameters">
|
| | <img src="https://img.shields.io/badge/License-Apache%202.0-green" alt="License">
|
| | </p>
|
| |
|
| | **AgentRank** is the first embedding model family specifically designed for AI agent memory retrieval. Unlike general-purpose embedders, AgentRank understands temporal context, memory types, and importance - critical for agents that need to remember past interactions.
|
| |
|
| | ## π Key Results
|
| |
|
| | | Model | MRR | Recall@1 | Recall@5 | NDCG@10 |
|
| | |-------|-----|----------|----------|---------|
|
| | | **AgentRank-Small** | **0.6375** | **0.4460** | **0.9740** | **0.6797** |
|
| | | all-MiniLM-L6-v2 | 0.5297 | 0.3720 | 0.7520 | 0.6370 |
|
| | | all-mpnet-base-v2 | 0.5351 | 0.3660 | 0.7960 | 0.6335 |
|
| |
|
| | **+20% MRR improvement over base MiniLM model!**
|
| |
|
| | ## π― Why AgentRank?
|
| |
|
| | AI agents need memory that understands:
|
| |
|
| | | Challenge | General Embedders | AgentRank |
|
| | |-----------|-------------------|-----------|
|
| | | "What did I say **yesterday**?" | β No temporal awareness | β
Temporal embeddings |
|
| | | "What's my **preference**?" | β Mixes with events | β
Memory type awareness |
|
| | | "What's **most important**?" | β No priority | β
Importance prediction |
|
| |
|
| | ## π¦ Installation
|
| |
|
| | ```bash
|
| | pip install transformers torch
|
| | ```
|
| |
|
| | ## π» Usage
|
| |
|
| | ### Basic Usage
|
| |
|
| | ```python
|
| | from transformers import AutoModel, AutoTokenizer
|
| | import torch
|
| |
|
| | # Load model
|
| | model = AutoModel.from_pretrained("vrushket/agentrank-small")
|
| | tokenizer = AutoTokenizer.from_pretrained("vrushket/agentrank-small")
|
| |
|
| | def encode(texts):
|
| | inputs = tokenizer(texts, padding=True, truncation=True, return_tensors="pt")
|
| | with torch.no_grad():
|
| | outputs = model(**inputs)
|
| | embeddings = outputs.last_hidden_state.mean(dim=1)
|
| | embeddings = torch.nn.functional.normalize(embeddings, p=2, dim=1)
|
| | return embeddings
|
| |
|
| | # Encode memories and query
|
| | memories = [
|
| | "User prefers Python over JavaScript",
|
| | "User asked about machine learning yesterday",
|
| | "User is working on a web project",
|
| | ]
|
| | query = "What programming language does the user like?"
|
| |
|
| | memory_embeddings = encode(memories)
|
| | query_embedding = encode([query])
|
| |
|
| | # Compute similarities
|
| | similarities = torch.mm(query_embedding, memory_embeddings.T)
|
| | print(f"Most relevant: {memories[similarities.argmax()]}")
|
| | # Output: "User prefers Python over JavaScript"
|
| | ```
|
| |
|
| | ### With Temporal & Memory Type Metadata (Full Power)
|
| |
|
| | ```python
|
| | # For full AgentRank features including temporal awareness:
|
| | # pip install agentrank (coming soon!)
|
| |
|
| | from agentrank import AgentRankEmbedder
|
| |
|
| | model = AgentRankEmbedder.from_pretrained("vrushket/agentrank-small")
|
| |
|
| | # Encode with metadata
|
| | embedding = model.encode(
|
| | "User mentioned they prefer morning meetings",
|
| | days_ago=3, # Memory is 3 days old
|
| | memory_type="semantic" # It's a preference, not an event
|
| | )
|
| | ```
|
| |
|
| | ## ποΈ Architecture
|
| |
|
| | AgentRank-Small is based on `all-MiniLM-L6-v2` with novel additions:
|
| |
|
| | ```
|
| | βββββββββββββββββββββββββββββββββββββββββββ
|
| | β MiniLM Transformer Encoder (6 layers) β
|
| | βββββββββββββββββββββββββββββββββββββββββββ
|
| | β
|
| | βββββββββββββββββΌββββββββββββββββ
|
| | β β β
|
| | βββββββββββ ββββββββββββ βββββββββββββ
|
| | β Temporal β β Memory β β Importanceβ
|
| | β Position β β Type β β Predictionβ
|
| | β Embed β β Embed β β Head β
|
| | βββββββββββ ββββββββββββ βββββββββββββ
|
| | β β β
|
| | βββββββββββββββββΌββββββββββββββββ
|
| | β
|
| | βββββββββββββββββββ
|
| | β L2 Normalized β
|
| | β 384-dim Embeddingβ
|
| | βββββββββββββββββββ
|
| | ```
|
| |
|
| | **Novel Features:**
|
| | - **Temporal Position Embeddings**: 10 learnable buckets (today, 1-3 days, week, month, etc.)
|
| | - **Memory Type Embeddings**: Episodic, Semantic, Procedural
|
| | - **Importance Prediction Head**: Auxiliary task during training
|
| |
|
| | ## π Training
|
| |
|
| | - **Dataset**: 500K synthetic agent memory samples
|
| | - **Memory Types**: Episodic (40%), Semantic (35%), Procedural (25%)
|
| | - **Loss**: Multiple Negatives Ranking Loss + Importance MSE
|
| | - **Hard Negatives**: 5 types (temporal, type confusion, topic drift, etc.)
|
| | - **Hardware**: NVIDIA RTX 6000 Ada (48GB) with FP16
|
| |
|
| | ## π Benchmarks
|
| |
|
| | Evaluated on AgentMemBench (500 test samples, 8 candidates each):
|
| |
|
| | | Metric | AgentRank-Small | MiniLM | Improvement |
|
| | |--------|-----------------|--------|-------------|
|
| | | MRR | 0.6375 | 0.5297 | **+20.4%** |
|
| | | Recall@1 | 0.4460 | 0.3720 | **+19.9%** |
|
| | | Recall@5 | 0.9740 | 0.7520 | **+29.5%** |
|
| | | NDCG@10 | 0.6797 | 0.6370 | **+6.7%** |
|
| |
|
| | ## π Coming Soon
|
| |
|
| | - **AgentRank-Base**: 110M params, even better performance
|
| | - **AgentRank-Reranker**: Cross-encoder for top-k refinement
|
| | - **Python Package**: `pip install agentrank`
|
| |
|
| | ## π Citation
|
| |
|
| | ```bibtex
|
| | @misc{agentrank2024,
|
| | author = {Vrushket More},
|
| | title = {AgentRank: Embedding Models for AI Agent Memory Retrieval},
|
| | year = {2024},
|
| | publisher = {HuggingFace},
|
| | url = {https://huggingface.co/vrushket/agentrank-small}
|
| | }
|
| | ```
|
| |
|
| | ## π License
|
| |
|
| | Apache 2.0 - Free for commercial use!
|
| |
|
| | ## π€ Acknowledgments
|
| |
|
| | Built on top of [sentence-transformers](https://www.sbert.net/) and [MiniLM](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2).
|
| |
|