Agentic-R_e5 / README.md

liuwenhan

Update README.md

8430614 verified 16 days ago

preview code

raw

history blame contribute delete

1.62 kB

metadata

license: mit
language:
  - en
base_model:
  - intfloat/e5-base-v2
pipeline_tag: sentence-similarity

Introduction

This is the Agentic-R trained in our paper: Agentic-R: Learning to Retrieve for Agentic Search (📝arXiv). Please refer our 🧩github repository for the detailed usage of our Agentic-R.

Usage

Our Agentic-R query encoder is designed for agentic search scenarios.
For queries, the input format is: query: <original_question> [SEP] <agent_query>. Passages use the standard passage: prefix following E5.

Below is an example of how to compute embeddings using sentence_transformers:

from sentence_transformers import SentenceTransformer

model = SentenceTransformer("liuwenhan/Agentic-R_e5")

input_texts = [
    # Query encoder input:
    # original_question [SEP] current_query
    "query: Who wrote The Old Man and the Sea? [SEP] Old Man and the Sea",

    # Passages
    "passage: The Old Man and the Sea is a short novel written by the American author Ernest Hemingway in 1951.",
    "passage: Ernest Hemingway was an American novelist, short-story writer, and journalist, born in 1899."
]

embeddings = model.encode(
    input_texts,
    normalize_embeddings=True
)

Notes:

original_question refers to the user’s initial question.

agent_query refers to the intermediate query generated during the agent’s reasoning process.

Always include [SEP] to separate the two parts of the query.

We recommend setting normalize_embeddings=True for cosine similarity–based retrieval.