| | --- |
| | tags: |
| | - sentence-transformers |
| | - embeddings |
| | - litert |
| | - tflite |
| | - edge |
| | - on-device |
| | license: mit |
| | base_model: intfloat/multilingual-e5-small |
| | pipeline_tag: feature-extraction |
| | --- |
| | |
| | # multilingual-e5-small - LiteRT |
| |
|
| | This is a [LiteRT](https://ai.google.dev/edge/litert) (formerly TensorFlow Lite) conversion of [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) for efficient on-device inference. |
| |
|
| | ## Model Details |
| |
|
| | | Property | Value | |
| | |----------|-------| |
| | | **Original Model** | [intfloat/multilingual-e5-small](https://huggingface.co/intfloat/multilingual-e5-small) | |
| | | **Format** | LiteRT (.tflite) | |
| | | **File Size** | 449.0 MB | |
| | | **Task** | Multilingual Sentence Embeddings (100 languages) | |
| | | **Max Sequence Length** | 512 | |
| | | **Output Dimension** | 384 | |
| | | **Pooling Mode** | Mean Pooling | |
| |
|
| | ## Performance |
| |
|
| | Benchmarked on AMD CPU (WSL2): |
| |
|
| | | Metric | Value | |
| | |--------|-------| |
| | | **Inference Latency** | 91.9 ms | |
| | | **Throughput** | 10.9 tokens/sec | |
| | | **Cosine Similarity vs Original** | 1.0000 ✅ | |
| |
|
| | ## Quick Start |
| |
|
| | ```python |
| | import numpy as np |
| | from ai_edge_litert.interpreter import Interpreter |
| | from transformers import AutoTokenizer |
| | |
| | # Load model and tokenizer |
| | interpreter = Interpreter(model_path="intfloat_multilingual-e5-small.tflite") |
| | interpreter.allocate_tensors() |
| | input_details = interpreter.get_input_details() |
| | output_details = interpreter.get_output_details() |
| | |
| | tokenizer = AutoTokenizer.from_pretrained("intfloat/multilingual-e5-small") |
| | |
| | def get_embedding(text: str) -> np.ndarray: |
| | """Get sentence embedding for input text.""" |
| | encoded = tokenizer( |
| | text, |
| | padding="max_length", |
| | max_length=512, |
| | truncation=True, |
| | return_tensors="np" |
| | ) |
| | |
| | interpreter.set_tensor(input_details[0]["index"], encoded["input_ids"].astype(np.int64)) |
| | interpreter.set_tensor(input_details[1]["index"], encoded["attention_mask"].astype(np.int64)) |
| | interpreter.invoke() |
| | |
| | return interpreter.get_tensor(output_details[0]["index"])[0] |
| | |
| | # Example |
| | embedding = get_embedding("Hello, world!") |
| | print(f"Embedding shape: {embedding.shape}") # (384,) |
| | ``` |
| |
|
| | ## Files |
| |
|
| | - `intfloat_multilingual-e5-small.tflite` - The LiteRT model file |
| |
|
| | ## Conversion Details |
| |
|
| | - **Conversion Tool**: [ai-edge-torch](https://github.com/google-ai-edge/ai-edge-torch) |
| | - **Conversion Date**: 2026-01-12 |
| | - **Source Framework**: PyTorch → LiteRT |
| | - **Validation**: Cosine similarity 1.0000 vs original |
| |
|
| | ## Intended Use |
| |
|
| | - **Mobile Applications**: On-device semantic search, RAG systems |
| | - **Edge Devices**: IoT, embedded systems, Raspberry Pi |
| | - **Offline Processing**: Privacy-preserving inference |
| | - **Low-latency Applications**: Real-time processing |
| |
|
| | ## Limitations |
| |
|
| | - Fixed sequence length (512 tokens) |
| | - CPU inference (GPU delegate requires setup) |
| | - Tokenizer loaded separately from original model |
| | - Float32 precision |
| |
|
| | ## License |
| |
|
| | This model inherits the license from the original: |
| | - **License**: MIT ([source](https://huggingface.co/intfloat/multilingual-e5-small)) |
| |
|
| | ## Citation |
| |
|
| | ```bibtex |
| | @article{wang2024multilingual, |
| | title={Multilingual E5 Text Embeddings: A Technical Report}, |
| | author={Wang, Liang and Yang, Nan and Huang, Xiaolong and others}, |
| | journal={arXiv preprint arXiv:2402.05672}, |
| | year={2024} |
| | } |
| | ``` |
| |
|
| | ## Acknowledgments |
| |
|
| | - Original model by [intfloat](https://huggingface.co/intfloat) |
| | - Conversion using [ai-edge-torch](https://github.com/google-ai-edge/ai-edge-torch) |
| |
|
| | --- |
| |
|
| | *Converted by [Bombek1](https://huggingface.co/Bombek1)* |
| |
|