SentenceTransformer based on thebajajra/RexBERT-base

This is a sentence-transformers model finetuned from thebajajra/RexBERT-base on the nomic-embed-unsupervised-data dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
    "Where do you guys go to find used camper shells?",
]
documents = [
    "I've got a newly acquired 1st gen 2005 silvee Toyota tundra trd and am looking for an used camper shell.  Craigslist hasnt been very useful....where do you guys go?\n\nThanks!",
    "I work at a convenience store and the number of Newports I sell a day is insane. Considering buying a couple cartons of em and maybe some parliament menthols if the FDA goes through with this. Should be able to throw em up on craigslist or ebay a week or two later and it'll be like steaks in a piranha pond",
    "Hey guys what is the most optimal tool for pulling long staples out from hardwood flooring? I'm trying to find the most optimal way to do it because I have thousands to pull! Fence pliers did not work too well on account the pointy tip was too thick get in and roll them out and when i tried the gripping/cutting part it broke the staples.\n\nI'm thinking round nose vice grips or a car gasket puller?\n\nThanks",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.8108, 0.2481, 0.1200]])

Training Details

Training Dataset

nomic-embed-unsupervised-data

  • Dataset: nomic-embed-unsupervised-data at 917bae6
  • Size: 222,490,215 training samples
  • Columns: query and document
  • Approximate statistics based on the first 1000 samples:
    query document
    type string string
    details
    • min: 6 tokens
    • mean: 16.83 tokens
    • max: 62 tokens
    • min: 12 tokens
    • mean: 162.25 tokens
    • max: 1024 tokens
  • Samples:
    query document
    I became a US citizen early this year and this is going to be my first 4th of July as an American! Because of the current situation, my citizen oath ceremony felt more like a pick up order... Got my certificate, and no guests allowed, so I couldn’t bring anybody to join my ceremony, also no pictures.

    Anyway... I want to celebrate big time this 4th of July, and I’m already planning it! (Any ideas are super welcome!). I say big time but I just really want to do something fun at home with my family. 😊
    "The Kingdom of God for Jesus"; I know you guys know how to answer this overrated question. Basically what we're talking about is that the "kingdom" of god according to jesus are:

    * "the kingdom as good news (where the kingdom is on earth, whereas by living a beautiful, meaningful life on earth is the meaning of salvation)"
    * "the kingdom is offered to all"
    * etc.

    and finally, the question goes like this: "The Kingdom Does Not Ask for Performance; It is a gift, an offer. We can only inherit it. So, what is the point of being good?"
    So I made a "size" chart to go with my weight infograph, all based off that "Relative champ weight/height" thread. Here's the weight chart I did the other day



    And here's the size chart I did today.



    *Anivia, Skarner and Shyvanna (dragon form) are "Dimensions" instead of an actual "height", but I think you can get the jist.

    The original thread this is based off of is located via the link below. I am using these numbers (and my own conversions), so I'm not always sure where they got the numbers!

  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Evaluation Dataset

nomic-embed-unsupervised-data

  • Dataset: nomic-embed-unsupervised-data at 917bae6
  • Size: 222,727 evaluation samples
  • Columns: query and document
  • Approximate statistics based on the first 1000 samples:
    query document
    type string string
    details
    • min: 6 tokens
    • mean: 16.41 tokens
    • max: 66 tokens
    • min: 15 tokens
    • mean: 164.47 tokens
    • max: 1024 tokens
  • Samples:
    query document
    Do you subscribe to any horror magazines? I get most of my horror news from blogs and websites and such, but i do subscribe to a bunch of horror mags. With everything being so digital these days, something about flipping through a magazine and reading articles about both classic and upcoming horror movies is refreshing. I get a lot of great recommendations from them, and theres a lot of interesting interviews and behind the scenes stuff that i dont see on the popular websites.
    Missing PDS Laundry Card :( This is an absolute long shot but I must've accidentally left my laundry card in the dryer card slot because I cant find it anywhere. If someone found a card in there, please DM me. I've already bought a card but I'd like to have my original card back :(
    Talking Bad will be terrible Talking Dead is horrible and this will be to. Chris Hardwick and the cast of random no name celebrities offer nothing new to the discussion. The only good thing about Breaking Bad ending is that Talking Bad will end soon as well.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim",
        "gather_across_devices": false
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 128
  • learning_rate: 2e-06
  • num_train_epochs: 4
  • warmup_ratio: 0.1
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 128
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: True
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • parallelism_config: None
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • project: huggingface
  • trackio_space_id: trackio
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: no
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: True
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0009 100 4.4714 -
0.0018 200 4.4457 -
0.0028 300 4.4007 -
0.0037 400 4.336 -
0.0046 500 4.2476 -
0.0055 600 4.1406 -
0.0064 700 4.0049 -
0.0074 800 3.8434 -
0.0083 900 3.6393 -
0.0092 1000 3.3763 -
0.0101 1100 3.0541 -
0.0110 1200 2.6362 -
0.0120 1300 2.1226 -
0.0129 1400 1.6113 -
0.0138 1500 1.2565 -
0.0147 1600 1.029 -
0.0156 1700 0.846 -
0.0166 1800 0.7111 -
0.0175 1900 0.5967 -
0.0184 2000 0.488 -
0.0193 2100 0.4138 -
0.0203 2200 0.3565 -
0.0212 2300 0.3129 -
0.0221 2400 0.2827 -
0.0230 2500 0.2557 -
0.0239 2600 0.2379 -
0.0249 2700 0.2234 -
0.0258 2800 0.2055 -
0.0267 2900 0.1926 -
0.0276 3000 0.1843 -
0.0285 3100 0.175 -
0.0295 3200 0.1647 -
0.0304 3300 0.157 -
0.0313 3400 0.1512 -
0.0322 3500 0.146 -
0.0331 3600 0.1412 -
0.0341 3700 0.1352 -
0.0350 3800 0.1295 -
0.0359 3900 0.1261 -
0.0368 4000 0.122 -
0.0377 4100 0.1171 -
0.0387 4200 0.1147 -
0.0396 4300 0.1103 -
0.0405 4400 0.1073 -
0.0414 4500 0.1053 -
0.0423 4600 0.1016 -
0.0433 4700 0.0991 -
0.0442 4800 0.0981 -
0.0451 4900 0.0935 -
0.0460 5000 0.0928 -
0.0469 5100 0.0895 -
0.0479 5200 0.0877 -
0.0488 5300 0.0853 -
0.0497 5400 0.0829 -
0.0506 5500 0.0818 -
0.0515 5600 0.0805 -
0.0525 5700 0.0785 -
0.0534 5800 0.0769 -
0.0543 5900 0.0746 -
0.0552 6000 0.0754 -
0.0562 6100 0.0715 -
0.0571 6200 0.0707 -
0.0580 6300 0.0699 -
0.0589 6400 0.0678 -
0.0598 6500 0.0659 -
0.0608 6600 0.0659 -
0.0617 6700 0.0646 -
0.0626 6800 0.0627 -
0.0635 6900 0.0627 -
0.0644 7000 0.0604 -
0.0654 7100 0.0592 -
0.0663 7200 0.059 -
0.0672 7300 0.0577 -
0.0681 7400 0.0568 -
0.0690 7500 0.0558 -
0.0700 7600 0.0552 -
0.0709 7700 0.0542 -
0.0718 7800 0.0531 -
0.0727 7900 0.0528 -
0.0736 8000 0.0526 -
0.0746 8100 0.0509 -
0.0755 8200 0.05 -
0.0764 8300 0.0495 -
0.0773 8400 0.0486 -
0.0782 8500 0.0482 -
0.0792 8600 0.048 -
0.0801 8700 0.0468 -
0.0810 8800 0.0461 -
0.0819 8900 0.0459 -
0.0828 9000 0.0453 -
0.0838 9100 0.0442 -
0.0847 9200 0.0443 -
0.0856 9300 0.0437 -
0.0865 9400 0.0435 -
0.0874 9500 0.0426 -
0.0884 9600 0.042 -
0.0893 9700 0.0423 -
0.0902 9800 0.0406 -
0.0911 9900 0.0405 -
0.0920 10000 0.0397 -
0.0930 10100 0.0401 -
0.0939 10200 0.0392 -
0.0948 10300 0.0396 -
0.0957 10400 0.0391 -
0.0967 10500 0.0384 -
0.0976 10600 0.0377 -
0.0985 10700 0.0379 -
0.0994 10800 0.0372 -
0.1003 10900 0.0364 -
0.1013 11000 0.0367 -
0.1022 11100 0.0359 -
0.1031 11200 0.0355 -
0.1040 11300 0.0358 -
0.1049 11400 0.035 -
0.1059 11500 0.0353 -
0.1068 11600 0.0341 -
0.1077 11700 0.0343 -
0.1086 11800 0.034 -
0.1095 11900 0.0334 -
0.1105 12000 0.0337 -
0.1114 12100 0.0332 -
0.1123 12200 0.0323 -
0.1132 12300 0.0323 -
0.1141 12400 0.0322 -
0.1151 12500 0.0312 -
0.1160 12600 0.0307 -
0.1169 12700 0.0314 -
0.1178 12800 0.0309 -
0.1187 12900 0.0313 -
0.1197 13000 0.0306 -
0.1206 13100 0.0303 -
0.1215 13200 0.0301 -
0.1224 13300 0.0302 -
0.1233 13400 0.0296 -
0.1243 13500 0.029 -
0.1252 13600 0.0288 -
0.1261 13700 0.0286 -
0.1270 13800 0.0291 -
0.1279 13900 0.0287 -
0.1289 14000 0.0284 -
0.1298 14100 0.0276 -
0.1307 14200 0.028 -
0.1316 14300 0.0275 -
0.1326 14400 0.0269 -
0.1335 14500 0.027 -
0.1344 14600 0.0273 -
0.1353 14700 0.0267 -
0.1362 14800 0.0263 -
0.1372 14900 0.0264 -
0.1381 15000 0.0263 -
0.1390 15100 0.0262 -
0.1399 15200 0.0256 -
0.1408 15300 0.0254 -
0.1418 15400 0.0257 -
0.1427 15500 0.0251 -
0.1436 15600 0.0253 -
0.1445 15700 0.0247 -
0.1454 15800 0.0251 -
0.1464 15900 0.0245 -
0.1473 16000 0.0246 -
0.1482 16100 0.024 -
0.1491 16200 0.0241 -
0.1500 16300 0.0243 -
0.1510 16400 0.0235 -
0.1519 16500 0.024 -
0.1528 16600 0.0236 -
0.1537 16700 0.0233 -
0.1546 16800 0.0237 -
0.1556 16900 0.023 -
0.1565 17000 0.0233 -
0.1574 17100 0.0229 -
0.1583 17200 0.0227 -
0.1592 17300 0.023 -
0.1602 17400 0.0232 -
0.1611 17500 0.0221 -
0.1620 17600 0.0217 -
0.1629 17700 0.0224 -
0.1638 17800 0.0217 -
0.1648 17900 0.0219 -
0.1657 18000 0.0216 -
0.1666 18100 0.0214 -
0.1675 18200 0.0213 -
0.1685 18300 0.0215 -
0.1694 18400 0.0211 -
0.1703 18500 0.0213 -
0.1712 18600 0.0211 -
0.1721 18700 0.0212 -
0.1731 18800 0.0204 -
0.1740 18900 0.0206 -
0.1749 19000 0.021 -
0.1758 19100 0.0208 -
0.1767 19200 0.0202 -
0.1777 19300 0.0199 -
0.1786 19400 0.0204 -
0.1795 19500 0.0199 -
0.1804 19600 0.0196 -
0.1813 19700 0.0198 -
0.1823 19800 0.0199 -
0.1832 19900 0.0194 -
0.1841 20000 0.0191 -
0.1850 20100 0.0193 -
0.1859 20200 0.0193 -
0.1869 20300 0.0192 -
0.1878 20400 0.0192 -
0.1887 20500 0.0188 -
0.1896 20600 0.0183 -
0.1905 20700 0.0186 -
0.1915 20800 0.0182 -
0.1924 20900 0.0184 -
0.1933 21000 0.0187 -
0.1942 21100 0.0184 -
0.1951 21200 0.0183 -
0.1961 21300 0.0181 -
0.1970 21400 0.0178 -
0.1979 21500 0.0179 -
0.1988 21600 0.018 -
0.1997 21700 0.0185 -
0.2000 21728 - 0.0098
0.2007 21800 0.0176 -
0.2016 21900 0.0183 -
0.2025 22000 0.0174 -
0.2034 22100 0.0179 -
0.2044 22200 0.0175 -
0.2053 22300 0.0175 -
0.2062 22400 0.0172 -
0.2071 22500 0.0173 -
0.2080 22600 0.017 -
0.2090 22700 0.0167 -
0.2099 22800 0.0164 -
0.2108 22900 0.0167 -
0.2117 23000 0.0165 -
0.2126 23100 0.0171 -
0.2136 23200 0.0169 -
0.2145 23300 0.0164 -
0.2154 23400 0.0162 -
0.2163 23500 0.0164 -
0.2172 23600 0.0164 -
0.2182 23700 0.0166 -
0.2191 23800 0.0163 -
0.2200 23900 0.0164 -
0.2209 24000 0.0165 -
0.2218 24100 0.0163 -
0.2228 24200 0.0162 -
0.2237 24300 0.0163 -
0.2246 24400 0.0157 -
0.2255 24500 0.0157 -
0.2264 24600 0.0158 -
0.2274 24700 0.0153 -
0.2283 24800 0.0156 -
0.2292 24900 0.0155 -
0.2301 25000 0.0156 -
0.2310 25100 0.0154 -
0.2320 25200 0.0151 -
0.2329 25300 0.0153 -
0.2338 25400 0.015 -
0.2347 25500 0.0153 -
0.2356 25600 0.015 -
0.2366 25700 0.0152 -
0.2375 25800 0.0147 -
0.2384 25900 0.0148 -
0.2393 26000 0.0148 -
0.2402 26100 0.0144 -
0.2412 26200 0.0146 -
0.2421 26300 0.0143 -
0.2430 26400 0.0143 -
0.2439 26500 0.0145 -
0.2449 26600 0.0142 -
0.2458 26700 0.0142 -
0.2467 26800 0.0143 -
0.2476 26900 0.0139 -
0.2485 27000 0.0141 -
0.2495 27100 0.0141 -
0.2504 27200 0.0143 -
0.2513 27300 0.0141 -
0.2522 27400 0.014 -
0.2531 27500 0.0137 -
0.2541 27600 0.014 -
0.2550 27700 0.0139 -
0.2559 27800 0.0138 -
0.2568 27900 0.0141 -
0.2577 28000 0.0138 -
0.2587 28100 0.0138 -
0.2596 28200 0.0134 -
0.2605 28300 0.0135 -
0.2614 28400 0.0131 -
0.2623 28500 0.0133 -
0.2633 28600 0.0132 -
0.2642 28700 0.0133 -
0.2651 28800 0.0131 -
0.2660 28900 0.013 -
0.2669 29000 0.0131 -
0.2679 29100 0.013 -
0.2688 29200 0.0135 -
0.2697 29300 0.0131 -
0.2706 29400 0.0134 -
0.2715 29500 0.0131 -
0.2725 29600 0.0129 -
0.2734 29700 0.0127 -
0.2743 29800 0.0128 -
0.2752 29900 0.0125 -
0.2761 30000 0.0127 -
0.2771 30100 0.0126 -
0.2780 30200 0.0124 -
0.2789 30300 0.0126 -
0.2798 30400 0.0126 -
0.2808 30500 0.0122 -
0.2817 30600 0.0124 -
0.2826 30700 0.0123 -
0.2835 30800 0.0126 -
0.2844 30900 0.0123 -
0.2854 31000 0.012 -
0.2863 31100 0.012 -
0.2872 31200 0.0123 -
0.2881 31300 0.0122 -
0.2890 31400 0.0121 -
0.2900 31500 0.0124 -
0.2909 31600 0.0117 -
0.2918 31700 0.0118 -
0.2927 31800 0.0121 -
0.2936 31900 0.0119 -
0.2946 32000 0.0115 -
0.2955 32100 0.0117 -
0.2964 32200 0.012 -
0.2973 32300 0.0118 -
0.2982 32400 0.0117 -
0.2992 32500 0.0119 -
0.3001 32600 0.0118 -
0.3010 32700 0.0115 -
0.3019 32800 0.012 -
0.3028 32900 0.0119 -
0.3038 33000 0.0113 -
0.3047 33100 0.0117 -
0.3056 33200 0.0117 -
0.3065 33300 0.0113 -
0.3074 33400 0.0113 -
0.3084 33500 0.0113 -
0.3093 33600 0.0117 -
0.3102 33700 0.0111 -
0.3111 33800 0.0112 -
0.3120 33900 0.0113 -
0.3130 34000 0.0111 -
0.3139 34100 0.0113 -
0.3148 34200 0.0115 -
0.3157 34300 0.0114 -
0.3167 34400 0.0109 -
0.3176 34500 0.0112 -
0.3185 34600 0.0109 -
0.3194 34700 0.011 -
0.3203 34800 0.0108 -
0.3213 34900 0.0108 -
0.3222 35000 0.0107 -
0.3231 35100 0.0109 -
0.3240 35200 0.0108 -
0.3249 35300 0.0108 -
0.3259 35400 0.0108 -
0.3268 35500 0.0105 -
0.3277 35600 0.0106 -
0.3286 35700 0.0105 -
0.3295 35800 0.0104 -
0.3305 35900 0.0107 -
0.3314 36000 0.0105 -
0.3323 36100 0.0103 -
0.3332 36200 0.0105 -
0.3341 36300 0.0103 -
0.3351 36400 0.0107 -
0.3360 36500 0.0101 -
0.3369 36600 0.0102 -
0.3378 36700 0.0102 -
0.3387 36800 0.0102 -
0.3397 36900 0.01 -
0.3406 37000 0.0103 -
0.3415 37100 0.0103 -
0.3424 37200 0.01 -
0.3433 37300 0.0103 -
0.3443 37400 0.0103 -
0.3452 37500 0.0104 -
0.3461 37600 0.0098 -
0.3470 37700 0.0099 -
0.3479 37800 0.0102 -
0.3489 37900 0.0102 -
0.3498 38000 0.01 -
0.3507 38100 0.0101 -
0.3516 38200 0.01 -
0.3526 38300 0.0098 -
0.3535 38400 0.0097 -
0.3544 38500 0.0096 -
0.3553 38600 0.01 -
0.3562 38700 0.0097 -
0.3572 38800 0.0101 -
0.3581 38900 0.0099 -
0.3590 39000 0.0099 -
0.3599 39100 0.01 -
0.3608 39200 0.0094 -
0.3618 39300 0.0096 -
0.3627 39400 0.0095 -
0.3636 39500 0.0094 -
0.3645 39600 0.0094 -
0.3654 39700 0.0094 -
0.3664 39800 0.0096 -
0.3673 39900 0.0095 -
0.3682 40000 0.0096 -
0.3691 40100 0.0096 -
0.3700 40200 0.0094 -
0.3710 40300 0.0093 -
0.3719 40400 0.0092 -
0.3728 40500 0.0095 -
0.3737 40600 0.0091 -
0.3746 40700 0.0098 -
0.3756 40800 0.0094 -
0.3765 40900 0.0092 -
0.3774 41000 0.0094 -
0.3783 41100 0.0092 -
0.3792 41200 0.0093 -
0.3802 41300 0.0092 -
0.3811 41400 0.0095 -
0.3820 41500 0.0094 -
0.3829 41600 0.0089 -
0.3838 41700 0.009 -
0.3848 41800 0.0092 -
0.3857 41900 0.009 -
0.3866 42000 0.0089 -
0.3875 42100 0.0091 -
0.3884 42200 0.0087 -
0.3894 42300 0.0091 -
0.3903 42400 0.0089 -
0.3912 42500 0.0089 -
0.3921 42600 0.0089 -
0.3931 42700 0.0087 -
0.3940 42800 0.009 -
0.3949 42900 0.0087 -
0.3958 43000 0.0089 -
0.3967 43100 0.0088 -
0.3977 43200 0.0088 -
0.3986 43300 0.0089 -
0.3995 43400 0.0088 -
0.4000 43456 - 0.0047

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 5.1.2
  • Transformers: 4.57.1
  • PyTorch: 2.4.1+cu121
  • Accelerate: 1.11.0
  • Datasets: 4.3.0
  • Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
77
Safetensors
Model size
0.1B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for thebajajra/RexBERT-base-embed-pf-v0.1

Finetuned
(2)
this model
Finetunes
2 models

Dataset used to train thebajajra/RexBERT-base-embed-pf-v0.1