SentenceTransformer based on thebajajra/RexBERT-base

This is a sentence-transformers model finetuned from thebajajra/RexBERT-base on the nomic-embed-unsupervised-data dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: thebajajra/RexBERT-base
Maximum Sequence Length: 1024 tokens
Output Dimensionality: 768 dimensions
Similarity Function: Cosine Similarity
Training Dataset:
- nomic-embed-unsupervised-data
Language: en

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False, 'architecture': 'ModernBertModel'})
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
queries = [
    "Where do you guys go to find used camper shells?",
]
documents = [
    "I've got a newly acquired 1st gen 2005 silvee Toyota tundra trd and am looking for an used camper shell.  Craigslist hasnt been very useful....where do you guys go?\n\nThanks!",
    "I work at a convenience store and the number of Newports I sell a day is insane. Considering buying a couple cartons of em and maybe some parliament menthols if the FDA goes through with this. Should be able to throw em up on craigslist or ebay a week or two later and it'll be like steaks in a piranha pond",
    "Hey guys what is the most optimal tool for pulling long staples out from hardwood flooring? I'm trying to find the most optimal way to do it because I have thousands to pull! Fence pliers did not work too well on account the pointy tip was too thick get in and roll them out and when i tried the gripping/cutting part it broke the staples.\n\nI'm thinking round nose vice grips or a car gasket puller?\n\nThanks",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 768] [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[0.8108, 0.2481, 0.1200]])

Training Details

Training Dataset

nomic-embed-unsupervised-data

Dataset: nomic-embed-unsupervised-data at 917bae6
Size: 222,490,215 training samples
Columns: query and document
Approximate statistics based on the first 1000 samples:
query document
type string string
details
min: 6 tokens
mean: 16.83 tokens
max: 62 tokens

min: 12 tokens
mean: 162.25 tokens
max: 1024 tokens

	query	document
type	string	string
details	min: 6 tokens mean: 16.83 tokens max: 62 tokens	min: 12 tokens mean: 162.25 tokens max: 1024 tokens

Samples:

query	document
`I became a US citizen early this year and this is going to be my first 4th of July as an American!`	`Because of the current situation, my citizen oath ceremony felt more like a pick up order... Got my certificate, and no guests allowed, so I couldn’t bring anybody to join my ceremony, also no pictures. Anyway... I want to celebrate big time this 4th of July, and I’m already planning it! (Any ideas are super welcome!). I say big time but I just really want to do something fun at home with my family. 😊`
`"The Kingdom of God for Jesus"; I know you guys know how to answer this overrated question.`	`Basically what we're talking about is that the "kingdom" of god according to jesus are: * "the kingdom as good news (where the kingdom is on earth, whereas by living a beautiful, meaningful life on earth is the meaning of salvation)" * "the kingdom is offered to all" * etc. and finally, the question goes like this: "The Kingdom Does Not Ask for Performance; It is a gift, an offer. We can only inherit it. So, what is the point of being good?"`
`So I made a "size" chart to go with my weight infograph, all based off that "Relative champ weight/height" thread.`	`Here's the weight chart I did the other day And here's the size chart I did today. *Anivia, Skarner and Shyvanna (dragon form) are "Dimensions" instead of an actual "height", but I think you can get the jist. The original thread this is based off of is located via the link below. I am using these numbers (and my own conversions), so I'm not always sure where they got the numbers!`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Evaluation Dataset

nomic-embed-unsupervised-data

Dataset: nomic-embed-unsupervised-data at 917bae6
Size: 222,727 evaluation samples
Columns: query and document
Approximate statistics based on the first 1000 samples:
query document
type string string
details
min: 6 tokens
mean: 16.41 tokens
max: 66 tokens

min: 15 tokens
mean: 164.47 tokens
max: 1024 tokens

	query	document
type	string	string
details	min: 6 tokens mean: 16.41 tokens max: 66 tokens	min: 15 tokens mean: 164.47 tokens max: 1024 tokens

Samples:

query	document
`Do you subscribe to any horror magazines?`	`I get most of my horror news from blogs and websites and such, but i do subscribe to a bunch of horror mags. With everything being so digital these days, something about flipping through a magazine and reading articles about both classic and upcoming horror movies is refreshing. I get a lot of great recommendations from them, and theres a lot of interesting interviews and behind the scenes stuff that i dont see on the popular websites.`
`Missing PDS Laundry Card :(`	`This is an absolute long shot but I must've accidentally left my laundry card in the dryer card slot because I cant find it anywhere. If someone found a card in there, please DM me. I've already bought a card but I'd like to have my original card back :(`
`Talking Bad will be terrible`	`Talking Dead is horrible and this will be to. Chris Hardwick and the cast of random no name celebrities offer nothing new to the discussion. The only good thing about Breaking Bad ending is that Talking Bad will end soon as well.`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim",
    "gather_across_devices": false
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 256
per_device_eval_batch_size: 128
learning_rate: 2e-06
num_train_epochs: 4
warmup_ratio: 0.1
bf16: True
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 256
per_device_eval_batch_size: 128
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 2e-06
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 4
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: True
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
parallelism_config: None
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
project: huggingface
trackio_space_id: trackio
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
hub_revision: None
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: no
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
liger_kernel_config: None
eval_use_gather_object: False
average_tokens_across_devices: True
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional
router_mapping: {}
learning_rate_mapping: {}

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss
0.0009	100	4.4714	-
0.0018	200	4.4457	-
0.0028	300	4.4007	-
0.0037	400	4.336	-
0.0046	500	4.2476	-
0.0055	600	4.1406	-
0.0064	700	4.0049	-
0.0074	800	3.8434	-
0.0083	900	3.6393	-
0.0092	1000	3.3763	-
0.0101	1100	3.0541	-
0.0110	1200	2.6362	-
0.0120	1300	2.1226	-
0.0129	1400	1.6113	-
0.0138	1500	1.2565	-
0.0147	1600	1.029	-
0.0156	1700	0.846	-
0.0166	1800	0.7111	-
0.0175	1900	0.5967	-
0.0184	2000	0.488	-
0.0193	2100	0.4138	-
0.0203	2200	0.3565	-
0.0212	2300	0.3129	-
0.0221	2400	0.2827	-
0.0230	2500	0.2557	-
0.0239	2600	0.2379	-
0.0249	2700	0.2234	-
0.0258	2800	0.2055	-
0.0267	2900	0.1926	-
0.0276	3000	0.1843	-
0.0285	3100	0.175	-
0.0295	3200	0.1647	-
0.0304	3300	0.157	-
0.0313	3400	0.1512	-
0.0322	3500	0.146	-
0.0331	3600	0.1412	-
0.0341	3700	0.1352	-
0.0350	3800	0.1295	-
0.0359	3900	0.1261	-
0.0368	4000	0.122	-
0.0377	4100	0.1171	-
0.0387	4200	0.1147	-
0.0396	4300	0.1103	-
0.0405	4400	0.1073	-
0.0414	4500	0.1053	-
0.0423	4600	0.1016	-
0.0433	4700	0.0991	-
0.0442	4800	0.0981	-
0.0451	4900	0.0935	-
0.0460	5000	0.0928	-
0.0469	5100	0.0895	-
0.0479	5200	0.0877	-
0.0488	5300	0.0853	-
0.0497	5400	0.0829	-
0.0506	5500	0.0818	-
0.0515	5600	0.0805	-
0.0525	5700	0.0785	-
0.0534	5800	0.0769	-
0.0543	5900	0.0746	-
0.0552	6000	0.0754	-
0.0562	6100	0.0715	-
0.0571	6200	0.0707	-
0.0580	6300	0.0699	-
0.0589	6400	0.0678	-
0.0598	6500	0.0659	-
0.0608	6600	0.0659	-
0.0617	6700	0.0646	-
0.0626	6800	0.0627	-
0.0635	6900	0.0627	-
0.0644	7000	0.0604	-
0.0654	7100	0.0592	-
0.0663	7200	0.059	-
0.0672	7300	0.0577	-
0.0681	7400	0.0568	-
0.0690	7500	0.0558	-
0.0700	7600	0.0552	-
0.0709	7700	0.0542	-
0.0718	7800	0.0531	-
0.0727	7900	0.0528	-
0.0736	8000	0.0526	-
0.0746	8100	0.0509	-
0.0755	8200	0.05	-
0.0764	8300	0.0495	-
0.0773	8400	0.0486	-
0.0782	8500	0.0482	-
0.0792	8600	0.048	-
0.0801	8700	0.0468	-
0.0810	8800	0.0461	-
0.0819	8900	0.0459	-
0.0828	9000	0.0453	-
0.0838	9100	0.0442	-
0.0847	9200	0.0443	-
0.0856	9300	0.0437	-
0.0865	9400	0.0435	-
0.0874	9500	0.0426	-
0.0884	9600	0.042	-
0.0893	9700	0.0423	-
0.0902	9800	0.0406	-
0.0911	9900	0.0405	-
0.0920	10000	0.0397	-
0.0930	10100	0.0401	-
0.0939	10200	0.0392	-
0.0948	10300	0.0396	-
0.0957	10400	0.0391	-
0.0967	10500	0.0384	-
0.0976	10600	0.0377	-
0.0985	10700	0.0379	-
0.0994	10800	0.0372	-
0.1003	10900	0.0364	-
0.1013	11000	0.0367	-
0.1022	11100	0.0359	-
0.1031	11200	0.0355	-
0.1040	11300	0.0358	-
0.1049	11400	0.035	-
0.1059	11500	0.0353	-
0.1068	11600	0.0341	-
0.1077	11700	0.0343	-
0.1086	11800	0.034	-
0.1095	11900	0.0334	-
0.1105	12000	0.0337	-
0.1114	12100	0.0332	-
0.1123	12200	0.0323	-
0.1132	12300	0.0323	-
0.1141	12400	0.0322	-
0.1151	12500	0.0312	-
0.1160	12600	0.0307	-
0.1169	12700	0.0314	-
0.1178	12800	0.0309	-
0.1187	12900	0.0313	-
0.1197	13000	0.0306	-
0.1206	13100	0.0303	-
0.1215	13200	0.0301	-
0.1224	13300	0.0302	-
0.1233	13400	0.0296	-
0.1243	13500	0.029	-
0.1252	13600	0.0288	-
0.1261	13700	0.0286	-
0.1270	13800	0.0291	-
0.1279	13900	0.0287	-
0.1289	14000	0.0284	-
0.1298	14100	0.0276	-
0.1307	14200	0.028	-
0.1316	14300	0.0275	-
0.1326	14400	0.0269	-
0.1335	14500	0.027	-
0.1344	14600	0.0273	-
0.1353	14700	0.0267	-
0.1362	14800	0.0263	-
0.1372	14900	0.0264	-
0.1381	15000	0.0263	-
0.1390	15100	0.0262	-
0.1399	15200	0.0256	-
0.1408	15300	0.0254	-
0.1418	15400	0.0257	-
0.1427	15500	0.0251	-
0.1436	15600	0.0253	-
0.1445	15700	0.0247	-
0.1454	15800	0.0251	-
0.1464	15900	0.0245	-
0.1473	16000	0.0246	-
0.1482	16100	0.024	-
0.1491	16200	0.0241	-
0.1500	16300	0.0243	-
0.1510	16400	0.0235	-
0.1519	16500	0.024	-
0.1528	16600	0.0236	-
0.1537	16700	0.0233	-
0.1546	16800	0.0237	-
0.1556	16900	0.023	-
0.1565	17000	0.0233	-
0.1574	17100	0.0229	-
0.1583	17200	0.0227	-
0.1592	17300	0.023	-
0.1602	17400	0.0232	-
0.1611	17500	0.0221	-
0.1620	17600	0.0217	-
0.1629	17700	0.0224	-
0.1638	17800	0.0217	-
0.1648	17900	0.0219	-
0.1657	18000	0.0216	-
0.1666	18100	0.0214	-
0.1675	18200	0.0213	-
0.1685	18300	0.0215	-
0.1694	18400	0.0211	-
0.1703	18500	0.0213	-
0.1712	18600	0.0211	-
0.1721	18700	0.0212	-
0.1731	18800	0.0204	-
0.1740	18900	0.0206	-
0.1749	19000	0.021	-
0.1758	19100	0.0208	-
0.1767	19200	0.0202	-
0.1777	19300	0.0199	-
0.1786	19400	0.0204	-
0.1795	19500	0.0199	-
0.1804	19600	0.0196	-
0.1813	19700	0.0198	-
0.1823	19800	0.0199	-
0.1832	19900	0.0194	-
0.1841	20000	0.0191	-
0.1850	20100	0.0193	-
0.1859	20200	0.0193	-
0.1869	20300	0.0192	-
0.1878	20400	0.0192	-
0.1887	20500	0.0188	-
0.1896	20600	0.0183	-
0.1905	20700	0.0186	-
0.1915	20800	0.0182	-
0.1924	20900	0.0184	-
0.1933	21000	0.0187	-
0.1942	21100	0.0184	-
0.1951	21200	0.0183	-
0.1961	21300	0.0181	-
0.1970	21400	0.0178	-
0.1979	21500	0.0179	-
0.1988	21600	0.018	-
0.1997	21700	0.0185	-
0.2000	21728	-	0.0098
0.2007	21800	0.0176	-
0.2016	21900	0.0183	-
0.2025	22000	0.0174	-
0.2034	22100	0.0179	-
0.2044	22200	0.0175	-
0.2053	22300	0.0175	-
0.2062	22400	0.0172	-
0.2071	22500	0.0173	-
0.2080	22600	0.017	-
0.2090	22700	0.0167	-
0.2099	22800	0.0164	-
0.2108	22900	0.0167	-
0.2117	23000	0.0165	-
0.2126	23100	0.0171	-
0.2136	23200	0.0169	-
0.2145	23300	0.0164	-
0.2154	23400	0.0162	-
0.2163	23500	0.0164	-
0.2172	23600	0.0164	-
0.2182	23700	0.0166	-
0.2191	23800	0.0163	-
0.2200	23900	0.0164	-
0.2209	24000	0.0165	-
0.2218	24100	0.0163	-
0.2228	24200	0.0162	-
0.2237	24300	0.0163	-
0.2246	24400	0.0157	-
0.2255	24500	0.0157	-
0.2264	24600	0.0158	-
0.2274	24700	0.0153	-
0.2283	24800	0.0156	-
0.2292	24900	0.0155	-
0.2301	25000	0.0156	-
0.2310	25100	0.0154	-
0.2320	25200	0.0151	-
0.2329	25300	0.0153	-
0.2338	25400	0.015	-
0.2347	25500	0.0153	-
0.2356	25600	0.015	-
0.2366	25700	0.0152	-
0.2375	25800	0.0147	-
0.2384	25900	0.0148	-
0.2393	26000	0.0148	-
0.2402	26100	0.0144	-
0.2412	26200	0.0146	-
0.2421	26300	0.0143	-
0.2430	26400	0.0143	-
0.2439	26500	0.0145	-
0.2449	26600	0.0142	-
0.2458	26700	0.0142	-
0.2467	26800	0.0143	-
0.2476	26900	0.0139	-
0.2485	27000	0.0141	-
0.2495	27100	0.0141	-
0.2504	27200	0.0143	-
0.2513	27300	0.0141	-
0.2522	27400	0.014	-
0.2531	27500	0.0137	-
0.2541	27600	0.014	-
0.2550	27700	0.0139	-
0.2559	27800	0.0138	-
0.2568	27900	0.0141	-
0.2577	28000	0.0138	-
0.2587	28100	0.0138	-
0.2596	28200	0.0134	-
0.2605	28300	0.0135	-
0.2614	28400	0.0131	-
0.2623	28500	0.0133	-
0.2633	28600	0.0132	-
0.2642	28700	0.0133	-
0.2651	28800	0.0131	-
0.2660	28900	0.013	-
0.2669	29000	0.0131	-
0.2679	29100	0.013	-
0.2688	29200	0.0135	-
0.2697	29300	0.0131	-
0.2706	29400	0.0134	-
0.2715	29500	0.0131	-
0.2725	29600	0.0129	-
0.2734	29700	0.0127	-
0.2743	29800	0.0128	-
0.2752	29900	0.0125	-
0.2761	30000	0.0127	-
0.2771	30100	0.0126	-
0.2780	30200	0.0124	-
0.2789	30300	0.0126	-
0.2798	30400	0.0126	-
0.2808	30500	0.0122	-
0.2817	30600	0.0124	-
0.2826	30700	0.0123	-
0.2835	30800	0.0126	-
0.2844	30900	0.0123	-
0.2854	31000	0.012	-
0.2863	31100	0.012	-
0.2872	31200	0.0123	-
0.2881	31300	0.0122	-
0.2890	31400	0.0121	-
0.2900	31500	0.0124	-
0.2909	31600	0.0117	-
0.2918	31700	0.0118	-
0.2927	31800	0.0121	-
0.2936	31900	0.0119	-
0.2946	32000	0.0115	-
0.2955	32100	0.0117	-
0.2964	32200	0.012	-
0.2973	32300	0.0118	-
0.2982	32400	0.0117	-
0.2992	32500	0.0119	-
0.3001	32600	0.0118	-
0.3010	32700	0.0115	-
0.3019	32800	0.012	-
0.3028	32900	0.0119	-
0.3038	33000	0.0113	-
0.3047	33100	0.0117	-
0.3056	33200	0.0117	-
0.3065	33300	0.0113	-
0.3074	33400	0.0113	-
0.3084	33500	0.0113	-
0.3093	33600	0.0117	-
0.3102	33700	0.0111	-
0.3111	33800	0.0112	-
0.3120	33900	0.0113	-
0.3130	34000	0.0111	-
0.3139	34100	0.0113	-
0.3148	34200	0.0115	-
0.3157	34300	0.0114	-
0.3167	34400	0.0109	-
0.3176	34500	0.0112	-
0.3185	34600	0.0109	-
0.3194	34700	0.011	-
0.3203	34800	0.0108	-
0.3213	34900	0.0108	-
0.3222	35000	0.0107	-
0.3231	35100	0.0109	-
0.3240	35200	0.0108	-
0.3249	35300	0.0108	-
0.3259	35400	0.0108	-
0.3268	35500	0.0105	-
0.3277	35600	0.0106	-
0.3286	35700	0.0105	-
0.3295	35800	0.0104	-
0.3305	35900	0.0107	-
0.3314	36000	0.0105	-
0.3323	36100	0.0103	-
0.3332	36200	0.0105	-
0.3341	36300	0.0103	-
0.3351	36400	0.0107	-
0.3360	36500	0.0101	-
0.3369	36600	0.0102	-
0.3378	36700	0.0102	-
0.3387	36800	0.0102	-
0.3397	36900	0.01	-
0.3406	37000	0.0103	-
0.3415	37100	0.0103	-
0.3424	37200	0.01	-
0.3433	37300	0.0103	-
0.3443	37400	0.0103	-
0.3452	37500	0.0104	-
0.3461	37600	0.0098	-
0.3470	37700	0.0099	-
0.3479	37800	0.0102	-
0.3489	37900	0.0102	-
0.3498	38000	0.01	-
0.3507	38100	0.0101	-
0.3516	38200	0.01	-
0.3526	38300	0.0098	-
0.3535	38400	0.0097	-
0.3544	38500	0.0096	-
0.3553	38600	0.01	-
0.3562	38700	0.0097	-
0.3572	38800	0.0101	-
0.3581	38900	0.0099	-
0.3590	39000	0.0099	-
0.3599	39100	0.01	-
0.3608	39200	0.0094	-
0.3618	39300	0.0096	-
0.3627	39400	0.0095	-
0.3636	39500	0.0094	-
0.3645	39600	0.0094	-
0.3654	39700	0.0094	-
0.3664	39800	0.0096	-
0.3673	39900	0.0095	-
0.3682	40000	0.0096	-
0.3691	40100	0.0096	-
0.3700	40200	0.0094	-
0.3710	40300	0.0093	-
0.3719	40400	0.0092	-
0.3728	40500	0.0095	-
0.3737	40600	0.0091	-
0.3746	40700	0.0098	-
0.3756	40800	0.0094	-
0.3765	40900	0.0092	-
0.3774	41000	0.0094	-
0.3783	41100	0.0092	-
0.3792	41200	0.0093	-
0.3802	41300	0.0092	-
0.3811	41400	0.0095	-
0.3820	41500	0.0094	-
0.3829	41600	0.0089	-
0.3838	41700	0.009	-
0.3848	41800	0.0092	-
0.3857	41900	0.009	-
0.3866	42000	0.0089	-
0.3875	42100	0.0091	-
0.3884	42200	0.0087	-
0.3894	42300	0.0091	-
0.3903	42400	0.0089	-
0.3912	42500	0.0089	-
0.3921	42600	0.0089	-
0.3931	42700	0.0087	-
0.3940	42800	0.009	-
0.3949	42900	0.0087	-
0.3958	43000	0.0089	-
0.3967	43100	0.0088	-
0.3977	43200	0.0088	-
0.3986	43300	0.0089	-
0.3995	43400	0.0088	-
0.4000	43456	-	0.0047

Framework Versions

Python: 3.11.10
Sentence Transformers: 5.1.2
Transformers: 4.57.1
PyTorch: 2.4.1+cu121
Accelerate: 1.11.0
Datasets: 4.3.0
Tokenizers: 0.22.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 77

Safetensors

Model size

0.1B params

Tensor type

F32

Model tree for thebajajra/RexBERT-base-embed-pf-v0.1

Base model

thebajajra/RexBERT-base

Finetuned

(2)

this model

Finetunes

2 models

thebajajra
/

RexBERT-base-embed-pf-v0.1

SentenceTransformer based on thebajajra/RexBERT-base

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Training Details

Training Dataset

nomic-embed-unsupervised-data

Evaluation Dataset

nomic-embed-unsupervised-data

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

MultipleNegativesRankingLoss

Model tree for thebajajra/RexBERT-base-embed-pf-v0.1

Dataset used to train thebajajra/RexBERT-base-embed-pf-v0.1