Update README.md
Browse files
README.md
CHANGED
|
@@ -46,8 +46,6 @@ widget:
|
|
| 46 |
- Un gato está mirando hacia la cámara también.
|
| 47 |
- '"Sí, no deseo estar presente durante este testimonio", declaró tranquilamente
|
| 48 |
Peterson, de 31 años, al juez cuando fue devuelto a su celda.'
|
| 49 |
-
datasets:
|
| 50 |
-
- clibrain/stsb_multi_es_aug_gpt3.5-turbo_2
|
| 51 |
pipeline_tag: sentence-similarity
|
| 52 |
library_name: sentence-transformers
|
| 53 |
metrics:
|
|
@@ -190,7 +188,7 @@ model-index:
|
|
| 190 |
|
| 191 |
# SentenceTransformer based on nomic-ai/modernbert-embed-base
|
| 192 |
|
| 193 |
-
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the
|
| 194 |
|
| 195 |
## Model Details
|
| 196 |
|
|
@@ -201,9 +199,7 @@ This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [n
|
|
| 201 |
- **Output Dimensionality:** 768 dimensions
|
| 202 |
- **Similarity Function:** Cosine Similarity
|
| 203 |
- **Training Dataset:**
|
| 204 |
-
-
|
| 205 |
-
<!-- - **Language:** Unknown -->
|
| 206 |
-
<!-- - **License:** Unknown -->
|
| 207 |
|
| 208 |
### Model Sources
|
| 209 |
|
|
@@ -307,9 +303,8 @@ You can finetune this model on your own dataset.
|
|
| 307 |
|
| 308 |
### Training Dataset
|
| 309 |
|
| 310 |
-
####
|
| 311 |
|
| 312 |
-
* Dataset: [stsb_multi_es_aug_gpt3.5-turbo_2](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2) at [3567b77](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2/tree/3567b77024bc5cc6372e058c9f05107deb361664)
|
| 313 |
* Size: 2,697 training samples
|
| 314 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
| 315 |
* Approximate statistics based on the first 1000 samples:
|
|
@@ -347,9 +342,8 @@ You can finetune this model on your own dataset.
|
|
| 347 |
|
| 348 |
### Evaluation Dataset
|
| 349 |
|
| 350 |
-
####
|
| 351 |
|
| 352 |
-
* Dataset: [stsb_multi_es_aug_gpt3.5-turbo_2](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2) at [3567b77](https://huggingface.co/datasets/clibrain/stsb_multi_es_aug_gpt3.5-turbo_2/tree/3567b77024bc5cc6372e058c9f05107deb361664)
|
| 353 |
* Size: 697 evaluation samples
|
| 354 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
| 355 |
* Approximate statistics based on the first 697 samples:
|
|
|
|
| 46 |
- Un gato está mirando hacia la cámara también.
|
| 47 |
- '"Sí, no deseo estar presente durante este testimonio", declaró tranquilamente
|
| 48 |
Peterson, de 31 años, al juez cuando fue devuelto a su celda.'
|
|
|
|
|
|
|
| 49 |
pipeline_tag: sentence-similarity
|
| 50 |
library_name: sentence-transformers
|
| 51 |
metrics:
|
|
|
|
| 188 |
|
| 189 |
# SentenceTransformer based on nomic-ai/modernbert-embed-base
|
| 190 |
|
| 191 |
+
This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [nomic-ai/modernbert-embed-base](https://huggingface.co/nomic-ai/modernbert-embed-base) on the stsb_multi_es_augmented (private) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
|
| 192 |
|
| 193 |
## Model Details
|
| 194 |
|
|
|
|
| 199 |
- **Output Dimensionality:** 768 dimensions
|
| 200 |
- **Similarity Function:** Cosine Similarity
|
| 201 |
- **Training Dataset:**
|
| 202 |
+
- Private stsb dataset
|
|
|
|
|
|
|
| 203 |
|
| 204 |
### Model Sources
|
| 205 |
|
|
|
|
| 303 |
|
| 304 |
### Training Dataset
|
| 305 |
|
| 306 |
+
#### stsb_multi_es_augmented (private)
|
| 307 |
|
|
|
|
| 308 |
* Size: 2,697 training samples
|
| 309 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
| 310 |
* Approximate statistics based on the first 1000 samples:
|
|
|
|
| 342 |
|
| 343 |
### Evaluation Dataset
|
| 344 |
|
| 345 |
+
#### stsb_multi_es_augmented (private)
|
| 346 |
|
|
|
|
| 347 |
* Size: 697 evaluation samples
|
| 348 |
* Columns: <code>sentence1</code>, <code>sentence2</code>, and <code>score</code>
|
| 349 |
* Approximate statistics based on the first 697 samples:
|