Update README.md
Browse files
README.md
CHANGED
|
@@ -4,13 +4,9 @@ Naming pattern:
|
|
| 4 |
2. `GPL/${dataset}-tsdae-msmarco-distilbert-gpl`: Model with training order of (1) TSDAE on ${dataset} -> (2) MarginMSE on MSMARCO -> (3) GPL on ${dataset};
|
| 5 |
3. `GPL/msmarco-distilbert-margin-mse`: Model trained on MSMARCO with MarginMSE;
|
| 6 |
4. `GPL/${dataset}-tsdae-msmarco-distilbert-margin-mse`: Model with training order of (1) TSDAE on ${dataset} -> (2) MarginMSE on MSMARCO;
|
|
|
|
| 7 |
|
| 8 |
Actually, models in 1. and 2. are built on top of 3. and 4., respectively.
|
| 9 |
|
| 10 |
-
** NEW **
|
| 11 |
-
|
| 12 |
-
1. `GPL/${dataset}-distilbert-tas-b-gpl-self_miner`: Starting from the [tas-b model](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-tas-b), the models were trained with GPL on the target corpus ${dataset} with the base model itself as the negative miner (here noted as "self_miner"). Query generation setup: Keeping the total number of generated queries = 250K and the default `queries_per_passage` is setup to 3; so if the corpus size is larger than 250K/3=83.3K, the corpus will be truncated to 83.3K; and if the corpus size is below 250K/3=83.3K, the `queries_per_passage` will be increased accordingly to meet the standard (e.g. for 50K-sized corpus, the `queries_per_passage` will be set to 250K/50K=5).
|
| 13 |
-
|
| 14 |
-
|
| 15 |
|
| 16 |
|
|
|
|
| 4 |
2. `GPL/${dataset}-tsdae-msmarco-distilbert-gpl`: Model with training order of (1) TSDAE on ${dataset} -> (2) MarginMSE on MSMARCO -> (3) GPL on ${dataset};
|
| 5 |
3. `GPL/msmarco-distilbert-margin-mse`: Model trained on MSMARCO with MarginMSE;
|
| 6 |
4. `GPL/${dataset}-tsdae-msmarco-distilbert-margin-mse`: Model with training order of (1) TSDAE on ${dataset} -> (2) MarginMSE on MSMARCO;
|
| 7 |
+
5. `GPL/${dataset}-distilbert-tas-b-gpl-self_miner`: Starting from the [tas-b model](https://huggingface.co/sentence-transformers/msmarco-distilbert-base-tas-b), the models were trained with GPL on the target corpus ${dataset} with the base model itself as the negative miner (here noted as "self_miner").
|
| 8 |
|
| 9 |
Actually, models in 1. and 2. are built on top of 3. and 4., respectively.
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
|
| 12 |
|