Update README.md
Browse files
README.md
CHANGED
|
@@ -42,11 +42,6 @@ Five conditional benchmarks, using [lm-evaluation-harness](https://github.com/El
|
|
| 42 |
- MMLU: 0-shot, report normalized accuracy
|
| 43 |
- TruthfulQA: 3-shot, report accuracy of single-true mc1 setting
|
| 44 |
|
| 45 |
-
One open-ended benchmark, using official [alpaca_eval](https://github.com/tatsu-lab/alpaca_eval/):
|
| 46 |
-
- AlpacaEval2: win rate (%) judged by GPT-4-turbo between the model's outputs vs. the GPT-4-turbo's response
|
| 47 |
-
- LC AlpacaEval2: length-debiased win rate (%) of AlpacaEval2
|
| 48 |
-
- Length in Tokens: the average output length of AlpacaEval2, calculated in tokens with Llama3's tokenizer
|
| 49 |
-
|
| 50 |
## Input Format
|
| 51 |
|
| 52 |
The model is trained to use the following format:
|
|
@@ -61,11 +56,10 @@ The model is trained to use the following format:
|
|
| 61 |
|
| 62 |
## Training hyperparameters
|
| 63 |
|
| 64 |
-
The following hyperparameters were used during
|
| 65 |
- learning_rate: 1e-5
|
| 66 |
- total_train_batch_size: 16
|
| 67 |
- optimizer: AdamW with beta1 0.9, beta2 0.999 and epsilon 1e-8
|
| 68 |
- lr_scheduler_type: cosine
|
| 69 |
- lr_scheduler_warmup_ratio: 0.04
|
| 70 |
-
- num_epochs: 1.0
|
| 71 |
-
- Specifically add above input format over training samples
|
|
|
|
| 42 |
- MMLU: 0-shot, report normalized accuracy
|
| 43 |
- TruthfulQA: 3-shot, report accuracy of single-true mc1 setting
|
| 44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
## Input Format
|
| 46 |
|
| 47 |
The model is trained to use the following format:
|
|
|
|
| 56 |
|
| 57 |
## Training hyperparameters
|
| 58 |
|
| 59 |
+
The following hyperparameters were used during training:
|
| 60 |
- learning_rate: 1e-5
|
| 61 |
- total_train_batch_size: 16
|
| 62 |
- optimizer: AdamW with beta1 0.9, beta2 0.999 and epsilon 1e-8
|
| 63 |
- lr_scheduler_type: cosine
|
| 64 |
- lr_scheduler_warmup_ratio: 0.04
|
| 65 |
+
- num_epochs: 1.0
|
|
|