error577 commited on
Commit
e4c7d87
·
verified ·
1 Parent(s): d469a0e

End of training

Browse files
Files changed (2) hide show
  1. README.md +7 -3
  2. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -67,7 +67,8 @@ lora_r: 128
67
  lora_target_linear: true
68
  lr_scheduler: cosine
69
  max_grad_norm: 1.0
70
- max_steps: 500
 
71
  micro_batch_size: 2
72
  mlflow_experiment_name: /tmp/b4d50b4ebb62bd26_train_data.json
73
  model_type: AutoModelForCausalLM
@@ -104,7 +105,7 @@ xformers_attention: null
104
 
105
  This model is a fine-tuned version of [unsloth/SmolLM-1.7B-Instruct](https://huggingface.co/unsloth/SmolLM-1.7B-Instruct) on the None dataset.
106
  It achieves the following results on the evaluation set:
107
- - Loss: 0.6911
108
 
109
  ## Model description
110
 
@@ -132,7 +133,7 @@ The following hyperparameters were used during training:
132
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
133
  - lr_scheduler_type: cosine
134
  - lr_scheduler_warmup_steps: 10
135
- - training_steps: 500
136
 
137
  ### Training results
138
 
@@ -149,6 +150,9 @@ The following hyperparameters were used during training:
149
  | 0.7743 | 0.0111 | 400 | 0.7006 |
150
  | 0.507 | 0.0125 | 450 | 0.6939 |
151
  | 0.5217 | 0.0139 | 500 | 0.6911 |
 
 
 
152
 
153
 
154
  ### Framework versions
 
67
  lora_target_linear: true
68
  lr_scheduler: cosine
69
  max_grad_norm: 1.0
70
+ max_steps: 1000
71
+ auto_resume_from_checkpoints: true
72
  micro_batch_size: 2
73
  mlflow_experiment_name: /tmp/b4d50b4ebb62bd26_train_data.json
74
  model_type: AutoModelForCausalLM
 
105
 
106
  This model is a fine-tuned version of [unsloth/SmolLM-1.7B-Instruct](https://huggingface.co/unsloth/SmolLM-1.7B-Instruct) on the None dataset.
107
  It achieves the following results on the evaluation set:
108
+ - Loss: 0.7097
109
 
110
  ## Model description
111
 
 
133
  - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
134
  - lr_scheduler_type: cosine
135
  - lr_scheduler_warmup_steps: 10
136
+ - training_steps: 1000
137
 
138
  ### Training results
139
 
 
150
  | 0.7743 | 0.0111 | 400 | 0.7006 |
151
  | 0.507 | 0.0125 | 450 | 0.6939 |
152
  | 0.5217 | 0.0139 | 500 | 0.6911 |
153
+ | 0.7122 | 0.0153 | 550 | 0.7409 |
154
+ | 0.6526 | 0.0167 | 600 | 0.7192 |
155
+ | 0.5694 | 0.0181 | 650 | 0.7097 |
156
 
157
 
158
  ### Framework versions
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:91ea2e833e395bd896b7338e0b159889cb0a9805a20a0ba81249634cf8be6acb
3
  size 578859568
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c6519a65749b3640a198541c50a2d98674828ec6c9d6a4383408f62deb368efa
3
  size 578859568