error577
/

e5c7898a-4409-476c-b0e6-2048d1c3b9c8

Generated from Trainer

4-bit precision

Model card Files Files and versions

error577 commited on Feb 10

Commit

e4c7d87

·

verified ·

1 Parent(s): d469a0e

End of training

Files changed (2) hide show

README.md +7 -3
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -67,7 +67,8 @@ lora_r: 128
 lora_target_linear: true
 lr_scheduler: cosine
 max_grad_norm: 1.0
-max_steps: 500
 micro_batch_size: 2
 mlflow_experiment_name: /tmp/b4d50b4ebb62bd26_train_data.json
 model_type: AutoModelForCausalLM
@@ -104,7 +105,7 @@ xformers_attention: null
 This model is a fine-tuned version of [unsloth/SmolLM-1.7B-Instruct](https://huggingface.co/unsloth/SmolLM-1.7B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.6911
 ## Model description
@@ -132,7 +133,7 @@ The following hyperparameters were used during training:
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
-- training_steps: 500
 ### Training results
@@ -149,6 +150,9 @@ The following hyperparameters were used during training:
 | 0.7743        | 0.0111 | 400  | 0.7006          |
 | 0.507         | 0.0125 | 450  | 0.6939          |
 | 0.5217        | 0.0139 | 500  | 0.6911          |
 ### Framework versions

 lora_target_linear: true
 lr_scheduler: cosine
 max_grad_norm: 1.0
+max_steps: 1000
+auto_resume_from_checkpoints: true
 micro_batch_size: 2
 mlflow_experiment_name: /tmp/b4d50b4ebb62bd26_train_data.json
 model_type: AutoModelForCausalLM
 This model is a fine-tuned version of [unsloth/SmolLM-1.7B-Instruct](https://huggingface.co/unsloth/SmolLM-1.7B-Instruct) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.7097
 ## Model description
 - optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
 - lr_scheduler_type: cosine
 - lr_scheduler_warmup_steps: 10
+- training_steps: 1000
 ### Training results
 | 0.7743        | 0.0111 | 400  | 0.7006          |
 | 0.507         | 0.0125 | 450  | 0.6939          |
 | 0.5217        | 0.0139 | 500  | 0.6911          |
+| 0.7122        | 0.0153 | 550  | 0.7409          |
+| 0.6526        | 0.0167 | 600  | 0.7192          |
+| 0.5694        | 0.0181 | 650  | 0.7097          |
 ### Framework versions

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:91ea2e833e395bd896b7338e0b159889cb0a9805a20a0ba81249634cf8be6acb
 size 578859568

 version https://git-lfs.github.com/spec/v1
+oid sha256:c6519a65749b3640a198541c50a2d98674828ec6c9d6a4383408f62deb368efa
 size 578859568