FatCat87
/

6d89a915-cad3-42ab-8d1a-5e9e9e98151c

@@ -1,12 +1,12 @@
 ---
-license: llama3
 library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
-base_model: unsloth/Qwen2.5-3B
 model-index:
-- name: 080fd5a4-0250-4b62-86a9-7eef387d5b80
   results: []
 ---
@@ -19,19 +19,19 @@ should probably proofread and complete it, then remove this comment. -->
 axolotl version: `0.4.1`
 ```yaml
 adapter: lora
-base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
 bf16: auto
 datasets:
 - data_files:
-  - 8da150510918d7cc_train_data.json
   ds_type: json
   format: custom
-  path: 8da150510918d7cc_train_data.json
   type:
     field: null
     field_input: null
-    field_instruction: paper_title
-    field_output: paper_abstract
     field_system: null
     format: null
     no_input_format: null
@@ -51,7 +51,7 @@ fsdp_config: null
 gradient_accumulation_steps: 4
 gradient_checkpointing: true
 group_by_length: false
-hub_model_id: FatCat87/080fd5a4-0250-4b62-86a9-7eef387d5b80
 learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
@@ -71,9 +71,10 @@ pad_to_sequence_len: true
 resume_from_checkpoint: null
 sample_packing: true
 saves_per_epoch: 1
-seed: 31014
 sequence_len: 4096
-special_tokens: null
 strict: false
 tf32: false
 tokenizer_type: AutoTokenizer
@@ -82,9 +83,9 @@ val_set_size: 0.1
 wandb_entity: fatcat87-taopanda
 wandb_log_model: null
 wandb_mode: online
-wandb_name: 080fd5a4-0250-4b62-86a9-7eef387d5b80
 wandb_project: subnet56
-wandb_runid: 080fd5a4-0250-4b62-86a9-7eef387d5b80
 wandb_watch: null
 warmup_ratio: 0.05
 weight_decay: 0.0
@@ -94,12 +95,12 @@ xformers_attention: null
 </details><br>
-[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/5fyifub3)
-# 080fd5a4-0250-4b62-86a9-7eef387d5b80
-This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 2.2622
 ## Model description
@@ -121,7 +122,7 @@ The following hyperparameters were used during training:
 - learning_rate: 0.0002
 - train_batch_size: 2
 - eval_batch_size: 2
-- seed: 31014
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 4
@@ -129,17 +130,16 @@ The following hyperparameters were used during training:
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 2
 - num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
-| 2.5948        | 0.0215 | 1    | 2.5959          |
-| 2.2907        | 0.2581 | 12   | 2.3007          |
-| 2.2559        | 0.5161 | 24   | 2.2711          |
-| 2.2303        | 0.7742 | 36   | 2.2622          |
 ### Framework versions

 ---
+license: apache-2.0
 library_name: peft
 tags:
 - axolotl
 - generated_from_trainer
+base_model: EleutherAI/pythia-70m-deduped
 model-index:
+- name: 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
   results: []
 ---
 axolotl version: `0.4.1`
 ```yaml
 adapter: lora
+base_model: EleutherAI/pythia-70m-deduped
 bf16: auto
 datasets:
 - data_files:
+  - c0e356afd17a58f1_train_data.json
   ds_type: json
   format: custom
+  path: c0e356afd17a58f1_train_data.json
   type:
     field: null
     field_input: null
+    field_instruction: ruby_text
+    field_output: text
     field_system: null
     format: null
     no_input_format: null
 gradient_accumulation_steps: 4
 gradient_checkpointing: true
 group_by_length: false
+hub_model_id: FatCat87/6d89a915-cad3-42ab-8d1a-5e9e9e98151c
 learning_rate: 0.0002
 load_in_4bit: false
 load_in_8bit: true
 resume_from_checkpoint: null
 sample_packing: true
 saves_per_epoch: 1
+seed: 26260
 sequence_len: 4096
+special_tokens:
+  pad_token: <|endoftext|>
 strict: false
 tf32: false
 tokenizer_type: AutoTokenizer
 wandb_entity: fatcat87-taopanda
 wandb_log_model: null
 wandb_mode: online
+wandb_name: 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
 wandb_project: subnet56
+wandb_runid: 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
 wandb_watch: null
 warmup_ratio: 0.05
 weight_decay: 0.0
 </details><br>
+[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/fatcat87-taopanda/subnet56/runs/51bg82x5)
+# 6d89a915-cad3-42ab-8d1a-5e9e9e98151c
+This model is a fine-tuned version of [EleutherAI/pythia-70m-deduped](https://huggingface.co/EleutherAI/pythia-70m-deduped) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 32.2953
 ## Model description
 - learning_rate: 0.0002
 - train_batch_size: 2
 - eval_batch_size: 2
+- seed: 26260
 - distributed_type: multi-GPU
 - num_devices: 2
 - gradient_accumulation_steps: 4
 - total_eval_batch_size: 4
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: cosine
 - num_epochs: 1
 ### Training results
 | Training Loss | Epoch  | Step | Validation Loss |
 |:-------------:|:------:|:----:|:---------------:|
+| 48.9386       | 0.0635 | 1    | 45.6835         |
+| 47.7809       | 0.2540 | 4    | 44.7200         |
+| 34.2772       | 0.5079 | 8    | 38.5010         |
+| 30.6758       | 0.7619 | 12   | 32.2953         |
 ### Framework versions

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8c06a24f56b4e9d832307fa8e5f4e193bff9487f97556b67e442b1844f74995c
-size 184419648

 version https://git-lfs.github.com/spec/v1
+oid sha256:eb0b4cd4d9927ffd90e5c7c4fe44fcfb21156e57107514aba930818b3f037eb2
+size 6309118