krmk90 commited on
Commit
01b70ca
·
verified ·
1 Parent(s): f988383

Model save

Browse files
README.md CHANGED
@@ -36,11 +36,11 @@ More information needed
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0002
39
- - train_batch_size: 3
40
  - eval_batch_size: 8
41
  - seed: 42
42
- - gradient_accumulation_steps: 8
43
- - total_train_batch_size: 24
44
  - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: constant
46
  - lr_scheduler_warmup_ratio: 0.03
 
36
 
37
  The following hyperparameters were used during training:
38
  - learning_rate: 0.0002
39
+ - train_batch_size: 2
40
  - eval_batch_size: 8
41
  - seed: 42
42
+ - gradient_accumulation_steps: 64
43
+ - total_train_batch_size: 128
44
  - optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
45
  - lr_scheduler_type: constant
46
  - lr_scheduler_warmup_ratio: 0.03
runs/Mar12_18-55-21_ip-10-246-124-97/events.out.tfevents.1741805731.ip-10-246-124-97.4081.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c289b0a8210c7a31432089d3051a76ab4fbc1d98315048370756036c1e1d1977
3
- size 7097
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1863d0b7bdf58ffe13813659aba8ce41ca22dd4786e45d551cd868acd4389788
3
+ size 7445