hassen228 commited on
Commit
728ae13
·
verified ·
1 Parent(s): 713a72d

End of training

Browse files
Files changed (5) hide show
  1. README.md +19 -20
  2. config.json +11 -2
  3. model.safetensors +2 -2
  4. tokenizer.json +1 -1
  5. training_args.bin +1 -1
README.md CHANGED
@@ -18,10 +18,10 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.6560
22
- - F1 Macro: 0.8840
23
- - F1 Weighted: 0.8867
24
- - Accuracy: 0.8867
25
 
26
  ## Model description
27
 
@@ -40,10 +40,12 @@ More information needed
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
- - learning_rate: 1e-05
44
  - train_batch_size: 16
45
  - eval_batch_size: 16
46
  - seed: 42
 
 
47
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
  - lr_scheduler_type: linear
49
  - lr_scheduler_warmup_ratio: 0.1
@@ -52,23 +54,20 @@ The following hyperparameters were used during training:
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss | F1 Macro | F1 Weighted | Accuracy |
56
- |:-------------:|:-----:|:----:|:---------------:|:--------:|:-----------:|:--------:|
57
- | 0.6484 | 1.0 | 164 | 0.3818 | 0.8347 | 0.8402 | 0.8430 |
58
- | 0.3531 | 2.0 | 328 | 0.5118 | 0.8365 | 0.8377 | 0.8369 |
59
- | 0.2805 | 3.0 | 492 | 0.5171 | 0.8408 | 0.8423 | 0.8415 |
60
- | 0.1972 | 4.0 | 656 | 0.6697 | 0.8630 | 0.8650 | 0.8643 |
61
- | 0.1422 | 5.0 | 820 | 0.5505 | 0.8781 | 0.8816 | 0.8826 |
62
- | 0.0984 | 6.0 | 984 | 0.6658 | 0.8853 | 0.8875 | 0.8872 |
63
- | 0.0607 | 7.0 | 1148 | 0.8728 | 0.8834 | 0.8858 | 0.8857 |
64
- | 0.0483 | 8.0 | 1312 | 0.9191 | 0.8704 | 0.8725 | 0.8720 |
65
- | 0.0266 | 9.0 | 1476 | 0.9519 | 0.8670 | 0.8694 | 0.8689 |
66
- | 0.0298 | 10.0 | 1640 | 0.9161 | 0.8694 | 0.8721 | 0.8720 |
67
 
68
 
69
  ### Framework versions
70
 
71
- - Transformers 4.57.1
72
- - Pytorch 2.8.0+cu126
73
- - Datasets 4.0.0
74
  - Tokenizers 0.22.1
 
18
 
19
  This model is a fine-tuned version of [xlm-roberta-large](https://huggingface.co/xlm-roberta-large) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 0.8214
22
+ - Accuracy: 0.6439
23
+ - F1 Macro: 0.6444
24
+ - F1 Weighted: 0.6438
25
 
26
  ## Model description
27
 
 
40
  ### Training hyperparameters
41
 
42
  The following hyperparameters were used during training:
43
+ - learning_rate: 2e-05
44
  - train_batch_size: 16
45
  - eval_batch_size: 16
46
  - seed: 42
47
+ - gradient_accumulation_steps: 2
48
+ - total_train_batch_size: 32
49
  - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
50
  - lr_scheduler_type: linear
51
  - lr_scheduler_warmup_ratio: 0.1
 
54
 
55
  ### Training results
56
 
57
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 Macro | F1 Weighted |
58
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:-----------:|
59
+ | 1.1168 | 1.0 | 121 | 0.9948 | 0.5421 | 0.4476 | 0.4564 |
60
+ | 0.9225 | 2.0 | 242 | 0.8052 | 0.6410 | 0.6464 | 0.6474 |
61
+ | 0.8084 | 3.0 | 363 | 0.8085 | 0.6961 | 0.6760 | 0.6807 |
62
+ | 0.6813 | 4.0 | 484 | 0.7785 | 0.6670 | 0.6698 | 0.6676 |
63
+ | 0.5894 | 5.0 | 605 | 0.8427 | 0.6681 | 0.6723 | 0.6709 |
64
+ | 0.4791 | 6.0 | 726 | 0.9016 | 0.6722 | 0.6643 | 0.6663 |
65
+ | 0.4063 | 7.0 | 847 | 0.9857 | 0.6712 | 0.6715 | 0.6732 |
 
 
 
66
 
67
 
68
  ### Framework versions
69
 
70
+ - Transformers 4.57.3
71
+ - Pytorch 2.9.0+cu126
72
+ - Datasets 4.4.1
73
  - Tokenizers 0.22.1
config.json CHANGED
@@ -10,8 +10,18 @@
10
  "hidden_act": "gelu",
11
  "hidden_dropout_prob": 0.1,
12
  "hidden_size": 1024,
 
 
 
 
 
13
  "initializer_range": 0.02,
14
  "intermediate_size": 4096,
 
 
 
 
 
15
  "layer_norm_eps": 1e-05,
16
  "max_position_embeddings": 514,
17
  "model_type": "xlm-roberta",
@@ -20,8 +30,7 @@
20
  "output_past": true,
21
  "pad_token_id": 1,
22
  "position_embedding_type": "absolute",
23
- "problem_type": "single_label_classification",
24
- "transformers_version": "4.57.1",
25
  "type_vocab_size": 1,
26
  "use_cache": true,
27
  "vocab_size": 250002
 
10
  "hidden_act": "gelu",
11
  "hidden_dropout_prob": 0.1,
12
  "hidden_size": 1024,
13
+ "id2label": {
14
+ "0": "N",
15
+ "1": "NEU",
16
+ "2": "P"
17
+ },
18
  "initializer_range": 0.02,
19
  "intermediate_size": 4096,
20
+ "label2id": {
21
+ "N": 0,
22
+ "NEU": 1,
23
+ "P": 2
24
+ },
25
  "layer_norm_eps": 1e-05,
26
  "max_position_embeddings": 514,
27
  "model_type": "xlm-roberta",
 
30
  "output_past": true,
31
  "pad_token_id": 1,
32
  "position_embedding_type": "absolute",
33
+ "transformers_version": "4.57.3",
 
34
  "type_vocab_size": 1,
35
  "use_cache": true,
36
  "vocab_size": 250002
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:88acffeb20c81cecae797ad39b36fbc852d802a510d02e3d088d3279d24105f8
3
- size 2239618672
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2f8b249837d9e9353ee4e276bd1299c53b528da37769f2e0c48f5b644d7f33b8
3
+ size 2239622772
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:93189c5d9a15db043017cfd920e00cf72fe9a4220bd74b460b635f6aa85a61a2
3
  size 17082999
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c088c06cf975b7097e469bd69630cdb0d675c6db1ce3af1042b6e19c6d01f22
3
  size 17082999
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:69c36a33a2bd1d4531d988e4fd71d43a45bc7ff721dfad4a5f2c3e34b3280a5e
3
  size 5841
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:407e0060e33a5fde8c99602a34a54143913cdf1a562e8f6b2dcbbb107fb536c1
3
  size 5841