wjbmattingly commited on
Commit
5667f20
·
verified ·
1 Parent(s): 8086dc2

Upload folder using huggingface_hub

Browse files
README.md CHANGED
@@ -3,16 +3,16 @@ base_model:
3
  - LiquidAI/LFM2-VL-450M
4
  ---
5
 
6
- # final
7
 
8
  ## Model Description
9
 
10
  This model is a fine-tuned version of **LiquidAI/LFM2-VL-450M** using the brute-force-training package.
11
 
12
  - **Base Model**: LiquidAI/LFM2-VL-450M
13
- - **Training Status**: Complete
14
- - **Generated**: 2025-08-18 19:51:23
15
- - **Training Steps**: 10,000
16
 
17
  ## Training Details
18
 
@@ -24,19 +24,19 @@ This model is a fine-tuned version of **LiquidAI/LFM2-VL-450M** using the brute-
24
  ### Training Configuration
25
  - **Max Steps**: 10,000
26
  - **Batch Size**: 2
27
- - **Learning Rate**: 5e-06
28
- - **Gradient Accumulation**: 2 steps
29
- - **Evaluation Frequency**: Every 1,000 steps
30
 
31
  ### Current Performance
32
- - **Training Loss**: 0.918973
33
- - **Evaluation Loss**: 3.461486
34
 
35
  ## Pre-Training Evaluation
36
 
37
  **Initial Model Performance (before training):**
38
- - **Loss**: 6.255835
39
- - **Perplexity**: 521.04
40
  - **Character Accuracy**: 27.7%
41
  - **Word Accuracy**: 11.6%
42
 
@@ -45,33 +45,37 @@ This model is a fine-tuned version of **LiquidAI/LFM2-VL-450M** using the brute-
45
  ### All Checkpoint Evaluations
46
  | Step | Checkpoint Type | Loss | Perplexity | Char Acc | Word Acc | Improvement vs Pre |
47
  |------|----------------|------|------------|----------|----------|--------------------|
48
- | Pre | pre_training | 6.2558 | 521.04 | 27.7% | 11.6% | +0.0% |
49
- | 1,000 | checkpoint | 4.4533 | 85.91 | 33.8% | 16.1% | +28.8% |
50
- | 2,000 | checkpoint | 4.1024 | 60.49 | 32.5% | 15.1% | +34.4% |
51
- | 3,000 | checkpoint | 3.9043 | 49.62 | 33.2% | 16.9% | +37.6% |
52
- | 4,000 | checkpoint | 3.7561 | 42.78 | 30.9% | 14.2% | +40.0% |
53
- | 5,000 | checkpoint | 3.6675 | 39.15 | 33.5% | 17.0% | +41.4% |
54
- | 6,000 | checkpoint | 3.6180 | 37.26 | 31.8% | 15.1% | +42.2% |
55
- | 7,000 | checkpoint | 3.5651 | 35.34 | 32.2% | 15.6% | +43.0% |
56
- | 8,000 | checkpoint | 3.5113 | 33.49 | 30.6% | 14.2% | +43.9% |
57
- | 9,000 | checkpoint | 3.4908 | 32.81 | 33.4% | 16.9% | +44.2% |
58
- | 10,000 | final | 3.4615 | 31.86 | 32.5% | 16.5% | +44.7% |
 
 
 
 
59
 
60
  ## Training Progress
61
 
62
  ### Recent Training Steps (Loss Only)
63
  | Step | Training Loss | Timestamp |
64
  |------|---------------|-----------|
65
- | 9,991 | 0.764521 | 2025-08-18T19:50 |
66
- | 9,992 | 0.948460 | 2025-08-18T19:50 |
67
- | 9,993 | 0.758166 | 2025-08-18T19:50 |
68
- | 9,994 | 0.898506 | 2025-08-18T19:50 |
69
- | 9,995 | 0.784889 | 2025-08-18T19:50 |
70
- | 9,996 | 0.786168 | 2025-08-18T19:50 |
71
- | 9,997 | 0.674831 | 2025-08-18T19:50 |
72
- | 9,998 | 0.950868 | 2025-08-18T19:50 |
73
- | 9,999 | 0.960045 | 2025-08-18T19:50 |
74
- | 10,000 | 0.918973 | 2025-08-18T19:50 |
75
 
76
  ## Training Visualizations
77
 
@@ -95,8 +99,8 @@ This model is a fine-tuned version of **LiquidAI/LFM2-VL-450M** using the brute-
95
  from transformers import AutoModelForCausalLM, AutoTokenizer
96
  # For vision-language models, use appropriate imports
97
 
98
- model = AutoModelForCausalLM.from_pretrained("./final")
99
- tokenizer = AutoTokenizer.from_pretrained("./final")
100
 
101
  # Your inference code here
102
  ```
@@ -108,9 +112,9 @@ tokenizer = AutoTokenizer.from_pretrained("./final")
108
  "dataset_name": "CATMuS/medieval",
109
  "model_name": "LiquidAI/LFM2-VL-450M",
110
  "max_steps": 10000,
111
- "eval_steps": 1000,
112
- "num_accumulation_steps": 2,
113
- "learning_rate": 5e-06,
114
  "train_batch_size": 2,
115
  "val_batch_size": 2,
116
  "train_select_start": 0,
@@ -136,4 +140,4 @@ tokenizer = AutoTokenizer.from_pretrained("./final")
136
 
137
  ---
138
 
139
- *This model card was automatically generated by brute-force-training on 2025-08-18 19:51:23*
 
3
  - LiquidAI/LFM2-VL-450M
4
  ---
5
 
6
+ # model_step_7000
7
 
8
  ## Model Description
9
 
10
  This model is a fine-tuned version of **LiquidAI/LFM2-VL-450M** using the brute-force-training package.
11
 
12
  - **Base Model**: LiquidAI/LFM2-VL-450M
13
+ - **Training Status**: 🔄 In Progress
14
+ - **Generated**: 2025-08-18 20:39:32
15
+ - **Training Steps**: 7,000
16
 
17
  ## Training Details
18
 
 
24
  ### Training Configuration
25
  - **Max Steps**: 10,000
26
  - **Batch Size**: 2
27
+ - **Learning Rate**: 1e-05
28
+ - **Gradient Accumulation**: 1 steps
29
+ - **Evaluation Frequency**: Every 500 steps
30
 
31
  ### Current Performance
32
+ - **Training Loss**: 0.619257
33
+ - **Evaluation Loss**: 0.722366
34
 
35
  ## Pre-Training Evaluation
36
 
37
  **Initial Model Performance (before training):**
38
+ - **Loss**: 1.297430
39
+ - **Perplexity**: 3.66
40
  - **Character Accuracy**: 27.7%
41
  - **Word Accuracy**: 11.6%
42
 
 
45
  ### All Checkpoint Evaluations
46
  | Step | Checkpoint Type | Loss | Perplexity | Char Acc | Word Acc | Improvement vs Pre |
47
  |------|----------------|------|------------|----------|----------|--------------------|
48
+ | Pre | pre_training | 1.2974 | 3.66 | 27.7% | 11.6% | +0.0% |
49
+ | 500 | checkpoint | 0.9454 | 2.57 | 39.4% | 19.9% | +27.1% |
50
+ | 1,000 | checkpoint | 0.8644 | 2.37 | 38.7% | 19.1% | +33.4% |
51
+ | 1,500 | checkpoint | 0.8402 | 2.32 | 38.4% | 18.9% | +35.2% |
52
+ | 2,000 | checkpoint | 0.8139 | 2.26 | 37.9% | 19.8% | +37.3% |
53
+ | 2,500 | checkpoint | 0.7890 | 2.20 | 38.5% | 18.9% | +39.2% |
54
+ | 3,000 | checkpoint | 0.7793 | 2.18 | 39.3% | 19.5% | +39.9% |
55
+ | 3,500 | checkpoint | 0.7639 | 2.15 | 42.7% | 21.4% | +41.1% |
56
+ | 4,000 | checkpoint | 0.7483 | 2.11 | 41.2% | 20.4% | +42.3% |
57
+ | 4,500 | checkpoint | 0.7466 | 2.11 | 37.3% | 18.8% | +42.5% |
58
+ | 5,000 | checkpoint | 0.7358 | 2.09 | 40.4% | 20.5% | +43.3% |
59
+ | 5,500 | checkpoint | 0.7321 | 2.08 | 38.1% | 18.9% | +43.6% |
60
+ | 6,000 | checkpoint | 0.7276 | 2.07 | 38.8% | 17.6% | +43.9% |
61
+ | 6,500 | checkpoint | 0.7190 | 2.05 | 41.5% | 18.9% | +44.6% |
62
+ | 7,000 | checkpoint | 0.7224 | 2.06 | 41.6% | 18.7% | +44.3% |
63
 
64
  ## Training Progress
65
 
66
  ### Recent Training Steps (Loss Only)
67
  | Step | Training Loss | Timestamp |
68
  |------|---------------|-----------|
69
+ | 6,991 | 0.846698 | 2025-08-18T20:39 |
70
+ | 6,992 | 0.538150 | 2025-08-18T20:39 |
71
+ | 6,993 | 0.721188 | 2025-08-18T20:39 |
72
+ | 6,994 | 0.819544 | 2025-08-18T20:39 |
73
+ | 6,995 | 0.925656 | 2025-08-18T20:39 |
74
+ | 6,996 | 0.724563 | 2025-08-18T20:39 |
75
+ | 6,997 | 0.738329 | 2025-08-18T20:39 |
76
+ | 6,998 | 0.658910 | 2025-08-18T20:39 |
77
+ | 6,999 | 0.439738 | 2025-08-18T20:39 |
78
+ | 7,000 | 0.619257 | 2025-08-18T20:39 |
79
 
80
  ## Training Visualizations
81
 
 
99
  from transformers import AutoModelForCausalLM, AutoTokenizer
100
  # For vision-language models, use appropriate imports
101
 
102
+ model = AutoModelForCausalLM.from_pretrained("./model_step_7000")
103
+ tokenizer = AutoTokenizer.from_pretrained("./model_step_7000")
104
 
105
  # Your inference code here
106
  ```
 
112
  "dataset_name": "CATMuS/medieval",
113
  "model_name": "LiquidAI/LFM2-VL-450M",
114
  "max_steps": 10000,
115
+ "eval_steps": 500,
116
+ "num_accumulation_steps": 1,
117
+ "learning_rate": 1e-05,
118
  "train_batch_size": 2,
119
  "val_batch_size": 2,
120
  "train_select_start": 0,
 
140
 
141
  ---
142
 
143
+ *This model card was automatically generated by brute-force-training on 2025-08-18 20:39:32*
evaluation_comparison.png CHANGED

Git LFS Details

  • SHA256: b3328db6b598bc3ef5ccff86e00b1281d3aa556b8e0121c6871bdd7d6320a148
  • Pointer size: 131 Bytes
  • Size of remote file: 472 kB

Git LFS Details

  • SHA256: fe603c2e98f3932cc919781c80cf5602539065c13dc5b570e5ef68365d8a409a
  • Pointer size: 131 Bytes
  • Size of remote file: 512 kB
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:7d1c7bf9a67a65d0305165d7d04b3a5275ece0b8040925556407941f0b971da8
3
  size 901692416
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d05714a3806120122abbeed7fd3c4930fe07972e5ed14a4ce6bf7554cf1b5ca
3
  size 901692416
model_card_metadata.json CHANGED
@@ -1,16 +1,16 @@
1
  {
2
  "base_model": "LiquidAI/LFM2-VL-450M",
3
  "training_framework": "brute-force-training",
4
- "training_date": "2025-08-18T19:51:23.156386",
5
- "training_steps": 10000,
6
  "dataset": "CATMuS/medieval",
7
  "training_config": {
8
  "dataset_name": "CATMuS/medieval",
9
  "model_name": "LiquidAI/LFM2-VL-450M",
10
  "max_steps": 10000,
11
- "eval_steps": 1000,
12
- "num_accumulation_steps": 2,
13
- "learning_rate": 5e-06,
14
  "train_batch_size": 2,
15
  "val_batch_size": 2,
16
  "train_select_start": 0,
@@ -24,8 +24,8 @@
24
  "user_text": "Transcribe this medieval manuscript line.",
25
  "max_image_size": 200
26
  },
27
- "final_training_loss": 0.9189727306365967,
28
- "final_evaluation_loss": 3.461486041545868,
29
- "final_char_accuracy": 0.3254617025073297,
30
- "final_word_accuracy": 0.16456672762721147
31
  }
 
1
  {
2
  "base_model": "LiquidAI/LFM2-VL-450M",
3
  "training_framework": "brute-force-training",
4
+ "training_date": "2025-08-18T20:39:32.801669",
5
+ "training_steps": 7000,
6
  "dataset": "CATMuS/medieval",
7
  "training_config": {
8
  "dataset_name": "CATMuS/medieval",
9
  "model_name": "LiquidAI/LFM2-VL-450M",
10
  "max_steps": 10000,
11
+ "eval_steps": 500,
12
+ "num_accumulation_steps": 1,
13
+ "learning_rate": 1e-05,
14
  "train_batch_size": 2,
15
  "val_batch_size": 2,
16
  "train_select_start": 0,
 
24
  "user_text": "Transcribe this medieval manuscript line.",
25
  "max_image_size": 200
26
  },
27
+ "final_training_loss": 0.6192573308944702,
28
+ "final_evaluation_loss": 0.7223658800125122,
29
+ "final_char_accuracy": 0.41553718527220956,
30
+ "final_word_accuracy": 0.18711077811077811
31
  }
training_curves.png CHANGED

Git LFS Details

  • SHA256: 1d301c3edc0615f896061d9b927c4988efda607eb1418c8b9538aba173501aaf
  • Pointer size: 131 Bytes
  • Size of remote file: 447 kB

Git LFS Details

  • SHA256: a910353bf459e6ed0637d3a450b89190af5fea7d249af026692d4c55215098d2
  • Pointer size: 131 Bytes
  • Size of remote file: 511 kB
training_metrics.json CHANGED
The diff for this file is too large to render. See raw diff