model_step_13000

Model Description

This model is a fine-tuned version of LiquidAI/LFM2-VL-450M using the brute-force-training package.

Base Model: LiquidAI/LFM2-VL-450M
Training Status: 🔄 In Progress
Generated: 2025-08-19 10:41:14
Training Steps: 13,000

Training Details

Dataset

Dataset: johnlockejrr/yiddish_synth_v2
Training Examples: 100,000
Validation Examples: 4,999

Training Configuration

Max Steps: 100,000
Batch Size: 15
Learning Rate: 7e-05
Gradient Accumulation: 1 steps
Evaluation Frequency: Every 1,000 steps

Current Performance

Training Loss: 0.124526
Evaluation Loss: 0.189137

Pre-Training Evaluation

Initial Model Performance (before training):

Loss: 2.626098
Perplexity: 13.82
Character Accuracy: 31.1%
Word Accuracy: 12.9%

Evaluation History

All Checkpoint Evaluations

Step	Checkpoint Type	Loss	Perplexity	Char Acc	Word Acc	Improvement vs Pre
Pre	pre_training	2.6261	13.82	31.1%	12.9%	+0.0%
1,000	checkpoint	0.9395	2.56	20.1%	4.1%	+64.2%
2,000	checkpoint	0.8058	2.24	21.2%	4.0%	+69.3%
3,000	checkpoint	0.7305	2.08	23.0%	6.1%	+72.2%
4,000	checkpoint	0.6669	1.95	20.6%	3.4%	+74.6%
5,000	checkpoint	0.5341	1.71	21.4%	3.6%	+79.7%
6,000	checkpoint	0.4656	1.59	20.9%	3.8%	+82.3%
7,000	checkpoint	0.3917	1.48	21.4%	3.5%	+85.1%
8,000	checkpoint	0.3310	1.39	21.6%	4.8%	+87.4%
9,000	checkpoint	0.2892	1.34	20.7%	4.0%	+89.0%
10,000	checkpoint	0.2566	1.29	20.9%	4.7%	+90.2%
11,000	checkpoint	0.2199	1.25	20.2%	4.9%	+91.6%
12,000	checkpoint	0.2033	1.23	20.3%	3.2%	+92.3%
13,000	checkpoint	0.1891	1.21	19.4%	3.4%	+92.8%

Training Progress

Recent Training Steps (Loss Only)

Step	Training Loss	Timestamp
12,991	0.154684	2025-08-19T10:40
12,992	0.183019	2025-08-19T10:40
12,993	0.157314	2025-08-19T10:40
12,994	0.168899	2025-08-19T10:40
12,995	0.116096	2025-08-19T10:40
12,996	0.122316	2025-08-19T10:40
12,997	0.149480	2025-08-19T10:40
12,998	0.166267	2025-08-19T10:40
12,999	0.152927	2025-08-19T10:40
13,000	0.124526	2025-08-19T10:40

Training Visualizations

Training Progress and Evaluation Metrics

This chart shows the training loss progression, character accuracy, word accuracy, and perplexity over time. Red dots indicate evaluation checkpoints.

Evaluation Comparison Across All Checkpoints

Comprehensive comparison of all evaluation metrics across training checkpoints. Red=Pre-training, Blue=Checkpoints, Green=Final.

Available Visualization Files:

training_curves.png - 4-panel view: Training loss with eval points, Character accuracy, Word accuracy, Perplexity
evaluation_comparison.png - 4-panel comparison: Loss, Character accuracy, Word accuracy, Perplexity across all checkpoints

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
# For vision-language models, use appropriate imports

model = AutoModelForCausalLM.from_pretrained("./model_step_13000")
tokenizer = AutoTokenizer.from_pretrained("./model_step_13000")

# Your inference code here

Training Configuration

{
  "dataset_name": "johnlockejrr/yiddish_synth_v2",
  "model_name": "LiquidAI/LFM2-VL-450M",
  "max_steps": 100000,
  "eval_steps": 1000,
  "num_accumulation_steps": 1,
  "learning_rate": 7e-05,
  "train_batch_size": 15,
  "val_batch_size": 1,
  "train_select_start": 0,
  "train_select_end": 100000,
  "val_select_start": 100001,
  "val_select_end": 105000,
  "train_field": "train",
  "val_field": "train",
  "image_column": "image",
  "text_column": "text",
  "user_text": "Please transcribe all the Yiddish text you see in this historical manuscript image. Provide only the transcribed text without any additional commentary or description.",
  "max_image_size": 250
}

Model Card Metadata

Base Model: LiquidAI/LFM2-VL-450M
Training Framework: brute-force-training
Training Type: Fine-tuning
License: Inherited from base model
Language: Inherited from base model

This model card was automatically generated by brute-force-training on 2025-08-19 10:41:14

Downloads last month: 3

Safetensors

Model size

0.5B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for wjbmattingly/lfm2-vl-450M-yiddish

Base model

LiquidAI/LFM2-VL-450M

Finetuned

(14)

this model