model_step_13000

Model Description

This model is a fine-tuned version of LiquidAI/LFM2-VL-450M using the brute-force-training package.

  • Base Model: LiquidAI/LFM2-VL-450M
  • Training Status: ๐Ÿ”„ In Progress
  • Generated: 2025-08-19 10:41:14
  • Training Steps: 13,000

Training Details

Dataset

  • Dataset: johnlockejrr/yiddish_synth_v2
  • Training Examples: 100,000
  • Validation Examples: 4,999

Training Configuration

  • Max Steps: 100,000
  • Batch Size: 15
  • Learning Rate: 7e-05
  • Gradient Accumulation: 1 steps
  • Evaluation Frequency: Every 1,000 steps

Current Performance

  • Training Loss: 0.124526
  • Evaluation Loss: 0.189137

Pre-Training Evaluation

Initial Model Performance (before training):

  • Loss: 2.626098
  • Perplexity: 13.82
  • Character Accuracy: 31.1%
  • Word Accuracy: 12.9%

Evaluation History

All Checkpoint Evaluations

Step Checkpoint Type Loss Perplexity Char Acc Word Acc Improvement vs Pre
Pre pre_training 2.6261 13.82 31.1% 12.9% +0.0%
1,000 checkpoint 0.9395 2.56 20.1% 4.1% +64.2%
2,000 checkpoint 0.8058 2.24 21.2% 4.0% +69.3%
3,000 checkpoint 0.7305 2.08 23.0% 6.1% +72.2%
4,000 checkpoint 0.6669 1.95 20.6% 3.4% +74.6%
5,000 checkpoint 0.5341 1.71 21.4% 3.6% +79.7%
6,000 checkpoint 0.4656 1.59 20.9% 3.8% +82.3%
7,000 checkpoint 0.3917 1.48 21.4% 3.5% +85.1%
8,000 checkpoint 0.3310 1.39 21.6% 4.8% +87.4%
9,000 checkpoint 0.2892 1.34 20.7% 4.0% +89.0%
10,000 checkpoint 0.2566 1.29 20.9% 4.7% +90.2%
11,000 checkpoint 0.2199 1.25 20.2% 4.9% +91.6%
12,000 checkpoint 0.2033 1.23 20.3% 3.2% +92.3%
13,000 checkpoint 0.1891 1.21 19.4% 3.4% +92.8%

Training Progress

Recent Training Steps (Loss Only)

Step Training Loss Timestamp
12,991 0.154684 2025-08-19T10:40
12,992 0.183019 2025-08-19T10:40
12,993 0.157314 2025-08-19T10:40
12,994 0.168899 2025-08-19T10:40
12,995 0.116096 2025-08-19T10:40
12,996 0.122316 2025-08-19T10:40
12,997 0.149480 2025-08-19T10:40
12,998 0.166267 2025-08-19T10:40
12,999 0.152927 2025-08-19T10:40
13,000 0.124526 2025-08-19T10:40

Training Visualizations

Training Progress and Evaluation Metrics

Training Curves

This chart shows the training loss progression, character accuracy, word accuracy, and perplexity over time. Red dots indicate evaluation checkpoints.

Evaluation Comparison Across All Checkpoints

Evaluation Comparison

Comprehensive comparison of all evaluation metrics across training checkpoints. Red=Pre-training, Blue=Checkpoints, Green=Final.

Available Visualization Files:

  • training_curves.png - 4-panel view: Training loss with eval points, Character accuracy, Word accuracy, Perplexity
  • evaluation_comparison.png - 4-panel comparison: Loss, Character accuracy, Word accuracy, Perplexity across all checkpoints

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
# For vision-language models, use appropriate imports

model = AutoModelForCausalLM.from_pretrained("./model_step_13000")
tokenizer = AutoTokenizer.from_pretrained("./model_step_13000")

# Your inference code here

Training Configuration

{
  "dataset_name": "johnlockejrr/yiddish_synth_v2",
  "model_name": "LiquidAI/LFM2-VL-450M",
  "max_steps": 100000,
  "eval_steps": 1000,
  "num_accumulation_steps": 1,
  "learning_rate": 7e-05,
  "train_batch_size": 15,
  "val_batch_size": 1,
  "train_select_start": 0,
  "train_select_end": 100000,
  "val_select_start": 100001,
  "val_select_end": 105000,
  "train_field": "train",
  "val_field": "train",
  "image_column": "image",
  "text_column": "text",
  "user_text": "Please transcribe all the Yiddish text you see in this historical manuscript image. Provide only the transcribed text without any additional commentary or description.",
  "max_image_size": 250
}

Model Card Metadata

  • Base Model: LiquidAI/LFM2-VL-450M
  • Training Framework: brute-force-training
  • Training Type: Fine-tuning
  • License: Inherited from base model
  • Language: Inherited from base model

This model card was automatically generated by brute-force-training on 2025-08-19 10:41:14

Downloads last month
3
Safetensors
Model size
0.5B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for wjbmattingly/lfm2-vl-450M-yiddish

Finetuned
(14)
this model