DeepSeek-OCR Dhivehi

This model is a fine-tuned version of unsloth/DeepSeek-OCR, trained for Dhivehi single-line sentence recognition. It was fine-tuned using 50,000 samples from the alakxender/dhivehi-vrd-images dataset.

Base model: unsloth/DeepSeek-OCR
More info on the model: deepseek-ai/DeepSeek-OCR
Dataset: alakxender/vrd-images-224x224
Samples used: 20k multi-line Dhivehi sentences

Usage

Inference using Huggingface transformers on NVIDIA GPUs. Requirements tested on python 3.12.9 + CUDA11.8：

from transformers import AutoModel, AutoTokenizer
import torch
import os
os.environ["CUDA_VISIBLE_DEVICES"] = '0'
model_name = 'alakxender/deepseek-ocr-3b-vrd-dhivehi-20k-ml'

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModel.from_pretrained(model_name, _attn_implementation='flash_attention_2', trust_remote_code=True, use_safetensors=True)
model = model.eval().cuda().to(torch.bfloat16)

# prompt = "<image>\nFree OCR. "
prompt = "<image>\nFree OCR. "
image_file = 'sl.png'
output_path = 'your/output/dir'

# infer(self, tokenizer, prompt='', image_file='', output_path = ' ', base_size = 1024, image_size = 640, crop_mode = True, test_compress = False, save_results = False):

# Tiny: base_size = 512, image_size = 512, crop_mode = False
# Small: base_size = 640, image_size = 640, crop_mode = False
# Base: base_size = 1024, image_size = 1024, crop_mode = False
# Large: base_size = 1280, image_size = 1280, crop_mode = False

# Gundam: base_size = 1024, image_size = 640, crop_mode = True

res = model.infer(tokenizer, prompt=prompt, image_file=image_file, output_path = output_path, base_size = 1024, image_size = 640, crop_mode=True, save_results = True, test_compress = True)

Downloads last month: 10

Safetensors

Model size

3B params

Tensor type

I64

BF16

Model tree for alakxender/deepseek-ocr-3b-vrd-dhivehi-20k-ml

Base model

deepseek-ai/DeepSeek-OCR

Finetuned

unsloth/DeepSeek-OCR

Finetuned

(4)

this model

alakxender
/

deepseek-ocr-3b-vrd-dhivehi-20k-ml

DeepSeek-OCR Dhivehi

Usage

Model tree for alakxender/deepseek-ocr-3b-vrd-dhivehi-20k-ml

Dataset used to train alakxender/deepseek-ocr-3b-vrd-dhivehi-20k-ml