ReaderLM-v2 MLX (Full Precision)

This is a full-precision (bfloat16) MLX conversion of jinaai/ReaderLM-v2, optimized for Apple Silicon.

Model Details

Original Model: jinaai/ReaderLM-v2
Parameters: 1.5B
Precision: bfloat16 (full precision)
Format: MLX safetensors
Size: ~2.9GB

Why Full Precision?

While quantized versions (4-bit, 8-bit) exist for faster inference, this full-precision version offers:

Highest accuracy for HTML-to-markdown conversion
Best quality for complex document structures
No quantization artifacts in output

Usage

from mlx_lm import load, generate

model, tokenizer = load("roboalchemist/ReaderLM-v2-mlx-fp16")

html_content = "<html><body><h1>Hello World</h1><p>This is a test.</p></body></html>"
prompt = f"Extract the main content from the following HTML and convert it to Markdown format:\n```html\n{html_content}\n```"

response = generate(model, tokenizer, prompt=prompt, max_tokens=2048)
print(response)

Conversion Details

Converted using:

mlx_lm.convert --hf-path jinaai/ReaderLM-v2 --mlx-path ./ReaderLM-v2-mlx-fp16

License

This model inherits the CC-BY-NC-4.0 license from the original ReaderLM-v2.

Acknowledgments

Jina AI for the original ReaderLM-v2 model
MLX team at Apple for the framework

Downloads last month: 18

Safetensors

Model size

2B params

Tensor type

BF16

MLX

Hardware compatibility

Quantized

Model tree for roboalchemist/ReaderLM-v2-mlx-fp16

Base model

jinaai/ReaderLM-v2

Finetuned

(2)

this model