ReaderLM-v2 MLX (Full Precision)

This is a full-precision (bfloat16) MLX conversion of jinaai/ReaderLM-v2, optimized for Apple Silicon.

Model Details

  • Original Model: jinaai/ReaderLM-v2
  • Parameters: 1.5B
  • Precision: bfloat16 (full precision)
  • Format: MLX safetensors
  • Size: ~2.9GB

Why Full Precision?

While quantized versions (4-bit, 8-bit) exist for faster inference, this full-precision version offers:

  • Highest accuracy for HTML-to-markdown conversion
  • Best quality for complex document structures
  • No quantization artifacts in output

Usage

from mlx_lm import load, generate

model, tokenizer = load("roboalchemist/ReaderLM-v2-mlx-fp16")

html_content = "<html><body><h1>Hello World</h1><p>This is a test.</p></body></html>"
prompt = f"Extract the main content from the following HTML and convert it to Markdown format:\n```html\n{html_content}\n```"

response = generate(model, tokenizer, prompt=prompt, max_tokens=2048)
print(response)

Conversion Details

Converted using:

mlx_lm.convert --hf-path jinaai/ReaderLM-v2 --mlx-path ./ReaderLM-v2-mlx-fp16

License

This model inherits the CC-BY-NC-4.0 license from the original ReaderLM-v2.

Acknowledgments

  • Jina AI for the original ReaderLM-v2 model
  • MLX team at Apple for the framework
Downloads last month
18
Safetensors
Model size
2B params
Tensor type
BF16
·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for roboalchemist/ReaderLM-v2-mlx-fp16

Base model

jinaai/ReaderLM-v2
Finetuned
(2)
this model