Qwen3-8B-Herald-SFT
A QLoRA fine-tune of Qwen/Qwen3-8B for natural-language-to-Lean-4 autoformalization (NL→FL), trained on the FrenzyMath/Herald_statements dataset as a supervised cold-start before GRPO reinforcement learning.
Given a natural language mathematical statement, the model produces a
structured Lean 4 theorem stub (with sorry placeholders) in a
#### {"header": ..., "formal_statement": ...} JSON format that is directly
consumable by a downstream GRPO trainer with a Lean 4 compiler reward.
Model Details
Model Description
- Developed by: xiaolesu / Osmosis AI
- Model type: Causal LM (decoder-only transformer), LoRA adapter
- Finetuned from: Qwen/Qwen3-8B
- Language(s): English (natural language input) → Lean 4 (formal output)
- License: Apache 2.0
- Training dataset: FrenzyMath/Herald_statements — 50k shuffled examples (subset of 580k)
Model Sources
- Repository:
xiaolesu/Qwen3-8B-Herald-SFT-ckpt2000 - Base model: https://huggingface.co/Qwen/Qwen3-8B
Uses
Direct Use
This model is designed as a Lean 4 autoformalization engine. Given a
natural language math problem, it translates it into a formal Lean 4 theorem
statement using Mathlib, with sorry as the proof placeholder. It does not
write proofs — only the theorem signature.
Example use case: automatically generating a corpus of Lean 4 theorem stubs from informal mathematical textbooks, competition problems, or research papers.
Downstream Use
This is intended as a SFT cold-start checkpoint for a GRPO reinforcement
learning phase, where the Lean 4 compiler provides binary reward signals
(type-checks or not). The structured #### JSON output format is specifically
designed for easy parsing in a GRPO reward function.
Out-of-Scope Use
- Writing complete Lean 4 proofs (the model only generates stubs ending in
:= by sorry) - Informal mathematical reasoning or problem solving in natural language
- Lean 3 or other proof assistants (Coq, Isabelle, etc.)
- General-purpose coding tasks
How to Get Started with the Model
import json import torch from peft import PeftModel from transformers import AutoModelForCausalLM, AutoTokenizer
BASE_MODEL = "Qwen/Qwen3-8B" ADAPTER = "xiaolesu/Qwen3-8B-Herald-SFT-ckpt2000"
SYSTEM_PROMPT = """
You are a Lean4 auto-formalization engine.
Task
Given a statement of a math problem, translate it into formal Lean 4 (v4.11.0) code with Mathlib. Do not write any proof steps. Use "sorry" as a placeholder for proofs. All statements must end with ":= by sorry" or ":= sorry".
Output Format
Output must start with #### followed by a JSON object:
{"header": "", "formal_statement": " := by sorry"}
Now formalize the following:"""
tokenizer = AutoTokenizer.from_pretrained(ADAPTER) base = AutoModelForCausalLM.from_pretrained( BASE_MODEL, torch_dtype=torch.bfloat16, device_map="auto" ) model = PeftModel.from_pretrained(base, ADAPTER) model.eval()
nl = "Prove that for all natural numbers n, n + 0 = n" messages = [ {"role": "system", "content": SYSTEM_PROMPT}, {"role": "user", "content": json.dumps({"nl_statement": nl})}, ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad(): out = model.generate(**inputs, max_new_tokens=512, temperature=0.6, top_p=0.95, do_sample=True)
response = tokenizer.decode(out[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) print(response)
#### {"header": "import Mathlib", "formal_statement": "theorem nat_add_zero (n : â„•) : n + 0 = n := by sorry"}
- Downloads last month
- 275