RRULE Extractor (NuExtract 8B Fine-tuned)
This model is a fine-tuned and quantized version of the NuExtract 8B parameter model, designed specifically for structured extraction of schedule data. It converts unstructured or confusing schedule descriptions into structured, ICAL-compliant RRULE (RFC 5545) JSON formats.
Model Description
This model is a working proof of concept for a meaningful hypothesis: that the semantically rich, human-stewarded data maintained by some 211 networks (especially in rural areas) is not limited by its lack of machine-readable structure. With targeted fine-tuning, AI can bridge that gap — preserving the accuracy and nuance of human curation while producing the structured outputs that governments, hospitals, researchers, and software providers need to build effective solutions around the social safety net.
This model was trained as a follow-up to our initial experiments with the Osmosis 0.6B model. While the 0.6B model successfully learned the schema fields and could handle basic cases, it struggled with the complex realities of RFC 5545 RRULE schedule data. To overcome this limitation and ensure high-quality, strict structural adherence, we fine-tuned the highly capable NuExtract 8B model.
The result is a highly accurate extraction model that:
- Nails complex standard schedule formats.
- Successfully processes some of the most complex schedules found in the 211 datasets.
- Gracefully handles misspellings and ambiguous or unclear descriptions.
- Is quantized for more efficient inference and deployment (see QAT techniques in our experiment logs).
Intended Use
Primary Use Case: Extracting unstructured text descriptions of operating hours or schedules and converting them into strict, machine-readable JSON schedules that adhere to the RFC 5545 specification.
Example Input
"Open M-F 9-5 but closed on holidays"
Example Output
(The model will output a valid JSON array of schedule objects containing fields such as opens_at, closes_at, freq, interval, byday, etc.)
Available Model Files
This repository includes both the full-precision safetensors weights and a quantized GGUF for flexible usage:
| File(s) | Format | Precision | Size | Use Case |
|---|---|---|---|---|
model-0000[1-4]-of-00004.safetensors |
SafeTensors | F16 | ~21 GB | Further fine-tuning / research |
model-q4_k_m.gguf |
GGUF | Q4_K_M | ~4.7 GB | Inference / deployment |
Full-Precision Weights (SafeTensors)
The four .safetensors shards contain the complete model weights at 16-bit (F16) precision — the native precision at which the model was trained and fine-tuned. F16 was chosen over BF16 to keep the door open for future reinforcment learning techniques if I get the time to revisit this project when we are looking to deploy this model or a smaller QAT finetuned 4B parameter model at scale. Either way, the safetensors are provided for researchers and practitioners who want to:
- Continue fine-tuning on additional or domain-specific schedule data.
- Experiment with alternative quantization schemes.
- Run evaluations at full precision.
Quantized GGUF (Q4_K_M)
The model-q4_k_m.gguf file is the recommended file for most inference use cases. It was produced using Quantization-Aware Training (QAT) — a technique that simulates the effects of quantization during the training process, allowing the model to adapt its weights to minimize accuracy loss before the final quantization step is applied. This is in contrast to post-training quantization (PTQ), which quantizes a fully trained model without any opportunity for weight adaptation.
The practical result is that the Q4_K_M model retains the vast majority of the full-precision model's accuracy at a fraction of the memory footprint, making it well-suited for local inference and production deployment. For a deeper technical explanation of how QAT enables low-precision accuracy recovery, see NVIDIA's overview here.
Training Data
The model was fine-tuned on examples of 211 schedule data consisting of unstructured schedule strings annotated with corresponding valid ICAL compliant RRULES.
Limitations
- Designed primarily for English language schedule descriptions.
- Output generation should be validated by a JSON parser to ensure strict downstream compatibility, though the model is highly trained to output valid JSON formats natively.
Future Work & Significance
At 8B parameters, this model carries capabilities well beyond what schedule extraction requires. Its generalized architecture handles the task reliably, but we believe similar extraction quality is achievable with a much smaller model — potentially in the 1–4B range — trained on the same labeled dataset. A purpose-built smaller model would dramatically reduce inference cost, latency, and memory footprint, which matters at the scale 211 networks operate at.
Looking ahead, the ~10k labeled examples used here could be expanded to an estimated 80k+ training examples, and the same fine-tuning methodology applied to other structured fields across 211 datasets beyond schedules.
More broadly, this model serves as a working proof of concept for a meaningful hypothesis: that the semantically rich, human-stewarded data maintained by 211 networks is not limited by its lack of machine-readable structure. With targeted fine-tuning, AI can bridge that gap — preserving the accuracy and nuance of human curation while producing the structured outputs that governments, hospitals, researchers, and software providers need to build effective solutions for society's biggest problems. The bottleneck is not the data. The bottleneck is tooling, and that bottleneck is solvable.
- Downloads last month
- 66