DY-Star-LLMs
Collection
LLMs for automating Dolev-Yao-Star proof generation.
•
2 items
•
Updated
This model is a SFTTrainer fine-tuned version of Microsoft's BitNet 1.58b 2B LLM trained to generate F* code and automated proofs in the F* formal verification language.
This is a parameter-efficient fine-tuned BitNet model. It specializes in generating F* code, including function definitions, lemmas, and formal proofs.
microsoft/bitnet-b1.58-2B-4T-bf16bf16)After training, the model was evaluated on the validation split of microsoft/FStarDataSet-V2.
Metrics were computed using the TRL SFTTrainer.evaluate() routine.
| Metric | Value | Description |
|---|---|---|
| Evaluation loss | 1.5682 | Cross-entropy loss on validation set |
| Evaluation entropy | 1.5745 | Average token-level entropy |
| Mean token accuracy | 0.7127 | Fraction of tokens correctly predicted |
| Validation perplexity | ≈ 4.80 (exp(1.5682)) | Derived from evaluation loss |
| Evaluation runtime | 163.39 s | Total time for validation |
| Samples / s | 11.97 | Evaluation throughput |
| Steps / s | 5.99 | Evaluation step rate |
| Evaluated tokens | 138 ,017 ,925 | Total tokens processed |
| Epoch | 3 | Training completed over 3 full epochs |
The fine-tuned BitNet-1.58 2B (bf16) model achieved:
This indicates stable convergence and improved code-generation alignment compared to the original baseline.
Base model
microsoft/bitnet-b1.58-2B-4T