BitNet 1.58b 2B Fine-tuned on F*

This model is a SFTTrainer fine-tuned version of Microsoft's BitNet 1.58b 2B LLM trained to generate F* code and automated proofs in the F* formal verification language.

Model Details

Model Description

This is a parameter-efficient fine-tuned BitNet model. It specializes in generating F* code, including function definitions, lemmas, and formal proofs.

Developed by: Sarthak Das
Model type: BitNet LLM
Language(s): F* (Formal verification language)
License: MIT License
Finetuned from model: microsoft/bitnet-b1.58-2B-4T-bf16
Training dataset: microsoft/FStarDataSet-V2

Model Sources

Base Model Repository: microsoft/bitnet-b1.58-2B-4T-bf16
Dataset: microsoft/FStarDataSet-V2

Training Setup

Base Model: microsoft/bitnet-b1.58-2B-4T-bf16
Optimizer: AdamW (fused, β₁ = 0.9, β₂ = 0.999, ε = 1e-8, weight decay = 0.05)
Adapter: LoRA not used (full model fine-tuned; no quantization)
Precision: bfloat16 (bf16)
Scheduler: Cosine with 10 % warm-up ratio
Gradient Clipping: 1.0
Gradient Checkpointing: Enabled
Batch Size: 1 per device, accumulated to effective batch size = 64
Learning Rate: 5 × 10⁻⁵
Epochs: 3
Context Length: 2048 tokens
Hardware: 2 NVIDIA L40S (48 GB VRAM each)
Frameworks: PyTorch + Transformers + TRL (SFTTrainer)
Mixed Precision: Full bf16 (no quantization, full-precision training)

Evaluation

After training, the model was evaluated on the validation split of microsoft/FStarDataSet-V2. Metrics were computed using the TRL SFTTrainer.evaluate() routine.

Metric	Value	Description
Evaluation loss	1.5682	Cross-entropy loss on validation set
Evaluation entropy	1.5745	Average token-level entropy
Mean token accuracy	0.7127	Fraction of tokens correctly predicted
Validation perplexity	≈ 4.80 (exp(1.5682))	Derived from evaluation loss
Evaluation runtime	163.39 s	Total time for validation
Samples / s	11.97	Evaluation throughput
Steps / s	5.99	Evaluation step rate
Evaluated tokens	138 ,017 ,925	Total tokens processed
Epoch	3	Training completed over 3 full epochs

Summary

The fine-tuned BitNet-1.58 2B (bf16) model achieved:

Validation Loss: 1.57
Validation Perplexity: ≈ 4.8
Token-level Accuracy: ≈ 71 %

This indicates stable convergence and improved code-generation alignment compared to the original baseline.

Downloads last month: 41

Safetensors

Model size

2B params

Tensor type

BF16

Model tree for dassarthak18/bitnet-fstar-finetuned

Base model

microsoft/bitnet-b1.58-2B-4T

Quantized

(6)

this model

Dataset used to train dassarthak18/bitnet-fstar-finetuned

Collection including dassarthak18/bitnet-fstar-finetuned

DY-Star-LLMs

Collection

LLMs for automating Dolev-Yao-Star proof generation. • 2 items • Updated 13 days ago