HistoryGPT
Vietnamese History AI Assistant fine-tuned with RLHF (PPO).
Training Details
- Base Model: khanhrill/HistoryGPT
- Fine-tuning: PPO with human feedback from OpenWebUI
- Last Updated: 2025-12-12
- Version: 20251212_0806
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("khanhrill/HistoryGPT")
tokenizer = AutoTokenizer.from_pretrained("khanhrill/HistoryGPT")
prompt = "Hãy kể về lịch sử Việt Nam"
inputs = tokenizer(prompt, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0]))
Training Pipeline
This model was trained using an automated RLHF pipeline:
- Collect user feedback from OpenWebUI
- Train reward model from preference pairs
- Fine-tune with PPO using the reward model
- Deploy to HuggingFace Hub
- Downloads last month
- 177
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support