Llama 3 8B - ShareGPT 10K Voice AI
This is a LoRA-finetuned version of Meta-Llama-3-8B-Instruct, trained on 10,887 high-quality conversations from the ShareGPT52K dataset.
π― Model Overview
- Base Model: meta-llama/Meta-Llama-3-8B-Instruct
- Training Method: LoRA (Low-Rank Adaptation)
- Quantization: 4-bit (bitsandbytes)
- Dataset: RyokoAI/ShareGPT52K
- Training Conversations: 10,887
- Training Steps: 2,043
- Training Epochs: 3
- Adapter Size: ~161 MB
π Training Configuration
- Learning Rate: 2e-4
- Batch Size: 4 (effective: 16 with gradient accumulation)
- Gradient Accumulation Steps: 4
- LoRA Rank (r): 16
- LoRA Alpha: 32
- LoRA Dropout: 0.05
- Optimizer: paged_adamw_32bit
- Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Trainable Parameters: 41,943,040 (0.52% of total)
π Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model
base_model = AutoModelForCausalLM.from_pretrained(
"meta-llama/Meta-Llama-3-8B-Instruct",
device_map="auto",
torch_dtype=torch.float16
)
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "owenergy/llama3-sharegpt-10k-voice-ai")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct")
# Generate response
messages = [
{"role": "system", "content": "You are a helpful AI assistant."},
{"role": "user", "content": "Explain quantum computing in simple terms."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to(model.device)
outputs = model.generate(
inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
π Training Data
The model was trained on 10,887 diverse conversations from ShareGPT52K, covering:
- π» Technical assistance and programming
- π General knowledge and education
- βοΈ Creative writing and storytelling
- π§ Problem-solving and troubleshooting
- π¬ Natural conversation and dialogue
- π Wide range of topics and domains
β¨ What Makes This Model Special
- High-Quality Data: Trained exclusively on ShareGPT conversations, known for natural and helpful responses
- Optimized for Voice AI: Designed for conversational applications with natural dialogue flow
- Efficient LoRA: Only 161MB adapter that works with the base model
- Well-Balanced: 3 full epochs ensure good learning without overfitting
- Conversational Excellence: Excels at multi-turn dialogues and context understanding
πͺ Model Strengths
- Natural, human-like conversational responses
- Strong context retention across multiple turns
- Helpful and informative explanations
- Creative problem-solving approaches
- Adaptable to various conversation styles
β οΈ Limitations
- Requires base Llama 3 8B model (this is just the LoRA adapter)
- Trained primarily on English conversations
- May inherit biases present in training data
- Requires proper chat template formatting
- Best performance with conversational use cases
π Performance
The model shows improved performance over the base model on:
- Conversational coherence
- Response helpfulness
- Natural dialogue flow
- Context understanding
- Multi-turn conversations
π οΈ Use Cases
Perfect for:
- ποΈ Voice assistants
- π¬ Chatbots
- π Customer service AI
- π€ Interactive AI applications
- π± Mobile AI assistants
- π Web-based chat interfaces
π¦ Model Files
adapter_model.safetensors- LoRA adapter weights (~161MB)adapter_config.json- Adapter configurationtokenizer_config.json- Tokenizer settingsspecial_tokens_map.json- Special tokens- Checkpoints saved every 100 steps
π§ Hardware Requirements
For Inference:
- GPU: 12GB+ VRAM (with 4-bit quantization)
- RAM: 16GB+ system memory
- Storage: ~5GB (base model) + 161MB (adapter)
Recommended Setup:
- GPU: RTX 3090, RTX 4090, A100, or similar
- With 4-bit quantization: Can run on consumer GPUs
π Citation
@misc{llama3-sharegpt-10k-voice-ai,
author = {owenergy},
title = {Llama 3 8B ShareGPT 10K Voice AI},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/owenergy/llama3-sharegpt-10k-voice-ai},
note = {LoRA finetuned on 10,887 ShareGPT conversations}
}
π License
This model inherits the Llama 3 license from Meta. Please review the Llama 3 License Agreement.
π Acknowledgments
- Meta for the Llama 3 base model
- RyokoAI for the ShareGPT52K dataset
- HuggingFace for the transformers and PEFT libraries
π§ Contact
For questions, issues, or feedback, please open an issue on HuggingFace.
Model Card by: owenergy
Date: December 2025
Status: Production Ready β
- Downloads last month
- 11
Model tree for owenergy/llama3-sharegpt-10k-voice-ai
Base model
meta-llama/Meta-Llama-3-8B-Instruct