Saad's AI Twin
This is a personalized AI twin fine-tuned using LoRA on Microsoft Phi-3 Mini (3.8B parameters).
Model Details
- Base Model: microsoft/Phi-3-mini-4k-instruct
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Training: Google Colab with T4 GPU
- Purpose: Personality replication for conversational AI
Usage
With Transformers + PEFT
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
torch_dtype=torch.float16,
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "SaadAx/saad-twin")
# Generate response
prompt = "<|user|>\nHow do you handle stress?<|end|>\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.8)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))
Via Hugging Face Inference API
import requests
API_URL = "https://api-inference.huggingface.co/models/SaadAx/saad-twin"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}
def query(payload):
response = requests.post(API_URL, headers=headers, json=payload)
return response.json()
output = query({
"inputs": "<|user|>\nHello!<|end|>\n<|assistant|>\n",
"parameters": {"max_new_tokens": 200, "temperature": 0.8}
})
print(output)
Training Details
- Dataset: Custom personality questionnaire responses
- Training Time: ~25 minutes
- LoRA Rank: 16
- Target Modules: q_proj, v_proj, k_proj, o_proj
- Learning Rate: 3e-4
Intended Use
This model is designed for:
- Demonstrating personality-based AI fine-tuning
- Educational purposes
- Research in personalized AI systems
Limitations
- May not perfectly capture all personality nuances
- Requires Phi-3 prompt format
- Limited to English language
- 4K context window
License
MIT License - Free for commercial and non-commercial use
**Click "Commit changes to main"**
---
### **Step 3: Wait 5-10 Minutes**
After updating the README:
- Hugging Face needs to reprocess your model
- Refresh the page after 10 minutes
- Look for "Hosted inference API" section to appear
---
### **Step 4: Test If It Works**
After 10 minutes, try this in your browser:
Go to: https://huggingface.co/SaadAx/saad-twin
Look for the **inference widget** on the right side. If you see it, type a test message!
---
## **OPTION 2: Use Inference Endpoints (If Free Doesn't Work)**
If free inference doesn't enable, you can deploy a dedicated endpoint.
### **Step 1: Go to Inference Endpoints**
1. Click your profile (top right) โ **Settings**
2. In left sidebar: Click **Inference Endpoints** (or go to https://ui.endpoints.huggingface.co/)
### **Step 2: Create New Endpoint**
1. Click **"+ New endpoint"**
2. Fill in:
- **Model Repository:** `SaadAx/saad-twin`
- **Endpoint name:** `saad-twin-api`
- **Cloud Provider:** AWS (or Azure/GCP)
- **Region:** us-east-1 (or closest to you)
- **Instance Type:**
- **CPU:** `cpu.small` (cheapest, slow)
- **GPU:** `gpu.small` (recommended for Phi-3)
### **Pricing (if you go this route):**
CPU (slow): ~$0.03/hour = $20/month GPU (fast): ~$0.60/hour = $400/month
- Downloads last month
- -
Model tree for Saadanjum0/saad-twin
Base model
microsoft/Phi-3-mini-4k-instruct