Saad's AI Twin

This is a personalized AI twin fine-tuned using LoRA on Microsoft Phi-3 Mini (3.8B parameters).

Model Details

Base Model: microsoft/Phi-3-mini-4k-instruct
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Training: Google Colab with T4 GPU
Purpose: Personality replication for conversational AI

Usage

With Transformers + PEFT

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model and tokenizer
base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    torch_dtype=torch.float16,
    device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "SaadAx/saad-twin")

# Generate response
prompt = "<|user|>\nHow do you handle stress?<|end|>\n<|assistant|>\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=200, temperature=0.8)
print(tokenizer.decode(outputs[0], skip_special_tokens=False))

Via Hugging Face Inference API

import requests

API_URL = "https://api-inference.huggingface.co/models/SaadAx/saad-twin"
headers = {"Authorization": "Bearer YOUR_HF_TOKEN"}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

output = query({
    "inputs": "<|user|>\nHello!<|end|>\n<|assistant|>\n",
    "parameters": {"max_new_tokens": 200, "temperature": 0.8}
})
print(output)

Training Details

Dataset: Custom personality questionnaire responses
Training Time: ~25 minutes
LoRA Rank: 16
Target Modules: q_proj, v_proj, k_proj, o_proj
Learning Rate: 3e-4

Intended Use

This model is designed for:

Demonstrating personality-based AI fine-tuning
Educational purposes
Research in personalized AI systems

Limitations

May not perfectly capture all personality nuances
Requires Phi-3 prompt format
Limited to English language
4K context window

License

MIT License - Free for commercial and non-commercial use


**Click "Commit changes to main"**

---

### **Step 3: Wait 5-10 Minutes**

After updating the README:
- Hugging Face needs to reprocess your model
- Refresh the page after 10 minutes
- Look for "Hosted inference API" section to appear

---

### **Step 4: Test If It Works**

After 10 minutes, try this in your browser:

Go to: https://huggingface.co/SaadAx/saad-twin

Look for the **inference widget** on the right side. If you see it, type a test message!

---

## **OPTION 2: Use Inference Endpoints (If Free Doesn't Work)**

If free inference doesn't enable, you can deploy a dedicated endpoint.

### **Step 1: Go to Inference Endpoints**

1. Click your profile (top right) → **Settings**
2. In left sidebar: Click **Inference Endpoints** (or go to https://ui.endpoints.huggingface.co/)

### **Step 2: Create New Endpoint**

1. Click **"+ New endpoint"**
2. Fill in:
   - **Model Repository:** `SaadAx/saad-twin`
   - **Endpoint name:** `saad-twin-api`
   - **Cloud Provider:** AWS (or Azure/GCP)
   - **Region:** us-east-1 (or closest to you)
   - **Instance Type:** 
     - **CPU:** `cpu.small` (cheapest, slow)
     - **GPU:** `gpu.small` (recommended for Phi-3)

### **Pricing (if you go this route):**

CPU (slow): ~$0.03/hour = $20/month GPU (fast): ~$0.60/hour = $400/month

Downloads last month: -

Model tree for Saadanjum0/saad-twin

Base model

microsoft/Phi-3-mini-4k-instruct

Adapter

(773)

this model

Saadanjum0
/

saad-twin