SanctumAI-v1
The intelligent coding co-pilot built for CodeSanctum β an offline-first collaborative hackathon platform.
SanctumAI is a fine-tuned Phi-3-mini-4k-instruct model, specialized for hackathon-style code generation, debugging, project scaffolding, and task planning. It understands the CodeSanctum platform context and produces immediately usable, production-quality code.
Model Details
| Property | Value |
|---|---|
| Base Model | microsoft/Phi-3-mini-4k-instruct |
| Fine-tuning Method | QLoRA (4-bit NF4 quantization) |
| LoRA Rank (r) | 16 |
| LoRA Alpha | 32 |
| Target Modules | qkv_proj, o_proj, gate_up_proj, down_proj |
| Training Examples | 22 curated examples across 7 categories |
| Epochs | 5 |
| Learning Rate | 5e-5 (cosine scheduler) |
| Optimizer | Paged AdamW 8-bit |
| Max Sequence Length | 2048 tokens |
| Quantization | 4-bit (NF4, double quantization) |
| GPU | Google Colab T4 (free tier compatible) |
| Training Time | ~20-40 minutes |
| Framework | Transformers + PEFT + TRL |
Capabilities
SanctumAI is trained to excel at:
- Code Completion β Context-aware code generation across JavaScript, TypeScript, Python, Java, C++, HTML, CSS
- Bug Detection β Root cause analysis with concrete fixes
- Code Review β Quality assessment with actionable improvement suggestions
- Project Scaffolding β Full project structure generation from descriptions
- Task Generation β Breaking down projects into actionable sprint tasks
- Code Explanation β Clear, concise explanations of complex code
- Refactoring β Modernizing and optimizing existing code
Training Data Categories
code_completion β Context-aware code generation
bug_detection β Finding and fixing bugs with explanations
code_review β Quality assessment and improvement suggestions
project_scaffold β Full project structure generation
task_generation β Sprint task breakdown from descriptions
code_explain β Code explanation and documentation
identity β SanctumAI personality and platform knowledge
Architecture
SanctumAI runs as part of a multi-tier inference system:
βββββββββββββββββββββββββββββββββββββββββββββββ
β CodeSanctum Frontend β
β (React + Monaco Code Editor) β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββββββββββ
β Node.js Backend (Express) β
β Routes: /api/ai/* + /api/agent/* β
ββββββββββββββββββββ¬βββββββββββββββββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββββββββββ
β SanctumAI Inference Server β
β (FastAPI, OpenAI-compatible API) β
β β
β Priority Chain: β
β 1. SanctumAI (Colab via ngrok) β
β 2. Groq Fallback (llama-3.3-70b-versatile) β
β 3. Local Phi-3 + QLoRA adapter β
βββββββββββββββββββββββββββββββββββββββββββββββ
Usage
Loading the Adapter
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.float16,
bnb_4bit_use_double_quant=True,
bnb_4bit_quant_type="nf4",
)
base_model = AutoModelForCausalLM.from_pretrained(
"microsoft/Phi-3-mini-4k-instruct",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")
# Load SanctumAI adapter
model = PeftModel.from_pretrained(base_model, "CodeSanctum/SanctumAI-v1")
Inference
messages = [
{"role": "system", "content": "You are SanctumAI, the intelligent coding co-pilot built for CodeSanctum."},
{"role": "user", "content": "Create a REST API with Express.js for a todo app with CRUD operations"},
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=1024,
temperature=0.7,
top_p=0.95,
do_sample=True,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)
Via the Inference Server
# Start the FastAPI server
cd sanctum-ai
python server.py
# Call the OpenAI-compatible endpoint
curl http://localhost:8000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"messages": [
{"role": "user", "content": "Write a React hook for debouncing"}
],
"max_tokens": 512
}'
About CodeSanctum
CodeSanctum is a full-stack collaborative hackathon platform with:
- Real-time Collaborative Code Editor β Monaco-based with multi-cursor, live sync via Socket.IO
- AI Code Assistant β SanctumAI for code completion, bug detection, and review
- Team Management β Invite codes, roles, and sprint task boards
- Video Calling β HD team calls via 100ms SDK
- Offline Collaboration β LAN Mode with WebRTC P2P + Direct Connect (zero-internet relay)
- Encrypted Vault β Secure credential storage for team secrets
- Whiteboard β Real-time collaborative diagramming
- Documents β Shared specs and notes
- Chill Zone β Embedded games for team breaks
- Spotify Integration β Shared listening for focused coding sessions
Tech Stack: React, Node.js/Express, MongoDB, Socket.IO, FastAPI, Phi-3, QLoRA
Team
Built by the CodeSanctum Team for hackathon collaboration.
License
This adapter is released under the Apache 2.0 License, consistent with the Phi-3 base model license.
Citation
@software{sanctumai2025,
title={SanctumAI: A Fine-tuned Phi-3 Coding Co-pilot for Hackathon Collaboration},
author={CodeSanctum Team},
year={2025},
url={https://huggingface.co/CodeSanctum/SanctumAI-v1}
}
- Downloads last month
- 22
Model tree for narvesh18/SanctumAI-v1
Base model
microsoft/Phi-3-mini-4k-instruct