SanctumAI-v1

The intelligent coding co-pilot built for CodeSanctum — an offline-first collaborative hackathon platform.

SanctumAI is a fine-tuned Phi-3-mini-4k-instruct model, specialized for hackathon-style code generation, debugging, project scaffolding, and task planning. It understands the CodeSanctum platform context and produces immediately usable, production-quality code.

Model Details

Property	Value
Base Model	microsoft/Phi-3-mini-4k-instruct
Fine-tuning Method	QLoRA (4-bit NF4 quantization)
LoRA Rank (r)	16
LoRA Alpha	32
Target Modules	`qkv_proj`, `o_proj`, `gate_up_proj`, `down_proj`
Training Examples	22 curated examples across 7 categories
Epochs	5
Learning Rate	5e-5 (cosine scheduler)
Optimizer	Paged AdamW 8-bit
Max Sequence Length	2048 tokens
Quantization	4-bit (NF4, double quantization)
GPU	Google Colab T4 (free tier compatible)
Training Time	~20-40 minutes
Framework	Transformers + PEFT + TRL

Capabilities

SanctumAI is trained to excel at:

Code Completion — Context-aware code generation across JavaScript, TypeScript, Python, Java, C++, HTML, CSS
Bug Detection — Root cause analysis with concrete fixes
Code Review — Quality assessment with actionable improvement suggestions
Project Scaffolding — Full project structure generation from descriptions
Task Generation — Breaking down projects into actionable sprint tasks
Code Explanation — Clear, concise explanations of complex code
Refactoring — Modernizing and optimizing existing code

Training Data Categories

code_completion    — Context-aware code generation
bug_detection      — Finding and fixing bugs with explanations
code_review        — Quality assessment and improvement suggestions
project_scaffold   — Full project structure generation
task_generation    — Sprint task breakdown from descriptions
code_explain       — Code explanation and documentation
identity           — SanctumAI personality and platform knowledge

Architecture

SanctumAI runs as part of a multi-tier inference system:

┌─────────────────────────────────────────────┐
│              CodeSanctum Frontend            │
│         (React + Monaco Code Editor)         │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│          Node.js Backend (Express)           │
│      Routes: /api/ai/* + /api/agent/*        │
└──────────────────┬──────────────────────────┘
                   │
┌──────────────────▼──────────────────────────┐
│       SanctumAI Inference Server             │
│     (FastAPI, OpenAI-compatible API)         │
│                                              │
│  Priority Chain:                             │
│  1. SanctumAI (Colab via ngrok)              │
│  2. Groq Fallback (llama-3.3-70b-versatile)  │
│  3. Local Phi-3 + QLoRA adapter              │
└─────────────────────────────────────────────┘

Usage

Loading the Adapter

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
import torch

# Load base model in 4-bit
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

base_model = AutoModelForCausalLM.from_pretrained(
    "microsoft/Phi-3-mini-4k-instruct",
    quantization_config=bnb_config,
    device_map="auto",
)

tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-4k-instruct")

# Load SanctumAI adapter
model = PeftModel.from_pretrained(base_model, "CodeSanctum/SanctumAI-v1")

Inference

messages = [
    {"role": "system", "content": "You are SanctumAI, the intelligent coding co-pilot built for CodeSanctum."},
    {"role": "user", "content": "Create a REST API with Express.js for a todo app with CRUD operations"},
]

text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=1024,
        temperature=0.7,
        top_p=0.95,
        do_sample=True,
    )

response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)

Via the Inference Server

# Start the FastAPI server
cd sanctum-ai
python server.py

# Call the OpenAI-compatible endpoint
curl http://localhost:8000/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      {"role": "user", "content": "Write a React hook for debouncing"}
    ],
    "max_tokens": 512
  }'

About CodeSanctum

CodeSanctum is a full-stack collaborative hackathon platform with:

Real-time Collaborative Code Editor — Monaco-based with multi-cursor, live sync via Socket.IO
AI Code Assistant — SanctumAI for code completion, bug detection, and review
Team Management — Invite codes, roles, and sprint task boards
Video Calling — HD team calls via 100ms SDK
Offline Collaboration — LAN Mode with WebRTC P2P + Direct Connect (zero-internet relay)
Encrypted Vault — Secure credential storage for team secrets
Whiteboard — Real-time collaborative diagramming
Documents — Shared specs and notes
Chill Zone — Embedded games for team breaks
Spotify Integration — Shared listening for focused coding sessions

Tech Stack: React, Node.js/Express, MongoDB, Socket.IO, FastAPI, Phi-3, QLoRA

Team

Built by the CodeSanctum Team for hackathon collaboration.

License

This adapter is released under the Apache 2.0 License, consistent with the Phi-3 base model license.

Citation

@software{sanctumai2025,
  title={SanctumAI: A Fine-tuned Phi-3 Coding Co-pilot for Hackathon Collaboration},
  author={CodeSanctum Team},
  year={2025},
  url={https://huggingface.co/CodeSanctum/SanctumAI-v1}
}

Downloads last month: 22

Model tree for narvesh18/SanctumAI-v1

Base model

microsoft/Phi-3-mini-4k-instruct

Adapter

(820)

this model