--- license: apache-2.0 base_model: openai/gpt-oss-20b tags: - cybersecurity - security - gpt-oss - openai - fine-tuned - merged - text-generation - moe datasets: - Trendyol/Trendyol-Cybersecurity-Instruction-Tuning-Dataset - AlicanKiraz0/Cybersecurity-Dataset-Fenrir-v2.0 - trendmicro-ailab/Primus-Instruct language: - en pipeline_tag: text-generation library_name: transformers inference: true --- # GPT-OSS-Cybersecurity-20B-Merged Fine-tuned **openai/gpt-oss-20b** (21B total params, 3.6B active - MoE) specialized for **cybersecurity** tasks. This is a merged model (LoRA weights merged into base) for easy deployment. ## Model Description GPT-OSS-20B is a Mixture of Experts (MoE) model with efficient inference. - **Total Parameters**: 21B - **Active Parameters**: 3.6B (only active experts used per token) - **Architecture**: MoE (Mixture of Experts) This model was trained on ~50,000 cybersecurity instruction-response pairs from: - Trendyol Cybersecurity Dataset (35K samples) - Fenrir v2.0 Dataset (12K samples) - Primus-Instruct (3x upsampled) ## Training Details | Parameter | Value | |-----------|-------| | Base Model | openai/gpt-oss-20b | | Architecture | MoE (21B total, 3.6B active) | | Training Samples | ~50,000 | | Epochs | 2 | | LoRA Rank | 16 | | LoRA Alpha | 32 | | Learning Rate | 2e-4 | | Max Sequence Length | 1024 | ## Usage ```python from transformers import AutoModelForCausalLM, AutoTokenizer import torch model = AutoModelForCausalLM.from_pretrained( "sainikhiljuluri2015/GPT-OSS-Cybersecurity-20B-Merged", torch_dtype=torch.bfloat16, device_map="auto", trust_remote_code=True ) tokenizer = AutoTokenizer.from_pretrained("sainikhiljuluri2015/GPT-OSS-Cybersecurity-20B-Merged", trust_remote_code=True) prompt = "What are the indicators of a ransomware attack?" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.7) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## API Usage ```python import requests API_URL = "https://YOUR_ENDPOINT_URL/v1/chat/completions" response = requests.post(API_URL, json={ "model": "sainikhiljuluri2015/GPT-OSS-Cybersecurity-20B-Merged", "messages": [{"role": "user", "content": "What is SQL injection?"}], "max_tokens": 300 }) print(response.json()["choices"][0]["message"]["content"]) ``` ## Cybersecurity Capabilities - 🔍 Threat analysis and classification - 🚨 Security alert triage - 📋 Incident response guidance - 🦠 Malware analysis - 📊 MITRE ATT&CK mapping - 🔐 Vulnerability assessment - 💉 SQL injection detection - 🎣 Phishing analysis - 🔑 CVE knowledge - 🛡️ Security best practices ## Hardware Requirements Due to the 21B parameter size (MoE), recommended: - **GPU**: A100 40GB+ or equivalent - **VRAM**: 40GB+ for BF16 inference - For smaller GPUs, use 4-bit quantization