🩺 Pars-Medical-o1-Llama-FFT-GGUF - ⚠️This is a beta version and does not have the ability to correctly respond to or handle local runs on its own.⚠️

⚠️The full version will be released in the Pars-Medical application.⚠️

نسخه کوانتیزه شده برای اجرای محلی

Quantized Version for Local Inference

📦 About This Repo | درباره این ریپو

این ریپو شامل نسخه‌های کوانتیزه شده GGUF از مدل Pars-Medical-o1-Llama-FFT است.

برای اجرای محلی با Ollama، LM Studio یا llama.cpp مناسب است.

This repo contains GGUF quantized versions of Pars-Medical-o1-Llama-FFT.

Perfect for local inference with Ollama, LM Studio, or llama.cpp.

🔗 Looking for full model? See Pars-Medical-o1-Llama-FFT (BF16, 6.4GB)

📊 Available Quantizations | نسخه‌های موجود

Quantized on NVIDIA A100 80GB SXM using llama.cpp

File	Size	VRAM	Quality	Best For
`Pars-Medical-o1-Llama-FFT-Q4_K_M.gguf`	~2.0 GB	~4 GB	⭐⭐⭐ Good	💻 Most users, CPU/low VRAM
`Pars-Medical-o1-Llama-FFT-Q5_K_M.gguf`	~2.5 GB	~5 GB	⭐⭐⭐⭐ Better	⚖️ Balanced quality/speed
`Pars-Medical-o1-Llama-FFT-Q8_0.gguf`	~3.5 GB	~6 GB	⭐⭐⭐⭐⭐ Best	🎯 High quality inference

💡 Which one to choose? | کدام را انتخاب کنم؟

Q4_K_M → کمترین حجم، مناسب سیستم‌های ضعیف | Smallest, good for low-end systems
Q5_K_M → تعادل بین کیفیت و سرعت | Balance between quality and speed
Q8_0 → بهترین کیفیت، نزدیک به مدل اصلی | Best quality, closest to original

🚀 Quick Start | شروع سریع

1️⃣ Ollama (Recommended | پیشنهادی)

# Direct from HuggingFace
ollama run hf.co/HexQuant/Pars-Medical-o1-Llama-FFT-GGUF

# Or with specific quantization
ollama run hf.co/HexQuant/Pars-Medical-o1-Llama-FFT-GGUF:Q4_K_M

2️⃣ LM Studio

Open LM Studio
Go to Search tab
Search: HexQuant/Pars-Medical-o1-Llama-FFT-GGUF
Download your preferred quantization
Start chatting! 💬

3️⃣ llama.cpp

# Download the model
wget https://huggingface.co/HexQuant/Pars-Medical-o1-Llama-FFT-GGUF/resolve/main/Pars-Medical-o1-Llama-FFT-Q4_K_M.gguf

# Run inference
./llama-cli \
  -m Pars-Medical-o1-Llama-FFT-Q4_K_M.gguf \
  -p "What are the symptoms of diabetes?" \
  -n 512 \
  --temp 0.7

4️⃣ Python (llama-cpp-python)

from llama_cpp import Llama

# Load model directly from HuggingFace
llm = Llama.from_pretrained(
    repo_id="HexQuant/Pars-Medical-o1-Llama-FFT-GGUF",
    filename="Pars-Medical-o1-Llama-FFT-Q4_K_M.gguf",
    n_ctx=4096,
    n_gpu_layers=-1,  # Use all GPU layers
)

# Chat
response = llm.create_chat_completion(
    messages=[
        {"role": "user", "content": "علائم دیابت نوع ۲ چیست؟"}
    ],
    temperature=0.7,
    max_tokens=512,
)

print(response["choices"][0]["message"]["content"])

💬 Example | نمونه

🇮🇷 مثال فارسی

سوال:

علائم کم‌کاری تیروئید چیست؟

پاسخ مدل:

<think> بیایید علائم کم‌کاری تیروئید را بررسی کنیم:

کاهش متابولیسم بدن

تأثیر بر سیستم‌های مختلف </think>

علائم اصلی کم‌کاری تیروئید:

خستگی مزمن - احساس کسالت دائمی

افزایش وزن - با وجود رژیم غذایی طبیعی

عدم تحمل سرما - احساس سردی بیشتر

یبوست - کاهش حرکات روده

خشکی پوست - پوست خشک و زبر

ریزش مو - نازک شدن موها

🇺🇸 English Example

Question:

What are the warning signs of a heart attack?

Model Response:

<think> Let me outline the classic and atypical presentations of myocardial infarction. </think>

Warning Signs of Heart Attack:

Classic Symptoms:

Chest pain/pressure (may radiate to arm, jaw, back)

Shortness of breath

Cold sweats

Nausea/vomiting

Atypical Symptoms (more common in women):

Unusual fatigue

Indigestion-like discomfort

Dizziness

⚠️ If you experience these symptoms, call emergency services immediately!

🔧 Technical Details | جزئیات فنی

Parameter	Value
Original Model	Pars-Medical-o1-Llama-FFT
Base Architecture	Llama-3.2-3B-Instruct
Parameters	3.2B
Context Length	4096 tokens
Quantization Tool	llama.cpp
Quantization Hardware	NVIDIA A100 80GB SXM
Languages	English, Persian (Farsi)

⚠️ Disclaimer | سلب مسئولیت

⚠️ این مدل صرفاً برای اهداف آموزشی و تحقیقاتی است.

❌ جایگزین مشاوره پزشکی حرفه‌ای نیست.

✅ همیشه با پزشک متخصص مشورت کنید.

⚠️ This model is for educational and research purposes only.

❌ Not a substitute for professional medical advice.

✅ Always consult with a qualified healthcare provider.

🔗 Related Links | لینک‌های مرتبط

Resource	Link
🤗 Main Model (BF16)	Pars-Medical-o1-Llama-FFT
📦 This Repo (GGUF)	Pars-Medical-o1-Llama-FFT-GGUF
🦙 Ollama	ollama.ai
💻 LM Studio	lmstudio.ai
🔧 llama.cpp	GitHub

🔗 Connect | ارتباط

Made with ❤️ TheRealPourya

If you find this model useful, please ⭐ star the repo and share it!

Downloads last month: 101

GGUF

Model size

3B params

Architecture

llama

Hardware compatibility

4-bit

5-bit

8-bit

Model tree for HexQuant/Pars-Medical-o1-Llama-FFT-GGUF

Base model

meta-llama/Llama-3.2-3B-Instruct

Finetuned

unsloth/Llama-3.2-3B-Instruct