--- license: apache-2.0 language: - ko - en base_model: - snuh/hari-q3 tags: - medical - clinical - QA - benchmark - healthcare - korean - reasoning --- 🧠 **Korean Medical LLM (QA-Finetuned) by Healthcare AI Research Institute of Seoul National University Hospital** **Jun-y00/hari-q3-bnb-4bit**λŠ” μ„œμšΈλŒ€ν•™κ΅λ³‘μ› 의료 AI μ—°κ΅¬μ†Œ(HARI)μ—μ„œ κ°œλ°œν•œ ν•œκ΅­μ–΄ 기반 의료 LLM을 **BitsAndBytes 4bit μ–‘μžν™”**둜 μ–‘μžν™”ν•œ λ²„μ „μž…λ‹ˆλ‹€. μ£Όμš” λͺ©μ μ€ 의료 μ§ˆμ˜μ‘λ‹΅(QA) 및 μž„μƒ μΆ”λ‘  μ§€μ›μž…λ‹ˆλ‹€. ## πŸš€ Model Overview - **Model Name**: `Jun-y00/hari-q3-bnb-4bit` - **Architecture**: Large Language Model (LLM) - **Fine-tuning Objective**: Medical QA (Question–Answer) style generation - **Primary Language**: English, Korean - **Domain**: Clinical Medicine - **Performance**: Achieves **84.14% accuracy** on the **Korean Medical Licensing Examination (KMLE)** - **Key Applications**: - Clinical decision support (QA-style) - Medical education and self-assessment tools - Automated medical reasoning and documentation aid --- ## πŸ“Š Training Data & Benchmark This model was fine-tuned using a curated corpus of Korean medical QA-style data derived from **publicly available, de-identified sources**. The training data includes clinical guidelines, academic publications, exam-style questions, and synthetic prompts reflecting real-world clinical reasoning. - **Training Data Characteristics**: - Focused on Korean-language question–answering formats relevant to clinical settings. - Includes guideline-derived questions, de-identified case descriptions, and physician-crafted synthetic queries. - Designed to reflect realistic diagnostic, therapeutic, and decision-making scenarios. - **Benchmark Evaluation**: - **KMLE-style QA benchmark(KorMedMCQA)** - non-reasoning - Doctor: 70.57% - Nurse: 81.66% - Pharm: 76.61% - Dentist: 62.27% - reasoning - Doctor: 84.14% - Nurse: 88.50% - Pharm: 85.42% - Dentist: 68.56% - All evaluations were conducted on de-identified, non-clinical test sets, with no real patient data involved. > ⚠️ These benchmarks are provided for research purposes only and do not imply clinical safety or efficacy. --- ## πŸ” Privacy & Ethical Compliance We strictly adhere to ethical AI development and privacy protection: - βœ… The model was trained exclusively on **publicly available and de-identified data**. - πŸ”’ It does **not include any real patient data or personally identifiable information (PII)**. - βš–οΈ Designed for **safe, responsible, and research-oriented** use in healthcare AI. > ⚠️ This model is intended for **research and educational purposes only** and should **not** be used to make clinical decisions. --- ## πŸ₯ About HARI – Healthcare AI Research Institute The **Healthcare AI Research Institute (HARI)** is a pioneering research group within **Seoul National University Hospital**, driving innovation in medical AI. ### 🌍 Vision & Mission - **Vision**: Shaping a sustainable and healthy future through pioneering AI research. - **Mission**: - Develop clinically useful, trustworthy AI technologies. - Foster cross-disciplinary collaboration in medicine and AI. - Lead global healthcare AI commercialization and policy frameworks. - Educate the next generation of AI-powered medical professionals. --- ## πŸ§ͺ Research Platforms & Infrastructure - **Platforms**: SUPREME, SNUHUB, DeView, VitalDB, NSTRI Global Data Platform - **Computing**: NVIDIA H100 / A100 GPUs, Quantum AI Infrastructure - **Projects**: - Clinical note summarization - AI-powered diagnostics - EHR automation - Real-time monitoring via AI pipelines --- ## πŸŽ“ AI Education Programs - **Basic AI for Healthcare**: Designed for clinicians and students - **Advanced AI Research**: Targeting senior researchers and specialists in clinical AI validation and deep learning --- ## 🀝 Collaborate with Us We welcome collaboration with: - AI research institutions and medical universities - Healthcare startups and technology partners - Policymakers shaping AI regulation in medicine πŸ“§ **Contact**: [help-ds@snuh.org](mailto:help-ds@snuh.org) 🌐 **Website**: [Seoul National University Hospital](https://www.snuh.org) --- ## πŸ€— Model Usage Example ```python from transformers import AutoTokenizer, AutoModelForCausalLM import torch # Load tokenizer and model model_name = "snuh/hari-q3" model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name) prompt = ''' ### Instruction: 당신은 μž„μƒ 지식을 κ°–μΆ˜ 유λŠ₯ν•˜κ³  μ‹ λ’°ν•  수 μžˆλŠ” ν•œκ΅­μ–΄ 기반 의료 μ–΄μ‹œμŠ€ν„΄νŠΈμž…λ‹ˆλ‹€. μ‚¬μš©μžμ˜ μ§ˆλ¬Έμ— λŒ€ν•΄ μ •ν™•ν•˜κ³  μ‹ μ€‘ν•œ μž„μƒ 좔둠을 λ°”νƒ•μœΌλ‘œ 진단 κ°€λŠ₯성을 μ œμ‹œν•΄ μ£Όμ„Έμš”. λ°˜λ“œμ‹œ ν™˜μžμ˜ μ—°λ Ή, 증상, 검사 κ²°κ³Ό, 톡증 λΆ€μœ„ λ“± λͺ¨λ“  λ‹¨μ„œλ₯Ό μ’…ν•©μ μœΌλ‘œ κ³ λ €ν•˜μ—¬ μΆ”λ‘  κ³Όμ •κ³Ό 진단λͺ…을 μ œμ‹œν•΄μ•Ό ν•©λ‹ˆλ‹€. μ˜ν•™μ μœΌλ‘œ μ •ν™•ν•œ μš©μ–΄λ₯Ό μ‚¬μš©ν•˜λ˜, ν•„μš”ν•˜λ‹€λ©΄ 일반인이 μ΄ν•΄ν•˜κΈ° μ‰¬μš΄ μš©μ–΄λ„ 병행해 μ„€λͺ…ν•΄ μ£Όμ„Έμš”. ### Question: 60μ„Έ 남성이 볡톡과 λ°œμ—΄μ„ ν˜Έμ†Œν•˜λ©° λ‚΄μ›ν•˜μ˜€μŠ΅λ‹ˆλ‹€. ν˜ˆμ•‘ 검사 κ²°κ³Ό 백혈ꡬ μˆ˜μΉ˜κ°€ μƒμŠΉν–ˆκ³ , 우츑 ν•˜λ³΅λΆ€ 압톡이 ν™•μΈλ˜μ—ˆμŠ΅λ‹ˆλ‹€. κ°€μž₯ κ°€λŠ₯성이 높은 진단λͺ…은 λ¬΄μ—‡μΈκ°€μš”? '''.strip() messages = [ {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True, enable_thinking=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device) generated_ids = model.generate( **model_inputs, max_new_tokens=4096 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ] response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] print(response) ``` --- ## πŸ“„ License **Apache 2.0 License** – Free for research and commercial use with attribution. --- ## πŸ“’ Citation If you use this model in your work, please cite: ``` @misc{hari-q3, title = {hari-q3}, url = {https://huggingface.co/snuh/hari-q3}, author = {Healthcare AI Research Institute(HARI) of Seoul National University Hospital(SNUH)}, month = {May}, year = {2025} } ``` --- ## πŸš€ Together, we are shaping the future of AI-driven healthcare.