Model Card for Suyash1/SIH-TL-NabhaWillRemeberUs

This model is a fine-tuned version of TinyLlama-1.1B-Chat-v1.0, specifically adapted for medical reasoning based on the FreedomIntelligence/medical-o1-reasoning-SFT dataset. It is intended to generate responses to medical questions based on provided symptoms and conditions, aiming to provide reasoning similar to the dataset's structure.

Model Details

Model Description

This model is a PEFT (Parameter-Efficient Fine-Tuning) LoRA adaptation of the TinyLlama-1.1B-Chat-v1.0 base model. It was fine-tuned on the English split of the FreedomIntelligence/medical-o1-reasoning-SFT dataset, which contains medical questions and detailed reasoning responses. The fine-tuning was performed using the SFTTrainer from the TRL library with 4-bit quantization. The model is intended to demonstrate the capability of fine-tuning smaller language models for specialized tasks like medical reasoning in resource-constrained environments like Google Colab.

Developed by: Suyash Gupta
Funded by [optional]: NIT Durgapur
Shared by [optional]: Suyash Gupta
Model type: Causal Language Model (Fine-tuned LoRA adapter)
Language(s) (NLP): English
License: Apache-2.0 (inherits from base model and common practices)
Finetuned from model [optional]: TinyLlama/TinyLlama-1.1B-Chat-v1.0

Model Sources [optional]

Repository: https://huggingface.co/Suyash1/SIH-TL-NabhaWillRemeberUs
Base Model Repository: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
Dataset Repository: https://huggingface.co/datasets/FreedomIntelligence/medical-o1-reasoning-SFT
Paper [optional]: [More Information Needed - Link to TinyLlama or relevant PEFT/TRL papers if applicable]
Demo [optional]: [More Information Needed - Could link to a Colab notebook or Gradio demo]

Uses

Direct Use

This model can be used for generating text responses to medical questions, particularly those formatted in a question-response structure similar to the training data. It can be used for exploring medical reasoning capabilities of fine-tuned smaller language models.

Downstream Use [optional]

The fine-tuned adapter can be integrated into applications or workflows that require generating medically-related text based on prompts. It can serve as a component in larger systems for medical information retrieval or question answering, though its limited size may restrict its accuracy for complex or critical applications.

Out-of-Scope Use

This model is not intended for providing medical advice, diagnosis, or treatment recommendations. It is a research project and should not be used as a substitute for professional medical consultation. Its responses may be inaccurate, incomplete, or misleading. Use in critical medical applications is strongly discouraged. Any use that could lead to harm or misinform individuals about their health is strictly out of scope.

Bias, Risks, and Limitations

Limited Medical Knowledge: While fine-tuned on a medical dataset, the base model's general knowledge and the limited size of the adapter may result in incomplete or inaccurate medical information.
Data Bias: The model's responses are influenced by the biases present in the training data. The dataset's coverage and phrasing may introduce biases in the generated reasoning and responses.
Hallucination: Like all language models, this model may generate information that is factually incorrect or not supported by the input or training data.
Safety: The model has not undergone rigorous safety training for medical applications. Its responses should be treated with caution.

Recommendations

Users should be aware of the significant limitations of this model for medical tasks. It is recommended to:

Verify Information: Always cross-reference any information generated by the model with reliable medical sources.
Do Not Use for Diagnosis or Treatment: This model is not a medical professional and should not be used for making health decisions.
Understand Data Limitations: Be aware that the model's performance is limited by the size and nature of the dataset it was fine-tuned on.
Responsible Deployment: If deploying this model in an application, include clear disclaimers about its limitations and out-of-scope uses, emphasizing that it is not a substitute for professional medical advice.

Downloads last month: 13

Safetensors

Model size

0.6B params

Tensor type

F32

F16

Suyash1
/

SIH-TL-NabhaWillRemeberUs