|
|
--- |
|
|
library_name: peft |
|
|
base_model: meta-llama/Meta-Llama-3-8B-Instruct |
|
|
pipeline_tag: text-generation |
|
|
widget: |
|
|
- example_title: Hi, I am Menstrual Chatbot. How may I help you? |
|
|
messages: |
|
|
- role: user |
|
|
content: Hey! What are common symptoms of menstruation? |
|
|
inference: |
|
|
parameters: |
|
|
max_new_tokens: 200 |
|
|
stop: |
|
|
- <|end_of_text|> |
|
|
- <|eot_id|> |
|
|
--- |
|
|
|
|
|
## Model Details |
|
|
LCS2 developed and released the Mestrual-LLaMA, a generative text models in 8B size. It is optimized for dialogue use cases. Further, in developing these models, we took great care to optimize helpfulness and safety. |
|
|
|
|
|
### Model Description |
|
|
- **Developed by:** LCS2, IIT Delhi |
|
|
- **Language(s) (NLP):** Multilingual |
|
|
- **License:** LLaMA3 |
|
|
- **Finetuned from model:** Meta-Llama-3-8B-Instruct |
|
|
- **Dataset:** https://huggingface.co/datasets/proadhikary/MENST |
|
|
- **Cite:** |
|
|
|
|
|
``` |
|
|
Adhikary P, Motiyani I, Oke G, Joshi M, Pathak K, Singh S, Chakraborty T |
|
|
Menstrual Health Education Using a Specialized Large Language Model in India: Development and Evaluation Study of MenstLLaMA |
|
|
J Med Internet Res 2025;27:e71977 |
|
|
URL: https://www.jmir.org/2025/1/e71977 |
|
|
DOI: 10.2196/71977 |
|
|
``` |
|
|
|
|
|
## Uses |
|
|
|
|
|
### Intended Use |
|
|
This model is fine-tuned on a menstrual health dataset to provide accurate and sensitive responses to queries related to menstrual health. |
|
|
|
|
|
### Downstream Use |
|
|
- **Primary Use:** Menstrual health Q&A. |
|
|
- **Secondary Use:** Educational resources, support groups, and health awareness initiatives. |
|
|
|
|
|
### Out-of-Scope Use |
|
|
- **Not Recommended For:** Comprehensive sexual healthcare chatbot functionalities and prescribing medications. |
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
While this model strives for accuracy and sensitivity, it is essential to note the following: |
|
|
- **Biases:** The model might reflect existing biases in the training data. |
|
|
- **Limitations:** It may not always suggest accurate medications or treatments; professional verification is advised. |
|
|
|
|
|
### Recommendations |
|
|
Users, both direct and downstream, should be aware of the model's biases, risks, and limitations. It is recommended to use the model as a supplementary tool rather than a definitive source. |
|
|
|
|
|
## How to Get Started with the Model |
|
|
|
|
|
Use the following code snippet to get started with the model: |
|
|
|
|
|
```python |
|
|
import torch |
|
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
|
|
#Request us for the model, collect your token: https://huggingface.co/docs/hub/en/security-tokens & place it here. |
|
|
#Incase delayed, please drop a mail at proadhikary@ee.iitd.ac.in! |
|
|
|
|
|
access_token = "hf_..." |
|
|
|
|
|
model_path = "proadhikary/Menstrual-LLaMA-8B" |
|
|
tokenizer = AutoTokenizer.from_pretrained(model_path, token=access_token) |
|
|
model = AutoModelForCausalLM.from_pretrained(model_path, token=access_token) |
|
|
|
|
|
def model_output(Question): |
|
|
messages = [ |
|
|
{"role": "system", "content": "Act as an advisor for menstrual health. Do not answer out of Domain(Menstrual Health) question. Generate only short and complete response!"}, |
|
|
{"role": "user", "content": Question}, |
|
|
] |
|
|
|
|
|
input_ids = tokenizer.apply_chat_template(messages, add_generation_prompt=True,return_tensors="pt").to(model.device) |
|
|
|
|
|
terminators = [tokenizer.eos_token_id, tokenizer.convert_tokens_to_ids("<|eot_id|>")] |
|
|
|
|
|
torch.backends.cuda.enable_mem_efficient_sdp(False) |
|
|
torch.backends.cuda.enable_flash_sdp(False) |
|
|
|
|
|
outputs = model.generate(input_ids, pad_token_id=tokenizer.pad_token_id, max_new_tokens=200, eos_token_id=terminators, do_sample=True, temperature=0.6, top_p=0.9,) |
|
|
|
|
|
response = outputs[0][input_ids.shape[-1]:] |
|
|
out = tokenizer.decode(response, skip_special_tokens=True) |
|
|
|
|
|
# Example usage |
|
|
input_text = "My mother said not to sleep on bed when I am menstruating, why?" |
|
|
response = model_output(input_text) |
|
|
|
|
|
print(response) |
|
|
``` |
|
|
|
|
|
|
|
|
|
|
|
### Training Data |
|
|
The dataset used for fine-tuning includes diverse and multilingual content focused on menstrual health. This dataset will be released soon. |
|
|
|
|
|
|
|
|
#### Preprocessing |
|
|
Special tokens were added following this: https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/ |
|
|
|
|
|
|
|
|
#### Training Hyperparameters |
|
|
- **Training regime:** [learning_rate=2e-4, weight_decay=0.001, fp16=False, bf16=False, max_grad_norm=0.3, max_steps=-1, warmup_ratio=0.03, group_by_length=True, lr_scheduler_type="constant"] |
|
|
|
|
|
|
|
|
Hope this helps! <3 |
|
|
|
|
|
|
|
|
|