Model Description
This model is a fine-tuned version of LLaMA-3 8B Instruct (4-bit quantized), optimized using Direct Preference Optimization (DPO) for answering legal questions related to Brazil’s Consolidation of Labor Laws (CLT). The fine-tuning process leveraged a curated dataset of 736 human-preference triplets, annotated by HR specialists and legal experts, to align the model with domain-specific expectations for accuracy and compliance.
Intended Use
The model is designed for legal question answering in the context of Brazilian labor law, supporting HR departments, compliance teams, and legal professionals. It aims to provide factually accurate and semantically aligned responses to CLT-related queries.
Training Details
- Base Model: LLaMA-3 8B (4-bit quantized)
- Fine-tuning Method: Direct Preference Optimization (DPO)
- Dataset: 736 validated human-preference entries on CLT-related questions
- Hyperparameters:
- Batch size: 2
- Gradient accumulation: 3
- Epochs: 1
- Learning rate: 5e-6
- Optimizer: AdamW 8-bit
Performance Summary
Compared to the base model, this DPO-tuned model achieved:
- +11% improvement in factual accuracy
- Higher semantic similarity scores
- Slight trade-off in fluency and argumentative structure
Ethical Considerations
- Legal Disclaimer: This model does not replace professional legal advice. Users should consult qualified professionals for critical decisions.
- Risk of Misinterpretation: Responses may omit nuances or context-specific interpretations of labor law.
- Data Privacy: The model was trained on synthetic and curated datasets, not on personal or confidential data.
Bias and Fairness
- The dataset was curated by HR and legal experts to minimize bias, but:
- Regional Bias: Focused exclusively on Brazilian CLT; not applicable to other jurisdictions.
- Interpretation Bias: Human annotators’ preferences may reflect subjective interpretations of legal norms.
Limitations
- Domain-specific; performance may degrade outside CLT-related queries.
- BLEU and ROUGE scores remain low due to metric limitations in legal contexts.
- Limited training data may affect generalization to complex or ambiguous cases.
Citation
soon
- Downloads last month
- 3
Model tree for ai-eldorado/Brazilian_CLT_DPO
Base model
meta-llama/Meta-Llama-3-8B-Instruct