Model Description
This model is a fine-tuned version of LLaMA-3 8B Instruct (4-bit quantized), optimized using Direct Preference Optimization (DPO) for answering legal questions related to Brazil’s Consolidation of Labor Laws (CLT). The fine-tuning process leveraged a curated dataset of 736 human-preference triplets, annotated by HR specialists and legal experts, to align the model with domain-specific expectations for accuracy and compliance.

Intended Use
The model is designed for legal question answering in the context of Brazilian labor law, supporting HR departments, compliance teams, and legal professionals. It aims to provide factually accurate and semantically aligned responses to CLT-related queries.

Training Details

  • Base Model: LLaMA-3 8B (4-bit quantized)
  • Fine-tuning Method: Direct Preference Optimization (DPO)
  • Dataset: 736 validated human-preference entries on CLT-related questions
  • Hyperparameters:
    • Batch size: 2
    • Gradient accumulation: 3
    • Epochs: 1
    • Learning rate: 5e-6
    • Optimizer: AdamW 8-bit

Performance Summary
Compared to the base model, this DPO-tuned model achieved:

  • +11% improvement in factual accuracy
  • Higher semantic similarity scores
  • Slight trade-off in fluency and argumentative structure

Ethical Considerations

  • Legal Disclaimer: This model does not replace professional legal advice. Users should consult qualified professionals for critical decisions.
  • Risk of Misinterpretation: Responses may omit nuances or context-specific interpretations of labor law.
  • Data Privacy: The model was trained on synthetic and curated datasets, not on personal or confidential data.

Bias and Fairness

  • The dataset was curated by HR and legal experts to minimize bias, but:
    • Regional Bias: Focused exclusively on Brazilian CLT; not applicable to other jurisdictions.
    • Interpretation Bias: Human annotators’ preferences may reflect subjective interpretations of legal norms.

Limitations

  • Domain-specific; performance may degrade outside CLT-related queries.
  • BLEU and ROUGE scores remain low due to metric limitations in legal contexts.
  • Limited training data may affect generalization to complex or ambiguous cases.

Citation

soon

Downloads last month
3
Safetensors
Model size
8B params
Tensor type
BF16
·
F32
·
U8
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ai-eldorado/Brazilian_CLT_DPO

Quantized
(263)
this model

Dataset used to train ai-eldorado/Brazilian_CLT_DPO