XLM-RoBERTa-base fine-tuned for Vietnamese NLI
A Vietnamese Natural Language Inference (NLI) model that predicts the relation between a premise and a hypothesis as one of:
c(contradiction)n(neutral)e(entailment)
This model fine-tunes xlm-roberta-base using a stratified 80/10/10 split, optimized to run on a single GPU (Kaggle T4/P100).
Model Details
- Developed by: Lê Lý (MoMo Talent 2025)
- Model type: XLM-RoBERTa encoder for sequence classification (3 labels)
- Languages: Vietnamese (vi)
- License: Inherits from upstream xlm-roberta-base (set the model page license accordingly)
- Finetuned from:
xlm-roberta-base
Model Sources
- Base model: XLM-RoBERTa (Conneau et al., 2020)
- Training script: Included below in this card (Kaggle-ready)
Uses
Direct Use
- Vietnamese NLI inference for research, demos, or as a component in larger systems (e.g., retrieval/ranking, dialog consistency checks).
Downstream Use
- Fine-tune further on domain-specific VN NLI or related tasks (stance detection, contradiction detection in QA/assistants).
Out-of-Scope Use
- Non-VN text without adaptation.
- Safety-critical decisions without human oversight.
- Open-domain factual verification (this is NLI, not a fact-checker).
Bias, Risks, and Limitations
- Trained on a VN NLI dataset; distributional shift (domain, register, slang, figurative language) may degrade performance.
- NLI labels can be sensitive to annotation style/instructions; avoid over-interpreting borderline cases.
Recommendations: Evaluate on your target domain; monitor confusion between n vs e/c; consider calibration or thresholding if used in pipelines.
How to Get Started
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_id = "YOUR_USERNAME/xlmr-vinli-finetune" # replace with your repo id
tok = AutoTokenizer.from_pretrained(model_id)
mdl = AutoModelForSequenceClassification.from_pretrained(model_id)
id2label = mdl.config.id2label # {0:'c',1:'n',2:'e'}
text = {"premise": "Trời đang mưa rất to.", "hypothesis": "Bên ngoài khô ráo và không có mưa."}
enc = tok(text["premise"], text["hypothesis"], return_tensors="pt", truncation=True, max_length=256)
with torch.no_grad():
logits = mdl(**enc).logits
pred = logits.softmax(-1).argmax(-1).item()
print("Prediction:", id2label[pred])
Training Details
Data
- Path (Kaggle):
/kaggle/input/nli-vietnam/full_data_true.json - Labels:
{"c":0, "n":1, "e":2} - Split: Stratified ~80/10/10 (train/val/test)
Ensure JSON has fields: id, premise, hypothesis, label (labels in {c,n,e}).
Procedure
Preprocessing
- Tokenizer:
XLMRobertaTokenizerFast(max_length=256, truncation)
Hyperparameters
- Epochs: 4
- Optim: AdamW (via HF Trainer)
- LR: 2e-5
- Weight decay: 0.01
- Warmup ratio: 0.06
- Scheduler: linear
- Batch:
per_device_train_batch_size=8,per_device_eval_batch_size=32 - Grad Accumulation: 2 (effective train batch ~16)
- Precision:
bf16if available (Ampere+), elsefp16 - Label smoothing: 0.05
- Early stopping: patience 2
- Gradient checkpointing: enabled
save_safetensors=True,load_best_model_at_end=Trueonf1_macro
Compute
- Hardware: Single NVIDIA T4/P100 16GB (Kaggle)
dataloader_num_workers=2,pin_memory=True
Speeds, Sizes, Times
- Checkpoint size: standard
xlm-roberta-basehead (+classifier) - Exact wall-clock depends on GPU; typical Kaggle session completes within normal time limits.
Evaluation
Metrics & Factors
- Metrics: Accuracy, Macro F1
- Factors: Per-label performance (c, n, e)
Results (Test)
Accuracy: 0.9901
Macro F1: 0.9878
Support: 1113 samples (c=429, n=108, e=576)
Classification Report:
precision recall f1-score support
c 0.9930 0.9883 0.9907 429
n 0.9815 0.9815 0.9815 108
e 0.9896 0.9931 0.9913 576
weighted avg 0.9901 0.9901 0.9901 1113
Confusion Matrix:
[ 1 106 1],
[ 2 2 572]]
Note: Replicate numbers may vary slightly due to randomness/hardware.
Environmental Impact
- Hardware: Single T4/P100 16GB (Kaggle)
- Cloud Provider/Region: Kaggle (unspecified)
- Hours used: Not logged
- Carbon Emitted: Not estimated
- You can estimate with the MLCO2 Impact calculator.
Technical Specifications
Architecture & Objective
- Backbone: XLM-RoBERTa Base
- Head: Linear classification (3 labels)
- Objective: Cross-entropy with label smoothing (0.05); optional class weighting (off by default)
Software
transformers==4.43.3datasets==2.21.0accelerate==0.33.0evaluate==0.4.2scikit-learn==1.5.1torch(CUDA)
Citation
XLM-RoBERTa
@inproceedings{conneau2020unsupervised,
title={Unsupervised Cross-lingual Representation Learning at Scale},
author={Conneau, Alexis and Khandelwal, Kartikay and Goyal, Naman and Chaudhary, Vishrav and Wenzek, Guillaume and Guzm{\'a}n, Francisco and Grave, Edouard and Ott, Myle and Zettlemoyer, Luke and Stoyanov, Veselin},
booktitle={ACL},
year={2020}
}
Contact
Author: Lê Lý
- Downloads last month
- 7