---
library_name: transformers
base_model:
- UBC-NLP/AraT5v2-base-1024
pipeline_tag: text-classification
---

---
library_name: transformers
base_model: UBC-NLP/AraT5v2-base-1024
tags:
- arabic
- darija
- sentiment-analysis
- text-classification
- tashkeel
- arabict5
---

# AraT5v2-Darja-Sentiment

 Fine-tuned version of [`UBC-NLP/AraT5v2-base-1024`](https://huggingface.co/UBC-NLP/AraT5v2-base-1024) for **sentiment analysis** of texts written in **Algerian Arabic (Darja)**, with or without **Tashkīl**.

---

##  Dataset

The model was trained on a custom dataset containing:

- `tweet`: the original short text in Algerian Arabic
- `text_catt`: the same text with Tashkīl (diacritics) added
- `label`: one of `positive`, `neutral`, `negative`

> The input format used during training:
sentiment: [Darja]: <TEXTE_DARJA> [Tashkīl]: <TEXTE_TASHKĪL>

The dataset is from the SemEval_Task12 arq.

## python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model = AutoModelForSeq2SeqLM.from_pretrained("Noanihio/arat5v2-darja-sentiment")
tokenizer = AutoTokenizer.from_pretrained("Noanihio/arat5v2-darja-sentiment")

input_text = "sentiment: [Darja]: والله غير كي شفتو فرحت [Tashkīl]: وَاللَّهِ غَيْرُ كَيْ شَفْتُهُ فَرِحْتُ"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(**inputs)
label = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(label)  # ➜ positive

## Training details


Model: UBC-NLP/AraT5v2-base-1024

Trained on: Google Colab Pro, GPU T4

Epochs: 3

Batch size: 8

Learning rate: 5e-5

Framework: transformers.Trainer, full fine-tuning

No LoRA used


## Intended Use
This model is designed for:

Automatic sentiment classification in Arabic dialects

Evaluating emotional tone in Darja tweets and messages

Research in NLP for underrepresented languages (Algerian Arabic)

## Limitations
Model may be biased toward informal/digital Darja

Limited generalization to other Arabic dialects

Tashkīl input can improve results, but is optional

## Acknowledgements
Fine-tuned by @Noanihio
with the help Faiza Belbachir