metadata
datasets:
- dynabench/dynasent
- devs9/sst3
language:
- en
metrics:
- accuracy
- f1
base_model:
- answerdotai/ModernBERT-base
tags:
- sentiment-analysis
- flash-attention
- modernbert
- trilstm-attn
license: apache-2.0
ModernBERT-TRABSA-CE
Three-way sentiment classifier (negative · neutral · positive) built on ModernBERT-base and fine-tuned with the TRABSA head (mean-pool ➜ BiLSTM ➜ token-attention ➜ MLP) using Cross-Entropy loss.
Model Details
| Developer | I. Bachelis |
| Model type | Encoder with task head |
| Languages | English |
| License | Apache-2.0 |
| Finetuned from | answerdotai/ModernBERT-base |
| Params | 110 M (backbone) + ≈3 M (head) |
| Precision | fp16 (FlashAttention) |
| Token limit | 128 |
Sources
Intended Uses
| Use-case | Users |
|---|---|
| Sentiment scoring of short English texts (tweets, reviews) | Practitioners, researchers |
| Feature extractor for downstream ABSA / stance tasks | NLP developers |
Out-of-scope
- Non-English text; paragraphs >128 tokens; hateful or toxic–speech detection.
Bias • Risk • Limitations
- Training data come from Yelp-style reviews & Rotten-Tomatoes snippets ⇒ bias to informal / review language.
- Neutral vs negative remains the weakest frontier (see confusion matrix).
- FlashAttention accelerates convergence; over-training >2 epochs hurts F1.
Recommendation: For deployment on new domains, run a small domain-adaptive fine-tune and monitor neutral/negative confusion.
How to Use
from transformers import AutoTokenizer, AutoModel
import torch
m = "iabachelis/ModernBERT-TRABSA-CE"
tok = AutoTokenizer.from_pretrained(m)
model = AutoModel.from_pretrained(
m, trust_remote_code=True).eval()
text = "The film is visually stunning, but painfully slow."
inputs = tok(text, return_tensors="pt")
probs = model(**inputs).logits.softmax(-1).squeeze()
id2cls = {0:"negative",1:"neutral",2:"positive"}
print({id2cls[i]: float(p) for i,p in enumerate(probs)})