|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- Bias Detection |
|
|
- Text Classification |
|
|
language: |
|
|
- en |
|
|
Author: |
|
|
- Himel Ghosh |
|
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
|
|
This is a RoBERTa-based binary classification model fine-tuned on the BABE (URL: https://huggingface.co/datasets/mediabiasgroup/BABE) |
|
|
dataset for bias detection in English news statements. |
|
|
The model predicts whether a given sentence contains biased language (LABEL_1) or is unbiased (LABEL_0). |
|
|
It is intended for applications in media bias analysis, content moderation, and social computing research. |
|
|
|
|
|
## Citation |
|
|
|
|
|
Please cite this Hugging Face model page: |
|
|
```bibtex |
|
|
@misc{himel7robertaBabe2025, |
|
|
title={RoBERTa Bias Detector (fine-tuned on BABE)}, |
|
|
author={Himel Ghosh}, |
|
|
howpublished={\url{https://huggingface.co/himel7/roberta-babe}}, |
|
|
year={2025} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This model is a fine-tuned version of roberta-base trained to detect linguistic bias in English-language news statements. |
|
|
The task is framed as binary classification: the model outputs LABEL_1 for biased statements and LABEL_0 for non-biased statements. |
|
|
|
|
|
Fine-tuning was performed on the BABE dataset, which contains annotated news snippets across various topics and political leanings. |
|
|
The annotations focus on whether the language used expresses subjective bias rather than factual reporting. |
|
|
|
|
|
The goal of this model is to assist in detecting subtle forms of bias in media content, such as emotionally loaded language, stereotypical |
|
|
phrasing, or exaggerated claims, and can be useful in journalistic analysis, media monitoring, or NLP research into framing and stance. |
|
|
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
|
|
- **Developed by:** Himel Ghosh |
|
|
- **Language(s) (NLP):** Python |
|
|
- **Finetuned from model:** roberta-base |
|
|
|
|
|
|
|
|
## Uses |
|
|
|
|
|
This model is intended to support the detection and analysis of biased language in English news content. It can be used as a tool by: |
|
|
|
|
|
- **Media researchers** and **social scientists** studying framing, bias, or political discourse. |
|
|
|
|
|
- **Journalists and editors** aiming to assess the neutrality of their writing or compare outlets. |
|
|
|
|
|
- **Developers** integrating bias detection into NLP pipelines for content moderation, misinformation detection, or AI-assisted writing tools. |
|
|
|
|
|
### Foreseeable Uses: |
|
|
- Annotating datasets for bias. |
|
|
|
|
|
- Measuring bias across different news outlets or topics. |
|
|
|
|
|
- Serving as an assistive tool in editorial decision-making or media monitoring. |
|
|
|
|
|
### Users Affected: |
|
|
- **Content creators** whose work may be labeled as biased or unbiased. |
|
|
|
|
|
- **End-users** of applications powered by this model (e.g., fact-checking platforms, moderation systems). |
|
|
|
|
|
- **Marginalized communities**, depending on how bias is defined, interpreted, and acted upon. |
|
|
|
|
|
The model should be used with care in high-stakes contexts, as bias is inherently subjective and culturally contextual. |
|
|
It is not intended to replace human judgment but to assist in surfacing potentially biased expressions. |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
This model can be used directly for binary classification of English-language news statements to determine whether they exhibit biased language. |
|
|
It returns one of two labels: |
|
|
- **LABEL_0** –> Non-biased |
|
|
|
|
|
- **LABEL_1** –> Biased |
|
|
|
|
|
- Example usage with Hugging Face’s pipeline: |
|
|
|
|
|
from transformers import pipeline |
|
|
|
|
|
classifier = pipeline("text-classification", model="your-username/roberta-babe") |
|
|
result = classifier("Immigrants are criminals.") |
|
|
|
|
|
|
|
|
## Bias, Risks, and Limitations |
|
|
|
|
|
While this model is designed to detect linguistic bias, it carries several limitations and risks, both technical and sociotechnical: |
|
|
- The model was fine-tuned on the BABE dataset, which includes annotations based on human judgments that may reflect specific cultural or political perspectives. |
|
|
- It may not generalize well to non-news text or out-of-domain content (e.g., social media, informal writing). |
|
|
- Subtle forms of bias, sarcasm, irony, or coded language may not be reliably detected. |
|
|
- Bias is inherently subjective: What one annotator considers biased may be seen as neutral by another. The model reflects those subjective judgments. |
|
|
- The model does not detect factual correctness or misinformation — only linguistic bias cues. |
|
|
- Labeling a text as “biased” may have reputational or ethical implications, especially if used in moderation, censorship, or journalistic evaluations. |
|
|
|
|
|
|
|
|
### Recommendations |
|
|
|
|
|
<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. --> |
|
|
|
|
|
Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recommendations. |
|
|
|
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
Training was done on the BABE Dataset: https://huggingface.co/datasets/mediabiasgroup/BABE |
|
|
|
|
|
|
|
|
## Evaluation |
|
|
|
|
|
The model was evaluated on the Test split of the BABE dataset and yielded the following metrics: |
|
|
- **Accuracy: 0.8520** |
|
|
- **Precision: 0.9237** |
|
|
- **Recall: 0.8014** |
|
|
- **F1 Score: 0.8582** |
|
|
|
|
|
#### Summary |
|
|
The model achieved 85.2% Accuracy on the BABE test split, with very high Precision of 92.37% and 80.14% Recall. |
|
|
This means the model predicts very few false positives and detects the biases that are actually biases. |