Model Card for mDeBERTa Policy Detection
A multilingual policy detection model fine-tuned for detecting policy mentions directed towards specific groups in political text.
Model Details
Model Description
This model is a fine-tuned mDeBERTa-v3-base that performs policy classification using Natural Language Inference (NLI) to determine whether political text contains specific policy proposals directed towards target groups.
- Developed by: Will Horne, Alona O. Dolinsky and Lena Maria Huber
- Model type: Sequence Classification (NLI-based policy detection)
- Language(s) (NLP): English, German (multilingual)
- Finetuned from model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
Model Sources
- Repository: rwillh11/mdeberta_NLI_policy_noContext
- Base Model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7
Uses
Direct Use
The model is designed for researchers analyzing whether policy proposals are found in political discourse that are directed towards specific groups. It takes a political text and a target group as input and classifies whether the text contains a policy directed towards that group (policy/no policy).
Note that the model does not categorize the texts into designated policy areas (e.g., healthcare, education) but rather identifies the presence of any policy directed at the specified group.
Downstream Use
This model can be integrated into larger political text analysis pipelines for:
- Political manifestos analysis
- Policy proposal detection in political communication
- Comparative political research across countries and languages
- Group-targeted policy analysis
Out-of-Scope Use
This model should not be used for:
- General policy detection (not group-specific)
- Categorization of policies into specific policy areas
- Real-time social media monitoring without human oversight
- Making decisions about individuals or groups
- Content moderation without additional validation
Bias, Risks, and Limitations
Technical Limitations
- Trained specifically on political manifesto text; performance may vary on other text types
- Focus sentences without context may lack nuance present in full paragraphs
- Limited to two policy categories (policy, no policy)
Bias Considerations
- Training data consists of political manifestos from specific countries and time periods
- May reflect biases present in political discourse of training data
- Policy detection may vary across different political contexts and group types
Recommendations
Users should be aware that this model:
- Is designed for research purposes in political science
- Should be validated on specific domains before deployment
- May require human oversight for sensitive applications
- Performance may vary across different types of groups and political contexts
How to Get Started with the Model
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "rwillh11/mdeberta_NLI_policy_noContext"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example usage
text = "We will increase funding for schools to better support students."
target_group = "students"
# Create hypotheses for each policy class
hypotheses = {
"policy": f"The text contains a policy directed towards {target_group}.",
"no policy": f"The text does not contain a policy directed towards {target_group}."
}
# Get predictions for each hypothesis
results = {}
for policy_class, hypothesis in hypotheses.items():
inputs = tokenizer(text, hypothesis, return_tensors="pt", truncation=True)
with torch.no_grad():
outputs = model(**inputs)
probs = torch.softmax(outputs.logits, dim=-1)
entailment_prob = probs[0][0].item() # Probability of entailment
results[policy_class] = entailment_prob
# Select policy class with highest entailment probability
predicted_class = max(results, key=results.get)
print(f"Predicted policy classification for '{target_group}': {predicted_class}")
Training Details
Training Data
The model was trained on political manifesto data containing:
- Languages: English and German
- Text Type: Political manifesto sentences (focal sentences without context)
- Labels: Two-class policy classification (policy, no policy)
- Groups: Various political target groups (citizens, specific demographics, professions, etc.)
- Original dataset: 7,546 text-group pairs
- English: 4,066 text-group pairs
- German: 3,480 text-group pairs
- Training Size: ~6,037 original texts (80% split)
- Test Size: ~1,509 original texts (20% split)
Training Procedure
Preprocessing
- Texts tokenized using mDeBERTa tokenizer with max length 512
- NLI format: premise (political text) + hypothesis (policy towards group)
- Each text paired with both true and false hypotheses for binary classification
Training Hyperparameters
- Training regime: Mixed precision training
- Optimizer: AdamW with weight decay
- Learning rate: Optimized via Optuna (range: 1e-5 to 4e-5)
- Weight decay: Optimized via Optuna (range: 0.01 to 0.3)
- Warmup ratio: Optimized via Optuna (range: 0.0 to 0.1)
- Epochs: 10 per trial
- Batch size: 16 (train and eval)
- Trials: 20 total
- Metric for selection: F1 Macro
- Seed: 42 (deterministic training)
Training Infrastructure
- Hardware: CUDA-enabled GPU
- Framework: Transformers, PyTorch
- Hyperparameter optimization: Optuna
- Deterministic training: All random seeds fixed
Evaluation
Testing Data, Factors & Metrics
Testing Data
- 20% holdout from original dataset
- Multilingual political manifesto sentences
Factors
The model was evaluated across:
- Languages: English and German text
- Additional validation on held out sets: English, German, Dutch, Danish, Spanish, French, Italian, Swedish
- Policy classes: Policy, no policy
- Group types: Various socia-demographic groups
Metrics
Primary metrics used for evaluation:
- F1 Macro: Primary optimization metric (treats all classes equally)
- Balanced Accuracy: Accounts for class imbalance
- Precision/Recall (Macro): Detailed performance measures
Results
Best Model Performance (Trial 10, Epoch 9):
- Accuracy: 0.872
- Balanced Accuracy: 0.874
- Precision: 0.869
- Recall: 0.874
- F1 Macro: 0.870
Additional validation on held-out sets return the following metrics:
English
- Accuracy: 0.860
- Precision: 0.864
- Recall: 0.860
- F1 Macro: 0.859
German (using texts translated from English)
- Accuracy: 0.814
- Precision: 0.814
- Recall: 0.814
- F1 Macro: 0.814
Dutch (using texts translated from English)
- Accuracy: 0.847
- Precision: 0.847
- Recall: 0.847
- F1 Macro: 0.847
Danish (using texts translated from English)
- Accuracy: 0.837
- Precision: 0.837
- Recall: 0.837
- F1 Macro: 0.836
Spanish (using texts translated from English)
- Accuracy: 0.860
- Precision: 0.861
- Recall: 0.860
- F1 Macro: 0.860
French (using texts translated from English)
- Accuracy: 0.837
- Precision: 0.837
- Recall: 0.837
- F1 Macro: 0.837
Italian (using texts translated from English)
- Accuracy: 0.845
- Precision: 0.847
- Recall: 0.845
- F1 Macro: 0.845
Swedish (using texts translated from English)
- Accuracy: 0.863
- Precision: 0.864
- Recall: 0.863
- F1 Macro: 0.863
The model demonstrates strong performance across policy categories with deterministic results confirmed through multiple prediction runs.
Model Examination
The model uses Natural Language Inference to transform policy detection into a binary entailment task:
- For each text-group pair, generates two hypotheses (policy/no policy)
- Selects the hypothesis with highest entailment probability
- This approach leverages pre-trained NLI capabilities for policy classification
Environmental Impact
Training involved hyperparameter optimization with 20 trials, each training for 10 epochs.
- Hardware Type: CUDA-enabled GPU
- Hours used: Estimated 10-15 hours (including hyperparameter search)
- Cloud Provider: Google Colab
- Compute Region: Variable
- Carbon Emitted: Not precisely measured
Technical Specifications
Model Architecture and Objective
- Base Architecture: mDeBERTa-v3-base (278M parameters)
- Task: Natural Language Inference for policy detection
- Input: Text pair (political sentence + policy hypothesis)
- Output: Binary classification (entailment/non-entailment)
- Objective: Cross-entropy loss with F1 Macro optimization
Compute Infrastructure
Hardware
- GPU-accelerated training (CUDA)
- Mixed precision training support
Software
- Transformers library
- PyTorch framework
- Optuna for hyperparameter optimization
- scikit-learn for metrics
Citation
If you use this model in your research, please cite:
BibTeX:
@misc{mdeberta_policy_nocontext,
title={mDeBERTa Policy Detection Model for Political Group Appeals},
author={Will Horne and Alona O. Dolinsky and Lena Maria Huber},
year={2024},
url={https://huggingface.co/rwillh11/mdeberta_NLI_policy_noContext}
}
Model Card Authors
Research team studying group appeals in political discourse.
Model Card Contact
For questions about this model, please open an issue in the repository or contact the research team through appropriate academic channels.
- Downloads last month
- 6