Model Card for mDeBERTa Policy Detection

A multilingual policy detection model fine-tuned for detecting policy mentions directed towards specific groups in political text.

Model Details

Model Description

This model is a fine-tuned mDeBERTa-v3-base that performs policy classification using Natural Language Inference (NLI) to determine whether political text contains specific policy proposals directed towards target groups.

  • Developed by: Will Horne, Alona O. Dolinsky and Lena Maria Huber
  • Model type: Sequence Classification (NLI-based policy detection)
  • Language(s) (NLP): English, German (multilingual)
  • Finetuned from model: MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7

Model Sources

Uses

Direct Use

The model is designed for researchers analyzing whether policy proposals are found in political discourse that are directed towards specific groups. It takes a political text and a target group as input and classifies whether the text contains a policy directed towards that group (policy/no policy).

Note that the model does not categorize the texts into designated policy areas (e.g., healthcare, education) but rather identifies the presence of any policy directed at the specified group.

Downstream Use

This model can be integrated into larger political text analysis pipelines for:

  • Political manifestos analysis
  • Policy proposal detection in political communication
  • Comparative political research across countries and languages
  • Group-targeted policy analysis

Out-of-Scope Use

This model should not be used for:

  • General policy detection (not group-specific)
  • Categorization of policies into specific policy areas
  • Real-time social media monitoring without human oversight
  • Making decisions about individuals or groups
  • Content moderation without additional validation

Bias, Risks, and Limitations

Technical Limitations

  • Trained specifically on political manifesto text; performance may vary on other text types
  • Focus sentences without context may lack nuance present in full paragraphs
  • Limited to two policy categories (policy, no policy)

Bias Considerations

  • Training data consists of political manifestos from specific countries and time periods
  • May reflect biases present in political discourse of training data
  • Policy detection may vary across different political contexts and group types

Recommendations

Users should be aware that this model:

  • Is designed for research purposes in political science
  • Should be validated on specific domains before deployment
  • May require human oversight for sensitive applications
  • Performance may vary across different types of groups and political contexts

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load model and tokenizer
model_name = "rwillh11/mdeberta_NLI_policy_noContext"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example usage
text = "We will increase funding for schools to better support students."
target_group = "students"

# Create hypotheses for each policy class
hypotheses = {
    "policy": f"The text contains a policy directed towards {target_group}.",
    "no policy": f"The text does not contain a policy directed towards {target_group}."
}

# Get predictions for each hypothesis
results = {}
for policy_class, hypothesis in hypotheses.items():
    inputs = tokenizer(text, hypothesis, return_tensors="pt", truncation=True)
    with torch.no_grad():
        outputs = model(**inputs)
        probs = torch.softmax(outputs.logits, dim=-1)
        entailment_prob = probs[0][0].item()  # Probability of entailment
        results[policy_class] = entailment_prob

# Select policy class with highest entailment probability
predicted_class = max(results, key=results.get)
print(f"Predicted policy classification for '{target_group}': {predicted_class}")

Training Details

Training Data

The model was trained on political manifesto data containing:

  • Languages: English and German
  • Text Type: Political manifesto sentences (focal sentences without context)
  • Labels: Two-class policy classification (policy, no policy)
  • Groups: Various political target groups (citizens, specific demographics, professions, etc.)
  • Original dataset: 7,546 text-group pairs
    • English: 4,066 text-group pairs
    • German: 3,480 text-group pairs
    • Training Size: ~6,037 original texts (80% split)
    • Test Size: ~1,509 original texts (20% split)

Training Procedure

Preprocessing

  • Texts tokenized using mDeBERTa tokenizer with max length 512
  • NLI format: premise (political text) + hypothesis (policy towards group)
  • Each text paired with both true and false hypotheses for binary classification

Training Hyperparameters

  • Training regime: Mixed precision training
  • Optimizer: AdamW with weight decay
  • Learning rate: Optimized via Optuna (range: 1e-5 to 4e-5)
  • Weight decay: Optimized via Optuna (range: 0.01 to 0.3)
  • Warmup ratio: Optimized via Optuna (range: 0.0 to 0.1)
  • Epochs: 10 per trial
  • Batch size: 16 (train and eval)
  • Trials: 20 total
  • Metric for selection: F1 Macro
  • Seed: 42 (deterministic training)

Training Infrastructure

  • Hardware: CUDA-enabled GPU
  • Framework: Transformers, PyTorch
  • Hyperparameter optimization: Optuna
  • Deterministic training: All random seeds fixed

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • 20% holdout from original dataset
  • Multilingual political manifesto sentences

Factors

The model was evaluated across:

  • Languages: English and German text
  • Additional validation on held out sets: English, German, Dutch, Danish, Spanish, French, Italian, Swedish
  • Policy classes: Policy, no policy
  • Group types: Various socia-demographic groups

Metrics

Primary metrics used for evaluation:

  • F1 Macro: Primary optimization metric (treats all classes equally)
  • Balanced Accuracy: Accounts for class imbalance
  • Precision/Recall (Macro): Detailed performance measures

Results

Best Model Performance (Trial 10, Epoch 9):

  • Accuracy: 0.872
  • Balanced Accuracy: 0.874
  • Precision: 0.869
  • Recall: 0.874
  • F1 Macro: 0.870

Additional validation on held-out sets return the following metrics:

English

  • Accuracy: 0.860
  • Precision: 0.864
  • Recall: 0.860
  • F1 Macro: 0.859

German (using texts translated from English)

  • Accuracy: 0.814
  • Precision: 0.814
  • Recall: 0.814
  • F1 Macro: 0.814

Dutch (using texts translated from English)

  • Accuracy: 0.847
  • Precision: 0.847
  • Recall: 0.847
  • F1 Macro: 0.847

Danish (using texts translated from English)

  • Accuracy: 0.837
  • Precision: 0.837
  • Recall: 0.837
  • F1 Macro: 0.836

Spanish (using texts translated from English)

  • Accuracy: 0.860
  • Precision: 0.861
  • Recall: 0.860
  • F1 Macro: 0.860

French (using texts translated from English)

  • Accuracy: 0.837
  • Precision: 0.837
  • Recall: 0.837
  • F1 Macro: 0.837

Italian (using texts translated from English)

  • Accuracy: 0.845
  • Precision: 0.847
  • Recall: 0.845
  • F1 Macro: 0.845

Swedish (using texts translated from English)

  • Accuracy: 0.863
  • Precision: 0.864
  • Recall: 0.863
  • F1 Macro: 0.863

The model demonstrates strong performance across policy categories with deterministic results confirmed through multiple prediction runs.

Model Examination

The model uses Natural Language Inference to transform policy detection into a binary entailment task:

  • For each text-group pair, generates two hypotheses (policy/no policy)
  • Selects the hypothesis with highest entailment probability
  • This approach leverages pre-trained NLI capabilities for policy classification

Environmental Impact

Training involved hyperparameter optimization with 20 trials, each training for 10 epochs.

  • Hardware Type: CUDA-enabled GPU
  • Hours used: Estimated 10-15 hours (including hyperparameter search)
  • Cloud Provider: Google Colab
  • Compute Region: Variable
  • Carbon Emitted: Not precisely measured

Technical Specifications

Model Architecture and Objective

  • Base Architecture: mDeBERTa-v3-base (278M parameters)
  • Task: Natural Language Inference for policy detection
  • Input: Text pair (political sentence + policy hypothesis)
  • Output: Binary classification (entailment/non-entailment)
  • Objective: Cross-entropy loss with F1 Macro optimization

Compute Infrastructure

Hardware

  • GPU-accelerated training (CUDA)
  • Mixed precision training support

Software

  • Transformers library
  • PyTorch framework
  • Optuna for hyperparameter optimization
  • scikit-learn for metrics

Citation

If you use this model in your research, please cite:

BibTeX:

@misc{mdeberta_policy_nocontext,
  title={mDeBERTa Policy Detection Model for Political Group Appeals},
  author={Will Horne and Alona O. Dolinsky and Lena Maria Huber},
  year={2024},
  url={https://huggingface.co/rwillh11/mdeberta_NLI_policy_noContext}
}

Model Card Authors

Research team studying group appeals in political discourse.

Model Card Contact

For questions about this model, please open an issue in the repository or contact the research team through appropriate academic channels.

Downloads last month
6
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rwillh11/mdeberta_NLI_policy_noContext