File size: 8,529 Bytes
d87abe2 d7ce272 d87abe2 d7ce272 291505e 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 d7ce272 7edcd55 291505e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 |
---
license: cc-by-4.0
language:
- en
- es
- fr
- de
metrics:
- accuracy
- f1
tags:
- misinformation
- engineering
- robustness
- adversarial
datasets:
- Stevebankz/Emc
---
# Model Card for Engineering Misinformation Detection Models
This repository provides a suite of models trained on the **Engineering Misinformation Corpus (EMC)**, introduced in our paper *"The Brittleness of Transformer Feature Fusion: A Comparative Study of Model Robustness in Engineering Misinformation Detection (2025)"*.
The models are designed to detect **AI-generated misinformation** in **safety-critical engineering documents**. Each represents a different paradigm: traditional feature-based learning, end-to-end Transformers, and hybrid fusion models.
---
## Model Details
### Models Included
1. **XLM-R (Text Only)**
- Transformer-based classifier (XLM-RoBERTa) trained end-to-end on raw text.
- Files: `config.json`, `model.safetensors`, `sentencepiece.bpe.model`, `tokenizer_config.json`, `tokenizer.json`.
2. **Simple Fusion (Text + Features, Concatenation)**
- Combines XLM-R embeddings with a 12-dimensional engineered feature set via naive concatenation.
- Files: `fusion_simple.pt`, `scaler.pkl`.
3. **Gated Fusion (Text + Features, Dynamic Arbitration)**
- Combines XLM-R embeddings with engineered features using a gating mechanism to dynamically arbitrate signals.
- Files: `fusion_gated.pt`, `scaler.pkl`.
4. **XGBoost + Features (Feature-Only Baseline)**
- Lightweight tree-based model trained solely on 12D engineered features.
- Files: `xgb_model.json`, `scaler.pkl`.
- **Developed by:** Steve Nwaiwu and collaborators
- **Model type:** Comparative benchmark (feature-based, Transformer-based, and hybrid fusion approaches)
- **Languages:** Primarily English, with experimental subsets in ES/FR/DE
- **License:** CC-BY 4.0
- **Finetuned from:** `xlm-roberta-base` (for text-based components)
---
## Uses
### Direct Use
- **Detect AI-generated misinformation** in engineering and technical documentation.
- Benchmark adversarial robustness across modeling paradigms.
- Educational use for comparing model interpretability and brittleness.
### Downstream Use
- Adaptation to domain-specific technical corpora (e.g., biomedical, aerospace, safety reports).
- Fusion-based approaches can be extended with domain-specific structured features.
### Out-of-Scope Use
- Not suitable for **general misinformation detection** (e.g., politics, social media).
- Not validated for **non-technical or informal text domains**.
- Should not be used as a standalone safety system without human oversight.
---
## Bias, Risks, and Limitations
- **Bias in training corpus**: While curated for engineering, domain coverage may be uneven across subfields.
- **Fusion brittleness**: Naive feature fusion models fail under semantic adversarial attacks (e.g., synonym swaps).
- **Language limits**: Non-English results were confounded by dataset artifacts; multilingual robustness is not yet validated.
### Recommendations
- Use **XLM-R** when robustness to adversarial perturbations is critical.
- Use **XGBoost** in resource-constrained environments, but pair with human-in-the-loop oversight.
- Avoid **naive fusion** in safety-critical deployment.
### Training Details
- Training Data
- **Dataset**: Engineering Misinformation Corpus (EMC)
- **Splits**: Train / Validation / Test
Includes both real engineering documents and AI-generated misinformation.
- Training Procedure
- **XLM-R models**: finetuned with AdamW optimizer, learning rate warmup, max length 256.
- **Fusion models**: trained with joint optimization of Transformer encoder + fusion classifier.
- **XGBoost**: trained with grid-searched hyperparameters (depth, learning rate, estimators).
### Technical Specifications
- Architectures
- XLM-RoBERTa: Transformer encoder (base size)
- Simple Fusion: Concatenation of Transformer pooled embedding + 12D features
- Gated Fusion: Transformer pooled embedding + gated feature projection (64D)
- XGBoost: Gradient-boosted decision trees
### Software
- PyTorch, Hugging Face Transformers, scikit-learn, XGBoost
### Evaluation
- Metrics
- Macro F1, Micro F1, and AP (Average Precision).
- Adversarial Robustness tested under:
- Structural perturbations (typos, casing)
- Semantic/Cue perturbations (synonym swap, term masking)
- Feature-targeted perturbations (numeric & unit corruption)
---
## How to Get Started with the Models
### Example: Load XGBoost + Features
```python
import joblib, xgboost as xgb
import numpy as np
# Load scaler + model
scaler = joblib.load("scaler.pkl")
model = xgb.XGBClassifier()
model.load_model("xgb_model.json")
# Example features (12D vector extracted with your pipeline)
x = np.array([[120, 30, 5, 0.0, 0.0, 0.05, 10, 3, 1, 0, 12.5, 0.33]])
x_scaled = scaler.transform(x)
pred = model.predict(x_scaled)
print("Predicted class:", pred[0])
### Example: Load XLM-R (Text Only)
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
tokenizer = AutoTokenizer.from_pretrained("your-username/emc-models", subfolder="xlmr_text_only")
model = AutoModelForSequenceClassification.from_pretrained("your-username/emc-models", subfolder="xlmr_text_only")
text = "For safety, follow the shutdown procedure."
inputs = tokenizer(text, return_tensors="pt")
outputs = model(**inputs)
print(outputs.logits.softmax(-1))
### Example: Load GATED FUSION
import torch
import joblib
import numpy as np
import pandas as pd
from transformers import AutoTokenizer, XLMRobertaModel
# ---- Load tokenizer and scaler ----
tokenizer = AutoTokenizer.from_pretrained("your-username/emc-models/xlmr_text_only")
scaler = joblib.load("scaler.pkl")
# ---- Define Gated Fusion model (must match training definition) ----
import torch.nn as nn
class GatedFusion(nn.Module):
def __init__(self, model_name: str, n_feats: int = 12, n_labels: int = 2, feat_proj: int = 64):
super().__init__()
self.encoder = XLMRobertaModel.from_pretrained(model_name)
H = self.encoder.config.hidden_size
self.fe_proj = nn.Sequential(nn.Linear(n_feats, feat_proj), nn.ReLU())
self.gate = nn.Sequential(
nn.Linear(H + n_feats, 64), nn.ReLU(),
nn.Linear(64, 1), nn.Sigmoid()
)
self.dropout = nn.Dropout(0.1)
self.classifier = nn.Linear(H + feat_proj, n_labels)
def forward(self, input_ids, attention_mask, feats):
out = self.encoder(input_ids=input_ids, attention_mask=attention_mask, return_dict=True)
mask = attention_mask.unsqueeze(-1).float()
pooled = (out.last_hidden_state * mask).sum(dim=1) / mask.sum(dim=1).clamp(min=1e-6)
alpha = self.gate(torch.cat([pooled, feats], dim=1))
ef = self.fe_proj(feats)
fused = torch.cat([pooled, alpha * ef], dim=1)
return self.classifier(self.dropout(fused))
# ---- Load trained weights ----
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = GatedFusion("xlm-roberta-base", n_feats=12, n_labels=2).to(device)
model.load_state_dict(torch.load("fusion_gated.pt", map_location=device), strict=False)
model.eval()
# ---- Example input ----
text = "For safety, follow the shutdown procedure."
features = np.array([[120, 30, 5, 0.0, 0.0, 0.05, 10, 3, 1, 0, 12.5, 0.33]]) # replace with your feature extractor
features_scaled = scaler.transform(features)
# ---- Tokenize text ----
enc = tokenizer(text, truncation=True, padding="max_length", max_length=256, return_tensors="pt")
input_ids = enc["input_ids"].to(device)
attn_mask = enc["attention_mask"].to(device)
feats = torch.tensor(features_scaled, dtype=torch.float32).to(device)
# ---- Run model ----
with torch.no_grad():
logits = model(input_ids, attn_mask, feats)
probs = torch.softmax(logits, dim=-1).cpu().numpy()[0]
print("Prediction probs:", probs)
print("Predicted class:", probs.argmax()) # 0 = Real, 1 = Misinformation
# ---- Citation -----
@article{NWAIWU2025107783,
title = {The brittleness of transformer feature fusion: A comparative study of model robustness in engineering misinformation detection},
journal = {Results in Engineering},
volume = {28},
pages = {107783},
year = {2025},
issn = {2590-1230},
doi = {https://doi.org/10.1016/j.rineng.2025.107783},
url = {https://www.sciencedirect.com/science/article/pii/S2590123025038368},
author = {Steve Nwaiwu and Nipat Jongsawat and Anucha Tungkasthan}} |