tejas_v3_retrain
Model Summary
tejas_v3_retrain is a continued fine-tuning ofshravan1nala/tejas-fake-news-model-v3,
which itself is based on DistilBERT (distilbert-base-uncased).
This retraining focuses on improving generalization, stability, and evaluation quality by:
- selecting the best checkpoint using validation loss
- applying early stopping to prevent overfitting
- evaluating with precision, recall, and F1-score, not accuracy alone
The model is designed to act as a classification signal within an evidence-based fact-checking system rather than a standalone decision-maker.
Evaluation Results (Test Set)
| Metric | Score |
|---|---|
| Loss | 0.2811 |
| Accuracy | 0.8822 |
| Precision | 0.9028 |
| Recall | 0.8528 |
| F1-score | 0.8771 |
These results demonstrate strong generalization and high precision, making the model suitable for downstream verification pipelines.
Model Description
- Architecture: DistilBERT (encoder-only Transformer)
- Task: Binary text classification (REAL vs FAKE)
- Labels:
0โ REAL / SUPPORTED1โ FAKE / MISINFORMATION
- Input: Short factual claims, headlines, or summarized news text
- Output: Class probabilities used as a signal for fact verification
This model does not generate explanations or final verdicts.
It is intended to be combined with external evidence retrieval and rule-based decision logic.
Intended Uses
Primary Use
- Acting as a classifier signal in an automated fake-news / fact-checking system
- Assisting evidence-based verification pipelines
- Supporting ranking, confidence estimation, and fallback decisions
Secondary Use
- Academic research on misinformation detection
- Educational demonstrations of NLP-based classification
Limitations
- The model may be overconfident on time-sensitive or emerging claims
- It does not perform evidence retrieval or reasoning
- Predictions should not be treated as final truth judgments
- Performance may degrade on domains significantly different from training data
For reliable fact-checking, model predictions should always be combined with trusted external evidence.
Training and Evaluation Data
The model was trained on a unified and normalized dataset published as:
๐ shravan1nala/tejas-fake-news-dataset-v3
This dataset was constructed from multiple well-known misinformation sources, including:
Datasets Used
LIAR-2 Dataset
- Short political claims
- Multi-class labels normalized to binary
- Human-annotated factuality judgments
FakeNewsNet
- News headlines and social-style misinformation
- Binary real/fake labels
- Diverse topics and writing styles
ErfanMoosavi Fake News Dataset
- Longer news articles
- Binary fake/real classification
- News-oriented misinformation patterns
All datasets were:
- normalized to a binary label schema
- deduplicated
- split into train / validation / test sets
- balanced to avoid class bias
No external or proprietary data was used.
Training Procedure
Training Strategy
- Continued fine-tuning from a previously trained V3 checkpoint
- Best model selected using lowest validation loss
- Early stopping applied to prevent overfitting
- Evaluation performed on a held-out test set
Training Hyperparameters
- Learning rate: 2e-5
- Train batch size: 16
- Evaluation batch size: 16
- Epochs: up to 3 (early stopped at epoch 2)
- Optimizer: AdamW (fused)
- Weight decay: 0.01
- Learning rate scheduler: Linear
- Random seed: 42
Training Results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|---|---|---|---|---|---|---|---|
| 0.2134 | 1.0 | 3369 | 0.3031 | 0.8654 | 0.8851 | 0.8352 | 0.8594 |
| 0.1583 | 2.0 | 6738 | 0.3898 | 0.8602 | 0.8767 | 0.8334 | 0.8545 |
The final model corresponds to the checkpoint with the lowest validation loss.
Framework Versions
- Transformers: 4.57.3
- PyTorch: 2.9.0+cu126
- Datasets: 4.0.0
- Tokenizers: 0.22.1
Citation
If you use this model, please cite it as:
- Downloads last month
- 1
Model tree for shravan1nala/fake_news_detection_model_v3.1
Base model
distilbert/distilbert-base-uncased