tejas_v3_retrain

Model Summary

tejas_v3_retrain is a continued fine-tuning of
shravan1nala/tejas-fake-news-model-v3,
which itself is based on DistilBERT (distilbert-base-uncased).

This retraining focuses on improving generalization, stability, and evaluation quality by:

selecting the best checkpoint using validation loss
applying early stopping to prevent overfitting
evaluating with precision, recall, and F1-score, not accuracy alone

The model is designed to act as a classification signal within an evidence-based fact-checking system rather than a standalone decision-maker.

Evaluation Results (Test Set)

Metric	Score
Loss	0.2811
Accuracy	0.8822
Precision	0.9028
Recall	0.8528
F1-score	0.8771

These results demonstrate strong generalization and high precision, making the model suitable for downstream verification pipelines.

Model Description

Architecture: DistilBERT (encoder-only Transformer)
Task: Binary text classification (REAL vs FAKE)
Labels:
- 0 → REAL / SUPPORTED
- 1 → FAKE / MISINFORMATION
Input: Short factual claims, headlines, or summarized news text
Output: Class probabilities used as a signal for fact verification

This model does not generate explanations or final verdicts.
It is intended to be combined with external evidence retrieval and rule-based decision logic.

Intended Uses

Primary Use

Acting as a classifier signal in an automated fake-news / fact-checking system
Assisting evidence-based verification pipelines
Supporting ranking, confidence estimation, and fallback decisions

Secondary Use

Academic research on misinformation detection
Educational demonstrations of NLP-based classification

Limitations

The model may be overconfident on time-sensitive or emerging claims
It does not perform evidence retrieval or reasoning
Predictions should not be treated as final truth judgments
Performance may degrade on domains significantly different from training data

For reliable fact-checking, model predictions should always be combined with trusted external evidence.

Training and Evaluation Data

The model was trained on a unified and normalized dataset published as:

👉 shravan1nala/tejas-fake-news-dataset-v3

This dataset was constructed from multiple well-known misinformation sources, including:

Datasets Used

LIAR-2 Dataset
- Short political claims
- Multi-class labels normalized to binary
- Human-annotated factuality judgments
FakeNewsNet
- News headlines and social-style misinformation
- Binary real/fake labels
- Diverse topics and writing styles
ErfanMoosavi Fake News Dataset
- Longer news articles
- Binary fake/real classification
- News-oriented misinformation patterns

All datasets were:

normalized to a binary label schema
deduplicated
split into train / validation / test sets
balanced to avoid class bias

No external or proprietary data was used.

Training Procedure

Training Strategy

Continued fine-tuning from a previously trained V3 checkpoint
Best model selected using lowest validation loss
Early stopping applied to prevent overfitting
Evaluation performed on a held-out test set

Training Hyperparameters

Learning rate: 2e-5
Train batch size: 16
Evaluation batch size: 16
Epochs: up to 3 (early stopped at epoch 2)
Optimizer: AdamW (fused)
Weight decay: 0.01
Learning rate scheduler: Linear
Random seed: 42

Training Results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Precision	Recall	F1
0.2134	1.0	3369	0.3031	0.8654	0.8851	0.8352	0.8594
0.1583	2.0	6738	0.3898	0.8602	0.8767	0.8334	0.8545

The final model corresponds to the checkpoint with the lowest validation loss.

Framework Versions

Transformers: 4.57.3
PyTorch: 2.9.0+cu126
Datasets: 4.0.0
Tokenizers: 0.22.1

Citation

If you use this model, please cite it as:

Downloads last month: 1

Safetensors

Model size

67M params

Tensor type

F32

Model tree for shravan1nala/fake_news_detection_model_v3.1

Base model

distilbert/distilbert-base-uncased

Finetuned

shravan1nala/tejas-fake-news-model-v3