the_poli

the_poli is a transformer-based NLP classification model developed as part of the s0m3m0 research project.
The model is designed to analyse political and socio-political text, primarily from online and social media sources, and generate structured predictions for analytical and experimental purposes.

This repository contains only the trained model artifacts (weights, configuration, and tokenizer files).
The full data pipeline and application code are maintained separately.

Model Overview

Model type: Transformer-based text classification
Framework: Hugging Face Transformers
Primary language: English
Domain: Political and social media text
Use case: Research, analysis, and experimentation

The model is intended to assist in identifying patterns and signals in text rather than making authoritative judgments.

Intended Use

The model is suitable for:

Academic and research-based NLP experiments
Political and social discourse analysis
Text classification pipeline prototyping
Educational demonstrations of NLP systems

Not Intended For

Political persuasion or targeting
Surveillance or profiling of individuals
Automated decision-making in real-world political contexts
High-stakes or safety-critical applications

Example Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification

model_id = "d42kw01f/the_poli"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForSequenceClassification.from_pretrained(model_id)

text = "Example political statement for analysis"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
outputs = model(**inputs)

Training Data

Trained on curated datasets derived from publicly available sources
Data was preprocessed and filtered for research purposes
No private, sensitive, or non-consensual data was intentionally included

Dataset details are intentionally limited to reduce misuse risk.

Limitations & Bias

Model performance depends on the quality and balance of the training data
May reflect biases present in source datasets
Not robust to domain shifts, sarcasm, or adversarial input
Outputs should be treated as probabilistic signals, not factual conclusions

Ethical Considerations

This model is released strictly for research and educational use. Users are responsible for:

Ensuring ethical deployment
Respecting platform terms of service
Avoiding harmful, misleading, or manipulative applications

Related Project

Code repository: https://github.com/d42kw01f/s0m3m0
Project name: s0m3m0

Author

Dakshitha Navodya Perera
AI • Cybersecurity • Data Engineering
Sri Lanka

Downloads last month: 18

Safetensors

Model size

0.1B params

Tensor type

F32