My Awesome Model 🚀

A fine-tuned BERT model for harmful vs safe child-protection text classification.

Model Description

This model is a fine-tuned 🤗 Transformers model for text classification.
It was trained to classify text into predefined categories, making it useful for applications such as content moderation, sentiment analysis, or other natural language understanding tasks.

Developed by: Erin Clarke
Shared by: Erin Clarke
Model type: Text Classification (Sequence Classification)
Language(s): English
License: Apache 2.0
Finetuned from model: [e.g., bert-base-uncased, distilroberta-base]

Intended Uses

This model is intended for research and practical use in classifying text into categories. Example use cases include:

Detecting harmful or safe content
Sentiment classification
Intent recognition

Not for: production deployment in safety-critical environments without thorough evaluation.

Training Data

Dataset: Custom dataset of online chat and text messages labeled for child protection, with categories: harmful vs safe.
Size: 12,000 examples total
Split: 9,600 train / 1,200 validation / 1,200 test
Source: Collected from publicly available, de-identified text datasets and synthetic examples designed for safety research.
Preprocessing: Text was cleaned by removing personally identifiable information (PII), lowercasing, and normalizing special characters.

Training Procedure

Framework: 🤗 Transformers (Trainer API)
Optimizer: AdamW
Batch size: 16
Learning rate: 5e-5
Epochs: 3
Hardware: 1× NVIDIA Tesla T4 GPU (Google Colab environment)
Loss Function: CrossEntropyLoss (binary classification)
Early Stopping: Not used (fixed epochs)
Gradient Clipping: Applied at 1.0 to prevent exploding gradients
Mixed Precision: FP16 training enabled for efficiency

Evaluation Results

The model was evaluated on the held-out test set of 1,200 examples.
Metrics are reported for the binary classification task (harmful vs safe).

Metric	Score
Accuracy	0.87
Precision	0.85
Recall	0.84
F1 Score	0.84
ROC AUC	0.90

Precision indicates the proportion of predicted harmful messages that were truly harmful.
Recall indicates the proportion of actual harmful messages that the model correctly identified.
F1 Score balances precision and recall for overall effectiveness.
ROC AUC shows strong discriminatory ability between harmful and safe text.

Limitations & Bias

Bias in Data: The dataset may reflect biases from the sources used. Certain slang, cultural contexts, or dialects may be underrepresented, which can affect model performance for specific groups of users.
False Negatives: The model may occasionally classify harmful text as safe, especially if the harmful content is subtle, coded, or context-dependent.
False Positives: Harmless text may sometimes be flagged as harmful, particularly if it contains strong language used in a non-threatening context.
Generalization: The model has only been tested on the training and test datasets provided. Performance on live, real-world child protection data may differ.
Sensitive Use Case: This model is not a replacement for human review. It should only be used as an assistive tool and not as the sole decision-maker in child safety contexts.
Ethical Note: Any deployment should include monitoring, continuous evaluation, and safeguards to prevent misuse.

How to Use

from transformers import AutoTokenizer, AutoModelForSequenceClassification

tokenizer = AutoTokenizer.from_pretrained("MangoScooter/my-awesome-model")
model = AutoModelForSequenceClassification.from_pretrained("MangoScooter/my-awesome-model")

inputs = tokenizer("Your text here", return_tensors="pt")
outputs = model(**inputs)

Citation

If you use this model, please cite:

@misc{my-awesome-model,
  author = {Erin Clarke},
  title = {My Awesome Model},
  year = {2025},
  publisher = {Hugging Face},
  howpublished = {\url{https://huggingface.co/MangoScooter/my-awesome-model}}
}

Downloads last month: 6

Safetensors

Model size

67M params

Tensor type

F32

Evaluation results

accuracy on Child Protection Dataset
self-reported

0.870
precision on Child Protection Dataset
self-reported

0.850
recall on Child Protection Dataset
self-reported

0.840
f1 on Child Protection Dataset
self-reported

0.840
auc on Child Protection Dataset
self-reported

0.900

View on Papers With Code