| | --- |
| | license: mit |
| | language: |
| | - en |
| | metrics: |
| | - accuracy |
| | - f1 |
| | base_model: |
| | - google-bert/bert-base-uncased |
| | pipeline_tag: text-classification |
| | tags: |
| | - text-classification |
| | - ai-detection |
| | - academic-text |
| | - ai-generated-text-detection |
| | model-index: |
| | - name: bert-ai-text-detector |
| | results: |
| | - task: |
| | type: text-classification |
| | name: AI-Generated Text Detection |
| | dataset: |
| | name: Custom Academic Text Dataset |
| | type: custom |
| | metrics: |
| | - type: accuracy |
| | value: 0.9957 |
| | - type: f1 |
| | value: 0.9958 |
| | - type: precision |
| | value: 0.9923 |
| | - type: recall |
| | value: 0.9994 |
| | --- |
| | # BERT-based AI-Generated Academic Text Detector |
| |
|
| | A high-accuracy BERT model for detecting AI-generated academic text with **99.57% accuracy** on paragraph-level samples. |
| |
|
| | ## Online Demo |
| |
|
| | 🌐 **Try the model online**: [https://followsci.com/ai-detection](https://followsci.com/ai-detection) |
| |
|
| | Free web interface with real-time detection, no installation or API key required. |
| |
|
| | ## Model Details |
| |
|
| | ### Model Description |
| |
|
| | - **Model Type**: BERT-base-uncased fine-tuned for binary text classification |
| | - **Architecture**: BERT-base-uncased (110M parameters) |
| | - **Task**: Binary classification (Human-written vs AI-generated text) |
| | - **Input**: Academic text paragraphs (up to 512 tokens) |
| | - **Output**: Binary label (0 = Human-written, 1 = AI-generated) with confidence scores |
| |
|
| | ### Training Information |
| |
|
| | - **Training Samples**: 1,487,400 paragraph-level samples |
| | - **Validation Samples**: 185,930 paragraph-level samples |
| | - **Test Samples**: 185,930 paragraph-level samples |
| | - **Total Dataset**: 1,859,260 paragraphs |
| | - **Training Data**: |
| | - Human-written: Academic papers from arXiv |
| | - AI-generated: Text generated by various large language models (GPT, Claude, etc.) |
| |
|
| | ## Performance |
| |
|
| | ### Test Set Results |
| |
|
| | | Metric | Value | |
| | |--------|-------| |
| | | **Accuracy** | **99.57%** | |
| | | **F1-Score** | **99.58%** | |
| | | Precision | 99.23% | |
| | | Recall | 99.94% | |
| | | False Positive Rate | 0.82% | |
| | | False Negative Rate | 0.06% | |
| |
|
| | ### Confusion Matrix (Test Set) |
| |
|
| | | | Predicted: Human | Predicted: AI | |
| | |---|---|---| |
| | | **Actual: Human** | 89,740 (TN) | 740 (FP) | |
| | | **Actual: AI** | 60 (FN) | 95,390 (TP) | |
| |
|
| | **Inference Speed:** ~20,900 samples/second on RTX 3090 (batch size 64) |
| |
|
| | ## Usage |
| |
|
| | ### Quick Start |
| |
|
| | ```python |
| | from transformers import BertTokenizer, BertForSequenceClassification |
| | import torch |
| | |
| | # Load model and tokenizer |
| | model_name = "followsci/bert-ai-text-detector" |
| | tokenizer = BertTokenizer.from_pretrained(model_name) |
| | model = BertForSequenceClassification.from_pretrained(model_name) |
| | model.eval() |
| | |
| | # Detect AI text |
| | text = "Your academic paragraph here..." |
| | inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512) |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | probs = torch.nn.functional.softmax(outputs.logits, dim=-1) |
| | ai_prob = probs[0][1].item() * 100 |
| | human_prob = probs[0][0].item() * 100 |
| | |
| | print(f"AI-generated probability: {ai_prob:.1f}%") |
| | print(f"Human-written probability: {human_prob:.1f}%") |
| | |
| | if ai_prob > 50: |
| | print("Prediction: AI-generated") |
| | else: |
| | print("Prediction: Human-written") |
| | ``` |
| |
|
| | ### Batch Processing |
| |
|
| | ```python |
| | texts = [ |
| | "First paragraph...", |
| | "Second paragraph...", |
| | # ... more texts |
| | ] |
| | |
| | inputs = tokenizer( |
| | texts, |
| | return_tensors="pt", |
| | truncation=True, |
| | max_length=512, |
| | padding=True |
| | ) |
| | |
| | with torch.no_grad(): |
| | outputs = model(**inputs) |
| | probs = torch.nn.functional.softmax(outputs.logits, dim=-1) |
| | |
| | for i, prob in enumerate(probs): |
| | ai_prob = prob[1].item() * 100 |
| | print(f"Text {i+1}: AI probability = {ai_prob:.1f}%") |
| | ``` |
| |
|
| | ### Using with Transformers Pipeline |
| |
|
| | ```python |
| | from transformers import pipeline |
| | |
| | classifier = pipeline( |
| | "text-classification", |
| | model="followsci/bert-ai-text-detector", |
| | tokenizer="followsci/bert-ai-text-detector" |
| | ) |
| | |
| | result = classifier("Your text here...") |
| | print(result) |
| | ``` |
| |
|
| | ## Training Details |
| |
|
| | ### Training Configuration |
| |
|
| | - **Base Model**: `bert-base-uncased` |
| | - **Batch Size**: 64 |
| | - **Learning Rate**: 5e-5 (with linear warmup) |
| | - **Warmup Steps**: 5,000 |
| | - **Max Sequence Length**: 512 |
| | - **Optimizer**: AdamW |
| | - **Epochs**: 3 |
| | - **Training Time**: ~11 hours (on RTX 3090) |
| |
|
| | ### Dataset Distribution |
| |
|
| | | Split | Total Samples | Human (Label 0) | AI (Label 1) | |
| | |-------|--------------|-----------------|--------------| |
| | | Train | 1,487,400 | 723,780 (48.7%) | 763,620 (51.3%) | |
| | | Validation | 185,930 | 90,470 (48.7%) | 95,460 (51.3%) | |
| | | Test | 185,930 | 90,480 (48.7%) | 95,450 (51.3%) | |
| |
|
| | ## Limitations |
| |
|
| | 1. **Domain Specificity**: The model is trained primarily on academic text. Performance may degrade on: |
| | - Casual text or social media content |
| | - Technical documentation |
| | - Creative writing |
| |
|
| | 2. **Binary Classification**: The model only distinguishes between "human" and "AI" text, without: |
| | - Identifying which AI model generated the text |
| | - Providing confidence intervals |
| | - Detecting partially AI-assisted text |
| |
|
| | 3. **Paragraph-Level Detection**: The model is optimized for paragraph-level samples: |
| | - Performance on sentence-level or full-document level may vary |
| | - Best results achieved with structured academic paragraphs |
| |
|
| | 4. **False Positives**: Approximately 0.82% false positive rate means some human-written text may be flagged as AI-generated. |
| |
|
| | ## Ethical Considerations |
| |
|
| | - **Use Case**: This model is intended as a tool for academic integrity and research purposes |
| | - **Bias**: The model may reflect biases present in the training data |
| | - **Misuse**: Should not be used as the sole criterion for academic misconduct decisions |
| | - **Transparency**: Results should be interpreted with context and domain expertise |
| |
|
| |
|
| | ## License |
| |
|
| | This model is licensed under the MIT License. |
| |
|
| | ## Contact |
| |
|
| | - **Email**: raffoduanedonnenfeld@gmail.com |
| |
|
| | --- |
| |
|
| | <p align="center"> |
| | Made with ❤️ for Academic Integrity |
| | </p> |
| |
|