my-QA-model
This model is a fine-tuned version of distilbert-base-uncased on an SQuAD v1.1 dataset.
Model description
This is a transformer-based extractive Question Answering (QA) model fine-tuned on the Stanford Question Answering Dataset (SQuAD v1.1).
It takes a context paragraph and a natural language question as input and returns the most probable span in the text that answers the question.
- Architecture: DistilBERT
- Dataset: SQuAD v1.1 (~100k question-answer pairs)
- Task Type: Extractive Question Answering
- Training Objective: Predict start and end token positions of the answer span
- Evaluation Metrics: Exact Match (EM) and F1 Score
Intended uses & limitations
This model is designed for extractive question answering where the answer exists within a provided context.
It can be applied in reading comprehension tasks, chatbots, document search, automated quiz generation, educational tools, and research on transformer-based QA systems.
However, the model has limitations:
- It can only answer questions if the answer is present in the given text.
- It struggles with multi-hop reasoning, abstract inference, and answers requiring outside knowledge.
- Ambiguous or vague questions may result in incorrect spans.
- Performance may degrade on domains that differ significantly from Wikipedia (SQuAD’s source).
- It may reflect biases in the training data.
Training and evaluation data
The model was fine-tuned on the Stanford Question Answering Dataset (SQuAD v1.1), a large-scale reading comprehension dataset consisting of over 100,000 question–answer pairs on Wikipedia articles.
- Training set: ~87,599 examples
- Validation set: ~10,570 examples
- Each example contains a context paragraph, a question, and the corresponding answer span within the paragraph.
Evaluation was performed on the SQuAD v1.1 validation set using Exact Match (EM) and F1 score metrics.
Training procedure
- Base Model: A pre-trained transformer model Distibert-base-uncased from Hugging Face.
- Tokenization: Used the model's corresponding tokenizer with:
max_length=384truncation='only_second'stride=128for sliding window over long contexts
- Optimization:
- Optimizer: AdamW
- Learning rate: 3e-5
- Weight decay: 0.01
- Batch size: 16–32 (depending on GPU memory)
- Epochs: 2–3
- Loss Function: Cross-entropy loss over start and end token positions.
- Evaluation: Computed Exact Match (EM) and F1 score after each epoch.
- Checkpointing: Best model saved based on highest F1 score on validation set.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 3
- mixed_precision_training: Native AMP
Training results
The model achieved the following results on the SQuAD v1.1 validation set:
| Metric | Score |
|---|---|
| Exact Match (EM) | 51% |
| F1 Score | 70.2% |
| Training Loss (final) | 0.64% |
These results are comparable to other transformer-based models fine-tuned on SQuAD , demonstrating strong extractive question answering capabilities.
Framework versions
- Transformers 4.55.0
- Pytorch 2.6.0+cu124
- Datasets 4.0.0
- Tokenizers 0.21.4
- Downloads last month
- 1
Model tree for PredatorAlpha/my-QA-model
Base model
distilbert/distilbert-base-uncased