DistilBERT-based Music Era Classifier

This repository contains a fine-tuned text classification model based on distilbert-base-uncased. The model is designed to classify short text descriptions of eras in classical music into one of four historical musical eras: 0, 1, 2, and 3.

Model Architecture & Training

The model was trained using the Hugging Face Trainer API. It utilizes a distilbert-base-uncased pre-trained model with a classification head on top.

Tokenizer: AutoTokenizer.from_pretrained("distilbert-base-uncased")
Model: AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")
Training Arguments: Learning Rate: 2×10−5
Epochs: 5
Batch Size: 8
Evaluation Strategy: Per epoch
Metric: accuracy
Optimizer: AdamW

music-text-distilbert-predictor

This model is a fine-tuned version of distilbert-base-uncased on the samder03/2025-24679-text-dataset. It achieves the following results on the evaluation set:

Loss: 0.0495
Accuracy: 1.0
F1: 1.0
Precision: 1.0
Recall: 1.0

Limitations

This model's primary limitations are:

Numerical Labels: The model outputs a numerical label (0, 1, 2, or 3). An external lookup table is required to map these numbers to their corresponding musical era names.

Language & Casing: As the model is based on distilbert-base-uncased, it is designed for English-language text and does not differentiate between uppercase and lowercase letters. It will not work for other languages.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall
0.6387	1.0	80	0.5111	0.9563	0.9562	0.9574	0.9563
0.0833	2.0	160	0.1052	0.9812	0.9812	0.9814	0.9812
0.0221	3.0	240	0.0585	0.9812	0.9812	0.9814	0.9812
0.0122	4.0	320	0.0629	0.9812	0.9812	0.9814	0.9812
0.011	5.0	400	0.0614	0.9812	0.9812	0.9814	0.9812

Framework versions

Transformers 4.56.1
Pytorch 2.8.0+cu126
Datasets 4.0.0
Tokenizers 0.22.0

Potential Errors

There could be a problem with dataleakage because the accuracy is at 100% Because the model has already been trained on the augmented data, which is just a derivative of the original data, the original dataset isn't a true holdout set. The model is essentially being tested on data that it has already seen and, in some cases, memorized.

Downloads last month: 4

Safetensors

Model size

67M params

Tensor type

F32

Model tree for its-zion-18/music-text-distilbert-predictor

Base model

distilbert/distilbert-base-uncased

Finetuned

(10304)

this model

Dataset used to train its-zion-18/music-text-distilbert-predictor

Evaluation results

Metadata error: specify a dataset to view leaderboard