DistilBERT-based Music Era Classifier

This repository contains a fine-tuned text classification model based on distilbert-base-uncased. The model is designed to classify short text descriptions of eras in classical music into one of four historical musical eras: 0, 1, 2, and 3.

Model Architecture & Training

The model was trained using the Hugging Face Trainer API. It utilizes a distilbert-base-uncased pre-trained model with a classification head on top.

  • Tokenizer: AutoTokenizer.from_pretrained("distilbert-base-uncased")

  • Model: AutoModelForSequenceClassification.from_pretrained("distilbert-base-uncased")

  • Training Arguments: Learning Rate: 2×10−5

  • Epochs: 5

  • Batch Size: 8

  • Evaluation Strategy: Per epoch

  • Metric: accuracy

  • Optimizer: AdamW

music-text-distilbert-predictor

This model is a fine-tuned version of distilbert-base-uncased on the samder03/2025-24679-text-dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0495
  • Accuracy: 1.0
  • F1: 1.0
  • Precision: 1.0
  • Recall: 1.0

Limitations

This model's primary limitations are:

Numerical Labels: The model outputs a numerical label (0, 1, 2, or 3). An external lookup table is required to map these numbers to their corresponding musical era names.

Language & Casing: As the model is based on distilbert-base-uncased, it is designed for English-language text and does not differentiate between uppercase and lowercase letters. It will not work for other languages.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
0.6387 1.0 80 0.5111 0.9563 0.9562 0.9574 0.9563
0.0833 2.0 160 0.1052 0.9812 0.9812 0.9814 0.9812
0.0221 3.0 240 0.0585 0.9812 0.9812 0.9814 0.9812
0.0122 4.0 320 0.0629 0.9812 0.9812 0.9814 0.9812
0.011 5.0 400 0.0614 0.9812 0.9812 0.9814 0.9812

Framework versions

  • Transformers 4.56.1
  • Pytorch 2.8.0+cu126
  • Datasets 4.0.0
  • Tokenizers 0.22.0

Potential Errors

There could be a problem with dataleakage because the accuracy is at 100% Because the model has already been trained on the augmented data, which is just a derivative of the original data, the original dataset isn't a true holdout set. The model is essentially being tested on data that it has already seen and, in some cases, memorized.

Downloads last month
4
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for its-zion-18/music-text-distilbert-predictor

Finetuned
(10304)
this model

Dataset used to train its-zion-18/music-text-distilbert-predictor