auto-g-nano

This is a minimal, decoder-only Transformer (nanoGPT-style) trained from scratch on the Tiny Shakespeare dataset.

Model Details

  • Architecture: Decoder-only Transformer
  • Parameters: ~10.8M
  • Vocabulary Size: 65
  • Embedding Dimension: 384
  • Heads: 6
  • Layers: 6
  • Block Size: 256

How to Use

You can use this model directly with the GPT class from this repository.

from model import GPT

model = GPT.from_pretrained("geoffsee/auto-g-nano")
# Generate text
# context = torch.zeros((1, 1), dtype=torch.long)
# print(model.generate(context, max_new_tokens=100))

Training Data

Trained on the Tiny Shakespeare dataset.

Downloads last month
14
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support