geoffsee
/

auto-g-nano-10m

Text Generation

Model card Files Files and versions

auto-g-nano

This is a minimal, decoder-only Transformer (nanoGPT-style) trained from scratch on the Tiny Shakespeare dataset.

Model Details

Architecture: Decoder-only Transformer
Parameters: ~10.8M
Vocabulary Size: 65
Embedding Dimension: 384
Heads: 6
Layers: 6
Block Size: 256

How to Use

You can use this model directly with the GPT class from this repository.

from model import GPT

model = GPT.from_pretrained("geoffsee/auto-g-nano")
# Generate text
# context = torch.zeros((1, 1), dtype=torch.long)
# print(model.generate(context, max_new_tokens=100))

Training Data

Trained on the Tiny Shakespeare dataset.

Downloads last month: 14