| | --- |
| | language: [en] |
| | license: apache-2.0 |
| | tags: |
| | - gpt2 |
| | - physics |
| | - ibdp |
| | - education |
| | - tutor |
| | datasets: |
| | - custom |
| | widget: |
| | - text: "Explain Newton’s second law for IB Physics HL." |
| | model-index: |
| | - name: IB-Physics-Mini-GPT |
| | results: [] |
| | --- |
| | |
| | # IB-Physics-Mini-GPT (from-scratch tiny GPT-2) |
| |
|
| | A small GPT-2–style casual LLM trained from scratch on a compact IB Physics HL corpus, |
| | then lightly instruction-tuned for short Q&A. Purpose: show end-to-end skill |
| | (tokenizer → pretrain → SFT → eval → deploy on a HF Space). |
| |
|
| | **Why small?** Fits student budget. **Why physics?** Narrow domain = good coverage with little data. |
| |
|
| | ## Quickstart |
| | ```bash |
| | pip install -r requirements.txt |
| | # 1) prepare data |
| | python train/prepare_corpus.py |
| | python train/build_tokenizer.py |
| | # 2) pretrain (tiny) |
| | python train/pretrain.py |
| | # 3) sft |
| | python train/sft.py |
| | # 4) sample |
| | python train/gen_sample.py --prompt "Explain inertia in one sentence." |
| | # 5) push to Hugging Face |
| | python scripts/push_to_hf.py --repo your-username/ib-physics-mini-gpt |
| | ``` |
| |
|
| | ## Demo Space |
| | This repo includes a Gradio app (`space_app/app.py`). Create a Hugging Face Space, |
| | point it at this folder, set Space SDK=Gradio, Python backend. |
| |
|
| | ## Notes |
| | - Educational demo; not for safety-critical use. |
| | - Inspired by classic GPT papers and hands-on books/videos. |
| |
|