--- library_name: transformers license: apache-2.0 datasets: - HuggingFaceFW/fineweb-edu language: - en --- # Model Details This model is a 1B llama3 model pretrained from scratch with torchtitan on fineweb-edu with AdamW optimizer. 20x chinchilla rule for 20B tokens seen. # How to use ``` import torch from transformers import pipeline pipe = pipeline( "text-generation", model="kz919/llama3_1b_chinchilla_8132025", ) print(pipe("The key to life is")) ``` # Downstream Eval ## ARC, Hellaswag, Lambda_OpenAI, OpenbookQA, PIQA ``` lm_eval --model hf --model_args pretrained=kz919/llama3_1b_chinchilla_8142025,dtype="bfloat16",add_bos_token=True --tasks lambada_openai,hellaswag,piqa,arc_easy,arc_challenge,openbookqa --device cuda:7 --batch_size 8 ``` | Tasks |Version|Filter|n-shot| Metric | | Value | |Stderr| |--------------|------:|------|-----:|----------|---|------:|---|-----:| |arc_challenge | 1|none | 0|acc |↑ | 0.2688|± |0.0130| | | |none | 0|acc_norm |↑ | 0.2875|± |0.0132| |arc_easy | 1|none | 0|acc |↑ | 0.6082|± |0.0100| | | |none | 0|acc_norm |↑ | 0.5412|± |0.0102| |hellaswag | 1|none | 0|acc |↑ | 0.3459|± |0.0047| | | |none | 0|acc_norm |↑ | 0.4169|± |0.0049| |lambada_openai| 1|none | 0|acc |↑ | 0.3311|± |0.0066| | | |none | 0|perplexity|↓ |38.2983|± |1.5427| |openbookqa | 1|none | 0|acc |↑ | 0.2340|± |0.0190| | | |none | 0|acc_norm |↑ | 0.3500|± |0.0214| |piqa | 1|none | 0|acc |↑ | 0.6795|± |0.0109| | | |none | 0|acc_norm |↑ | 0.6774|± |0.0109| ## MMLU | Groups |Version|Filter|n-shot|Metric| |Value | |Stderr| |------------------|------:|------|------|------|---|-----:|---|-----:| |mmlu | 2|none | |acc |↑ |0.2529|± |0.0037| | - humanities | 2|none | |acc |↑ |0.2459|± |0.0063| | - other | 2|none | |acc |↑ |0.2424|± |0.0077| | - social sciences| 2|none | |acc |↑ |0.2697|± |0.0080| | - stem | 2|none | |acc |↑ |0.2572|± |0.0078|