TaiPhone: A Phone-Scale LLM Rooted in Taiwanese Knowledge

image/png

TaiPhone is a low-cost, lightweight language model built for Traditional Chinese, with a strong focus on Taiwanese language, culture, and context. Trained on just 0.7 billion carefully curated tokens and enhanced with chat vector techniques, TaiPhone delivers superior performance compared to similarly sized open-source 1B or 3B-scale LLMs. TaiPhone shows that with the right data, effective and culturally-aware models can be built at a fraction of the cost.

Model Information

  • Base model: https://huggingface.co/meta-llama/Llama-3.2-1B
  • Context length: 16k
  • Training detail:
    • Numbers of tokens: 0.7B tokens
    • Continual pretraining(CP) epochs: 2
    • Fine-tuning(FT) epochs: 3
    • CP learning rate: 5e-5 with cosine scheduler.
    • FT learning rate: 1e-5 with cosine scheduler.

Benchmark

MCQ Evaluation Process

  1. The model is prompted to answer each multiple-choice question in free-form, without being constrained to a specific format.
  2. A lightweight LLM (e.g., GPT-4.1-nano) is then used to extract the model’s final selected option from its response.
  3. Accuracy is calculated by comparing the extracted answers against the correct choices.

Score

  • 1B Scale
Model TW-MCQ MMLU-Redux
LLaMA3.2-1B-Instruct 0.305 0.403
LLaMA3.2-1B-it-chinese-kyara 0.360 0.405
LLaMA3.2-TaiPhone-1B-Instruct-v0.1 (Ours) 0.375 0.421
  • 3B Scale
Model TW-MCQ MMLU-Redux
LLaMA3.2-3B-Instruct 0.442 0.569
LLaMA3.2-3B-it-chinese-kyara 0.462 0.405
Llama-3.2-3B-F1-Instruct 0.458 0.548
LLaMA3.2-TaiPhone-3B-Instruct-v0.1 (Ours) 0.502 0.578
Downloads last month
9
Safetensors
Model size
1B params
Tensor type
BF16
·
Inference Providers NEW
Input a message to start chatting with aqweteddy/Llama3.2-TaiPhone-1B-Instruct-v0.1.

Model tree for aqweteddy/Llama3.2-TaiPhone-1B-Instruct-v0.1

Finetuned
(807)
this model
Quantizations
1 model

Collection including aqweteddy/Llama3.2-TaiPhone-1B-Instruct-v0.1