Edge-Quant/EXAONE-4.0-1.2B-Q4_K_M-GGUF
This model was converted to GGUF format from LGAI-EXAONE/EXAONE-4.0-1.2B using llama.cpp via the ggml.ai's GGUF-my-repo space.
Refer to the original model card for more details on the model.
benchmark 1.2B Reasoning Mode
| EXAONE 4.0 1.2B | EXAONE Deep 2.4B | Qwen 3 0.6B | Qwen 3 1.7B | SmolLM 3 3B | |
|---|---|---|---|---|---|
| Model Size | 1.28B | 2.41B | 596M | 1.72B | 3.08B |
| Hybrid Reasoning | β | β | β | β | |
| World Knowledge | |||||
| MMLU-Redux | 71.5 | 68.9 | 55.6 | 73.9 | 74.8 |
| MMLU-Pro | 59.3 | 56.4 | 38.3 | 57.7 | 57.8 |
| GPQA-Diamond | 52.0 | 54.3 | 27.9 | 40.1 | 41.7 |
| Math/Coding | |||||
| AIME 2025 | 45.2 | 47.9 | 15.1 | 36.8 | 36.7 |
| HMMT Feb 2025 | 34.0 | 27.3 | 7.0 | 21.8 | 26.0 |
| LiveCodeBench v5 | 44.6 | 47.2 | 12.3 | 33.2 | 27.6 |
| LiveCodeBench v6 | 45.3 | 43.1 | 16.4 | 29.9 | 29.1 |
| Instruction Following | |||||
| IFEval | 67.8 | 71.0 | 59.2 | 72.5 | 71.2 |
| Multi-IF (EN) | 53.9 | 54.5 | 37.5 | 53.5 | 47.5 |
| Agentic Tool Use | |||||
| BFCL-v3 | 52.9 | N/A | 46.4 | 56.6 | 37.1 |
| Tau-Bench (Airline) | 20.5 | N/A | 22.0 | 31.0 | 37.0 |
| Tau-Bench (Retail) | 28.1 | N/A | 3.3 | 6.5 | 5.4 |
| Multilinguality | |||||
| KMMLU-Pro | 42.7 | 24.6 | 21.6 | 38.3 | 30.5 |
| KMMLU-Redux | 46.9 | 25.0 | 24.5 | 38.0 | 33.7 |
| KSM | 60.6 | 60.9 | 22.8 | 52.9 | 49.7 |
| MMMLU (ES) | 62.4 | 51.4 | 48.8 | 64.5 | 64.7 |
| MATH500 (ES) | 88.8 | 84.5 | 70.6 | 87.9 | 87.5 |
1.2B Non-Reasoning Mode
| EXAONE 4.0 1.2B | Qwen 3 0.6B | Gemma 3 1B | Qwen 3 1.7B | SmolLM 3 3B | |
|---|---|---|---|---|---|
| Model Size | 1.28B | 596M | 1.00B | 1.72B | 3.08B |
| Hybrid Reasoning | β | β | β | β | |
| World Knowledge | |||||
| MMLU-Redux | 66.9 | 44.6 | 40.9 | 63.4 | 65.0 |
| MMLU-Pro | 52.0 | 26.6 | 14.7 | 43.7 | 43.6 |
| GPQA-Diamond | 40.1 | 22.9 | 19.2 | 28.6 | 35.7 |
| Math/Coding | |||||
| AIME 2025 | 23.5 | 2.6 | 2.1 | 9.8 | 9.3 |
| HMMT Feb 2025 | 13.0 | 1.0 | 1.5 | 5.1 | 4.7 |
| LiveCodeBench v5 | 26.4 | 3.6 | 1.8 | 11.6 | 11.4 |
| LiveCodeBench v6 | 30.1 | 6.9 | 2.3 | 16.6 | 20.6 |
| Instruction Following | |||||
| IFEval | 74.7 | 54.5 | 80.2 | 68.2 | 76.7 |
| Multi-IF (EN) | 62.1 | 37.5 | 32.5 | 51.0 | 51.9 |
| Long Context | |||||
| HELMET | 41.2 | 21.1 | N/A | 33.8 | 38.6 |
| RULER | 77.4 | 55.1 | N/A | 65.9 | 66.3 |
| LongBench v1 | 36.9 | 32.4 | N/A | 41.9 | 39.9 |
| Agentic Tool Use | |||||
| BFCL-v3 | 55.7 | 44.1 | N/A | 52.2 | 47.3 |
| Tau-Bench (Airline) | 10.0 | 31.5 | N/A | 13.5 | 38.0 |
| Tau-Bench (Retail) | 21.7 | 5.7 | N/A | 4.6 | 6.7 |
| Multilinguality | |||||
| KMMLU-Pro | 37.5 | 24.6 | 9.7 | 29.5 | 27.6 |
| KMMLU-Redux | 40.4 | 22.8 | 19.4 | 29.8 | 26.4 |
| KSM | 26.3 | 0.1 | 22.8 | 16.3 | 16.1 |
| Ko-LongBench | 69.8 | 16.4 | N/A | 57.1 | 15.7 |
| MMMLU (ES) | 54.6 | 39.5 | 35.9 | 54.3 | 55.1 |
| MATH500 (ES) | 71.2 | 38.5 | 41.2 | 66.0 | 62.4 |
| WMT24++ (ES) | 65.9 | 58.2 | 76.9 | 76.7 | 84.0 |
Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)
brew install llama.cpp
Invoke the llama.cpp server or the CLI.
CLI:
llama-cli --hf-repo Edge-Quant/EXAONE-4.0-1.2B-Q4_K_M-GGUF --hf-file exaone-4.0-1.2b-q4_k_m.gguf -p "The meaning to life and the universe is"
Server:
llama-server --hf-repo Edge-Quant/EXAONE-4.0-1.2B-Q4_K_M-GGUF --hf-file exaone-4.0-1.2b-q4_k_m.gguf -c 2048
Note: You can also use this checkpoint directly through the usage steps listed in the Llama.cpp repo as well.
Step 1: Clone llama.cpp from GitHub.
git clone https://github.com/ggerganov/llama.cpp
Step 2: Move into the llama.cpp folder and build it with LLAMA_CURL=1 flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
cd llama.cpp && LLAMA_CURL=1 make
Step 3: Run inference through the main binary.
./llama-cli --hf-repo Edge-Quant/EXAONE-4.0-1.2B-Q4_K_M-GGUF --hf-file exaone-4.0-1.2b-q4_k_m.gguf -p "The meaning to life and the universe is"
or
./llama-server --hf-repo Edge-Quant/EXAONE-4.0-1.2B-Q4_K_M-GGUF --hf-file exaone-4.0-1.2b-q4_k_m.gguf -c 2048
- Downloads last month
- 13
Hardware compatibility
Log In
to view the estimation
4-bit
Model tree for Edge-Quant/EXAONE-4.0-1.2B-Q4_K_M-GGUF
Base model
LGAI-EXAONE/EXAONE-4.0-1.2B