manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF

These models was converted to GGUF format from manthilaffs/Gamunu-4B-Instruct-Alpha using llama.cpp. Refer to the original model card for more details on the model.

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

llama-cli --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-BF16.gguf -p "Hello! how are you?"

llama-cli --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-q8_0.gguf -p "Hello! how are you?"

llama-server --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-BF16.gguf -c 2048

llama-server --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-q8_0.gguf -c 2048

GGUF

Model size

4B params

Architecture

gemma3

Hardware compatibility

8-bit

16-bit

Base model

Finetuned

Quantized

(3)

this model