manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF

These models was converted to GGUF format from manthilaffs/Gamunu-4B-Instruct-Alpha using llama.cpp. Refer to the original model card for more details on the model.

Use with llama.cpp

Install llama.cpp through brew (works on Mac and Linux)

brew install llama.cpp

Invoke the llama.cpp server or the CLI.

CLI:

llama-cli --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-BF16.gguf -p "Hello! how are you?"
llama-cli --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-q8_0.gguf -p "Hello! how are you?"

Server:

llama-server --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-BF16.gguf -c 2048
llama-server --hf-repo manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF --hf-file gamunu-4b-instruct-alpha-q8_0.gguf -c 2048
Downloads last month
434
GGUF
Model size
4B params
Architecture
gemma3
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF

Quantized
(3)
this model

Collection including manthilaffs/Gamunu-4B-Instruct-Alpha-GGUF