To run this model check out OpenArc.

Muse-12B-int4_asym-ov

This model was converted to OpenVINO IR using weight only compression to int4_asym.

Muse-12B has been exceptional so far.

It doesn't shy away from, um, intense topics and refusals aren't a problem. At this time we don't have access to all the samplers reccomended by LatitudeGames but I haven't seen massive degredation. Long context performance remains strong, and with some scaffolding could be a reliable workhorse, though sometimes a bit verbose.

Another interesting usecase has been tinkering inside the talk_to_llm.py. Using this demo hooks up Muse-12B with whisper and kokoro using the OpenArc server.

Very interesting way to expereince a text adventure.

Performance on A770

Results were captured using openarc bench.

Very nice.

openarc bench selects input tokens by sampling the entire vocabulary using a similar approach to llama-bench.

input tokens: [512]
max tokens:   [128]
runs: 5

  benching... (5/5) 

Muse-12B-int4_asym-ov

┏━━━━━┳━━━━━┳━━━━━┳━━━━━━━━━┳━━━━━━━━━━┳━━━━━━━━━━━━━━┳━━━━━━━━━━━━━┳━━━━━━━━━━━━━┓
┃ run ┃   p ┃   n ┃ ttft(s) ┃ tpot(ms) ┃ prefill(t/s) ┃ decode(t/s) ┃ duration(s) ┃
┡━━━━━╇━━━━━╇━━━━━╇━━━━━━━━━╇━━━━━━━━━━╇━━━━━━━━━━━━━━╇━━━━━━━━━━━━━╇━━━━━━━━━━━━━┩
│   1 │ 512 │ 128 │    0.20 │    28.92 │       2623.5 │        34.6 │        3.87 │
│   2 │ 512 │ 128 │    0.16 │    28.92 │       3134.4 │        34.6 │        3.84 │
│   3 │ 512 │ 128 │    0.17 │    28.89 │       3007.7 │        34.6 │        3.84 │
│   4 │ 512 │ 128 │    0.17 │    28.88 │       3045.6 │        34.6 │        3.84 │
│   5 │ 512 │ 128 │    0.17 │    28.91 │       2998.3 │        34.6 │        3.84 │
└─────┴─────┴─────┴─────────┴──────────┴──────────────┴─────────────┴─────────────┘
Total: 5 runs

System:
Xeon W2255
128GB DDR4-ECC
Asrock A770
Ubuntu 24.04: 6.14.4-061404-generic 
openvino   2025.3.0
openvino-genai  2025.3.0.0

Downloads last month: 10

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Echo9Zulu/Muse-12B-int4_asym-ov

Base model

mistralai/Mistral-Nemo-Base-2407

Finetuned

LatitudeGames/Muse-12B

Finetuned

(4)

this model