GGUF
vllm
mistral-common
unsloth
conversational

Thinking not enabled in Ollama

#4
by erat-verbum - opened

I tried pulling:

ollama run hf.co/unsloth/Magistral-Small-2509-GGUF:UD-Q4_K_XL

but it looks like its ModelFile does not have thinking enabled as one of its capabilities:

ollama show hf.co/unsloth/Magistral-Small-2509-GGUF:UD-Q4_K_XL
  Model
    architecture        llama      
    parameters          23.6B      
    context length      131072     
    embedding length    5120       
    quantization        unknown    

  Capabilities
    completion    
    tools         
    vision        

  Projector
    architecture        clip       
    parameters          438.96M    
    embedding length    1024       
    dimensions          5120       

  Parameters
    stop              "</s>"    
    temperature       0.7       
    min_p             0.01      
    repeat_penalty    1         
    top_p             0.95      

  System
    First draft your thinking process (inner monologue) until you arrive at a response. Format your         
      response using Markdown, and use LaTeX for any mathematical equations. Write both your thoughts and     
      the response in the same language as the input.                                                         
    Your thinking process must follow the template below:[THINK]Your thoughts or/and draft, like working    
      through an exercise on scratch paper. Be as casual and as long as you want until you are confident      
      to generate the response. Use the same language as the input.[/THINK]Here, provide a self-contained     
      response.           

Without that, the key feature of Magistral vs Mistral is kind of missing 🀷

Sign up or log in to comment