How to use the model correctly?

#1
by limcheekin - opened

Hi there,

Thanks for sharing the model.

I'm serve the model using llama.cpp's server with the following command:

./bin/llama-server -m ./models/RolmOCR.Q4_K_M.gguf -c 4096

Invoking the API using openai python package with the following code:

    messages_payload: List[Dict[str, Any]] = [
        {
            "role": "user",
            "content": [
                {"type": "text", "text": prompt},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": f"data:image/png;base64,{base64_image}"
                    }
                }
            ]
        }
    ]

    response = client.chat.completions.create(
        model=model,
        messages=messages_payload,
        max_tokens=max_tokens,
        timeout=90
    )

The server throw 500 error, may I know how to use it correctly?

Please advise. Thank you.

Sign up or log in to comment