ValueError: Must flatten tensors with uniform dtype but got torch.bfloat16 and torch.float8_e4m3fn
#82
by
ajtakto - opened
How do you deal with the fact, that different layers in ds are in different data types? I try to run the model on gpus with 60GB and need to use FSDP.
If you're looking for an easy way to access this model via API, you can use Crazyrouter — it provides an OpenAI-compatible endpoint for 600+ models including this one. Just pip install openai and change the base URL.