Error with transformers 4.51.3: RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

#34

by david-crynge - opened May 1

May 1

Using pixtral-12b with transformers-4.51.3 gives the following error:
RuntimeError: Input type (torch.cuda.FloatTensor) and weight type (torch.cuda.HalfTensor) should be the same

Downgrading to transformers-4.48.3 fixes it, but I'd like to be able to use the latest transformers version

RaushanTurganbay

Unofficial Mistral Community org May 2

@david-crynge can you share your inference code? Seems like you need to cast inputs to correct dtype with inputs = inputs.to(torch.float16) before doing generation

szu123

Aug 25

I am facing the same issue when using qlora, with 4 bits nf4 quantization, float16 compute and quant storage, and double quant enabled. It seems like weights are stored as halftensors as they should and the input is float due to compute_dtype=float16. Any comment on how to fix this @RaushanTurganbay ?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment