How to run this model

#1
by nguyenquivinhquang - opened

from transformers import AutoModel, AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon-H1-1.5B-Instruct-GPTQ-Int8")
tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon-H1-1.5B-Instruct-GPTQ-Int8")

input_text = "Explain quantum computing simply."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids)

I tried the code above, but I realized that the model did not convert to CUDA.

Sign up or log in to comment