How to run this model
#1
by
nguyenquivinhquang
- opened
from transformers import AutoModel, AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon-H1-1.5B-Instruct-GPTQ-Int8")
tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon-H1-1.5B-Instruct-GPTQ-Int8")
input_text = "Explain quantum computing simply."
input_ids = tokenizer(input_text, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids)
I tried the code above, but I realized that the model did not convert to CUDA.