Llama-3-8B LoRA — Movie Rating (4-bit)
A lightweight LoRA adapter on top of unsloth/Meta-Llama-3.1-8B-bnb-4bit that predicts a film’s TMDB vote_average (0–10, one decimal) from a structured “movie-features” prompt.
Model Details
| Field | Value |
|---|---|
| Developed by | YijingOlivia |
| Base model | unsloth/Meta-Llama-3.1-8B-bnb-4bit |
| LoRA rank / targets | r = 16, targets q_proj, v_proj |
| Trainable params | ≈ 2 M |
| Quantisation | 4-bit NF4 (Bits-and-Bytes) |
| Language | English prompt / numeric output |
| License | Apache-2.0 |
| Hardware | Google Colab T4 (15 GB VRAM) |
| Precision | bf16 activations + 8-bit optimiser |
Model Sources
Uses
Direct Use
Feed a prompt that follows the training template, generate ≤ 8 tokens, then take the first floating-point number.
Example & How to Get Started
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel
base = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"
lora = "YijingOlivia/llama3-movie-rating-lora"
bnb_cfg = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4")
tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
base,
quantization_config=bnb_cfg,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, lora).eval()
prompt = (
"### MOVIE FEATURES\n"
"title: The Bourne Identity\n"
"budget: 60000000\n"
"runtime: 119\n"
"genres_names: Action, Thriller\n"
"actor_1_name: Matt Damon\n"
"director_1_name: Doug Liman\n"
"imdb_rating: 7.9\n\n"
"### TASK\n"
"Predict this movie's TMDB rating on a 0–10 scale (one decimal place).\n\n"
"### ANSWER\n"
)
ids = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**ids, max_new_tokens=8)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support