Llama-3-8B LoRA — Movie Rating (4-bit)

A lightweight LoRA adapter on top of unsloth/Meta-Llama-3.1-8B-bnb-4bit that predicts a film’s TMDB vote_average (0–10, one decimal) from a structured “movie-features” prompt.

Model Details

Field Value
Developed by YijingOlivia
Base model unsloth/Meta-Llama-3.1-8B-bnb-4bit
LoRA rank / targets r = 16, targets q_proj, v_proj
Trainable params ≈ 2 M
Quantisation 4-bit NF4 (Bits-and-Bytes)
Language English prompt / numeric output
License Apache-2.0
Hardware Google Colab T4 (15 GB VRAM)
Precision bf16 activations + 8-bit optimiser

Model Sources

Uses

Direct Use

Feed a prompt that follows the training template, generate ≤ 8 tokens, then take the first floating-point number.

Example & How to Get Started

from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig
from peft import PeftModel

base = "unsloth/Meta-Llama-3.1-8B-bnb-4bit"
lora = "YijingOlivia/llama3-movie-rating-lora"

bnb_cfg = BitsAndBytesConfig(load_in_4bit=True, bnb_4bit_quant_type="nf4")

tokenizer = AutoTokenizer.from_pretrained(base, trust_remote_code=True)
base_model = AutoModelForCausalLM.from_pretrained(
    base,
    quantization_config=bnb_cfg,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, lora).eval()

prompt = (
    "### MOVIE FEATURES\n"
    "title: The Bourne Identity\n"
    "budget: 60000000\n"
    "runtime: 119\n"
    "genres_names: Action, Thriller\n"
    "actor_1_name: Matt Damon\n"
    "director_1_name: Doug Liman\n"
    "imdb_rating: 7.9\n\n"
    "### TASK\n"
    "Predict this movie's TMDB rating on a 0–10 scale (one decimal place).\n\n"
    "### ANSWER\n"
)

ids = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(**ids, max_new_tokens=8)
print(tokenizer.decode(out[0], skip_special_tokens=True))
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support