Gemma 3 270M - Kiliki Language Fine-tuned Model
This model is a fine-tuned version of google/gemma-3-270m-it using QLoRA (Quantized Low-Rank Adaptation) for English to Kiliki language translation.
Model Details
- Base Model: google/gemma-3-270m-it
- Fine-tuning Method: QLoRA (4-bit quantization + LoRA adapters)
- Training Dataset: kiliki_dataset_10k.csv (7,528 unique English-Kiliki translation pairs)
- Model Size: Only adapter weights (~few MB) - requires base model for inference
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
# Load base model with quantization
bnb_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
)
model = AutoModelForCausalLM.from_pretrained(
"google/gemma-3-270m-it",
quantization_config=bnb_config,
device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3-270m-it")
# Load QLoRA adapter
model = PeftModel.from_pretrained(model, "droidnext/gemma_3_270m_kiliki_language")
# Generate translation
messages = [{"role": "user", "content": "Hello"}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=50)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:]))
Training Details
- Training Dataset: 7,528 English-Kiliki translation pairs
- Training Split: 80% train, 20% test
- Method: QLoRA (4-bit quantization with LoRA rank 64)