Gemma-3-270M Dhivehi — Instruction-Tuned

Compact Dhivehi (ދިވެހި) assistant based on google/gemma-3-270m-it, tuned to follow instruction + optional input (context) and produce concise, helpful answers in Dhivehi.

Model details

Base: google/gemma-3-270m-it
Language: Dhivehi
Style: Helpful, friendly, concise
Context format:
- instruction: the user request/question
- input (optional): supporting context the model should rely on

Intended use

Dhivehi Q&A with short to medium answers grounded in a provided context
Simple rewriting and news-style drafting when an instruction is given
General assistant responses in Dhivehi

Not intended for: open-domain factual lookup without context, legal/medical advice, or long multi-turn planning.

How to use

Message list API

Pass a fresh conversation each time (no history needed). Extract only the assistant’s content.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_path = "alakxender/gemma-3-270m-dhivehi-ctx"

model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    torch_dtype="auto", 
    device_map="auto", 
    attn_implementation="eager"
)
tok = AutoTokenizer.from_pretrained(model_path)
pipe = pipeline("text-generation", model=model, tokenizer=tok)

system_prompt = "You are a Dhivehi assistant. You are helpful and friendly."
instruction   = "ދިވެހިރާއްޖޭގެ ޖޯގްރަފީ އާއި ބެހޭ ގޮތުން ލިޔެފައިވާ މުހިންމު ނުކުތާތަކަކީ ކޮބާ؟"
input_text    = """ދިވެހިރާއްޖެއަކީ އިންޑިޔާ ކަނޑުގައި އޮންނަ ޖަޒީރާ ޤައުމެކެވެ. ދިވެހިރާއްޖެ އޮންނަނީ ސްރިލަންކާގެ ހުޅުނަގުގައި ދެކުނު އޭޝިޔާ ބައްރުގައެވެ. ދިވެހިރާއްޖެއަކީ މިނިވަންކަމާއި އިސްތިޤުލާލު ދިފާޢުކޮށް ރިއްކާތެރިކުރަމުން އަންނަ، ދިވެހީންގެ މިލްކުވެރި ސަރަހައްދެވެ."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user",   "content": f"instruction: {instruction}\n\ninput: {input_text}"},
]

do_sample = False
if do_sample == True:
    print("Using sampling parameters")
    # Use sampling parameters only when do_sample is True
    gen_kwargs = {
        "max_new_tokens": 256,
        "temperature": 0.2,
        "top_p": 0.2,
        "top_k": 5,
        "do_sample": True,
        "disable_compile": True
    }
else:
    print("Using greedy decoding")
    # Use only valid parameters for greedy decoding
    gen_kwargs = {
        "max_new_tokens": 256,
        "disable_compile": True
    }
out = pipe(messages, **gen_kwargs)
assistant = next((m["content"] for m in out[0]["generated_text"] if m["role"] == "assistant"), "")
print(assistant)

#Response: ދިވެހިރާއްޖެ އަކީ އިންޑިޔާ ކަނޑުގައި އޮންނަ ޖަޒީރާއެއް ކަމަށާއި، ދިވެހިރާއްޖެ އަކީ މުޅި ދިވެހިރާއްޖެއަށް ބޮޑެތި ތަރައްޤީތަކެއް ލިބިފައިވާ ޤައުމެއް ކަމަށެވެ.

Using the chat template directly

Build a single prompt string from the user message and read only the generated suffix.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_path = "alakxender/gemma-3-270m-dhivehi-ctx"

model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype="auto", device_map="auto", attn_implementation="eager"
)
tok = AutoTokenizer.from_pretrained(model_path)
gen = pipeline("text-generation", model=model, tokenizer=tok)

instruction   = """ދިވެހިންގެ މައިގަނޑު ސިފަތަކަކީ ކޮބައިހެއްޔެވެ؟"""
input_text    = """ދިވެހިންނަކީ ދިވެހިރާއްޖޭގެ އަހުލުވެރިންނެވެ. ކޮންމެ ދިވެއްސަކީ މުސްލިމެކެވެ. ދިވެހީންގެ ނަސްލަކީ އެތައް ސަތޭކަ ޤަރުނެއް ކަނޑައްތުކޮށް ދިވެހި ކާބާފައިންގެ ފަރާތުން ވާރުތަވެފައިވާ މުސްތަޤިއްލު ނަސްލެކެވެ. ދިވެހިންނަކީ އަމިއްލަ މިނިވަން ބަހަކުން ވާހަކަދައްކަމުންދާ، ޞައްޙަ އިސްލާމީ ޢަޤީދާގެ މަތީގައި ތިބި، އަމިއްލަ ބަހެއް ލިބިފައިވާ ބައެކެވެ."""

user_only = [{"role": "user", "content": f"{instruction}\n\n{input_text}"}]
prompt = tok.apply_chat_template(user_only, tokenize=False, add_generation_prompt=True)

do_sample = False
if do_sample == True:
    # Use sampling parameters only when do_sample is True
    gen_kwargs = {
        "max_new_tokens": 256,
        "temperature": 0.2,
        "top_p": 0.2,
        "top_k": 5,
        "do_sample": True,
        "disable_compile": True
    }
else:
    # Use only valid parameters for greedy decoding
    gen_kwargs = {
        "max_new_tokens": 256,
        "disable_compile": True
    }

resp = gen(prompt, **gen_kwargs)
answer = resp[0]["generated_text"][len(prompt):].strip()
print(answer)

#Response: ދިވެހިންނަކީ ދިވެހިރާއްޖޭގެ އަހުލުވެރިންނެވެ. ދިވެހީންގެ ނަސްލަކީ އެތައް ސަތޭކަ ޤަރުނެއް ކަނޑައްތުކޮށް ދިވެހި ކާބާފައިންގެ ފަރާތުން ވާރުތަވެފައިވާ މުސްތަޤިއްލު ނަސްލެކެވެ. ދިވެހިންނަކީ އަމިއްލަ މިނިވަން ބަހަކުން ވާހަކަދައްކަމުންދާ، ޞައްޙަ އިސްލާމީ ޢަޤީދާގެ މަތީގައި ތިބި ބައެކެވެ.

Generation tips

Prefer do_sample=False for faithful, deterministic answers to provided context.
If you must sample: temperature≈0.2–0.7, top_p≈0.2–0.9.
Keep max_new_tokens modest (e.g., 128–384) for focused outputs.
Dhivehi is RTL; use an environment that renders RTL text correctly.

Prompting patterns

Contextual QA
- instruction: Ask a direct question
- input: Provide the grounding paragraph or bullet points
Rewrite/Summarize
- instruction: Specify the transformation (summarize, simplify, rephrase)
- input: Paste the original text
Constrained format
- instruction: State length or style constraints
- input: Provide facts; the model should not add external information

Note: Prompting patterns are guidelines only — not all are fully tested, and results may vary.

Limitations

May hallucinate if no input is given or if the instruction demands external facts.
Domain expertise is limited to patterns seen in the training mixture; verify critical claims.
Long multi-section documents may exceed context; provide only the necessary excerpts.
The model is not trained to follow instructions that are not in Dhivehi.

Downloads last month: 32

Safetensors

Model size

0.3B params

Tensor type

BF16

Model tree for alakxender/gemma-3-270m-dhivehi-ctx

Base model

google/gemma-3-270m

Finetuned

google/gemma-3-270m-it