Gemma-3-270M Dhivehi — Instruction-Tuned

Compact Dhivehi (ދިވެހި) assistant based on google/gemma-3-270m-it, tuned to follow instruction + optional input (context) and produce concise, helpful answers in Dhivehi.

Model details

  • Base: google/gemma-3-270m-it
  • Language: Dhivehi
  • Style: Helpful, friendly, concise
  • Context format:
    • instruction: the user request/question
    • input (optional): supporting context the model should rely on

Intended use

  • Dhivehi Q&A with short to medium answers grounded in a provided context
  • Simple rewriting and news-style drafting when an instruction is given
  • General assistant responses in Dhivehi

Not intended for: open-domain factual lookup without context, legal/medical advice, or long multi-turn planning.

How to use

Message list API

Pass a fresh conversation each time (no history needed). Extract only the assistant’s content.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_path = "alakxender/gemma-3-270m-dhivehi-ctx"

model = AutoModelForCausalLM.from_pretrained(
    model_path, 
    torch_dtype="auto", 
    device_map="auto", 
    attn_implementation="eager"
)
tok = AutoTokenizer.from_pretrained(model_path)
pipe = pipeline("text-generation", model=model, tokenizer=tok)

system_prompt = "You are a Dhivehi assistant. You are helpful and friendly."
instruction   = "ދިވެހިރާއްޖޭގެ ޖޯގްރަފީ އާއި ބެހޭ ގޮތުން ލިޔެފައިވާ މުހިންމު ނުކުތާތަކަކީ ކޮބާ؟"
input_text    = """ދިވެހިރާއްޖެއަކީ އިންޑިޔާ ކަނޑުގައި އޮންނަ ޖަޒީރާ ޤައުމެކެވެ. ދިވެހިރާއްޖެ އޮންނަނީ ސްރިލަންކާގެ ހުޅުނަގުގައި ދެކުނު އޭޝިޔާ ބައްރުގައެވެ. ދިވެހިރާއްޖެއަކީ މިނިވަންކަމާއި އިސްތިޤުލާލު ދިފާޢުކޮށް ރިއްކާތެރިކުރަމުން އަންނަ، ދިވެހީންގެ މިލްކުވެރި ސަރަހައްދެވެ."""

messages = [
    {"role": "system", "content": system_prompt},
    {"role": "user",   "content": f"instruction: {instruction}\n\ninput: {input_text}"},
]

do_sample = False
if do_sample == True:
    print("Using sampling parameters")
    # Use sampling parameters only when do_sample is True
    gen_kwargs = {
        "max_new_tokens": 256,
        "temperature": 0.2,
        "top_p": 0.2,
        "top_k": 5,
        "do_sample": True,
        "disable_compile": True
    }
else:
    print("Using greedy decoding")
    # Use only valid parameters for greedy decoding
    gen_kwargs = {
        "max_new_tokens": 256,
        "disable_compile": True
    }
out = pipe(messages, **gen_kwargs)
assistant = next((m["content"] for m in out[0]["generated_text"] if m["role"] == "assistant"), "")
print(assistant)

#Response: ދިވެހިރާއްޖެ އަކީ އިންޑިޔާ ކަނޑުގައި އޮންނަ ޖަޒީރާއެއް ކަމަށާއި، ދިވެހިރާއްޖެ އަކީ މުޅި ދިވެހިރާއްޖެއަށް ބޮޑެތި ތަރައްޤީތަކެއް ލިބިފައިވާ ޤައުމެއް ކަމަށެވެ.

Using the chat template directly

Build a single prompt string from the user message and read only the generated suffix.

from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline

model_path = "alakxender/gemma-3-270m-dhivehi-ctx"

model = AutoModelForCausalLM.from_pretrained(
    model_path, torch_dtype="auto", device_map="auto", attn_implementation="eager"
)
tok = AutoTokenizer.from_pretrained(model_path)
gen = pipeline("text-generation", model=model, tokenizer=tok)

instruction   = """ދިވެހިންގެ މައިގަނޑު ސިފަތަކަކީ ކޮބައިހެއްޔެވެ؟"""
input_text    = """ދިވެހިންނަކީ ދިވެހިރާއްޖޭގެ އަހުލުވެރިންނެވެ. ކޮންމެ ދިވެއްސަކީ މުސްލިމެކެވެ. ދިވެހީންގެ ނަސްލަކީ އެތައް ސަތޭކަ ޤަރުނެއް ކަނޑައްތުކޮށް ދިވެހި ކާބާފައިންގެ ފަރާތުން ވާރުތަވެފައިވާ މުސްތަޤިއްލު ނަސްލެކެވެ. ދިވެހިންނަކީ އަމިއްލަ މިނިވަން ބަހަކުން ވާހަކަދައްކަމުންދާ، ޞައްޙަ އިސްލާމީ ޢަޤީދާގެ މަތީގައި ތިބި، އަމިއްލަ ބަހެއް ލިބިފައިވާ ބައެކެވެ."""

user_only = [{"role": "user", "content": f"{instruction}\n\n{input_text}"}]
prompt = tok.apply_chat_template(user_only, tokenize=False, add_generation_prompt=True)

do_sample = False
if do_sample == True:
    # Use sampling parameters only when do_sample is True
    gen_kwargs = {
        "max_new_tokens": 256,
        "temperature": 0.2,
        "top_p": 0.2,
        "top_k": 5,
        "do_sample": True,
        "disable_compile": True
    }
else:
    # Use only valid parameters for greedy decoding
    gen_kwargs = {
        "max_new_tokens": 256,
        "disable_compile": True
    }

resp = gen(prompt, **gen_kwargs)
answer = resp[0]["generated_text"][len(prompt):].strip()
print(answer)

#Response: ދިވެހިންނަކީ ދިވެހިރާއްޖޭގެ އަހުލުވެރިންނެވެ. ދިވެހީންގެ ނަސްލަކީ އެތައް ސަތޭކަ ޤަރުނެއް ކަނޑައްތުކޮށް ދިވެހި ކާބާފައިންގެ ފަރާތުން ވާރުތަވެފައިވާ މުސްތަޤިއްލު ނަސްލެކެވެ. ދިވެހިންނަކީ އަމިއްލަ މިނިވަން ބަހަކުން ވާހަކަދައްކަމުންދާ، ޞައްޙަ އިސްލާމީ ޢަޤީދާގެ މަތީގައި ތިބި ބައެކެވެ.

Generation tips

  • Prefer do_sample=False for faithful, deterministic answers to provided context.
  • If you must sample: temperature≈0.2–0.7, top_p≈0.2–0.9.
  • Keep max_new_tokens modest (e.g., 128–384) for focused outputs.
  • Dhivehi is RTL; use an environment that renders RTL text correctly.

Prompting patterns

  • Contextual QA

    • instruction: Ask a direct question
    • input: Provide the grounding paragraph or bullet points
  • Rewrite/Summarize

    • instruction: Specify the transformation (summarize, simplify, rephrase)
    • input: Paste the original text
  • Constrained format

    • instruction: State length or style constraints
    • input: Provide facts; the model should not add external information

Note: Prompting patterns are guidelines only — not all are fully tested, and results may vary.

Limitations

  • May hallucinate if no input is given or if the instruction demands external facts.
  • Domain expertise is limited to patterns seen in the training mixture; verify critical claims.
  • Long multi-section documents may exceed context; provide only the necessary excerpts.
  • The model is not trained to follow instructions that are not in Dhivehi.
Downloads last month
32
Safetensors
Model size
0.3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for alakxender/gemma-3-270m-dhivehi-ctx

Finetuned
(763)
this model

Datasets used to train alakxender/gemma-3-270m-dhivehi-ctx