Anki-SLM-Siksha

Transform any study material into perfect Anki flashcards — instantly.

Anki-SLM-Siksha is a fine-tuned small language model built by AnkTechsol on top of meta-llama/Llama-3.2-3B-Instruct. It is purpose-trained to convert raw study notes, definitions, and textbook content into high-quality, exam-ready Anki flashcards in structured JSON format — making it the ideal AI backbone for any spaced-repetition learning application.

Siksha (Sanskrit: शिक्षा) means education — a fitting name for a model built to make learning smarter.

Model Details

Property	Value
Base Model	meta-llama/Llama-3.2-3B-Instruct
Model Size	3B parameters
Fine-Tuning Method	LoRA (rank=16) via Unsloth
Quantization	4-bit during training, BF16 weights
Task	Flashcard Generation (Q&A)
Training Examples	150 curated conversations
Context Window	128K tokens
Organization	AnkTechsol
License	Apache 2.0

What It Does

Given any piece of study material, Anki-SLM-Siksha outputs a single, well-structured flashcard as a JSON object:

{
  "front": "What is osmosis?",
  "back": "The movement of water molecules through a semipermeable membrane from lower to higher solute concentration."
}

The model is trained to:

Extract the single most important concept from any input text
Generate a clear, concise question on the front optimized for active recall
Write a precise, exam-ready answer on the back
Output only valid JSON — no extra commentary, no hallucinated content
Handle topics across science, math, CS, history, business, AI/ML, and more

Quick Start

With Unsloth (Recommended)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="anktechsol/anki-slm-siksha",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

SYSTEM_PROMPT = """You are anki-slm-siksha, an AI assistant that converts study material into high-quality Anki flashcards. Always output exactly one flashcard as a valid JSON object with 'front' and 'back' fields only."""

def generate_flashcard(note: str) -> str:
    messages = [
        {"role": "user", "content": f"Create a flashcard:\n\n{note}"}
    ]
    text = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    inputs = tokenizer(text, return_tensors="pt").to("cuda")
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
    )
    return tokenizer.decode(
        outputs[0][inputs["input_ids"].shape[1]:],
        skip_special_tokens=True
    )

print(generate_flashcard("The mitochondria is the powerhouse of the cell."))
# Output: {"front": "What is the powerhouse of the cell?", "back": "The mitochondrion."}

With HuggingFace Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("anktechsol/anki-slm-siksha")
model = AutoModelForCausalLM.from_pretrained(
    "anktechsol/anki-slm-siksha",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Create a flashcard:\n\nDNA stands for deoxyribonucleic acid and carries genetic information."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Dataset

150 handcrafted conversations spanning 10+ subject domains
Topics include: Biology, Chemistry, Physics, Mathematics, Computer Science, Networking, Machine Learning, Economics, Business, and Cybersecurity
Each example follows the user → assistant format with structured JSON output

Fine-Tuning Configuration

# LoRA Config
r = 16
lora_alpha = 16
lora_dropout = 0
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

# Training Config
num_train_epochs = 3
per_device_train_batch_size = 2
gradient_accumulation_steps = 4
learning_rate = 2e-4
optimizer = "adamw_8bit"

Infrastructure

Trained on Google Colab T4 GPU (15GB VRAM) — free tier
Training time: ~6 minutes
Framework: Unsloth + HuggingFace TRL
Generated via TuneKit

System Prompt

For best results, use this system prompt at inference time:

You are anki-slm-siksha, an AI assistant that converts study material into high-quality Anki flashcards. Always:
- Read the user's text carefully and identify the single most important concept or fact.
- Output exactly one flashcard per request as a valid JSON object with two fields: "front" and "back".
- Make the "front" a clear, concise question or cloze deletion suitable for active recall.
- Make the "back" a precise, exam-ready answer, avoiding unnecessary extra details.
- Use simple, student-friendly language while preserving technical accuracy.
- Do not include any explanation, commentary, or formatting outside the JSON object.

Example Outputs

Input Note	Front	Back
Photosynthesis is the process by which green plants use sunlight to synthesize nutrients from CO2 and water.	What is photosynthesis?	Photosynthesis is the process by which green plants use sunlight to synthesize nutrients from CO2 and water.
A neural network is a computational model inspired by the brain, made of interconnected nodes.	What is a neural network in AI?	A computational model inspired by the brain, made of interconnected nodes called neurons.
Encryption is the process of converting information into a coded form to prevent unauthorized access.	What is encryption?	The process of converting information into a coded form to prevent unauthorized access.
A primary key is a unique identifier for each record in a database table.	What is a primary key in a database?	A unique identifier for each record in a table.

Use Cases

Students — Instantly convert lecture notes and textbook chapters into revision-ready flashcard decks
Educators — Auto-generate quiz banks and study guides at scale
EdTech Apps — Embed as an AI backend in spaced-repetition learning tools
Self-Learners — Supercharge any learning workflow with AI-assisted card creation
LMS Integrations — Connect to platforms like Anki, Mochi, RemNote, or any custom learning system

Limitations

Optimized for English-language inputs only
Best performance on factual, definitional content — creative or ambiguous notes may produce lower-quality cards
Output is designed to be a single flashcard; multi-card generation requires prompt-level iteration
As with all LLMs, occasional hallucination is possible — always review generated cards before adding to a production deck

About AnkTechsol

AnkTechsol is an AI and Data Engineering startup building intelligent tools for education, automation, and enterprise intelligence.

Key Products:

Vidyantrik — AI-powered learning platform
Career10x — AI-driven career acceleration
DataDhan — Enterprise data solutions
AnkiGPT / Anki-SLM-Siksha — Open-source AI for spaced repetition

Follow our work on LinkedIn | GitHub

Citation

If you use this model in your research or product, please cite:

@misc{anktechsol2026ankislmsiksha,
  title        = {Anki-SLM-Siksha: A Fine-Tuned LLM for Spaced-Repetition Flashcard Generation},
  author       = {AnkTechsol},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/anktechsol/anki-slm-siksha}},
  note         = {Fine-tuned from meta-llama/Llama-3.2-3B-Instruct using LoRA via Unsloth}
}

Built with love for learners everywhere. Siksha means education — and education should be effortless.

Downloads last month: 15

Safetensors

Model size

3B params

Tensor type

BF16

Model tree for anktechsol/anki-slm-siksha

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

(621)

this model

anktechsol
/

anki-slm-siksha