Anki-SLM-Siksha

Transform any study material into perfect Anki flashcards — instantly.

Anki-SLM-Siksha is a fine-tuned small language model built by AnkTechsol on top of meta-llama/Llama-3.2-3B-Instruct. It is purpose-trained to convert raw study notes, definitions, and textbook content into high-quality, exam-ready Anki flashcards in structured JSON format — making it the ideal AI backbone for any spaced-repetition learning application.

Siksha (Sanskrit: शिक्षा) means education — a fitting name for a model built to make learning smarter.


Model Details

Property Value
Base Model meta-llama/Llama-3.2-3B-Instruct
Model Size 3B parameters
Fine-Tuning Method LoRA (rank=16) via Unsloth
Quantization 4-bit during training, BF16 weights
Task Flashcard Generation (Q&A)
Training Examples 150 curated conversations
Context Window 128K tokens
Organization AnkTechsol
License Apache 2.0

What It Does

Given any piece of study material, Anki-SLM-Siksha outputs a single, well-structured flashcard as a JSON object:

{
  "front": "What is osmosis?",
  "back": "The movement of water molecules through a semipermeable membrane from lower to higher solute concentration."
}

The model is trained to:

  • Extract the single most important concept from any input text
  • Generate a clear, concise question on the front optimized for active recall
  • Write a precise, exam-ready answer on the back
  • Output only valid JSON — no extra commentary, no hallucinated content
  • Handle topics across science, math, CS, history, business, AI/ML, and more

Quick Start

With Unsloth (Recommended)

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="anktechsol/anki-slm-siksha",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

SYSTEM_PROMPT = """You are anki-slm-siksha, an AI assistant that converts study material into high-quality Anki flashcards. Always output exactly one flashcard as a valid JSON object with 'front' and 'back' fields only."""

def generate_flashcard(note: str) -> str:
    messages = [
        {"role": "user", "content": f"Create a flashcard:\n\n{note}"}
    ]
    text = tokenizer.apply_chat_template(
        messages, tokenize=False, add_generation_prompt=True
    )
    inputs = tokenizer(text, return_tensors="pt").to("cuda")
    outputs = model.generate(
        **inputs,
        max_new_tokens=256,
        temperature=0.7,
        do_sample=True,
    )
    return tokenizer.decode(
        outputs[0][inputs["input_ids"].shape[1]:],
        skip_special_tokens=True
    )

print(generate_flashcard("The mitochondria is the powerhouse of the cell."))
# Output: {"front": "What is the powerhouse of the cell?", "back": "The mitochondrion."}

With HuggingFace Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

tokenizer = AutoTokenizer.from_pretrained("anktechsol/anki-slm-siksha")
model = AutoModelForCausalLM.from_pretrained(
    "anktechsol/anki-slm-siksha",
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

messages = [
    {"role": "user", "content": "Create a flashcard:\n\nDNA stands for deoxyribonucleic acid and carries genetic information."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Details

Dataset

  • 150 handcrafted conversations spanning 10+ subject domains
  • Topics include: Biology, Chemistry, Physics, Mathematics, Computer Science, Networking, Machine Learning, Economics, Business, and Cybersecurity
  • Each example follows the user → assistant format with structured JSON output

Fine-Tuning Configuration

# LoRA Config
r = 16
lora_alpha = 16
lora_dropout = 0
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]

# Training Config
num_train_epochs = 3
per_device_train_batch_size = 2
gradient_accumulation_steps = 4
learning_rate = 2e-4
optimizer = "adamw_8bit"

Infrastructure

  • Trained on Google Colab T4 GPU (15GB VRAM) — free tier
  • Training time: ~6 minutes
  • Framework: Unsloth + HuggingFace TRL
  • Generated via TuneKit

System Prompt

For best results, use this system prompt at inference time:

You are anki-slm-siksha, an AI assistant that converts study material into high-quality Anki flashcards. Always:
- Read the user's text carefully and identify the single most important concept or fact.
- Output exactly one flashcard per request as a valid JSON object with two fields: "front" and "back".
- Make the "front" a clear, concise question or cloze deletion suitable for active recall.
- Make the "back" a precise, exam-ready answer, avoiding unnecessary extra details.
- Use simple, student-friendly language while preserving technical accuracy.
- Do not include any explanation, commentary, or formatting outside the JSON object.

Example Outputs

Input Note Front Back
Photosynthesis is the process by which green plants use sunlight to synthesize nutrients from CO2 and water. What is photosynthesis? Photosynthesis is the process by which green plants use sunlight to synthesize nutrients from CO2 and water.
A neural network is a computational model inspired by the brain, made of interconnected nodes. What is a neural network in AI? A computational model inspired by the brain, made of interconnected nodes called neurons.
Encryption is the process of converting information into a coded form to prevent unauthorized access. What is encryption? The process of converting information into a coded form to prevent unauthorized access.
A primary key is a unique identifier for each record in a database table. What is a primary key in a database? A unique identifier for each record in a table.

Use Cases

  • Students — Instantly convert lecture notes and textbook chapters into revision-ready flashcard decks
  • Educators — Auto-generate quiz banks and study guides at scale
  • EdTech Apps — Embed as an AI backend in spaced-repetition learning tools
  • Self-Learners — Supercharge any learning workflow with AI-assisted card creation
  • LMS Integrations — Connect to platforms like Anki, Mochi, RemNote, or any custom learning system

Limitations

  • Optimized for English-language inputs only
  • Best performance on factual, definitional content — creative or ambiguous notes may produce lower-quality cards
  • Output is designed to be a single flashcard; multi-card generation requires prompt-level iteration
  • As with all LLMs, occasional hallucination is possible — always review generated cards before adding to a production deck

About AnkTechsol

AnkTechsol is an AI and Data Engineering startup building intelligent tools for education, automation, and enterprise intelligence.

Key Products:

  • Vidyantrik — AI-powered learning platform
  • Career10x — AI-driven career acceleration
  • DataDhan — Enterprise data solutions
  • AnkiGPT / Anki-SLM-Siksha — Open-source AI for spaced repetition

Follow our work on LinkedIn | GitHub


Citation

If you use this model in your research or product, please cite:

@misc{anktechsol2026ankislmsiksha,
  title        = {Anki-SLM-Siksha: A Fine-Tuned LLM for Spaced-Repetition Flashcard Generation},
  author       = {AnkTechsol},
  year         = {2026},
  howpublished = {\url{https://huggingface.co/anktechsol/anki-slm-siksha}},
  note         = {Fine-tuned from meta-llama/Llama-3.2-3B-Instruct using LoRA via Unsloth}
}

Built with love for learners everywhere. Siksha means education — and education should be effortless.

Downloads last month
15
Safetensors
Model size
3B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for anktechsol/anki-slm-siksha

Adapter
(621)
this model

Space using anktechsol/anki-slm-siksha 1