Anki-SLM-Siksha
Transform any study material into perfect Anki flashcards — instantly.
Anki-SLM-Siksha is a fine-tuned small language model built by AnkTechsol on top of meta-llama/Llama-3.2-3B-Instruct. It is purpose-trained to convert raw study notes, definitions, and textbook content into high-quality, exam-ready Anki flashcards in structured JSON format — making it the ideal AI backbone for any spaced-repetition learning application.
Siksha (Sanskrit: शिक्षा) means education — a fitting name for a model built to make learning smarter.
Model Details
| Property | Value |
|---|---|
| Base Model | meta-llama/Llama-3.2-3B-Instruct |
| Model Size | 3B parameters |
| Fine-Tuning Method | LoRA (rank=16) via Unsloth |
| Quantization | 4-bit during training, BF16 weights |
| Task | Flashcard Generation (Q&A) |
| Training Examples | 150 curated conversations |
| Context Window | 128K tokens |
| Organization | AnkTechsol |
| License | Apache 2.0 |
What It Does
Given any piece of study material, Anki-SLM-Siksha outputs a single, well-structured flashcard as a JSON object:
{
"front": "What is osmosis?",
"back": "The movement of water molecules through a semipermeable membrane from lower to higher solute concentration."
}
The model is trained to:
- Extract the single most important concept from any input text
- Generate a clear, concise question on the front optimized for active recall
- Write a precise, exam-ready answer on the back
- Output only valid JSON — no extra commentary, no hallucinated content
- Handle topics across science, math, CS, history, business, AI/ML, and more
Quick Start
With Unsloth (Recommended)
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name="anktechsol/anki-slm-siksha",
max_seq_length=2048,
load_in_4bit=True,
)
FastLanguageModel.for_inference(model)
SYSTEM_PROMPT = """You are anki-slm-siksha, an AI assistant that converts study material into high-quality Anki flashcards. Always output exactly one flashcard as a valid JSON object with 'front' and 'back' fields only."""
def generate_flashcard(note: str) -> str:
messages = [
{"role": "user", "content": f"Create a flashcard:\n\n{note}"}
]
text = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
outputs = model.generate(
**inputs,
max_new_tokens=256,
temperature=0.7,
do_sample=True,
)
return tokenizer.decode(
outputs[0][inputs["input_ids"].shape[1]:],
skip_special_tokens=True
)
print(generate_flashcard("The mitochondria is the powerhouse of the cell."))
# Output: {"front": "What is the powerhouse of the cell?", "back": "The mitochondrion."}
With HuggingFace Transformers
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
tokenizer = AutoTokenizer.from_pretrained("anktechsol/anki-slm-siksha")
model = AutoModelForCausalLM.from_pretrained(
"anktechsol/anki-slm-siksha",
torch_dtype=torch.bfloat16,
device_map="auto"
)
messages = [
{"role": "user", "content": "Create a flashcard:\n\nDNA stands for deoxyribonucleic acid and carries genetic information."}
]
inputs = tokenizer.apply_chat_template(messages, return_tensors="pt").to("cuda")
outputs = model.generate(inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Training Details
Dataset
- 150 handcrafted conversations spanning 10+ subject domains
- Topics include: Biology, Chemistry, Physics, Mathematics, Computer Science, Networking, Machine Learning, Economics, Business, and Cybersecurity
- Each example follows the
user → assistantformat with structured JSON output
Fine-Tuning Configuration
# LoRA Config
r = 16
lora_alpha = 16
lora_dropout = 0
target_modules = ["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"]
# Training Config
num_train_epochs = 3
per_device_train_batch_size = 2
gradient_accumulation_steps = 4
learning_rate = 2e-4
optimizer = "adamw_8bit"
Infrastructure
- Trained on Google Colab T4 GPU (15GB VRAM) — free tier
- Training time: ~6 minutes
- Framework: Unsloth + HuggingFace TRL
- Generated via TuneKit
System Prompt
For best results, use this system prompt at inference time:
You are anki-slm-siksha, an AI assistant that converts study material into high-quality Anki flashcards. Always:
- Read the user's text carefully and identify the single most important concept or fact.
- Output exactly one flashcard per request as a valid JSON object with two fields: "front" and "back".
- Make the "front" a clear, concise question or cloze deletion suitable for active recall.
- Make the "back" a precise, exam-ready answer, avoiding unnecessary extra details.
- Use simple, student-friendly language while preserving technical accuracy.
- Do not include any explanation, commentary, or formatting outside the JSON object.
Example Outputs
| Input Note | Front | Back |
|---|---|---|
| Photosynthesis is the process by which green plants use sunlight to synthesize nutrients from CO2 and water. | What is photosynthesis? | Photosynthesis is the process by which green plants use sunlight to synthesize nutrients from CO2 and water. |
| A neural network is a computational model inspired by the brain, made of interconnected nodes. | What is a neural network in AI? | A computational model inspired by the brain, made of interconnected nodes called neurons. |
| Encryption is the process of converting information into a coded form to prevent unauthorized access. | What is encryption? | The process of converting information into a coded form to prevent unauthorized access. |
| A primary key is a unique identifier for each record in a database table. | What is a primary key in a database? | A unique identifier for each record in a table. |
Use Cases
- Students — Instantly convert lecture notes and textbook chapters into revision-ready flashcard decks
- Educators — Auto-generate quiz banks and study guides at scale
- EdTech Apps — Embed as an AI backend in spaced-repetition learning tools
- Self-Learners — Supercharge any learning workflow with AI-assisted card creation
- LMS Integrations — Connect to platforms like Anki, Mochi, RemNote, or any custom learning system
Limitations
- Optimized for English-language inputs only
- Best performance on factual, definitional content — creative or ambiguous notes may produce lower-quality cards
- Output is designed to be a single flashcard; multi-card generation requires prompt-level iteration
- As with all LLMs, occasional hallucination is possible — always review generated cards before adding to a production deck
About AnkTechsol
AnkTechsol is an AI and Data Engineering startup building intelligent tools for education, automation, and enterprise intelligence.
Key Products:
- Vidyantrik — AI-powered learning platform
- Career10x — AI-driven career acceleration
- DataDhan — Enterprise data solutions
- AnkiGPT / Anki-SLM-Siksha — Open-source AI for spaced repetition
Follow our work on LinkedIn | GitHub
Citation
If you use this model in your research or product, please cite:
@misc{anktechsol2026ankislmsiksha,
title = {Anki-SLM-Siksha: A Fine-Tuned LLM for Spaced-Repetition Flashcard Generation},
author = {AnkTechsol},
year = {2026},
howpublished = {\url{https://huggingface.co/anktechsol/anki-slm-siksha}},
note = {Fine-tuned from meta-llama/Llama-3.2-3B-Instruct using LoRA via Unsloth}
}
Built with love for learners everywhere. Siksha means education — and education should be effortless.
- Downloads last month
- 15
Model tree for anktechsol/anki-slm-siksha
Base model
meta-llama/Llama-3.2-3B-Instruct