Heting Mao's picture
16 1

Heting Mao

IkanRiddle
·

AI & ML interests

None yet

Recent Activity

reacted to kanaria007's post with ❤️ about 18 hours ago
✅ New Article: *Measuring What Matters in Learning* (v0.1) Title: 📏 Measuring What Matters in Learning: GCS and Metrics for Support Systems 🔗 https://huggingface.co/blog/kanaria007/measuring-what-matters-in-learning --- Summary: Most “AI for education” metrics measure *grades, time-on-task, and engagement*. That’s not enough for *support systems* (tutors, developmental assistants, social-skills coaches), where the real failure mode is: *the score goes up while the learner breaks*. This guide reframes learning evaluation as *multi-goal contribution*, tracked as a *GCS vector* (mastery, retention, wellbeing/load, self-efficacy, autonomy, fairness, safety) — and shows how to operationalize it without falling into classic metric traps. > If you can’t measure wellbeing, fairness, and safety, > you’re not measuring learning — you’re measuring extraction. --- Why It Matters: • Moves beyond “grading” into *support metrics* designed for real learners • Makes *wellbeing, autonomy, fairness, and safety* first-class (not afterthoughts) • Separates *daily ops metrics* vs *research evaluation* vs *governance/safety* • Turns “explainability” into *answerable questions* (“why this intervention, now?”) --- What’s Inside: • A practical *GCS vector* for learning & developmental support • How core metrics translate into education contexts (plan consistency, trace coverage, rollback health) • A tiered metric taxonomy: *Ops / Research / Safety* • Parent-facing views that avoid shaming, leaderboards, and over-monitoring • Pitfalls and failure patterns: “optimize test scores”, “maximize engagement”, “ignore fairness”, etc. --- 📖 Structured Intelligence Engineering Series Formal contracts live in the evaluation/spec documents; this is the *how-to-think / how-to-use* layer.
reacted to kanaria007's post with 🤗 about 18 hours ago
✅ New Article: *Measuring What Matters in Learning* (v0.1) Title: 📏 Measuring What Matters in Learning: GCS and Metrics for Support Systems 🔗 https://huggingface.co/blog/kanaria007/measuring-what-matters-in-learning --- Summary: Most “AI for education” metrics measure *grades, time-on-task, and engagement*. That’s not enough for *support systems* (tutors, developmental assistants, social-skills coaches), where the real failure mode is: *the score goes up while the learner breaks*. This guide reframes learning evaluation as *multi-goal contribution*, tracked as a *GCS vector* (mastery, retention, wellbeing/load, self-efficacy, autonomy, fairness, safety) — and shows how to operationalize it without falling into classic metric traps. > If you can’t measure wellbeing, fairness, and safety, > you’re not measuring learning — you’re measuring extraction. --- Why It Matters: • Moves beyond “grading” into *support metrics* designed for real learners • Makes *wellbeing, autonomy, fairness, and safety* first-class (not afterthoughts) • Separates *daily ops metrics* vs *research evaluation* vs *governance/safety* • Turns “explainability” into *answerable questions* (“why this intervention, now?”) --- What’s Inside: • A practical *GCS vector* for learning & developmental support • How core metrics translate into education contexts (plan consistency, trace coverage, rollback health) • A tiered metric taxonomy: *Ops / Research / Safety* • Parent-facing views that avoid shaming, leaderboards, and over-monitoring • Pitfalls and failure patterns: “optimize test scores”, “maximize engagement”, “ignore fairness”, etc. --- 📖 Structured Intelligence Engineering Series Formal contracts live in the evaluation/spec documents; this is the *how-to-think / how-to-use* layer.
reacted to kanaria007's post with 🤗 about 18 hours ago
✅ New Article: *Designing, Safeguarding, and Evaluating Learning Companions* (v0.1) Title: 🛡️ Designing, Safeguarding, and Evaluating SI-Core Learning Companions 🔗 https://huggingface.co/blog/kanaria007/designing-safeguarding-and-evaluating --- Summary: Most “AI tutoring” talks about prompts, content, and engagement graphs. But real learning companions—especially for children / ND learners—fail in quieter ways: *the system “works” while stress rises, agency drops, or fairness erodes.* This article is a practical playbook for building SI-Core–wrapped learning companions that are *goal-aware (GCS surfaces), safety-bounded (ETH guardrails), and honestly evaluated (PoC → real-world studies)*—without collapsing everything into a single score. > Mastery is important, but not the only axis. > *Wellbeing, autonomy, and fairness must be first-class.* --- Why It Matters: • Replaces “one number” optimization with *goal surfaces* (and explicit anti-goals) • Treats *child/ND safety* as a runtime policy problem, not a UX afterthought • Makes oversight concrete: *safe-mode, human-in-the-loop, and “Why did it do X?” explanations* • Shows how to evaluate impact without fooling yourself: *honest PoCs, heterogeneity, effect sizes, ethics of evaluation* --- What’s Inside: • A practical definition of a “learning companion” under SI-Core ([OBS]/[ID]/[ETH]/[MEM]/PLB loop) • GCS decomposition + *age/context goal templates* (and “bad but attractive” optima) • Safety playbook: threat model, *ETH policies*, ND/age extensions, safe-mode patterns • Teacher/parent ops: onboarding, dashboards, contestation/override, downtime playbooks, comms • Red-teaming & drills: scenario suites by age/context, *measuring safety over time* • Evaluation design: “honest PoC”, day-to-day vs research metrics, ROI framing, analysis patterns • Interpreting results: *effect size vs p-value*, “works for whom?”, go/no-go and scale-up stages --- 📖 Structured Intelligence Engineering Series
View all activity

Organizations

None yet