Free, CAPS-aligned tutoring · South Africa

A tutor that actually learns every child — and it's free.

DigitalRoot turns the CAPS curriculum into a living model of what a learner knows, skill by skill, and teaches to the exact gap. Private tutoring, for everyone who can't afford it.

3mastery dimensions
R0cost to the learner
CAPScurriculum-native
POPIAprivate by design
AmahleGrade 7 · Mathematics
live model
Procedural0.88
Conceptual0.54
Application0.37
Fluent, but shaky on "why"
The heart of it

The learner model — try it yourself

Everything in DigitalRoot orbits one thing: a durable, structured picture of what a learner knows. Below you can walk the curriculum into it, then watch the model react to real answers — live.

1 · How CAPS becomes the model

Click down through the curriculum. Each level narrows to a precise, measurable skill — that's the unit the model tracks.

Choose to drill down

The curriculum spine

CAPS maps cleanly onto five levels. The model anchors every lesson and question to the bottom one.

SubjectMathematics — taught Gr 1–12
OfferingSubject + grade + language
Unit (KU)A CAPS topic for the term
Skill (KC)One thing a learner can master
ContentLessons + questions, per skill

2 · What we measure — feed it answers

Mastery isn't one score. It's three: can they do it, do they understand it, can they apply it. Press the buttons and watch the model — and the tutor's next move — change.

Prime-factorise a 3-digit number
Knowledge Component · mastery threshold 0.85 per dimension
Procedural — can do it0.30
Conceptual — understands why0.22
Application — uses it in context0.18
Just starting out router → Continue
Simulate a learner event

3 · What this unlocks

The point

One tutor, every learner, no fee.

A model this precise runs for one child or a million — the marginal cost of teaching the next learner is near zero. That's how you give South Africa private-quality tutoring for free.

∞ learners

Teaches the right gap

Procedurally fluent but doesn't understand? It teaches the "why", not more drills.

Fixes misconceptions

Catches an actively-wrong rule and corrects it surgically, instead of grinding practice.

Reach-back

A Grade 5 gap blocking Grade 7 work? It steps back, fixes it, resumes.

Parents in the loop

Live view of progress, and a way to confirm work done at home.

Improves itself

The teaching algorithms evolve from real learner data over time.

The system

How the pieces fit

The learner model is the system of record. Content flows in; a tutoring front end runs the conversation. Flip the toggle to compare the wiring as it stands today with the recommended shape.

Today: scrapers feed the tutor's own curriculum, bypassing the system of record. Four seams marked in red.
YAML curriculum curriculum drafts per-turn HTTP serves move outbox events / AST Raw CAPS contentPDFs · DBE · WCED Scrapers + lakeharvest + provenance deeptutor-forktutoring front end · LLM agents learner-modelsystem of record · .NET 10vector mastery · misconceptionsreach-back · coherence routing SpacetimeDB → parent-uilive dashboard mastery-evolutionShinkaEvolve 1 2 3 4
built prototype spec / design only seam to fix
The supporting engines

What each part unlocks when it's built right

Three systems sit around the learner model. Here's what they are and what they make possible.

The self-improving engine

ShinkaEvolve

Most tutors hand-tune their learning algorithms once and freeze them. ShinkaEvolve (from Sakana AI) lets those algorithms keep getting better against real data.

  • What it does. An evolutionary search loop — it proposes algorithm variants with an LLM, scores each against a fitness function, keeps the best across generations, and explores in parallel populations.
  • What it evolves here. The rule that updates mastery from each answer, and the selector that picks the next skill. It replays the real learner event log to test candidates.
  • Shipped safely. A winner is emitted as a constrained, inspectable tree — a fixed operator set, not arbitrary code. The learner model hot-loads it; versioned, auditable, reversible.
  • Fitness = predicting the learner. Candidates are judged on how well they predict performance on held-out events, so the winner models this population best.
1Replay real learner events
2Evolve candidate algorithms
3Score on held-out data
4Emit winner as a safe artifact
5Learner model hot-loads it
↻ each cycle, sharper
What we achieve

Pedagogy that compounds — the longer it runs, the better it models how learners actually learn. Needs a real event corpus first, so it's a later phase.

Why a live database

SpacetimeDB

Parents should see learning happen in real time. SpacetimeDB is a database clients subscribe to — updates are pushed live — which is exactly what a real-time dashboard needs and what a normal database does awkwardly.

  • The need. A guardian watching a child's mastery move as they work, from another device, sometimes adding their own input. A real-time, multi-client problem.
  • A derived cache, never the truth. The learner model writes Postgres and an outbox row in one transaction; a relay pushes deltas into SpacetimeDB; dashboards update in under ~200ms. Rebuildable from Postgres anytime.
  • POPIA by design. Self-hosted on the South African VPS, no personal data stored (only anonymous IDs and numbers), and erasure includes a purge step here too.
1learner model writes PG + outbox
2relay drains the outbox
3SpacetimeDB applies the delta
4parent dashboard updates live
<200ms
to live
ZA
data-resident
What we achieve

Parents watching mastery rise live as their child learns — real engagement and trust — without ever compromising the system of record or data residency.

Bringing parents into the loop

The parent dashboard

South African learner outcomes are strongly driven by parental involvement. The parent UI makes a guardian an active, safely-scoped part of the learning loop.

  • Live mastery view. Per-skill mastery across all three dimensions, active misconceptions, and the current session — updating live.
  • Parent confirmations. A guardian can attest "she did this correctly at home." It's a lower-weighted evidence source that flows through the same machinery and nudges mastery.
  • Consent and control. Granular, revocable POPIA consent — data processing, AI processing, confirmations, live view — owned by the guardian.
  • Scoped by identity. A login plus signed-token bridge ensures a guardian only ever sees their own children.
Amahle · Grade 7live
Whole numbersmastered
Common fractionsin progress
Prime factorsmisconception
This week3 sessions · 47 min
What we achieve

Parents who can see progress in real time and add to it — turning the home into part of the tutoring loop, safely and privately.

Where the work is

What's built vs. what's planned

Honest status. The design is ahead of the code — the right order. One component is genuinely built; the rest are prototypes or specs.

learner-model.NET 10 · system of record
Domain, schema, migrations, integration tests with proven PII encryption. API, events and projections still to come.
BUILT · FOUNDATION
data pipelinescrapers + mapper
Scrapers run and a mapper emits curriculum — but aimed at the tutor's loader, not the learner model, and items are placeholders.
PROTOTYPE
CAPS contentATPs · Siyavula · DBE
Real source material and site maps of ~36,585 WCED resources on hand. Not yet parsed into real lessons and questions.
RAW
deeptutor-forktutoring runtime + chat
Spec targets upstream modules that have since been deprecated. Needs retargeting to a thin client adapter.
SPEC · STALE
mastery-evolutionShinkaEvolve
Evolves the mastery rule and selector into safe artifacts. Needs a learner-event corpus before it can run.
SPEC ONLY
parent-ui + live stateSpacetimeDB · graph
Real-time guardian dashboard and graph-based reach-back. Net-new operational surface — a candidate to defer past Phase 1.
DESIGN ONLY
Where the assembly is loose

Four seams to tighten

The parts were specified independently. Where they meet — who owns the curriculum, who owns the turn, what content is real — the joins need work. These are the red badges on the diagram.

SEAM 01

Two curricula

The scraper feeds the tutor's own loader, not the learner model — two sources of truth, and the scraped one bypasses the system of record.

SEAM 02

Hollow content

The generated items are placeholders, not teachable material. The real content is still locked inside the source PDFs, unparsed.

SEAM 03

Two brains

The tutor has its own agentic loop; the learner model wants to own sequencing and per-turn routing. Nobody decided which is subordinate.

SEAM 04

Stale fork

The fork spec targets upstream modules that were deprecated and partly deleted. The "patch it in place" plan has no surface to attach to.

Fresh-eyes rebuild

Three moves that tighten everything

Keep every differentiated idea — vector mastery, misconceptions, reach-back, routing, event sourcing. Move the heavy infrastructure later, and fix the joins.

1

One curriculum, one brain

The learner model is the only source of truth. The tutor becomes a thin render + LLM client that asks "what's the next move?" each turn and reports events back. It reads curriculum from the learner model — never authored twice.

Resolves seams 1, 3 & 4 at once
2

Split scraping from authoring

A dumb content lake (scrapers) feeds an LLM-assisted, teacher-validated authoring pipeline that drafts real lessons and questions into the learner model. That pipeline — not the scraper — is the actual moat.

Resolves seam 2 · builds the defensible asset
3

Phase the infrastructure

Get the core teaching loop running on plain Postgres — recursive queries and a polling dashboard. Defer SpacetimeDB, the graph database, and algorithm evolution until there's scale and data to justify them.

Dogfooding loop months sooner
Sequencing

What ships when

The teaching loop first, scale second, self-improvement last — once there's data to learn from.

Phase 1 · Core loop
Prove it on one learner
  • learner-model API + event store
  • vector mastery + misconceptions
  • reach-back via Postgres queries
  • tutor thin-client adapter
  • ~12 hand-authored real skills
  • guardian view by polling
Phase 2 · Scale
Real content, more learners
  • LLM-assisted authoring pipeline
  • teacher-validated content at volume
  • graph projection for reach-back
  • SpacetimeDB live dashboard
Phase 3 · Self-improve
Compounding quality
  • ShinkaEvolve on a real corpus
  • multi-subject & multi-grade
  • multilingual content
On teachers: the design models them as content validators (every lesson and question can be teacher-validated), and a teacher portal is a documented later phase. Paying teachers for authoring or validation work is not yet specified anywhere in the current documents — a deliberate open question, not an omission.
The fork points

Decisions to make next

Resolve these six and the rebuild plan writes itself.

1
Curriculum ownershiplearner model is the sole source of truth; tutor reads from it
Recommend yes
2
Turn-loop authoritylearner model decides the move; tutor renders it
Recommend yes
3
Fork strategyretarget to a thin client adapter on current DeepTutor
Recommend yes
4
Phase-1 infra cutdefer SpacetimeDB, the graph DB and evolution
Recommend yes
5
Content realityreal items need teacher-validated authoring, not scraping
Recommend yes
6
Mastery shapekeep the vector model — built, and a real differentiator
Recommend keep