DigitalRoot — A tutor that learns every child

The heart of it

The learner model — try it yourself

Everything in DigitalRoot orbits one thing: a durable, structured picture of what a learner knows. Below you can walk the curriculum into it, then watch the model react to real answers — live.

1 · How CAPS becomes the model

Click down through the curriculum. Each level narrows to a precise, measurable skill — that's the unit the model tracks.

Choose to drill down

The curriculum spine

CAPS maps cleanly onto five levels. The model anchors every lesson and question to the bottom one.

SubjectMathematics — taught Gr 1–12

OfferingSubject + grade + language

Unit (KU)A CAPS topic for the term

Skill (KC)One thing a learner can master

ContentLessons + questions, per skill

2 · What we measure — feed it answers

Mastery isn't one score. It's three: can they do it, do they understand it, can they apply it. Press the buttons and watch the model — and the tutor's next move — change.

Prime-factorise a 3-digit number

Knowledge Component · mastery threshold 0.85 per dimension

Procedural — can do it0.30

Conceptual — understands why0.22

Application — uses it in context0.18

Just starting out router → Continue

Simulate a learner event

3 · What this unlocks

The point

One tutor, every learner, no fee.

A model this precise runs for one child or a million — the marginal cost of teaching the next learner is near zero. That's how you give South Africa private-quality tutoring for free.

∞ learners

Teaches the right gap

Procedurally fluent but doesn't understand? It teaches the "why", not more drills.

Fixes misconceptions

Catches an actively-wrong rule and corrects it surgically, instead of grinding practice.

Reach-back

A Grade 5 gap blocking Grade 7 work? It steps back, fixes it, resumes.

Parents in the loop

Live view of progress, and a way to confirm work done at home.

Improves itself

The teaching algorithms evolve from real learner data over time.

The system

How the pieces fit

The learner model is the system of record. Content flows in; a tutoring front end runs the conversation. Flip the toggle to compare the wiring as it stands today with the recommended shape.

Today: scrapers feed the tutor's own curriculum, bypassing the system of record. Four seams marked in red.

built prototype spec / design only seam to fix

The supporting engines

What each part unlocks when it's built right

Three systems sit around the learner model. Here's what they are and what they make possible.

The self-improving engine

ShinkaEvolve

Most tutors hand-tune their learning algorithms once and freeze them. ShinkaEvolve (from Sakana AI) lets those algorithms keep getting better against real data.

What it does. An evolutionary search loop — it proposes algorithm variants with an LLM, scores each against a fitness function, keeps the best across generations, and explores in parallel populations.
What it evolves here. The rule that updates mastery from each answer, and the selector that picks the next skill. It replays the real learner event log to test candidates.
Shipped safely. A winner is emitted as a constrained, inspectable tree — a fixed operator set, not arbitrary code. The learner model hot-loads it; versioned, auditable, reversible.
Fitness = predicting the learner. Candidates are judged on how well they predict performance on held-out events, so the winner models this population best.

1Replay real learner events

2Evolve candidate algorithms

3Score on held-out data

4Emit winner as a safe artifact

5Learner model hot-loads it

↻ each cycle, sharper

What we achieve

Pedagogy that compounds — the longer it runs, the better it models how learners actually learn. Needs a real event corpus first, so it's a later phase.

Why a live database

SpacetimeDB

Parents should see learning happen in real time. SpacetimeDB is a database clients subscribe to — updates are pushed live — which is exactly what a real-time dashboard needs and what a normal database does awkwardly.

The need. A guardian watching a child's mastery move as they work, from another device, sometimes adding their own input. A real-time, multi-client problem.
A derived cache, never the truth. The learner model writes Postgres and an outbox row in one transaction; a relay pushes deltas into SpacetimeDB; dashboards update in under ~200ms. Rebuildable from Postgres anytime.
POPIA by design. Self-hosted on the South African VPS, no personal data stored (only anonymous IDs and numbers), and erasure includes a purge step here too.

1learner model writes PG + outbox

2relay drains the outbox

3SpacetimeDB applies the delta

4parent dashboard updates live

<200ms

to live

ZA

data-resident

What we achieve

Parents watching mastery rise live as their child learns — real engagement and trust — without ever compromising the system of record or data residency.

Bringing parents into the loop

The parent dashboard

South African learner outcomes are strongly driven by parental involvement. The parent UI makes a guardian an active, safely-scoped part of the learning loop.

Live mastery view. Per-skill mastery across all three dimensions, active misconceptions, and the current session — updating live.
Parent confirmations. A guardian can attest "she did this correctly at home." It's a lower-weighted evidence source that flows through the same machinery and nudges mastery.
Consent and control. Granular, revocable POPIA consent — data processing, AI processing, confirmations, live view — owned by the guardian.
Scoped by identity. A login plus signed-token bridge ensures a guardian only ever sees their own children.

Amahle · Grade 7live

Whole numbersmastered

Common fractionsin progress

Prime factorsmisconception

This week3 sessions · 47 min

What we achieve

Parents who can see progress in real time and add to it — turning the home into part of the tutoring loop, safely and privately.

Where the work is

What's built vs. what's planned

Honest status. The design is ahead of the code — the right order. One component is genuinely built; the rest are prototypes or specs.

learner-model.NET 10 · system of record

Domain, schema, migrations, integration tests with proven PII encryption. API, events and projections still to come.

BUILT · FOUNDATION

data pipelinescrapers + mapper

Scrapers run and a mapper emits curriculum — but aimed at the tutor's loader, not the learner model, and items are placeholders.

PROTOTYPE

CAPS contentATPs · Siyavula · DBE

Real source material and site maps of ~36,585 WCED resources on hand. Not yet parsed into real lessons and questions.

RAW

deeptutor-forktutoring runtime + chat

Spec targets upstream modules that have since been deprecated. Needs retargeting to a thin client adapter.

SPEC · STALE

mastery-evolutionShinkaEvolve

Evolves the mastery rule and selector into safe artifacts. Needs a learner-event corpus before it can run.

SPEC ONLY

parent-ui + live stateSpacetimeDB · graph

Real-time guardian dashboard and graph-based reach-back. Net-new operational surface — a candidate to defer past Phase 1.

DESIGN ONLY

Where the assembly is loose

Four seams to tighten

The parts were specified independently. Where they meet — who owns the curriculum, who owns the turn, what content is real — the joins need work. These are the red badges on the diagram.

SEAM 01

Two curricula

The scraper feeds the tutor's own loader, not the learner model — two sources of truth, and the scraped one bypasses the system of record.

SEAM 02

Hollow content

The generated items are placeholders, not teachable material. The real content is still locked inside the source PDFs, unparsed.

SEAM 03

Two brains

The tutor has its own agentic loop; the learner model wants to own sequencing and per-turn routing. Nobody decided which is subordinate.

SEAM 04

Stale fork

The fork spec targets upstream modules that were deprecated and partly deleted. The "patch it in place" plan has no surface to attach to.

Fresh-eyes rebuild

Three moves that tighten everything

Keep every differentiated idea — vector mastery, misconceptions, reach-back, routing, event sourcing. Move the heavy infrastructure later, and fix the joins.

1

One curriculum, one brain

The learner model is the only source of truth. The tutor becomes a thin render + LLM client that asks "what's the next move?" each turn and reports events back. It reads curriculum from the learner model — never authored twice.

Resolves seams 1, 3 & 4 at once

2

Split scraping from authoring

A dumb content lake (scrapers) feeds an LLM-assisted, teacher-validated authoring pipeline that drafts real lessons and questions into the learner model. That pipeline — not the scraper — is the actual moat.

Resolves seam 2 · builds the defensible asset

3

Phase the infrastructure

Get the core teaching loop running on plain Postgres — recursive queries and a polling dashboard. Defer SpacetimeDB, the graph database, and algorithm evolution until there's scale and data to justify them.

Dogfooding loop months sooner

Sequencing

What ships when

The teaching loop first, scale second, self-improvement last — once there's data to learn from.

Phase 1 · Core loop

Prove it on one learner

learner-model API + event store
vector mastery + misconceptions
reach-back via Postgres queries
tutor thin-client adapter
~12 hand-authored real skills
guardian view by polling

Phase 2 · Scale

Real content, more learners

LLM-assisted authoring pipeline
teacher-validated content at volume
graph projection for reach-back
SpacetimeDB live dashboard

Phase 3 · Self-improve

Compounding quality

ShinkaEvolve on a real corpus
multi-subject & multi-grade
multilingual content

On teachers: the design models them as content validators (every lesson and question can be teacher-validated), and a teacher portal is a documented later phase. Paying teachers for authoring or validation work is not yet specified anywhere in the current documents — a deliberate open question, not an omission.

The fork points

Decisions to make next

Resolve these six and the rebuild plan writes itself.

1

Curriculum ownershiplearner model is the sole source of truth; tutor reads from it

Recommend yes

2

Turn-loop authoritylearner model decides the move; tutor renders it

Recommend yes

3

Fork strategyretarget to a thin client adapter on current DeepTutor

Recommend yes

4

Phase-1 infra cutdefer SpacetimeDB, the graph DB and evolution

Recommend yes

5

Content realityreal items need teacher-validated authoring, not scraping

Recommend yes

6

Mastery shapekeep the vector model — built, and a real differentiator

Recommend keep