Architecture¶

The plugin has two layers with deliberately different natures:

The soft trigger is the model's judgment. There is no keyword detector. The ground-first skill carries a keyword-free policy description; the agent self-elects it from its own understanding of whether a task is complex/irreversible — in any domain. (Measured: precision 1.0 / recall 0.97 across infra + diverse domains; a keyword classifier got 0.28 recall off-infra. EXPERIMENTS.md Exp 6–7.) This is the plugin's thesis applied to itself: trigger the model's training, don't hand-author static domain knowledge.
The hard gate is deterministic and self-arming. A safety floor must not depend on the model it gates, so it stays pattern-based.

  user request
       │  the agent elects the ground-first skill from its OWN understanding
       ▼  (no keyword list, any domain)
  RECOGNIZE ─▶ RECONSTRUCT ─▶ (probes, read-only) ─▶ tmt-ground commit
                 Grounding Brief: split KNOW vs ASSUME; decision points;
                 silent failure modes; unknowns tagged PROBE/ASK/ASSUME;
                 verify the 3 riskiest assumptions
       │
       ▼  PreToolUse hook (bin/tmt_enforce.py), classify_tool(tool):
  GATE (self-arming, hard):  readonly / write-local -> ALLOW
                             mutating + ungrounded   -> DENY  (marks session)
                             grounded                -> ALLOW

Components¶

Stage	Surface	File	Role
Recognize	skill (model-elected)	`skills/ground-first/SKILL.md`	the soft trigger — policy in the description, brief in the body
Gate	`PreToolUse`	`bin/tmt_enforce.py`	deny destructive tools until grounded; self-arming
Reconcile	`PostToolUseFailure`	`bin/tmt_reconcile.py`	a failed tool is a falsified assumption — re-ground, don't retry blindly
Probe log	`PostToolUse`	`bin/tmt_log.py`	record read-only probes (evidence for `tmt-ground commit`)
Prune	`SessionStart`	`bin/tmt_session.py`	clear stale per-session state
Release	bin CLI	`bin/tmt-ground`	mark the session grounded after probing
Reversibility manifest	library	`bin/tmt_lib.py`	`classify_tool` (the gate's deterministic classifier)

Session state¶

Per-session JSON under $TMT_DATA / $CLAUDE_PLUGIN_DATA / ~/.tmt/data. Key fields: grounded (gate released), probes_run (probe evidence), required (set lazily by the gate on first denial so tmt-ground can find the session).

Findings¶

Both the grounding content and the trigger were tuned falsification-first — compact beats heavy, replicate before believing, the model judges better than keywords. Full ledger in EXPERIMENTS.md.