Incident Response

When something goes wrong — detect, yank, improve.

INCIDENT RESPONSE — 5-MIN YANK ACTIVITY (DD-107)

DETECTION

Detection

Two channels: automated trip-wires (anomalous install patterns, capability-usage spikes, version-churn matching attack signatures) and user reports via arc report or the registry UI.

A trusted steward (highest trust tier) can also flag a blueprint they originally sponsored — this carries elevated priority because they have context the system doesn’t.

Keep it simple — for a two-person team, detection often means “you notice something is wrong.”

Future: The observability stack (structured logging, security event alerting, health monitoring) is designed and will power the automated trip-wires described above — replacing manual observation with structured, machine-readable detection. See the operational infrastructure design spec (I-500, I-501, I-502).

YANK

The yank

A trusted steward can trigger a yank directly from the admin UI — no ceremony required at this team size. The steward confirms the incident, issues the yank, and the system signs and publishes a new revocation list.

The revocation propagates globally via Workers KV within seconds. Every arc install and arc upgrade that polls the revocation list (every 5 minutes) will refuse the yanked blueprint.

Parallel: the publisher’s submission privileges are frozen until the incident is resolved.

As the team and trust network grows, this can evolve to require dual-control (two stewards must both approve the yank). The architecture supports it — but requiring it now would mean nobody can act fast when it matters most.

AFTER

After the yank

Within a week: write up what happened. How did the blueprint get through the pipeline? Which gate should have caught it? Was the detection path optimal?

Within a month: ship at least one concrete improvement — a new clean room rule, a new attribute gate check, a detection signal, or a process change.

Every incident should leave the system better than it found it. The write-up and the improvement are both recorded in the audit log so there’s an auditable chain from incident to fix.