ATLAS/TRUST MODEL/LIFECYCLE/INCIDENT RESPONSE SHEET A / 02

Incident Response

When something goes wrong — detect, yank, improve.

INCIDENT RESPONSE — 5-MIN YANK ACTIVITY (DD-107)
SHEET A / 02 INCIDENT RESPONSE — 5-MIN YANK ACTIVITY (DD-107) TRIGGER Incident Report Arrives reviewer / user / anomaly / researcher PHASE 1 — CONTAINMENT (0-5 MIN) Confirm Triage SOP-14 decision tree DD-107: 5-minute yank target Elapsed clock starts at triage YANK Request (SHA-256) admin worker receives hash Revoke Signed-List Entry dual-control required (DD-112) DD-112: dual-control key use Two key halves, two operators No single person can yank alone Reg Signer — Both Key Halves cryptographic dual-control sign Append Yank Decision hash-chained audit log entry Publish Revocation-List v+1 signed list updated and distributed Yank Confirmed (≤5 min) PHASE 2 — PROPAGATION (5-30 MIN) Consumer-Side Enforcement next poll → signed list w/ revocation arc upgrade refuses revoked SHA-256 Freeze Publisher submission privilege frozen dual-control per DD-76 Notify Stewards steward channel alert incident details broadcast PHASE 3 — FORENSICS (30 MIN - HOURS) Blast Radius Analysis DD-109 — four dimensions DD-109: blast-radius KPI Four dimensions must be computed before disclosure can proceed (1) Install Footprint how many installs affected (2) Capability Exposure what permissions granted (3) Bundle Propagation DD-111 transitive deps (4) Time-Exposure how long was it live Append Blast-Radius Block PHASE 4 — COMMUNICATION (SAME DAY) Targeted Arc Notification notify affected consumers directly Public Disclosure Post CVE-style identifier assigned PHASE 5 — ROOT CAUSE ANALYSIS (≤ 7 DAYS) RCA detection path which gate failed countermeasures Append RCA Document PHASE 6 — IMPROVEMENT (≤ 30 DAYS) DD-110: written RCA ≤7 days, durable improvement ≤30 days Measured from yank confirmation Implement Durable Improvement new L1 rule / DD-79 flag / attribute-gate field SOP step / DD-75 trip-wire Append Improvement Commit Hash Post-Incident Review Meeting END FIVE-MINUTE YANK, SEVEN-DAY RCA, THIRTY-DAY IMPROVEMENT DD-107 sets the yank clock. DD-109 gates disclosure on the four-dimension blast-radius block. DD-110 enforces written RCA within 7 days and a durable improvement within 30 days.
DETECTION

Detection

Two channels: automated trip-wires (anomalous install patterns, capability-usage spikes, version-churn matching attack signatures) and user reports via arc report or the registry UI.

A trusted steward (highest trust tier) can also flag a blueprint they originally sponsored — this carries elevated priority because they have context the system doesn’t.

Keep it simple — for a two-person team, detection often means “you notice something is wrong.”

Future: The observability stack (structured logging, security event alerting, health monitoring) is designed and will power the automated trip-wires described above — replacing manual observation with structured, machine-readable detection. See the operational infrastructure design spec (I-500, I-501, I-502).

YANK

The yank

A trusted steward can trigger a yank directly from the admin UI — no ceremony required at this team size. The steward confirms the incident, issues the yank, and the system signs and publishes a new revocation list.

The revocation propagates globally via Workers KV within seconds. Every arc install and arc upgrade that polls the revocation list (every 5 minutes) will refuse the yanked blueprint.

Parallel: the publisher’s submission privileges are frozen until the incident is resolved.

As the team and trust network grows, this can evolve to require dual-control (two stewards must both approve the yank). The architecture supports it — but requiring it now would mean nobody can act fast when it matters most.

AFTER

After the yank

Within a week: write up what happened. How did the blueprint get through the pipeline? Which gate should have caught it? Was the detection path optimal?

Within a month: ship at least one concrete improvement — a new clean room rule, a new attribute gate check, a detection signal, or a process change.

Every incident should leave the system better than it found it. The write-up and the improvement are both recorded in the audit log so there’s an auditable chain from incident to fix.

← Back to Lifecycle Trust Model →