Crypto & Anti-Tamper for Avionics Mission Systems
← Back to: Avionics & Mission Systems
Crypto & Anti-Tamper in avionics builds a hardware-anchored trust chain so only authenticated software can run, keys stay non-exportable, and integrity can be proven with safe evidence logs. When intrusion is detected, the system executes policy-driven lockdown/zeroization that remains effective even across resets and power loss.
H2-1 · What “Crypto & Anti-Tamper” owns in avionics
This section defines strict scope boundaries and a single engineering chain (RoT → Boot → Keys/TRNG → Tamper → Zeroize → Provision/Validate) so every later chapter stays vertically deep without drifting into sibling pages.
Crypto & Anti-Tamper in avionics is the trust core that proves software integrity at boot, keeps keys non-exportable, detects physical or fault-injection intrusion, and can rapidly zeroize secrets. It centers on a Root of Trust (RoT) that enforces the chain of trust, validates updates, manages key lifecycles using proven randomness, and records auditable security evidence.
Output of this page is evidence: verifiable boot state, key usage boundaries, tamper events, and zeroization completion signals.
Boundary contract (scope guard)
This page owns
This page does not own (link only; no deep expansion)
- Waveform/modem/protocol deep-dives for tactical datalinks (handled by the datalink pages)
- Network-level timing distribution design (handled by timing/sync pages)
- Aircraft power front-end surge/EMI implementation details (handled by power front-end pages)
- Avionics bus protocol stacks (handled by interface pages)
The 6-block trust chain (system view)
- RoT: anchors identity and non-exportable secrets; provides verification and signing capability.
- Boot: verifies each stage; enforces version/rollback rules; fails safely with deterministic evidence.
- Keys/TRNG: derives and wraps keys using proven randomness; blocks extraction by policy and hardware boundary.
- Tamper: detects enclosure, probing, glitching signals; drives graded response (log → lockdown → wipe).
- Zeroize: irreversibly clears secrets (RAM + NVM scopes); remains correct under power loss edge cases.
- Provision/Validate: factory injection and service workflows that scale; produces audit-friendly test artifacts.
Each block should answer three questions: Input, Output, and Evidence.
H2-2 · Threat model that actually matches avionics reality
A practical threat model must map attacker capability → attack path → protected asset → control point → evidence. The goal is engineering closure: every control has a measurable artifact, and every artifact supports certification-grade review.
Attacker capability tiers (what changes across field, depot, lab, supply chain)
- Field access (minutes–hours): limited tools, opportunistic access to maintenance ports or enclosures; aims for quick extraction or bypass.
- Depot access (hours–days): device can be opened; parts may be swapped; storage can be probed; update media and service workflows become a target.
- Lab attacker (days–weeks): fault injection (voltage/clock/temperature), side-channel probing, micro-probing; attempts to break extraction boundaries.
- Supply chain adversary: pre-injection of modified firmware, counterfeit RoT devices, or contaminated provisioning steps; targets trust anchors before deployment.
Design implication: defenses should be graded (block, detect, limit blast radius, prove state) rather than assuming a single attacker profile.
Protected assets (prioritized by blast radius)
Rule of thumb: if compromise enables persistent impersonation or silent downgrade, it is S-Class and must be non-exportable or provably measured.
Attack surface → Control point → Evidence (engineering-ready)
| Attack surface (path) | Control point (what blocks/limits) | Evidence (what can be audited) |
|---|---|---|
| Debug / service ports JTAG/SWD/UART, service shells |
RoT-enforced debug policy; authenticated unlock; rate-limited access; tamper event on unexpected enablement; secrets never exposed to host. | Debug state record; signed boot state includes “debug locked”; tamper log entry on policy violation. |
| Firmware update path maintenance load, removable media, ground tools |
Secure boot verification for every stage; signed update manifests; rollback prevention; atomic A/B switching; failure is fail-closed with deterministic logs. | Signed measurement record; monotonic counter state; update attempt log with result code and version hash. |
| NVM / storage probing flash, eMMC/UFS/NVMe partitions |
Keys stored only inside RoT boundary; key wrapping with KEK; encrypted sensitive blobs; integrity tags; zeroization target list covers caches and staging buffers. | “Non-exportable” enforcement proof (device policy); wipe completion flag; storage integrity failure counters. |
| Enclosure / mesh open-case, light, mesh break |
Tamper sensors feed shortest path to RoT; graded response: log → lockdown → zeroize; avoid nuisance by debounced, cross-checked sensor fusion. | Tamper event chain (time-ordered); signed tamper status snapshot; zeroize trigger reason codes. |
| Power / clock glitch undervoltage, clock injection, temp extremes |
RoT monitors anomaly windows; secure boot hardened against glitch timing; TRNG health tests detect entropy collapse; tamper escalation on repeated anomalies. | Anomaly counters; TRNG health test status; boot failure reason codes correlated to glitch events. |
| Supply chain contamination counterfeit parts, injected firmware |
Strong provisioning controls: device identity enrollment, certificate chain pinning, verified programming stations, post-injection lock + audit; incoming authenticity checks. | Provisioning audit log; certificate chain evidence; locked-state attestation at shipment. |
Use this table as a chapter map: each later section should strengthen a control point and define its evidence format.
Engineering closure checklist (use as “chapter exit criteria”)
- Every control point has a named evidence artifact (record, counter state, event chain, wipe proof, TRNG status).
- Every evidence artifact has a collection rule that does not leak secrets (no raw keys, no sensitive dumps).
- Tamper response is graded and deterministic: same trigger class → same action + same reason codes.
- Power/clock anomaly handling is auditable: anomaly counters correlate with boot failures and tamper escalation.
Next chapters should reference this mapping table instead of restating threat theory, keeping the page vertically deep and non-overlapping.
H2-3 · Root of Trust options and how to choose (SE vs HSM vs discrete)
A Root of Trust (RoT) is not “a crypto chip.” It is the smallest boundary that can keep root secrets non-exportable, enforce boot policy, and produce auditable evidence. Selection should start from required duties and measurable artifacts, not from a part category label.
- Non-exportable root secrets: root key / seed never leaves the RoT boundary; only “use” operations are exposed (sign/unwrap/derive).
- Signature verification: verifies image signatures and certificate chains; rejects unknown issuers; records deterministic reason codes.
- Key derivation & wrapping: derives KEKs/DEKs and wraps blobs inside the RoT policy boundary; enforces per-key access control.
- Anti-rollback state: monotonic counter / version state updated inside the RoT; prevents downgrade even if external storage is modified.
- Optional secure time: only required when evidence timelines need trusted monotonicity beyond a boot counter (e.g., signed event sequencing).
Implementation rule: each duty should map to a named evidence artifact (counter state, boot record, tamper state snapshot), never to “trust by design intent.”
Implementation classes (engineering differences)
Secure Element (SE)
- Best fit for: non-exportable keys, signing/verification, secure counters, constrained policy.
- Typical constraints: fixed command set, limited throughput, integration is “policy by API,” not by custom firmware.
HSM-class device
- Best fit for: larger algorithm suites, higher verification volume, richer key domain separation.
- Typical constraints: more complex host integration, stricter secure-channel management and lifecycle controls.
Discrete RoT (MCU + secure flash)
- Best fit for: custom policy/state machines or unique provisioning flows.
- Typical constraints: assurance depends heavily on implementation quality and update hardening (a bug becomes a trust failure).
Common selection mistakes (avoid silent RoT bypass)
- Placing anti-rollback state outside RoT (host NVM only) enables downgrade attacks despite valid signatures.
- Using RoT as a “signing oracle” while policy lives in host firmware allows policy bypass via host compromise.
- Choosing a part that lacks required algorithms/cert-chain depth then moving verification into the host defeats the RoT boundary.
- No tamper-to-RoT closure: tamper events exist but cannot drive lockdown/zeroize within the shortest response path.
- Discrete RoT without hardened update path: a weak RoT firmware update mechanism undermines the entire chain of trust.
A correct RoT design makes “extracting secrets” strictly harder than “using secrets,” and provides evidence when policy boundaries are crossed.
Decision table (5–8 criteria that force clear boundaries)
| Criterion | Secure Element (SE) | HSM-class | Discrete RoT |
|---|---|---|---|
| Key exportability | Strong fit: keys stay non-exportable by design; operations exposed via limited API. | Strong fit: non-exportable keys with richer domains; requires careful channel policy. | Risk depends on implementation; strict non-exportability must be enforced in firmware + hardware boundary. |
| Cert-chain scale | Good for modest chains; check max objects, parsing support, and storage strategy. | Best when chains/issuers/roles are larger or multi-domain; more flexible validation policies. | Flexible if correctly implemented; requires robust parser hardening and update strategy. |
| Upgrade frequency | Good when policies are stable; update flow must not require “unlocking” secrets. | Good for frequent updates/attestation workloads; supports broader policy sets. | Good for bespoke flows; must harden RoT firmware update itself to avoid trust collapse. |
| Algorithm suite | Check supported curves/hashes and constraints; avoid host-side verification fallback. | Broad support and throughput; useful for higher-volume verification or mixed algorithms. | Potentially broad; validation and certification effort increases with complexity. |
| Tamper response | Works well with direct tamper inputs and fast lockdown/zeroize hooks (part-dependent). | Strong if integrated with tamper domain separation; ensure response path is short and deterministic. | Possible but needs careful sensor policy and robust response code paths. |
| Assurance needs | Often simpler story: fixed-function boundary with clear non-exportability claims. | Strong for complex assurance when evidence/roles must be richer; manage complexity carefully. | Heavily dependent on development rigor; “custom” increases review surface. |
| Integration complexity | Lowest: narrow API, fewer moving parts. | Highest: secure channels, domains, provisioning roles, richer policies. | Medium: flexible integration but more firmware responsibility and long-term maintenance. |
| Lifecycle ops | Good if factory injection is well-defined; limited modes reduce leakage risk. | Best when multiple roles/tenants exist; requires strict process control and audit trails. | Feasible; must design RMA/service boundaries to avoid cloning and key reuse. |
Selection outcome should explicitly state which duties are enforced inside the RoT and which evidence artifacts will be collected in validation.
H2-4 · Secure boot chain of trust (from ROM to application)
Secure boot is an engineering sequence, not a checkbox: each stage must verify the next, enforce version rules, and emit deterministic evidence. The chain should remain correct under interruption (power loss during update) and under policy edge cases (certificate rotation, rollback attempts).
- ROM / immutable stage: anchors trust (pinned hash/key) and verifies the first loadable stage.
- 1st stage: performs minimal initialization; verifies the second stage and enforces debug policy gates.
- 2nd stage / loader: verifies OS/RTOS image(s), critical drivers, and the update manifest policy.
- OS/RTOS: verifies trusted services and configuration; optionally records measured state for audit.
- Application: verifies mission-critical tasks/modules and enters mission mode only under valid state.
Verification should be bound to an RoT-enforced anti-rollback counter. “Valid signature” must not be sufficient for downgrade.
Key mechanisms (and why each exists)
- Signature verification: ensures authenticity of each stage image; rejects unauthorized signers.
- Certificate chain handling: supports issuer rotation and revocation without “unlocking” trust anchors.
- Hash measurement: records what ran (for audit) without replacing enforcement; measurement is evidence, not permission.
- Rollback protection: monotonic counter prevents replay of older-but-signed images.
- A/B images: enables atomic switching and robust recovery after interruptions.
- Deterministic reason codes: every failure path yields an explainable, non-secret diagnostic record.
Failure policy boundary (fail-closed vs safe/degraded mode)
| Failure class | Required action | Evidence + recovery path |
|---|---|---|
| Signature / chain fail | Fail-closed for mission mode; optionally enter restricted safe mode with locked secrets. | Signed boot record + reason code; recover via authenticated reflash or approved A/B fallback (no downgrade). |
| Rollback attempt | Reject image even if signed; remain in safe mode or prior valid slot; do not decrease counter. | Counter state snapshot; recovery requires newer signed image or authorized service procedure. |
| Update incomplete | Do not boot partially written slot; fall back to last known-good slot; require atomic commit flag. | Update state record; recovery is automatic fallback + reattempt with authenticated tool. |
| Tamper active | Lockdown and block sensitive operations; optionally trigger zeroization per policy. | Tamper event chain; recovery only after authorized inspection and re-provisioning if required. |
| TRNG health fail | Block key generation/derivation; restrict boot to safe mode if cryptographic operations are required. | TRNG status flags; recovery via controlled reset, diagnostics, or module replacement if persistent. |
Safe/degraded mode must be explicit: limited interfaces, limited secrets, and auditable entry conditions.
Engineering pitfalls (symptom → cause → fix)
- Certificate rotation bricks fleet: issuer changes without overlap window → implement dual-trust window and versioned chain policy.
- Debug unlock becomes a backdoor: unlocking not bound to RoT policy or not audited → require authenticated unlock token + time limit + reason codes.
- Boot-time budget blow-up: excessive chain depth or heavy algorithms → partition verification (critical first) while preserving measured evidence for the rest.
- Power loss during update causes “brick”: non-atomic writes → use A/B slots, write-then-verify, and commit flags only after full verification.
- Signed downgrade still boots: rollback counter outside RoT or not checked at every stage → keep monotonic counter inside RoT boundary and enforce per-stage.
A secure boot design is “done” when every pitfall above has a deterministic test case and produces expected evidence artifacts.
H2-5 · Measured boot & attestation (without becoming a network-protocol page)
Secure boot enforces what is allowed to run; measured boot proves what actually ran. Attestation packages measurements into an auditable evidence bundle signed by the Root of Trust (RoT). This chapter focuses on evidence shape and anti-replay rules, not transport protocols.
Measured boot records stage-by-stage measurements (hashes and policy snapshots) so “what executed at boot” can be proven later. Attestation signs a summarized evidence package inside the RoT and lets a verifier decide compliance (allowed version ranges, revocation state, debug/tamper gates). The output is a deterministic, storable artifact—useful for audits, maintenance checks, and fleet governance.
Secure boot vs measured boot (practical comparison)
| Dimension | Secure Boot | Measured Boot |
|---|---|---|
| Primary goal | Prevent untrusted code from executing (enforcement). | Prove what executed (auditability and observability). |
| What is checked | Signatures, certificate chain, rollback counters, policy gates. | Stage measurements (hashes), policy snapshots, optional runtime configuration fingerprints. |
| Failure behavior | Fail-closed or restricted safe mode when verification fails. | Verifier can mark “non-compliant” even if device runs (evidence drives decision). |
| Evidence artifact | Reason codes, counter state, boot outcomes. | Signed evidence package: measurements + policy snapshot + nonce binding. |
| Where policy lives | Enforced locally in boot chain and RoT boundary. | Judged by verifier using allowlists, revocation lists, and gates (transport not covered here). |
| What it does NOT replace | Does not replace audit evidence requirements by itself. | Does not replace secure boot; measurement without enforcement cannot stop malicious code. |
Design rule: treat measured boot as a compliance proof layer; secure boot remains the enforcement gate.
Minimal attestation structure (evidence shape, not protocol)
- Measurement: stage hashes (ROM/stage1/stage2/OS/app), plus a compact measurement digest for signing.
- Policy snapshot: debug lock state, rollback counter, tamper state, TRNG health flags (non-secret indicators).
- Nonce binding: evidence must include a verifier-provided challenge (or a verifiable freshness token) to prevent replay.
- RoT signature: RoT signs the evidence digest that includes nonce + counter + measurements to make it unforgeable and time-ordered.
- Verifier decision: checks allowlists/versions/revocations and records the verdict with the evidence for audit.
Avoid protocol sprawl: keep the chapter at “inputs/outputs and invariants.” Transport can vary by platform without changing evidence validity.
Engineering rules (nonce, anti-replay, retention, and version strategy)
- Nonce must be signed: challenge value must be included in the signed material; otherwise old evidence can be replayed.
- Replay defense needs two anchors: nonce freshness plus a monotonic counter/policy version snapshot prevents “old-but-valid” reuse.
- Retention must be non-secret: store hashes, reason codes, counters, and signatures—never store raw secret material or unwrap outputs.
- Version strategy must be explicit: evidence should carry version identifiers and policy IDs so verifier rules stay deterministic over time.
- Audit must be explainable: verifier verdict should be reproducible from evidence + policy rules, not from undocumented heuristics.
Validation idea: replay the same evidence with a new nonce and confirm it is rejected; change only policy rules and confirm verdict changes are explainable.
H2-6 · Key hierarchy & storage (how keys live, rotate, and stay non-exportable)
Key management succeeds when “using keys” stays easy but “extracting keys” stays hard—even if the host software is compromised. This chapter focuses on key ladders, wrapping boundaries, slot policies, and lifecycle operations (rotate/revoke/destroy) with service/RMA boundaries.
- Root key / root seed: exists only inside the RoT; used for signing or deriving other keys; never used to encrypt bulk data.
- KEK (Key Encryption Key): wraps/unwraps DEKs; defines who can unlock data keys; must not be reused across devices as a single shared secret.
- DEK (Data Encryption Key): encrypts data/config; often stored only as a wrapped blob; rotates on schedule or on events.
- Session keys: short-lived keys for runtime sessions; should be volatile and derived from TRNG/DRBG + policy.
Practical rule: Root never leaves RoT; KEK governs DEK access; DEK binds to data domains; session keys do not persist.
Key storage model (slots, wrapping, and non-exportability)
- Key slots: RoT-managed identifiers with attributes (usage, lifetime, origin, and exportability = NO).
- Wrapping boundary: DEKs are stored externally only as wrapped blobs (ciphertext + metadata), never as plaintext backups.
- Policy/ACL: host can request “unwrap-and-use” but cannot request “export plaintext,” and every request is policy-checked.
- Metadata is part of security: key version, domain ID, and policy ID must bind to wrapping to avoid mix-and-match attacks.
Non-exportable means “no plaintext extraction path exists,” not merely “the API does not document export.”
Key table (type / where stored / who can use / rotation)
| Key type | Stored where | Exportable | Who can use | Rotation / evidence |
|---|---|---|---|---|
| Root | RoT internal slot only | No | RoT only (sign/derive) | Rare; evidence: key ID + policy lock state |
| KEK | RoT slot (or derived inside RoT) | No | RoT unwrap/wrap operations | Rotate by policy; evidence: key version + domain ID |
| DEK | Wrapped blob in NVM + metadata | No | RoT unwrap-to-use; plaintext never persisted | Rotate scheduled/event; evidence: wrapped-blob version + revoke list |
| Session | Volatile RAM only | No | Runtime crypto operations | Short-lived; evidence: session policy + TRNG health flags |
Service boundary: RMA procedures must never require exporting root/KEK material; recovery should be re-provisioning, not “secret backup restore.”
Lifecycle checklist (generate → provision → use → rotate → revoke → destroy)
- Generate / provision: key origin is defined (TRNG/controlled injection); device identity binds to certificate chain.
- Activate: switch from factory mode to deployed mode; lock debug/provision interfaces with auditable state.
- Use: expose only “use” operations; log non-secret audit events (reason codes, key version, policy ID).
- Rotate: allow overlap windows (old+new key versions) and define deterministic cutover rules.
- Revoke: maintain a revocation policy keyed by version/domain; verifier decisions remain reproducible.
- Destroy / zeroize: define which slots, caches, and staging buffers are cleared; ensure correct behavior on reset/power loss.
- RMA / repair boundary: define what can be preserved (non-secret logs) vs what must be re-provisioned (keys/identity).
Validation idea: prove that cloning a wrapped DEK blob onto another device does not decrypt data (device/domain binding must fail deterministically).
Common pitfalls (what breaks, and why it becomes unrecoverable)
- Reusing one KEK across devices enables fleet-wide compromise after one leak → enforce per-device or strictly segmented domains.
- Key version confusion makes verifier decisions inconsistent → bind key version + policy ID into wrapping and evidence.
- Over-detailed logs leak sensitive context → store reason codes and versions, never raw secrets or unwrap results.
- “Backup” that contains recoverable root material defeats zeroize → root secrets must be non-backupable by design.
- Dev/prod mixing creates supply-chain shortcuts → isolate environments and make the switch-to-deployed state auditable.
H2-7 · TRNG/DRBG: entropy you can defend in a certification review
A TRNG provides raw entropy; a DRBG expands it into high-rate random output. In avionics-grade reviews, the key question is not “does it look random,” but “is entropy measurable, continuously monitored, and fail-safe for security decisions.”
TRNG is the entropy source. DRBG is the controlled expansion mechanism. Both must be backed by health tests and a clear failure policy, because temperature, voltage, aging, and startup conditions can reduce effective entropy. A defensible design produces audit-safe evidence (reason codes, counters, and status snapshots) without exposing secrets.
TRNG vs DRBG (what each “owns” in practice)
- TRNG owns entropy: it must produce unpredictable raw bits under real operating conditions (cold start, hot soak, brownout margins).
- DRBG owns determinism control: it expands entropy into output with defined state, reseed rules, and testable failure handling.
- Conditioning is not optional: whitening/conditioning reduces bias, but does not replace health tests that detect “entropy collapse.”
- Security decisions depend on freshness: key generation and nonce creation should be gated by TRNG/DRBG health state.
Review-friendly framing: “entropy is measured and policed,” not “randomness is assumed.”
Entropy sources (high-level) and the evidence that matters
The source type is less important than what can be measured and monitored. Typical sources include jitter-based oscillators, thermal-noise-based circuits, and metastability-based elements. In all cases, the defensible part is the evidence chain: health tests → reason codes/counters → policy gates.
- Startup entropy shortfall: early bits may be biased before steady-state; require startup tests and “no-keygen-before-healthy.”
- Temperature/voltage drift: bias and repetition can rise at corners; require continuous tests and drift-aware alert thresholds.
- Entropy decay/aging: long-term degradation can hide until stressed; require trendable counters and periodic self-check routines.
TRNG/DRBG health-test checklist (what reviewers expect to see)
| Test class | What it detects | What the system should do | Audit-safe evidence |
|---|---|---|---|
| Startup test | initial bias, stuck behavior, unstable conditioning at boot | block keygen/nonce issuance until healthy; latch failure state | reason code + boot-time health state + counter |
| Continuous repetition | repeated patterns indicating entropy collapse | raise alert; gate sensitive operations; trigger reseed | repetition counter + last-seen timestamp |
| Stuck-bit / stuck-word | frozen output or reduced toggling | lockdown TRNG output; deny new secrets; require maintenance action | stuck counter + severity level + latch |
| Bias / proportion | unbalanced 0/1 proportions beyond tolerance | degrade mode; increase reseed rate; flag non-compliance | bias score bucket + counter trend |
| DRBG reseed policy | overlong deterministic runs without fresh entropy | enforce reseed intervals; stop output if entropy unavailable | reseed version + reseed failures count |
Implementation invariant: evidence must be non-secret (no raw entropy samples, no internal state dumps).
Failure policy (safe handling without guessing)
- Level 0 — log only: transient anomalies produce reason codes and counters for review, without changing security posture.
- Level 1 — gate sensitive ops: block new key generation and nonce issuance; allow limited non-secret functions.
- Level 2 — key lockdown: deny unwrapping/signing operations until health is restored and verified.
- Level 3 — security escalation: if combined with tamper or repeated failures, trigger stronger actions (session wipe / zeroize path).
The policy should be deterministic and testable: same input health state → same allowed/denied outcomes.
H2-8 · Intrusion detection & tamper sensors (and what “good” looks like)
Anti-tamper is not a single “tamper pin.” A robust design combines sensors, a tamper hub (debounce, self-test, correlation), and a shortest-path response into the RoT. This chapter focuses on the tamper loop only—environment monitoring is handled elsewhere.
A “good” intrusion/tamper design has controllable false alarms, traceable events, a short response path to security controls, and high bypass difficulty. The system should support graded actions—log only, key lockdown, session wipe, and full zeroize—while producing audit-safe evidence (sensor ID, reason codes, latch state, and counters) that enables reproducible post-event review.
Tamper sensor categories (what they mitigate)
- Lid / enclosure open: detects physical access to internals; pair with latch + self-test to avoid simple bypass.
- Light exposure: detects enclosure breach; best used with multi-point sensing and correlation (not a single photodiode).
- Temperature extremes: detects thermal attacks and abnormal service conditions; require debounce and policy thresholds.
- Acceleration / shock: detects forced access events; use correlation to reduce false positives.
- Power / clock anomaly: detects glitch-style manipulation attempts; tie into shortest-path security gating.
- Tamper mesh (probe grid): detects probing; require continuity monitoring and open/short detection with latching.
Keep the scope security-focused: only tamper-relevant sensing and response are covered here.
Response levels (log → lockdown → session wipe → zeroize)
- Log only: record event chain with sensor ID and reason code; used for low-confidence or single-sensor anomalies.
- Key lockdown: immediately deny signing/unwrapping and other sensitive key operations in the RoT boundary.
- Session wipe: clear volatile session material; forces re-attestation and re-establishment of trusted state.
- Full zeroize: destroy selected RoT-managed secrets and set a latched “tamper” state requiring controlled recovery/re-provisioning.
The response ladder should be deterministic and testable with fault injection and bypass attempts.
Matrix: sensor → attack mitigated → recommended response → evidence
| Sensor | Attack mitigated | Typical bypass attempt | Response level | Evidence artifact |
|---|---|---|---|---|
| Lid open | unauthorized physical access | switch short / magnet spoof | lockdown → wipe | sensor ID + latch + counter |
| Light | enclosure breach / exposure | 遮光 / localized cover | log → lockdown | threshold bucket + time |
| Temperature | thermal manipulation / abuse | slow ramp to stay under threshold | log → lockdown | max/min bucket + duration |
| Shock | forced opening attempts | vibration masking | log (correlate) | event count + correlation flag |
| Power/clock | fault injection / glitch | low-amplitude repeated pulses | lockdown → wipe | window flags + counter |
| Mesh | probe / micro-drilling | bridge repair / localized bypass | wipe → zeroize | mesh continuity fault + latch |
Use correlation to control false alarms: multi-sensor triggers can justify stronger actions than single noisy channels.
H2-9 · Zeroization controls: erase fast, verify, and survive power loss
Zeroization must be defined as a repeatable, testable loop: scope → trigger → execute → verify → evidence. This chapter covers the minimum time/energy window needed for a secure erase, without expanding into hold-up power architecture. (See Hold-Up & Emergency Power for system-level energy delivery.)
“Zeroize” means sensitive material becomes non-recoverable or non-usable under a latched security state. A defensible implementation identifies every storage location (RoT, external NVM, RAM/cache/DMA), defines trigger sources and response levels, meets a worst-case time budget, and produces audit-safe evidence (reason codes, latch state, counters, and verification status) without exposing secrets.
Zeroize targets (what must be covered to avoid “leftovers”)
A reliable zeroization definition starts with targets and where they can persist. Group targets by boundary and by persistence risk to avoid missing common leak paths.
| Target class | Where it lives | Recommended handling | Verification & evidence |
|---|---|---|---|
| Root / KEK | RoT secure slots / internal NVM | destroy slot material; latch tamper/zeroize state | slot status + latch flag + reason code |
| DEK blobs | external NVM (wrapped), file-like storage | invalidate + overwrite (if required) + deny unwrap under latch | blob version invalidation + unwrap-deny counter |
| Session keys | RAM, crypto engines, FIFOs | RAM scrubbing + engine reset + FIFO purge | scrub complete flag + reset reason code |
| Sensitive config | NVM, shadow RAM, caches | clear shadow copies; re-load only after re-provision | config state hash/ID + cleared-state marker |
| Temp buffers | DMA buffers, caches, crash areas | explicit wipe + prevent dumping secret-bearing regions | wipe counter + policy ID (dump disabled) |
Non-negotiable: include cache/DMA paths and any “diagnostic dump” region that might capture plaintext.
Trigger sources and response ladder (avoid over-destruction)
Zeroization should be policy-driven. Not every abnormal event should immediately destroy all secrets. Use a response ladder so serviceability is not accidentally eliminated.
- Tamper (high confidence): latch security state → key lockdown → session wipe → zeroize (per policy).
- Debug policy violation: latch violation → key lockdown; escalate on repetition or correlation with tamper.
- Attestation/authentication failure: deny privileged actions; escalate after threshold/time-window rules.
- Maintenance mode switch: wipe sessions; restrict key ops; require controlled re-verify before full enable.
- Commanded zeroize: accept only via authenticated control path; always produce an evidence record.
Keep the shortest path: triggers should reach the RoT boundary without depending on untrusted host software.
NVM “irreversibility” (what can be guaranteed under power loss)
Not all storage behaves the same under wear leveling, journaling, or partial writes. A defensible approach defines two safety goals: (A) non-recoverable (material is physically destroyed), or (B) non-usable (latched policy prevents use even if remnants exist).
- Prefer “no plaintext in external NVM”: store only wrapped blobs that require RoT approval to unwrap.
- Invalidate first: make old blobs/config invalid immediately, so a power cut cannot “restore” usability.
- Overwrite when required: when policy demands physical removal, overwrite with verification readback (concept-level).
- Latch blocks recovery: after a tamper/zeroize latch, deny unwrap/sign operations and require controlled re-provisioning.
The safest definition under brownout is often “non-usable under latch,” because physical erase may not complete for every medium.
Time budget and minimal energy window (link to hold-up design)
Zeroization must complete within a worst-case t_total budget: detection → latch → erase/wipe → verify → proof log. The platform must guarantee a minimal energy window to meet t_total at the worst operating corner.
- t_detect: trigger detection and debounced qualification.
- t_latch: security latch set (no further secret use allowed).
- t_erase: wipe targets (RoT slots, RAM/cache/DMA, policy-defined NVM actions).
- t_verify: verify outcomes (status flags, deny counters) + record proof evidence.
System-level energy delivery methods are covered by the Hold-Up & Emergency Power page; this chapter only defines the required window.
Zeroize checklist (scope · trigger · time · evidence)
| Checklist item | Pass criteria | Evidence artifact |
|---|---|---|
| Scope coverage | RoT slots + RAM + cache/DMA + external blobs/config all mapped | target map version ID |
| Trigger ladder | triggers are graded (log → lockdown → wipe → zeroize) | policy ID + escalation counters |
| Power-loss safe | latch prevents secret use even if power cuts during erase | latch state + deny counters |
| Verification | wipes produce verifiable completion signals | scrub complete + slot status |
| Proof log | events recorded without secret leakage | reason code + timestamp + counter |
H2-10 · Provisioning & maintenance: secure manufacturing without bricking serviceability
Provisioning is a closed loop: generate identity material → inject into the secure boundary → verify → lock lifecycle state → ship with evidence. Maintenance must remain possible, but only via controlled modes that never export root secrets or bypass the RoT.
Secure manufacturing does not mean “a powerful fixture with secrets.” It means per-device identity and keys are created under control, injected into the secure boundary (RoT/SE/HSM-class element), verified via a measurable evidence step, and then the device is moved to a production-locked lifecycle. Serviceability is preserved through controlled maintenance modes that restrict operations and require re-verification.
Roles and boundaries (who can do what)
| Role | Allowed operations | Forbidden operations | Required evidence |
|---|---|---|---|
| Factory | inject identity/cert chain into secure boundary; run verify; lock lifecycle | export root/KEK; keep reusable secrets on fixture | policy ID + lock state + fingerprints |
| Operator | consume device evidence; approve onboarding; operate in production mode | force debug unlock; bypass verify for updates | onboarding record + evidence check |
| Service | enter controlled maintenance mode; reflash firmware; reprovision under authorization | dump secrets; permanently disable security checks | maintenance ticket ID + re-verify record |
Keep “serviceability” by controlling modes, not by making debug permanently open.
Factory loop (generate → inject → verify → lock → ship)
- Generate identity material: per-device identity and certificate chain are prepared in a controlled environment.
- Inject into secure boundary: secrets land in RoT/SE slots; external storage holds only wrapped/derived artifacts.
- Verify: run an evidence step (attestation-style) to confirm keys, policy, and firmware state.
- Lock lifecycle: switch from development to production (debug restrictions, rollback policy enabled).
- Ship with evidence bundle: fingerprints, policy ID, lock state, version IDs—no secrets.
Fixtures should trigger operations, not store reusable secrets.
Maintenance / RMA boundary (secure recovery without bypass)
A service workflow should support controlled re-provisioning while keeping the RoT boundary intact. A practical approach is to use a restricted maintenance mode that allows updates and re-binding, but not secret export.
- Entry control: authorized transition into maintenance mode, time-limited, with explicit evidence logging.
- Allowed actions: firmware reflash, certificate refresh, controlled reprovisioning steps.
- Denied actions: exporting root/KEK, permanent debug unlock, bypassing verify/attestation.
- Exit control: re-verify state and re-lock lifecycle before returning to production mode.
Common pitfalls (and how to prevent them)
- Fixture leakage: a programming fixture stores reusable secrets → prevent by removing secrets from fixtures and enforcing per-device derivation/injection.
- Key reuse across a batch: “same KEK for convenience” → prevent with per-device uniqueness and domain binding; detect via evidence checks.
- RMA bypass: service path can skip RoT checks → prevent with maintenance mode limits and mandatory re-verify before relock.
Evidence-based gates (verify + lock state) turn these pitfalls into detectable failures, not silent compromises.
H2-11 · Validation & attack-informed test plan (what proves it’s done)
“Done” means: for secure boot, keys, RNG, tamper, and zeroization, there is (1) a concrete stimulus, (2) a deterministic expected behavior, and (3) a stored evidence trail that survives resets and supports certification review.
- Must record: boot reason codes, policy IDs, active firmware version/measurement ID, monotonic counters (rollback), update state transitions, tamper source + action taken, zeroize completion/latch flags.
- Must avoid: raw key material, TRNG raw samples, plaintext sensitive payloads, replayable tokens, or anything that enables recovery of secrets.
- Evidence format: event ID + version + counter + reason code + optional timestamp/sequence (monotonic) so production and service can correlate logs.
- Pass/Fail gate: each test row ends with one explicit gate: fail-closedcontrolled degraded moderequires reprovision
System-wide self-test orchestration belongs to BIT/BIST & Health Monitoring. This page only validates the crypto/anti-tamper chain and its proof artifacts.
| Test | Setup | Stimulus (attack-informed) | Expected behavior | Evidence artifact | Pass/Fail gate |
|---|---|---|---|---|---|
| Secure boot: image integrity tamper detection path |
Production policy enabled; debug locked; known-good golden image present. | Integrity check fails for a boot stage image (corrupt or mismatched digest). | Boot is denied; system enters a safe stop or defined recovery path (no “best-effort boot”). | Boot reason = SIG/DIGEST_FAIL; deny counter increments; measurement ID recorded. | fail-closed |
| Secure boot: rollback protection monotonic counter |
Rollback counter provisioned; multiple signed versions available. | Attempt to boot a validly signed but older version than policy allows. | Boot is blocked (or requires explicit authorized service mode); no silent downgrade. | Reason = ROLLBACK_BLOCK; counter value + policy ID logged. | fail-closed |
| Update: power-loss recovery A/B images |
A/B enabled; recovery state machine enabled; evidence logging on. | Power loss during update state transitions (download/write/switch point). | System returns to last known-good image or recovery image; no partial image ever boots. | Update state trace; selected-slot record; recovery-used flag. | controlled recovery |
| Cert status: revocation policy-driven trust |
Revocation list / trust policy updated; audit logging enabled. | Boot chain encounters a revoked signer/cert in the verification path. | Reject image; require a permitted re-provision/maintenance flow if applicable. | Cert status = REVOKED; policy ID; reject reason code. | requires reprovision |
| Keys: export boundary non-exportable RoT |
RoT/SE key slots created as non-exportable; least-privilege policy applied. | Unauthorized attempt to read/export a key or wrapped secret outside policy. | Operation is denied; repeated attempts optionally escalate to lockdown. | Key op log = EXPORT_DENY; attempt counter; optional tamper escalation code. | deny + log |
| Keys: domain separation KEK/DEK binding |
Key hierarchy defined (Root→KEK→DEK/session); permissions enforced per key purpose. | Attempt to use a KEK/DEK outside its bound domain or purpose. | Policy violation is rejected; system never “falls back” to a weaker rule. | Policy-violation record; key version; caller context (non-secret). | deny + evidence |
| Keys: rotation & revoke lifecycle proof |
Rotation procedure available; old/new slots defined; logs enabled. | Rotate active slot; revoke old slot; validate old secrets cannot unwrap new data. | New slot becomes active; revoked slot cannot be used; services remain operable under service policy. | Active slot ID; key version; revoke flag; post-rotate health event. | must survive service |
| RNG: startup health gate no entropy, no keys |
Cold boot; RNG health tests enabled; keygen path requires RNG PASS. | Startup health test fails (insufficient entropy / failed continuous test). | Key generation and critical provisioning are blocked until RNG health is restored. | RNG health = FAIL; boot gate reason; counter of failures. | fail-closed |
| RNG: corner operation evidence temp/voltage extremes |
Environmental corner campaign; evidence logging enabled. | Operate at hot/cold and supply corners during RNG use and periodic tests. | Health tests remain within policy; if degraded, system enters defined degraded mode and logs it. | Health counters; degrade flag; policy ID; event sequences. | defined degrade |
| Tamper: sensor trigger path short closed-loop |
Tamper sensors enabled; response levels defined (log→lock→zeroize). | Any enabled intrusion sensor indicates an event (open/mesh/light/temp/accel as designed). | Response executes within the shortest path (prefer RoT-mediated), without reliance on application code. | Tamper source; confidence; action taken; latched state; sequence number. | must latch |
| Tamper: nuisance rate control false positives |
Operational vibration/temperature profile defined; debouncing configured. | Long-duration operation in representative environment (maintenance/transport/idle). | False event rate stays below threshold; every event is traceable to a source + policy action. | False-event counter; tamper histogram; debouncing version ID. | meets threshold |
| Zeroize: scope coverage RAM + NVM + buffers |
Zeroize target map defined; includes sensitive RAM regions and persistent slots. | Trigger zeroize via defined source (tamper, admin command, mode transition). | All targets in the map are cleared/invalidated; no “forgotten” DMA/cache regions remain in scope. | Target-map version; per-target completion counters; zeroize completion flag. | must cover all |
| Zeroize: time budget erase fast |
Worst-case timing corners defined; logging enabled; power margin known. | Trigger zeroize at worst-case conditions (temperature/supply corners). | Zeroize completes within budget; if not, enters defined locked state where secrets are unusable. | Start/end timestamps; timeout reason code; locked-state flag. | complete or lock |
| Zeroize: power loss survival proof after reboot |
Latched zeroize policy; deny-by-default when latch is set. | Power loss occurs during zeroize window. | After power returns, system remains latched in a safe state; sensitive operations stay denied until authorized reprovision. | Latched flag persists; deny counters; recovery state record. | deny persists |
| Provisioning: uniqueness no batch key reuse |
Factory flow enabled; device identity provisioning enabled; audit logs on. | Verify per-device identity & key material are unique across a lot (without exposing secrets). | Uniqueness is proven via non-secret IDs, cert chains, slot/version metadata. | Device ID; cert serial; key slot/version metadata; provisioning report signature. | must be unique |
| Maintenance: service boundary repair without bypass |
Authorized service mode defined; debug lock policy documented. | Attempt maintenance actions outside authorized boundary. | Unauthorized paths are blocked; authorized service actions produce explicit evidence and do not weaken production policy afterward. | Service-mode entry proof; policy restore event; audit trail. | policy restores |
Tip for audits: keep a one-page “evidence dictionary” mapping each reason code/counter to the test rows above. That makes review faster than long prose.
- NXP EdgeLock SE050 — example OPN: SE050C2HQ1/Z01SDZ (product)
- Infineon OPTIGA™ Trust M — example OPN: OPTIGA-TRUST-M-EXPRESS (product)
- Microchip CryptoAuthentication™ ATECC608B — example OPN: ATECC608B-MAHCZ-S (product)
- ST STSAFE-A110 — example family: STSAFE-A110 (product)
Keep this list as “reference devices” in the validation plan; do not treat it as a universal recommendation list.
- Vibration/acceleration evidence: ADXL355 (3-axis accelerometer) (ADI)
- Temperature evidence: TMP117 (digital temperature sensor) (TI)
- Supply supervisor evidence: TPS3890 (voltage supervisor / reset) (TI)
This page uses these only as tamper-chain inputs & evidence sources (not as a full environment-monitoring architecture).
- Side-channel / fault-injection evaluation platform: ChipWhisperer-Husky (docs)
- Evidence capture basics: logic analyzer / power monitor / environmental chamber (for corner campaigns).
Use evaluation tools to confirm that detection and policy responses work as designed; do not document exploitation steps in production-facing manuals.
H2-12 · FAQs ×12 (Crypto & Anti-Tamper)
FAQ goal Capture long-tail questions with short, evidence-focused answers and clear links back to the relevant sections.
Each answer states the boundary, the practical rule-of-thumb, and the evidence that can be stored without leaking secrets.
1Secure boot and measured boot—what problem does each solve?
Secure boot prevents unauthorized software from running by verifying signatures before execution (a “block the bad” control). Measured boot records what actually loaded by producing signed measurements that can be reviewed later (a “prove what ran” control). Many avionics programs use both: secure boot enforces trust, while measurements provide audit evidence via non-secret hashes and reason codes.
2Secure element vs HSM—what selection boundary actually matters?
The boundary is not the label, but the Root-of-Trust duties and evidence outputs: non-exportable key storage, policy-enforced signing/decrypt, monotonic counters (rollback), and auditable reason codes. Secure elements fit fixed, low-power trust anchors; HSM-class devices fit higher throughput, richer algorithms, and more complex interfaces. Choose by lifecycle needs (updates, rotation, service mode), not by buzzwords.
3How is rollback protection implemented without killing maintainability?
Rollback protection typically ties boot acceptance to a monotonic version counter and an allowed-version policy. Maintainability is preserved by defining an authorized service path: controlled maintenance mode, signed recovery images, or supervised re-provisioning that updates policy without enabling silent downgrades. Evidence should capture the blocked version, policy ID, and counter state—never keys—so service teams can diagnose why boot was denied.
4What should be non-exportable, and what can be wrapped/exported safely?
Root keys and KEKs that protect many assets should be non-exportable and only usable inside the RoT/SE for unwrap/sign operations. Data keys (DEKs) and session keys can be wrapped when portability is required, but only under strict domain separation (purpose binding, device binding, and versioning). Safe export means exporting a wrapped blob plus metadata (key version, policy ID), not plaintext material or replayable secrets.
5How do TRNG health tests fail in the field, and what should the system do?
Field failures often come from startup entropy shortage, temperature/voltage corners, aging-driven entropy reduction, or stuck behavior that only appears intermittently. A defensible design gates sensitive operations on TRNG health: if health tests fail, key generation and provisioning are blocked, and the system enters a defined degraded state with explicit logs. Evidence should be PASS/FAIL flags and counters, never raw random bits.
6How to design tamper sensors to avoid false alarms but still be hard to bypass?
Start with layered sensing: enclosure open plus at least one harder-to-bypass signal (mesh, light, or anomaly detection). Control nuisance alarms with debouncing, contextual thresholds, and “log-only” tiers before destructive actions—while keeping a short closed-loop path for high-confidence triggers. “Good” means every event produces a source + confidence + action record, and the bypass cost rises because multiple independent sensors must be defeated.
7What triggers should cause partial wipe vs full zeroization?
Use graded responses. Low-confidence or operational triggers (e.g., policy violations) can justify session wipe and lockdown without destroying long-term identity. High-confidence physical intrusion or repeated critical violations should trigger full zeroization of root/KEK material and any sensitive cached data. The decision must be policy-driven and logged: trigger source, response tier, and latched state. This preserves serviceability while keeping strong anti-tamper outcomes when needed.
8How to guarantee zeroization under power loss (time/energy budget)?
Guarantee comes from two elements: a minimal time budget for the erase/invalidations that must happen immediately, and a persistent latch that denies sensitive operations even if power is lost mid-wipe. The system should be safe-by-default after reboot (deny unwrap/sign until authorized reprovision). This page only defines the zeroize window concept and proof artifacts; detailed hold-up design belongs to the Hold-Up & Emergency Power page.
9What is a safe factory key-injection flow that scales without leaks?
A scalable flow keeps secrets inside a protected boundary: generate or unwrap keys inside a trusted HSM/SE, inject into non-exportable slots, verify via signed non-secret proofs (cert serials, slot IDs, policy IDs), then lock debug and record a provisioning report. Common leak sources are shared fixtures, reused batch keys, and missing audit trails. The best “scale” control is repeatable evidence, not manual exceptions.
10How to handle RMA/service while preventing key cloning?
Define a service lane that is authorized, temporary, and fully auditable: controlled maintenance mode, explicit policy IDs, and post-service re-lock with proof. When boards are swapped, identity and key material should not be copied; instead, re-provision under a controlled process that binds credentials to the new hardware. Evidence should show service entry/exit, policy restoration, and key-slot/version outcomes—without exposing secrets.
11What evidence/logs are acceptable to prove integrity without leaking secrets?
Acceptable evidence is metadata that supports verification: reason codes, counters, policy IDs, firmware versions, measurement IDs (hashes), slot IDs, and latched states. Unacceptable evidence includes raw key material, TRNG raw samples, or logs that reconstruct sensitive plaintext. A good rule is: store “what decision was made and why” (deny/allow, policy, counter), not the secret inputs that drove the decision.
12Which tests are the minimum “done” criteria for anti-tamper readiness?
Minimum “done” criteria cover five chains: secure boot (tamper/rollback/power-loss update), key boundary (export denial and rotation/revoke proof), RNG health gating (startup and corner evidence), tamper response (sensor trigger + nuisance rate control), and zeroization (scope coverage + latch proof after reboot). Each test must end with a clear gate (fail-closed, recovery, or reprovision) and an evidence artifact that survives resets.