Security & Anti-Tamper for Machine Vision Systems
← Back to: Imaging / Camera / Machine Vision
Security & Anti-Tamper ensures a machine-vision device can prove its firmware, identity, and video evidence are authentic—even under update, service, and physical-access risks. It combines secure boot and key protection with anti-rollback updates, signed/watermarked streams, controlled debug/RMA unlock, tamper response, and append-only audit logs.
Threat Model for Vision Devices
A practical threat model turns “security goals” into a build-and-test checklist: define protected assets, map attack surfaces, state realistic attacker capabilities, describe business-visible impacts, then bind each threat to a control and a verification method.
1) Assets (what must remain authentic / secret)
- Boot & firmware: bootloader stages, recovery image, verified manifests, security policy configuration.
- Identity & keys: device identity key/certificate, key-derivation materials, anti-rollback counters.
- Content authenticity: frame/chunk signatures, watermark keys, sequence counters/nonce rules.
- Evidence & traceability: audit logs, tamper events, update history, boot measurements (hash digests).
Engineering note: the most common failure is not “no crypto,” but assets stored or used outside the intended trust boundary (e.g., unsigned configs, read-out-able keys, erase-able logs).
2) Attack surfaces (where compromise usually starts)
- Boot chain: replace/patch early-stage components, modify boot configuration, fault-inject verification checks.
- Update path: deliver modified packages, replay old packages, downgrade to vulnerable versions.
- Debug access: JTAG/SWD/UART pads, insecure RMA unlock flow, over-privileged diagnostic modes.
- External media / interfaces: unsigned parameters/resources, replaceable storage, injection via attachable peripherals.
- Video stream path: replay/insert/drop frames, replace metadata, spoof “trusted” origin without proof.
- Physical access: chip-off attacks, probing buses, component swap, glitching power/clock/reset.
3) Attacker capability levels (use to set priorities)
- L1 Remote: network reachability; no physical contact.
- L2 Close access: can touch ports and enclosure; can trigger resets/power cycles.
- L3 Teardown: can open device and access PCB, flash, test pads.
- L4 Lab: can do voltage/clock glitching, fault injection, side-channel attempts.
Priorities must match the assumed attacker ceiling: many industrial vision deployments must treat L2/L3 as baseline.
4) Business-visible impacts (why it matters)
- Fake evidence: replayed or manipulated video frames that still look “valid” downstream.
- Persistent compromise: modified boot chain or update agent enabling hidden backdoors.
- Device cloning: copied identity keys enabling counterfeit devices and license bypass.
- Downgrade exploitation: forced rollback to a known-vulnerable firmware release.
- Lost traceability: erased or modified logs preventing root-cause analysis and accountability.
Threat-to-Control Mapping (deliverable that prevents scope creep)
Each threat must map to: a control (what to implement) and a verification (how to prove it works). This table becomes the “contract” that later chapters must fulfill.
| Threat (what attacker tries) | Affected asset | Typical entry point | Control (where to stop it) | Verification (evidence / test) | Residual risk |
|---|---|---|---|---|---|
| Boot component replacement | Boot chain, policy config | Boot media / config | Verified boot + manifest coverage + fail-closed | Replace a component → device must refuse boot; logs show reason | Lab fault injection may still target checks |
| Firmware downgrade (rollback) | Firmware integrity | Update path / replay | Anti-rollback with monotonic counter + signed update metadata | Attempt old version install → must be rejected with non-erasable record | Counter storage must be tamper-resistant |
| Key extraction / device cloning | Identity keys, certificates | Teardown / debug | Secure element / OTP/eFuse policy + debug lock + key never readable | API must support sign/derive only; read-out must be impossible in production | Advanced side-channel remains possible |
| Video replay / frame injection | Stream authenticity | Stream path | Frame/chunk signing + hash chain + anti-replay counters; watermark as supplement | Replay old segment → verifier detects counter/hash-chain break | Verifier implementation must enforce policy |
| Log wiping / cover tracks | Audit evidence | File system access | Append-only logs + integrity chaining + signed checkpoints | Modify/delete records → integrity check must fail; sequence must not go backwards | Storage exhaustion needs policy |
Root of Trust & Chain of Trust (Boot ROM → BL → OS/App)
A secure vision device must establish trust at the earliest possible stage (Root of Trust) and extend that trust through every executable and critical configuration component (Chain of Trust). The design must also define deterministic “fail-closed” behavior with testable evidence.
1) Root of Trust forms (role boundaries, not platform tutorials)
- SoC Boot ROM RoT: immutable verification logic at the first instruction; strongest “start point” for trust.
- External Secure Element / TPM: isolates private keys and can attest device identity; relies on secure integration and policy.
- FPGA/MCU bridge RoT: can gate boot or enforce policy, but must itself be verified or anchored to a RoT.
The RoT objective is not “more crypto,” but keys cannot be read out, verification cannot be bypassed, and the trust boundary starts before attacker-controlled code executes.
2) Chain of Trust (what must be covered)
- Stages: ROM → BL1 → BL2/UEFI → kernel → core user-space/app.
- Critical configs: boot arguments, device tree, manifest lists, recovery/maintenance policy.
- Coverage rule: a signed manifest must enumerate hashes of all security-relevant components, not only the “main image shell”.
The most common break is unsigned configuration or missing components from the signed manifest, which enables subtle but powerful substitutions.
3) Measured boot vs Verified boot (what each solves)
- Verified boot: prevents untrusted code from running (block execution).
- Measured boot: produces auditable evidence of what ran (prove device state via hashes).
For vision systems that require data authenticity, measured boot helps support device-side proof that the signing/watermarking pipeline is running in an approved state (without describing any cloud architecture here).
4) Fail-closed policy (deterministic state machine)
- Strict stop: verification failure → halt (max security, low availability).
- Recovery-only: verification failure → enter signed-recovery mode (common for field devices).
- Limited mode: verification failure → restricted operation with full audit logging (use with caution).
A correct policy is predictable: the same failure produces the same state and the same evidence trail.
Evidence / Acceptance Criteria (minimum tests that settle arguments)
- Boot measurement digest: a stable location for stage-hash output (register snapshot or protected audit log entry).
- Signature failure injection: tamper a bootloader signature → device must refuse normal boot and enter the defined fail-closed state.
- Coverage test: swap an “often-forgotten” component (e.g., boot config/DTB/manifest entry) → device must detect and refuse.
- Determinism: repeated identical failures must lead to identical state + identical error code/log format.
Secure Boot Details (Signatures, Certificate Chain, Policy & Performance)
Secure boot is a parameter set, not a slogan. A robust design defines: signature scheme, key roles and certificate chain, what exactly is covered by the signed manifest, which components are mutable vs fixed, and how verification time is budgeted across boot stages with deterministic fail-closed behavior.
1) Signature mechanisms (engineering tradeoffs)
- RSA: mature tooling; typically larger signatures/certs; verification cost and storage footprint can dominate small boot partitions.
- ECDSA: smaller signatures than RSA; verification time depends on implementation and acceleration; careful constant-time coding required.
- EdDSA: simpler usage patterns and often cleaner implementations; adoption depends on platform/tooling and audit readiness.
Selection should be driven by (a) verification frequency in the chain, (b) signature + cert size, (c) implementation auditability, and (d) update tooling maturity.
2) Certificate chain & key roles (separation of duties)
- Root key (offline): only signs intermediate certificates; never used for daily image signing.
- Intermediate (rotatable): partitions product lines / factories / batches; enables controlled rotation and revocation.
- Image signing key (high-use): signs boot components and manifests; must be replaceable without bricking deployed devices.
- Recovery signing key (optional): signs recovery-only images; prevents the “recovery path” from becoming a backdoor.
A practical rotation plan defines: how new intermediates are accepted, how old keys are denied, and how attempts are recorded in audit logs.
3) Policy: mutable vs fixed components (scope lock)
- Must be fixed or extremely constrained: trust anchor, early-stage verification policy, and the recovery entry policy.
- May be updated (signed + anti-rollback): later boot stages, OS/kernel, security services, update agent, stream-signing engine.
- Config coverage rule: any boot-critical configuration (boot args / DTB / policy flags) must be bound by the signed manifest.
The common bypass is not a broken signature algorithm—it is unsigned configuration or components missing from the manifest.
4) Boot-time budget (how to estimate without hard numbers)
- Count verifications: number of stages × number of signed objects per stage.
- Measure per-object cost: signature verify + hash compute + manifest parse.
- Total estimate: sum stage-by-stage, then add margin for worst-case storage latency and cold-cache behavior.
Optimization lever: sign a single manifest per stage (hash list + version + policy flags) and verify component hashes, instead of verifying many scattered files repeatedly.
Evidence / Acceptance Criteria (what settles disputes)
- Verify-time profile: record
verify_time_per_stageandobjects_verifiedat each stage. - Coverage proof: produce the signed manifest showing hashes for every boot-relevant component (not just a single “image shell”).
- Negative injection: tamper with a manifest signature or a component hash → device must enter the defined fail-closed state.
stage=BL2 verify_objects=1(manifest) hash_checks=7 verify_ms=__ hash_ms=__ result=PASS
stage=KERNEL verify_objects=1(manifest) hash_checks=5 verify_ms=__ hash_ms=__ result=PASS
Implementation details are platform-specific; the requirement is that the record is consistent, comparable, and auditable.
Key Storage, TRNG/DRBG, and Device Identity
Keys must be classified by domain and lifetime, stored within a clear security boundary, and generated from a verifiable entropy pipeline. Device identity must be provable using device-side evidence (certificate chain and challenge-response signatures) without relying on any cloud description.
1) Key classes (domain + lifetime)
- Device identity key: long-term; used for attestation and device uniqueness; must be non-exportable.
- Firmware verification keys: trust anchors and certificate chain used to validate boot and updates (public keys + policy).
- Stream-signing / watermark keys: protects frame authenticity; should be isolated from general application code.
- Session keys: short-lived; derived per session/boot; must be erasable and renewable.
Rule of thumb: long-term private keys must be non-readable; short-term keys must be rotatable and erasable.
2) Storage options (where keys live)
- OTP / eFuse: strong anchor storage; good for root public key hash, policy fuses, monotonic counters.
- Secure Element / TPM: hardware boundary for private keys; supports sign/derive without exporting keys.
- PUF-derived keys: keys derived from silicon uniqueness; requires stability strategy and error handling design.
- Encrypted flash: protects at-rest data; must be anchored to a non-exportable root key or it becomes “copyable encryption”.
Common misconfiguration: encrypted flash with the encryption key stored in the same flash. A secure design anchors secrets to a RoT/SE boundary.
3) TRNG vs DRBG (entropy pipeline)
- TRNG: produces raw entropy from a hardware source; must run health tests at startup and continuously.
- DRBG: expands a seed into high-volume cryptographic randomness; must handle reseed policy and reboot state.
- Operational requirement: record TRNG health status and ensure DRBG does not repeat outputs after resets.
“Looks random” is not sufficient. A field-debuggable system needs health-test evidence and a defined reseed strategy.
4) Provisioning (manufacturing vs on-site)
- Factory provisioning: scalable; requires strong manufacturing chain controls and auditable injection records.
- On-site provisioning: reduces supply-chain exposure; increases deployment complexity and process requirements.
- First-boot binding: can generate/lock identity on first boot; depends on a reliable entropy pipeline and immutable policy.
The design should specify what the device can output as proof: certificate chain and a challenge-response signature, verifiable offline.
Evidence / Acceptance Criteria (device-side, verifiable)
- Entropy health logs: record startup and periodic TRNG health results (PASS/FAIL + counters).
- Non-exportable key boundary: in production mode, private keys must never be readable; only
sign(),derive(),wrap()operations are allowed. - Identity attestation evidence: device outputs a certificate chain and signs a verifier-provided challenge; verification works offline.
Allowed: sign(msg), verify(msg,sig), derive(ctx), wrap(key), unwrap(blob)
Forbidden: read_private_key(), export_key_material()
Secure Firmware Update & Rollback Protection (A/B, Version Index, Monotonic Counter)
Firmware rollback is a primary risk: attackers can downgrade to an older vulnerable release even when signatures are valid. A secure update design must prove three properties: (1) authenticity of the package, (2) anti-rollback enforcement, and (3) atomic recoverability under power loss without creating an unrecoverable brick.
1) Update threats (what must be engineered out)
- Package substitution: MITM swaps payload or metadata.
- Tamper: payload or manifest altered after signing, or signature coverage is incomplete.
- Replay: a previously valid update package is resent.
- Downgrade: a validly signed but older vulnerable version is installed.
- Power-loss brick: interruption during write/switch/commit leaves both slots non-bootable.
The most common bypass is not cryptography failure—it is unsigned metadata (version/slot/policy) or missing components in the signed manifest.
2) Signed update package + manifest coverage
- Payload: firmware image(s) for the inactive slot.
- Manifest: hash list,
rollback_index, human-readable version, device/model constraints, slot policy, and key identifier. - Signature: signs the manifest (and therefore all payload hashes and policy fields).
Required proof: manifest lists every boot-relevant component hash and policy flag. Signing only a “container file” is insufficient.
3) Anti-rollback logic (monotonic anchor)
- Comparison rule: allow install only if
incoming.rollback_index >= stored.rollback_index. - Anchor options (conceptual): OTP/eFuse, secure element/TPM counter, or secure version store anchored to a RoT boundary.
- Commit rule: update
stored.rollback_indexonly after successful switch + health checks (commit stage).
Separating rollback_index (security monotonic) from “marketing version” avoids ambiguous ordering and prevents downgrade gaps.
4) A/B slots + atomic switch + controlled rollback
- Write to inactive slot: download → verify → install always targets the non-active slot.
- Trial boot: switch selects candidate slot with a bounded retry counter.
- Commit gate: only after health checks pass, mark new slot committed and advance anti-rollback index.
- Rollback condition: repeated boot failures or failed health checks triggers automatic reversion to last known-good slot.
Power-loss safety is a state-machine property: no stage should overwrite the last known-good slot before the new slot is verifiably bootable.
Evidence / Acceptance Criteria (testable and field-readable)
Rollback attempt tests
- Install an older signed build with lower
rollback_index→ must be rejected. - Modify manifest fields (version/slot policy) without re-signing → must fail signature verification.
- Replay the same package → allow/deny is policy-specific, but must be logged deterministically.
Required outputs: explicit error code (e.g., ERR_ROLLBACK_DENIED) and a log record including incoming_index, stored_index, and key_id.
Atomicity & power-cut injection tests
- Inject power loss during Download, Verify, Install, Switch, and Commit.
- After reboot, device must return to a bootable state (either last known-good slot or candidate slot).
- Never allow a state where both slots become non-bootable.
This chapter defines the software recovery logic; hardware hold-up details should be treated as a separate topic and kept out of scope here.
rollback_index.
Stream Authenticity: Signing, Watermarking, and Anti-Replay
Vision pipelines need device-side evidence that frames are genuine: integrity (not modified), freshness (not replayed), and provenance (from the claimed device/session). This chapter focuses on deterministic verifier outputs without diving into codec internals or any cloud architecture.
1) Route A — frame/chunk signing + hash chaining
- Granularity options: sign every frame (strongest, highest overhead) or sign every N frames/chunk (practical).
- Hash chain: link frames by hashing current frame metadata + content hash + previous chain hash.
- Detects: frame insertion/deletion/modification via chain break or signature mismatch.
Freshness is mandatory: without seq/nonce/timestamp, an attacker can replay old but intact signed frames.
2) Route B — watermarking (provenance and tamper evidence)
- Robust watermark: survives common transformations; supports provenance and post-hoc tracing.
- Fragile watermark: breaks under edits; provides a tamper-evidence signal.
- Engineering role: watermark complements signing (deterministic integrity) rather than replacing it.
Output must be machine-readable: wm_detected, wm_score, and a failure reason for audit and field diagnosis.
3) Anti-replay (freshness rules)
- Sequence counter (seq): monotonic increasing; verifier flags reuse, rollback, or abnormal jumps.
- Nonce binding: session challenge binds frames to a specific session to prevent offline recording + later injection.
- Timestamp: supports time-window checks; should be treated as an input to freshness logic, not a stand-alone trust source.
A practical verifier emits separate flags for sequence anomalies and chain breaks to distinguish packet loss from tampering.
4) Key boundary (identity vs session keys)
- Device identity / attestation key (long-term): proves device identity; should not sign every frame.
- Session key (short-term): used for frame/chunk signing or MAC; derived per boot/session and rotated.
- Traceability fields: verifier records
key_id/cert_idfor forensic linkage.
Keeping long-term keys isolated and using session keys for high-volume operations reduces blast radius and improves rotation agility.
Evidence / Acceptance Criteria (deterministic checks)
Replay and sequence tests
- Replay old frames/segments → verifier flags
seq_reusedorfreshness_fail. - Sequence rollback (decreasing seq) → verifier flags
seq_rollback. - Large unexplained gaps → verifier flags
seq_gap(distinguish from packet loss policies).
Tamper and chain integrity tests
- Insert or delete frames → verifier flags
chain_breakand/orsig_fail. - Modify frame content → verifier flags
hash_mismatchorsig_fail. - Watermark edit/removal → verifier reports
wm_detected=0or lowwm_score.
Recommended verifier outputs: auth_ok, seq_anomaly, chain_break, wm_detected, wm_score, plus a compact reason code.
Secure Debug, JTAG Lock, and RMA / Service Unlock
Debug ports are one of the most frequent “self-inflicted” breakpoints: a device can be cryptographically sound yet fully compromised if JTAG/SWD/UART access is uncontrolled. A robust design must enforce fail-closed defaults, allow only short-lived and least-privilege service sessions, and generate audit-grade evidence for every unlock attempt.
1) Debug port policy as explicit states
Treat debug as a state machine with a strict allowlist. Avoid “blacklist” thinking.
- LOCKED (default): JTAG/SWD disabled or restricted to minimal identification only; no memory read/write.
- LIMITED_DIAG: read-only diagnostics (health counters, temperatures, boot/update reason codes, compact log summaries).
- SERVICE_SESSION: time-bounded session with a command allowlist; still forbids exporting secrets or dumping sensitive regions.
- DENIED: triggered after repeated failures or risk signals; blocks access and increments an immutable failure counter.
This state model keeps the “what is allowed” surface mechanically checkable and prevents accidental escalation in service tooling.
2) Controlled unlock (challenge-response + short TTL)
Unlock should be bound to device, time window, and privilege level.
- Preconditions: device must enter a service posture (e.g., physical action, service pin, or explicit service mode).
- Challenge: device outputs a nonce + device identity + monotonic counter + policy version.
- Response: service tool returns a signed authorization token that encodes level and TTL.
- Enforcement: device verifies token, opens only the approved allowlist, and auto-relocks at TTL expiration.
Freshness is mandatory: include a nonce or monotonic counter to prevent replay of a previously valid unlock token.
3) RMA / service legitimacy (device-side evidence)
- Legitimate service proof: authorization token must be verifiable by the device (trusted service signer identity).
- Context binding: token binds to device ID and may require a physical service posture to reduce remote abuse.
- Minimal exposure: allow diagnostics needed for triage, not a full debug environment.
The design goal is “repair without backdoor”: a service session can observe health and run bounded tests while secrets remain non-exportable.
4) Irreversible vs reversible locks (engineering tradeoff)
- Irreversible (eFuse-style): strongest for disabling high-risk capabilities; cannot be undone if mistakenly fused.
- Reversible (policy-controlled): supports maintainability; must be paired with TTL, allowlist, and immutable audit.
- Hybrid approach: permanently disable secret export paths while keeping a constrained diagnostic gate reversible.
A practical rule: irreversible locking is suited for “never acceptable” capabilities (secret read-out), while reversible sessions cover bounded diagnostics.
Evidence / Acceptance Criteria (must pass negative tests)
Audit-grade evidence
- Every unlock attempt increments
unlock_counterand recordslevel+reason_code. - Repeated failures increment
fail_counterand may triggerDENIED. - Evidence persists across reboot and is not silently erasable from normal runtime.
Minimum viable auditing: monotonic counters + last-reason fields. Stronger designs add log digests anchored to a trusted boundary.
Negative tests (unauthorized and over-privileged)
- Unauthorized JTAG/SWD access must fail deterministically (explicit error + counter increment).
- Authorized session must still block secret read-out and sensitive dumps (attempts must fail and be logged).
- TTL expiration must auto-relock and reject further commands without new authorization.
A secure design proves both: “no access without authorization” and “authorization never grants excessive capability.”
Physical Anti-Tamper: Detection, Response, and Evidence
Physical anti-tamper is most effective when treated as an on-device control system: signals are filtered and scored, a policy selects a graded response, and the device emits auditable evidence. This chapter focuses on device-side engineering hooks rather than materials or enclosure craft.
1) Typical physical attacks (as observable categories)
- Enclosure intrusion: case open, probing access, connector manipulation.
- Component replacement: external storage swapped or replayed from an older image.
- Fault injection indicators: abnormal voltage/clock/reset patterns intended to bypass checks.
The objective is not to describe attack tooling, but to define which observable signals can be used to trigger policy decisions.
2) Detection signals (choose by reliability, cost, false positives)
- Case open / switch: simple and low cost; bypassable but valuable as a policy input.
- Mesh / intrusion loop: stronger intrusion indicator; must handle contact faults and manufacturing variance.
- Light sensor: effective for enclosure open events; requires false-positive evaluation in real environments.
- Glitch / brownout abuse detect: monitors suspicious voltage patterns and repeated undervoltage edges.
- Clock / reset anomaly detect: flags abnormal frequency or reset patterns that correlate with bypass attempts.
A practical implementation uses debouncing + time windows + counters to avoid one-shot false alarms driving irreversible actions.
3) Response policy (graded and auditable)
- LOG_ONLY: record the event and increase the risk score.
- LIMITED_MODE: degrade functionality (e.g., disable privileged interfaces, reduce features, enable read-only operation).
- LOCK: block sensitive operations until a legitimate service unlock path is used.
- ERASE_KEYS: highest-severity action; should require high confidence and must be backed by strong evidence to avoid self-destruction.
The response must be explainable: which signal(s) triggered it, what thresholds were crossed, and what action was taken.
4) Evidence and recoverability
- Event record:
tamper_reason_code,event_counter,policy_action, optional timestamp. - Persistence: evidence survives reboot and cannot be silently cleared during normal operation.
- Recovery path: limited/locked modes should have a defined service unlock path; irreversible actions must be rare and justified.
“Auditable + recoverable” means incident forensics is possible while still enabling legitimate repair workflows.
Evidence / Acceptance Criteria (injection + false-positive control)
Tamper event injection tests
- Case open event → device logs reason and enters the intended policy state.
- Reset abuse pattern → device increments counters and escalates only after threshold/time window conditions.
- Undervoltage oscillation → device flags brownout abuse and selects a graded response (log/limited/lock).
Required outputs: tamper_event_logged, reason_code, policy_action_taken, event_counter.
False-positive evaluation method
- Environment sweep: measure normal ranges for light, temperature, vibration, and supply variation.
- Threshold tuning: apply debouncing, windows, and counters; avoid single-sample triggers.
- Rate tracking: quantify false triggers (events/hour) and constrain irreversible actions to high-confidence conditions.
High-false-positive signals must never directly trigger irreversible outcomes; they should contribute to a score-based policy.
Security Logging & Forensics on Device
Security becomes operational only when incidents produce a verifiable evidence trail. An on-device audit log should be append-only, tamper-evident, and replay-resistant under power cycles, while remaining low-cost in storage and compute. This chapter defines a practical event taxonomy, a minimal record schema, and integrity protections that can be validated with negative tests.
1) Event taxonomy (what must be recorded)
Focus on security-critical transitions and denied actions, not verbose debug strings.
- Boot & trust: boot measurement, verified-boot failure, unexpected digest change.
- Update: update verification failure, install failure, commit failure, rollback attempt (rejected).
- Debug: unlock attempt, unlock granted (level/TTL), unauthorized probe blocked.
- Tamper: case open, glitch/brownout abuse flagged, policy escalation taken, key erase event (if any).
- Stream authenticity: sequence gap, hash-chain break, watermark verify fail (flag).
Each event should carry event_typereason_codeseverity and a compact context digest (policy version, firmware version, slot, session level).
2) Minimal record schema (structure over strings)
A log record is a structured object designed for verification and ordering.
seq(mandatory): monotonically increasing sequence number (primary ordering anchor).t_ref(optional): local time reference (RTC time, boot ticks, or a time-quality flag).event_type+reason_code: enumerated values (mechanically checkable).context_digest: hash of key context fields (fw_version, policy_version, slot_id, level).prev_hash+self_hash: hash chain pointers for tamper-evidence.
The design goal is deterministic parsing and verification. Keep records short, stable, and versioned via a schema_version field.
3) Ordering and time base (without external time systems)
- Sequence-first ordering:
seqis the canonical timeline; it must never silently decrease. - Time is auxiliary: RTC can be helpful but should not be the trust anchor; record a time-quality flag if available.
- Cross-reboot continuity: store
seqin a protected state so the next boot continues forward or can detect rollback.
Forensics typically needs correct ordering and provenance more than absolute wall-clock accuracy.
4) Integrity protection (hash chain + signed checkpoints)
- Hash chain: each record includes
prev_hash; deletion or modification breaks the chain. - Signed checkpoint (every N records): create a compact checkpoint that signs the current head hash +
seq+ policy version. - Append-only policy: normal runtime cannot rewrite history; truncation must be detectable and handled explicitly.
Checkpoints reduce verification cost and provide a trusted “anchor” across power loss, while the chain provides fine-grained tamper evidence.
5) Capacity and cost controls (low-cost, still useful)
- Ring buffer for detail: keep the last N detailed records; older info can be summarized.
- Critical vs informational: critical events should be retained longer; informational events can rotate faster.
- Compression by aggregation: repeated identical failures can increment counters instead of emitting verbose duplicates.
Avoid deep storage algorithms here: the key requirement is predictable retention plus integrity, not high-throughput logging.
Evidence / Acceptance criteria (negative-test driven)
- Integrity test: deleting or modifying a record must be detected (hash-chain break or checkpoint signature fail).
- Monotonicity test: power loss and reboot must not silently decrease
seq. - Truncation test: mid-write power loss must be recovered by rolling back to the last valid checkpoint/record boundary.
- Replay test: replacing storage with an older log image must be detected via
seq/ checkpoint mismatch.
Validation & Pen-Test Checklist
A security design is only as strong as its repeatable tests. This checklist turns core domains (boot, keys, update, stream, debug, tamper, logs) into an engineering SOP: each domain defines the threat target, the minimal test, the expected outcome, and the evidence that must be emitted. Prioritize tests that are repeatable and automation-friendly.
1) Checklist template (use the same 4 lines everywhere)
- Threat target: the specific action to prevent or detect.
- Minimal test: the smallest tool + steps to reproduce.
- Expected: block/detect behavior (fail-closed or flagged).
- Evidence: required reason codes, counters, and integrity signals (e.g., log chain intact).
This format avoids re-explaining mechanisms and keeps the checklist executable by firmware and validation teams.
2) Automation split (CI vs bench)
- CI-friendly: signature tamper, downgrade package, replay snippets, delete/modify log entries, policy regression tests.
- Bench-friendly: power loss during update stages, case-open/tamper trigger, reset/brownout abuse patterns (trigger category only).
Keep “attack how-to” out of scope. The goal is verifiable outcomes and emitted evidence, not tooling instructions.
Pass/Fail Matrix (template)
Use a matrix to ensure each domain is covered by representative negative tests. Cells should encode outcomes such as BLOCK + LOG or DETECT + LOG, not just a boolean.
| Attack action | Boot | Keys | Update | Stream | Debug | Tamper | Logs |
|---|---|---|---|---|---|---|---|
| Modify (tamper) | BLOCK + LOG | BLOCK + LOG | BLOCK + LOG | DETECT + LOG | BLOCK + LOG | DETECT + LOG | DETECT + LOG |
| Replay | DETECT + LOG | DETECT + LOG | BLOCK + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG |
| Downgrade | DETECT + LOG | DETECT + LOG | BLOCK + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG |
| Unauthorized access | BLOCK + LOG | BLOCK + LOG | BLOCK + LOG | DETECT + LOG | BLOCK + LOG | DETECT + LOG | DETECT + LOG |
| Power / reset abuse | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG |
| Delete / truncate evidence | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG | DETECT + LOG |
Evidence gating: each “+ LOG” outcome must map to an auditable record with seq, reason_code, and an integrity-valid chain/checkpoint.
IC / Module Selection Guide (with example MPNs)
This section translates the threat model and test matrix (H2-1 to H2-10) into BOM decisions. The goal is not “one perfect part,” but a repeatable way to select device families by the parameters that determine coverage of BootKeysUpdateStreamDebugTamperLogs. The listed part numbers are concrete examples; final selection must be validated against current datasheets, supply, and certifications.
A) Start from “coverage requirements” (not from catalogs)
- Boot (H2-2/3): signature verification must be anchored to a non-bypassable RoT; fail-closed recovery behavior must be testable.
- Keys/Identity (H2-4): keys must be non-exportable; RNG health status must be observable; device identity must be attestable.
- Update/Rollback (H2-5): monotonic version/counter must be enforced; A/B slot state must be auditable.
- Stream authenticity (H2-6): signing/watermark flags must be verifiable; replay/tamper should produce detectable evidence.
- Debug/RMA (H2-7): controlled unlock (time-bound, role-based) + audit logging is mandatory.
- Tamper (H2-8): tamper events must drive a policy action (lock/limit/erase) and produce evidence.
- Logs/Forensics (H2-9): append-only log integrity (hash chain + checkpoint) must be testable under power cycles and truncation.
B) Practical selection rules (procurement + engineering)
- Prefer parts with test hooks: explicit status codes, counters, and “deny paths” that can be verified (H2-10).
- Prefer non-exportable key APIs:
sign/verify,wrap/unwrap,deriveinstead of “read key.” - Time budget matters: verify time per stage (H2-3) and checkpoint signing (H2-9) need measurable performance.
- Lifecycle matters: provisioning, RMA unlock, revoke/rotate keys without replacing the whole product line.
- Second-source strategy: define “non-negotiable” parameters, then qualify multiple candidates via the same H2-10 checklist.
Tip: treat the H2-10 matrix as the acceptance gate; if a candidate cannot pass/emit evidence, it is not a safe substitution.
C1) Secure Element / TPM / HSM class (Root of Trust anchor)
Key parameters: interface, object capacity, attestation support, non-exportability, lifecycle, assurance level.
- Interfaces: I²C/SPI (signal integrity, EMI robustness, driver complexity).
- Key/object capacity: number of key slots, certificate objects, monotonic counters (if supported).
- Attestation: ability to emit verifiable identity proof (device cert chain / attestation statement).
- Assurance: published countermeasure level / certifications (where required by the project).
- Lifecycle: provisioning modes, lock states, RMA handling, secure delete/retire flows.
Example MPNs (Secure Element / Trust anchor):
- Microchip
ATECC608B(CryptoAuthentication secure element family) - NXP EdgeLock
SE050(secure element family; e.g.,SE050C2variants) - Infineon OPTIGA Trust M family (e.g.,
SLS32AIA010MS2variants) - STMicroelectronics
STSAFE-A110/STSAFE-A120families
Example MPNs (Discrete TPM 2.0 options):
- Infineon OPTIGA TPM family:
SLB9670/SLB9672(common discrete TPM families) - Nuvoton discrete TPM family:
NPCT75x(example family used in TPM applications)
Coverage focus: BootKeysUpdateLogs (anchors H2-2/3/4/5/9 and provides evidence hooks for H2-10).
C2) TRNG / Crypto acceleration (latency + evidence-friendly)
Key parameters: entropy health visibility, throughput, early-boot usability, power/latency budget impact.
- TRNG health hooks: “health test OK/FAIL” status and counters that can be logged (H2-4, H2-9).
- Acceleration scope: hash, signature verify, MAC/auth (impacts H2-3 verify budget and H2-9 checkpoints).
- Early boot availability: usable before full OS bring-up (important for verified boot chains).
- Power/latency: measurable “verify time per stage” impact.
Example MPNs (crypto/TRNG building blocks):
- Microchip crypto co-processor family (secure key + ECC):
ATECC608B(also used as a crypto accelerator) - NXP secure element families (crypto + key store):
SE050(as above) - Infineon OPTIGA Trust families:
SLS32AIA010MS2(as above)
Note: many vision SoCs include internal crypto/TRNG. External accelerators are typically chosen when a stronger key boundary, attestable identity, or better auditability is required.
C3) Tamper controller & sensors (event → policy → evidence)
Key parameters: input channels, response latency, false-positive control, audit hooks, power-domain behavior.
- Inputs: number of tamper inputs; support for case-open, light, mesh, voltage/reset abuse flags.
- Response latency: trigger-to-action timing (lock/limit/erase policy behavior must be defined in H2-8).
- False positives: thresholding, debounce, voting, environment compensation (industrial deployments).
- Evidence hooks: counters/log triggers that can feed H2-9 append-only logs.
Example MPNs (sensor building blocks used in tamper designs):
- Ambient light sensor (case-open / light tamper):
VEML7700(Vishay) - Hall switch (magnet / cover detect):
US5881(Melexis family) /A3213(Allegro family) - Voltage supervisor / window monitor (brownout abuse detect building block):
TPS3703(TI) /LTC2962(Analog Devices) - MCU companion with tamper pins (as “tamper controller” role in some architectures): examples include secure-MCU families such as
LPC55S69(NXP) /STM32U5(ST) used as security companions in embedded systems
Coverage focus: TamperLogsDebug (tamper events must produce auditable evidence, not just a GPIO interrupt).
C4) Secure storage protection (security attributes only)
Key parameters: integrity/authentication, replay resistance, secure partitions (RPMB/OTP), auditability.
- Data-at-rest: encryption capability and device-bound keying (prevents “chip-off read”).
- Integrity: authenticated reads/writes (MAC/tag) for critical metadata (logs, counters, policies).
- Anti-rollback: monotonic counters / protected version storage (supports H2-5 and H2-9).
- Secure partitions: RPMB/OTP/secure area for audit anchors (e.g., checkpoint head hash, counters).
Example MPNs / families (storage primitives commonly used):
- eMMC/UFS with RPMB support (family feature; choose an RPMB-capable MPN from the storage vendor shortlist)
- Serial NOR families with OTP/unique-ID features (family feature; qualify with H2-10 “replay/rollback” tests)
- Secure element used as a secure counter/anchor (device-bound):
SE050/ATECC608B/ OPTIGA Trust M family
This page intentionally avoids wear-leveling details; selection is based on integrity, replay resistance, and audit anchoring capability.
C5) Secure debug & service unlock (challenge-response + audit)
Key parameters: authentication method, time-bound unlock, privilege granularity, audit, revoke/rotate.
- Challenge-response: short-lived tokens; rate limit + deny evidence.
- Certificate-based auth: role separation (factory vs service vs field); revocation-ready design.
- Privilege granularity: limited diag vs full attach; “key never readable” must be enforced.
- Audit: every unlock attempt and grant must emit a reason code and be append-only logged (H2-9).
Example MPNs (typical building blocks used as debug-auth roots):
- Secure element used for service authorization tokens:
ATECC608B,SE050,STSAFE-A110 - Discrete TPM (when debug auth is tied to TPM-based identity):
SLB9670/SLB9672families
D) Evidence / acceptance: map specs → H2-10 tests
Use the same acceptance gates for every candidate and every substitute.
| Domain (H2-10) | Non-negotiable capability | What to verify (minimal negative test) | Evidence required (H2-9) | Typical device family | Example MPNs |
|---|---|---|---|---|---|
| Boot | Non-bypassable RoT, fail-closed | Corrupt signature → must refuse/enter safe state | event_type=boot_fail, reason_code, monotonic seq |
SE/TPM/HSM or SoC RoT + anchor | SE050, ATECC608B, SLB9672 |
| Keys | Non-exportable keys, RNG health visibility | Attempt key readout path → must be impossible; RNG health flag visible | event_type=key_access_deny or health status log |
Secure element / TPM | STSAFE-A110, SLS32AIA010MS2, ATECC608B |
| Update | Anti-rollback enforcement | Install older signed image → must reject | event_type=rollback_attempt, reason code, seq monotonic |
RoT + secure counter anchor | SE050, SLB9670, ATECC608B |
| Debug | Controlled unlock + least privilege | Unauthorized attach → must fail; authorized unlock → limited scope | event_type=debug_unlock (attempt/grant), audit chain intact |
SE/TPM + policy engine | STSAFE-A110, ATECC608B, SLB9672 |
| Tamper | Event → policy action + evidence | Case-open/light/brownout abuse trigger → must log + enforce policy | event_type=tamper, reason codes, checkpointed log |
Sensors + tamper ctrl/secure MCU | VEML7700, TPS3703, LPC55S69 |
| Logs | Append-only, tamper-evident | Delete/modify record → detect chain break; truncation recoverable | Hash-chain break flag, checkpoint signature verify fail | Secure storage + checkpoint signer | SE050, ATECC608B, RPMB-capable storage |
Substitution rule: a “drop-in alternative” must match the non-negotiable capabilities and pass the same negative tests while emitting the same evidence.
FAQs — Security & Anti-Tamper (evidence-first)
Each answer stays inside the device-side evidence chain: what to check first, what proves the root cause, and what fix boundary applies. No cloud backend assumptions.
Q1Secure boot enabled but malware still runs — verified boot or only measured boot?
If the device records hashes but still boots modified code, it is likely measured boot, not verified boot. First check whether signature failures force a fail-closed state, and whether verification is enforced before executing each stage. The discriminator is a deliberate image tamper: verified boot must refuse/enter recovery, while measured boot continues and only logs digests.
Q2Rollback protection fails after board swap — where should the monotonic counter live?
Rollback fails after a swap when the monotonic version is stored in a replaceable component (external flash or removable storage). First check where the counter is anchored (RoT/SE/TPM vs filesystem metadata), and whether the device identity is bound to that counter. The discriminator is a swap test: a secure anchor keeps the counter from decreasing across hardware changes and logs the attempt.
Q3Firmware update sometimes bricks devices — A/B slot or commit policy issue?
Intermittent bricking is often a state-machine or commit policy bug, not just a signature problem. First check the update state transitions (download→verify→install→switch→commit), and whether power-loss injection is covered at each edge. The discriminator is where it dies: failures before commit should roll back to the last known-good slot; post-commit failures indicate the commit criteria are too weak.
Q4Can attackers replay recorded video streams and still pass verification?
Replay can pass if authenticity checks lack freshness (nonce/sequence/timestamp) or if the verifier does not enforce continuity. First check that each frame/chunk is bound to a monotonic sequence and that verification rejects repeats. The discriminator is a controlled replay: re-inject an old segment and confirm the verifier flags duplicated sequence IDs or a broken hash chain, then logs the exact failure point.
Q5How to sign video without killing bandwidth or latency?
Don’t sign everything at maximum granularity by default. First choose the coarsest signing unit that still detects insert/delete/reorder (e.g., per GOP or per N frames) and include a continuity mechanism (hash chain + sequence). Then profile verification time per stage and signer throughput under worst-case FPS. The discriminator is whether an attacker can splice frames without breaking the chain at the chosen granularity.
Q6Watermark detected inconsistently — capture pipeline issue or key/session mismatch?
Inconsistent watermark detection is commonly caused by session/key mismatch or inconsistent processing paths, not the watermark itself. First correlate watermark pass/fail with session identifiers and key-rotation events, and confirm the watermark insertion and detection points are fixed across modes. The discriminator is repeatability: if failures align with session changes or pipeline branches, the issue is key/session binding; if they align with exposure/ISP modes, it is pipeline sensitivity.
Q7JTAG is disabled but secrets still leak — what about debug UART or logs?
Disabling JTAG closes one door, but leaks often come from debug UART, crash dumps, verbose logs, or unprotected diagnostics. First enumerate all debug surfaces and verify they default to deny without authorization. Then check that logs never include key material, raw memory dumps, or sensitive buffers. The discriminator is an unauthorized access attempt: it must fail with an auditable reason code; “silent success” indicates an uncovered debug path.
Q8Field tech needs debugging — how to do RMA unlock without exposing keys?
RMA unlock should be time-bound, role-based, and least-privilege, with keys remaining non-exportable. First require challenge-response using a service credential and limit the granted capabilities to diagnostics only. Then enforce that secure boundaries expose only sign/verify or wrap/unwrap APIs, never “read key.” The discriminator is a negative test: even in unlocked mode, attempts to dump secure regions or export keys must still be blocked and logged.
Q9Tamper triggers too often in cold/heat — how to reduce false positives?
False positives usually come from thresholds without compensation, poor debounce, or single-sensor decisions. First analyze tamper event rates across temperature and vibration profiles, and correlate with sensor raw readings. Then apply multi-signal voting, adaptive thresholds, and explicit debounce windows while keeping the response auditable. The discriminator is environmental replay: if events cluster at specific temperature ramps or mechanical shocks, calibration/compensation is required; if random, the sensor path is noisy or under-filtered.
Q10Glitch attacks cause bypass — what detectors or brownout policies matter most?
Bypass under glitches indicates missing fault detection or an unsafe recovery policy. First confirm brownout and clock/reset anomaly detectors are active and their events are handled before continuing execution. Then validate the system always transitions to a fail-closed state when detectors trigger during boot or update. The discriminator is a glitch/brownout injection test: detectors must raise an event, the device must lock/limit or enter recovery, and the audit log must capture the trigger source and timing.
Q11Logs get wiped after power loss — how to make audit logs append-only?
Audit logs need tamper-evidence and monotonic ordering, not just storage space. First implement an append-only structure with a hash chain and periodic signed checkpoints stored in a protected anchor. Then ensure sequence numbers never decrease across reboots and that truncation is detectable. The discriminator is a destructive test: delete or modify a record and verify the chain breaks; cut power mid-write and confirm recovery preserves monotonic sequence and marks incomplete entries.
Q12How to prove device identity to an offline verifier without cloud?
Offline identity proof requires a verifiable credential chain and a fresh challenge response. First provision a device identity certificate anchored to a trusted root stored by the verifier, and ensure the private key is non-exportable. Then perform a nonce-based attestation or stream-signing challenge so the verifier confirms liveness. The discriminator is replay resistance: repeating a previous attestation must fail because the verifier tracks nonce/sequence usage, and the device logs the attestation event with a reason code.