A Face Access Controller is an on-edge security endpoint: it turns depth/IR sensing into a door decision through a trusted pipeline (liveness → match → policy → relay), while keeping templates and keys non-exportable.
When failures happen (sunlight FRR spikes, reboots, OSDP drops, template loss), the fastest fix comes from an evidence chain—scores/histograms + waveforms/log fields—so symptoms map to measurable causes rather than guesswork.
H2-1. What a Face Access Controller Is (and is NOT)
On-device face decisionDepth/IR + liveness lightingSecure element + encrypted templatesDoor I/O (OSDP/Wiegand/Relay)PoE PD + evidence logs
A Face Access Controller is an edge device that performs Depth/IR capture → liveness decision → face
inference/match → access actuation, while protecting identity assets (templates/keys) using a
hardware-backed trust anchor and producing auditable field evidence (power, security state, I/O events).
This page is hardware- and evidence-driven: every diagnosis must map to at least one of these:
(1) power/timing proof, (2) AI pipeline proof, (3) security-state proof.
Out of scope (handled by sibling pages)Intercom/door-station audio or SIP/VoIP behavior, door-lock motor/solenoid mechanics, multi-door panel topology, NVR/VMS ingest/storage platforms, PoE switch/PSE internals, cloud orchestration/app tutorials.
Field triage: “belongs to this page” vs “belongs to a sibling page” (5 rules)
Use the rules below to prevent scope creep during field debug. Each rule is designed to be
mechanically checkable with a minimal evidence set (two data points).
F1 locks scope: the page covers the controller, sensor + illumination, secure element, access I/O, and PoE PD power evidence—while excluding intercom, lock mechanics, NVR/VMS platforms, panel topology, and PoE PSE design.
Cite this figure#fig-f1ICNavigator — “Face Access Controller”, Figure F1 (Scope Boundary)
H2-2. Reference Architecture: Data Paths + Trust Boundaries
A reliable face access controller is best understood as four flows sharing a single trust root:
(1) capture flow, (2) inference/match flow, (3) actuation flow, and (4) management/update flow.
The purpose of this chapter is to make the system auditable: every security claim must map to a
verifiable boundary and a measurable artifact.
Four data flows (each must have measurable evidence)
Three non-negotiable trust invariants (must never break)
These invariants are written as system constraints plus a proof method. The proof items should be
logged or otherwise observable without requiring vendor-only lab tools.
Invariant #1 — Templates/identity assets never exist in plaintext outside the trusted boundary.
Proof: encrypted template DB flag + key handle usage (SE/TEE), no plaintext export paths, TLS enforced for any identity transport.
Invariant #2 — Only signed firmware can boot; updates are signed; rollback is blocked.
Proof: secure-boot state + verified hash, signature chain record, monotonic counter (SE/OTP/RPMB) increments on accepted updates.
Invariant #3 — Debug/physical attack surfaces are controlled and tamper is observable.
Proof: debug port locked state, tamper events written to audit log with timestamp, optional key invalidation policy indicator.
Architectural rule: if a symptom cannot be mapped to one of the four flows and proven by at least one telemetry or waveform artifact,
the diagnosis is not complete.
F2 makes the system auditable: four flows (capture, inference/match, actuation, update) are mapped to a trusted boundary,
with a secure element providing key handles (not plaintext keys), and three invariants that must never break.
H2-3. Depth/IR Front-End Choices (ToF vs Stereo vs Structured Light)
In access control, depth/IR is not a “nice-to-have camera upgrade”—it is a spoof-resistance input.
The front-end choice must survive outdoor sunlight, keep false reject rate (FRR) stable across faces/angles,
and stay inside power/compute budgets while remaining field-debuggable.
Why depth matters in the door context (threat-model driven)
3D spoof (mask/print): requires depth + additional cues (reflectance behavior, micro-geometry limits, temporal stability) at the controller level.
Outdoor deployment: sunlight/IR flood can wash out signals; each depth method has a distinct failure signature that must be observable.
Selection matrix (3×6): method vs engineering evidence
Each cell lists dominant failure mode and a minimum evidence pair (one measurable metric + one log/counter).
This turns “trade-offs” into a field-checkable selection.
Decision path (pick based on constraints, then validate with evidence)
Hard outdoor sunlight + reflective entry (glass/metal): favor ToF or structured light, but require confidence/contrast telemetry; validate with invalid-depth/stripe-quality under max lux.
Low-power and simpler optics priority: stereo can win if texture is acceptable or IR-assist is stable; validate with inliers ratio and disparity invalid ratio.
Long range + consistent depth needed: ToF tends to be predictable if timing & thermal derating are controlled; validate with phase jitter and derating events.
Factory/field service constraints: choose the method whose “dominant failure mode” is easiest to observe with logs + one scope capture in your deployment.
Practical rule: if the chosen depth method cannot expose a stable quality/confidence metric and a sync/health flag,
field debugging will degrade into “AI seems random”—avoid that architecture.
F3 compares depth methods using the same module vocabulary (emitter/projector, optics, sensor, timing, preprocessing, NPU input) to keep selection evidence-driven and field-debuggable.
Liveness lighting is a synchronized signal generator, not just “more IR brightness”.
The driver must deliver repeatable optical power while keeping timing alignment, thermal limits,
and auditability intact. If any one of these breaks, FRR and spoof resistance degrade in a way that
can look like random AI behavior.
What the lighting subsystem must guarantee (engineering, not marketing)
Repeatable optical energy: peak current and pulse width must be achieved (no hidden current clipping).
Deterministic alignment: exposure window must overlap the valid light pulse (or modulation/pattern timing).
Thermal safety: junction temperature must remain bounded via NTC feedback and duty derating.
Auditable operation: peak/duty/derating events must be logged to correlate with score drift.
The 4 quantities that must be synchronized (minimum evidence chain)
These four must be observable as either waveforms or counters. Treat this as a non-negotiable
debug checklist whenever “night works, day fails” or “scores drift after update” appears.
Quantity
Where to measure
What “correct” looks like
If misaligned → typical symptom
① Light pulse LED/VCSEL EN
Driver strobe pin / enable GPIO; timestamp marker in logs.
Stable phase relationship to frame trigger; no jitter bursts during busy periods.
Lock stays asserted; phase/pattern markers align with illumination timing.
Depth confidence collapses; invalid depth holes increase under sunlight.
Thermal + eye-safety enforcement (keep it practical)
NTC feedback loop: NTC near emitter/board hotspot feeds driver derating; log derating_events with timestamps.
Duty/peak constraints: treat peak current, pulse width, repetition rate as a signed configuration—changes must be auditable.
Power interaction: pulsed emitters cause rail dips; correlate current waveform with reset/brownout counters during actuation.
Debug shortcut: if liveness or match quality changes with ambient/time-of-day, capture a single trace that includes
① strobe, ② current sense, and ③ frame sync. If those three do not maintain stable alignment, fix sync/power/derating first.
F4 shows the lighting subsystem as a synchronized signal chain: pulse and current must align with camera exposure and ToF/pattern timing, while NTC feedback enforces safe derating and logs events for audit and debug correlation.
Outdoor access control failures are rarely “AI problems” first. They usually start as
optical energy arriving at the sensor through unintended paths—direct glare, window reflections,
dust scatter, fog halos, or IR leakage around an IR-cut mechanism. This chapter turns those risks into
testable constraints and a path-based debug method.
Design constraints that must be validated (not assumed)
Window + bezel must block off-axis glare: baffles/light-traps should prevent shallow-angle sunlight from hitting sensor-facing surfaces.
Reflection return paths must be controlled: internal double-reflection from the window must not re-enter the lens as a “fake scene.”
IR-cut state must be deterministic: day/night transitions must not cause unpredictable IR leakage or focus/MTF shifts.
Contamination must be survivable: fingerprints/dust should not push black level and flare beyond what the pipeline can tolerate.
Fog/rain must be observable: scattering-driven SNR loss should be detectable via metrics, not discovered as rising FRR in the field.
Temperature drift must be bounded: lens focus shift and window stress effects must not collapse ROI contrast or depth confidence.
Six mandatory scenarios (each maps to measurable observables)
Each scenario below includes a minimum evidence set. The goal is to detect the optical failure mode
before it becomes a security or user-experience failure.
Fast rule: if FRR or depth confidence changes with time-of-day or weather, treat the root cause as an
optical path problem until the six scenarios above are proven stable by metrics.
A “path diagram + metric evidence” closes the loop faster than subjective visual judgement.
F5 visualizes how outdoor failures map to specific optical paths: direct glare, reflection return, IR leakage, and scattering.
Use this together with the six mandatory scenarios table to close the “symptom → evidence → path → fix” loop.
H2-6. AI SoC / NPU Pipeline (Preprocess → Embedding → Match → Policy)
The on-device recognition stack should be treated as a real-time pipeline.
Reliability depends on latency determinism, thermal stability, and memory bandwidth control,
plus auditable outputs (policy decisions and logs). This chapter describes the pipeline as an engineering system:
measurable stages, bounded resources, and observable failure signatures.
Pipeline stages (engineering view)
Ingress: capture / decode / timestamp the frame; establish ROI and privacy constraints early.
Targets below are design goals for interactive door access. Values should be tuned by product constraints,
but the structure is stable: each stage needs its own timer and its own symptom mapping.
Stage
Target (ms)
Over-budget symptom
First evidence to check
First fix direction
Capture / Ingress
8–12
Inconsistent response; occasional “stalls” when multiple faces appear.
dropped_frame_count + ingress_queue_depth
Reduce copies; cap input FPS; enforce ROI early.
Preprocess
6–10
Latency drift after updates; CPU load spikes; frame jitter increases.
preproc_ms + DDR_bw_est
Fuse ops; use zero-copy buffers; minimize format conversions.
Detect + Align
10–18
Queue grows in busy periods; recognition becomes “late” but still correct.
det_align_ms + roi_area_ratio
ROI gating; limit max faces; early reject low-quality frames.
Embedding (NPU)
18–35
Heat-triggered slowdowns; FRR rises under thermal throttle due to timing/AE coupling.
embed_ms + thermal_throttle_event
Thermal headroom; lower duty; model quantization; schedule load.
Match
3–8
Occasional incorrect reject under load; audit logs show retries/timeouts.
match_ms + template_read_retries
Cache hot templates; optimize secure storage access; batch comparisons.
Policy + Decision
2–6
Allow/deny inconsistencies across identical inputs due to missing quality gates.
policy_path_id + quality_gate_flags
Make gates explicit; separate “quality fail” from “identity fail”.
I/O Output
5–12
Decision is fast but unlock is late; relay/OSDP timing variance.
io_ms + bus_retry_count
Prioritize I/O task; debounce; verify supply margin for relay.
Resource constraints to enforce (what keeps the pipeline deterministic)
Thermal: throttle events must be logged and correlated with stage timers; otherwise FRR drift is misdiagnosed.
DDR / bandwidth: avoid extra frame copies; control intermediate tensor sizes; keep ROI tight and stable.
ROI + “sub-stream” concept: crop/scale policies must be explicit so busy scenes do not explode compute cost.
Privacy masking: apply masking before recording or exporting frames; log policy decisions without leaking raw content.
Debug shortcut: when users report “sometimes slow” or “fails when crowded”, check the three coupled signals first:
stage timers, thermal throttle, and queue depth. If those are stable, investigate optics/illumination quality gates next.
F6 makes the edge AI stack auditable: stage timers define deterministic latency, ROI and privacy gates control cost and exposure,
and policy output is tied to secure templates and an audit log for traceable decisions.
A face access controller is only as trustworthy as its identity store. The design goal is simple:
templates, private keys, and anti-rollback state must never exist as plaintext at rest, and cloning or rollback
must produce detectable, auditable signals. This chapter defines a storage tier strategy for
Secure Element (SE/TPM), TEE, and encrypted storage (eMMC RPMB / encrypted partitions).
Objects that must never land in plaintext
Face template / embedding vectors: store only encrypted blobs; any decode must be bound to device trust state.
Template KEK / key-wrapping keys: non-exportable; generated and used inside SE/TPM only.
Device identity private keys (TLS / signing): non-exportable; SE performs sign/handshake without key release.
Anti-rollback state (monotonic counter, sealed version): must be write-protected against rollback (SE counter or RPMB authenticated writes).
Attestation / audit signing keys: non-exportable; used only to sign measured boot + policy state for non-repudiation.
Practical rule: if any of the above can be recovered by reading a file system image, the system is
clonable by design. Encryption without device binding does not prevent template transplant.
Storing secrets without binding (e.g., raw DEK in file); relying on encryption without RoT.
Confidentiality only unless bound to device keys.
Anti-rollback / anti-clone: three detection signals (with evidence hooks)
Monotonic counter: a strictly increasing boot/version counter stored in SE or RPMB.
Evidence: boot_version_counter,
fw_rollback_detected.
Device binding: template blobs are encrypted with a DEK that is wrapped to a device-unique KEK in SE/TPM
(or sealed to measured boot state). Evidence:
template_unwrap_fail,
device_bind_id.
Attestation: measured boot hash + policy version signed by a non-exportable key to prove runtime state.
Evidence: measured_boot_hash,
attest_quote_id.
Recommended audit minimum: for every allow/deny decision, log
(policy_version, quality_gate_flags, template_version, stage timers, device_bind_id)
and anchor the record with a signature or hash chain so that offline operation remains accountable.
F7 shows a practical layering strategy: large objects (templates, logs) live in encrypted storage, while non-exportable keys and monotonic counters
live in SE/TPM (and/or RPMB-backed sealed state). The system must produce explicit rollback/clone evidence hooks.
H2-8. Interfaces & Door I/O: Wiegand/OSDP, Relays, Tamper, Offline Modes
Door access hardware fails most often at the edges: wiring, surges, ground reference, and inductive loads.
This chapter focuses on OSDP (RS-485), Wiegand, relay/lock outputs, and
door-contact / REX / tamper inputs, with a repeatable troubleshooting method:
symptom → first measurement point → confirm → first fix.
OSDP vs Wiegand: the practical boundary
OSDP (RS-485): differential bus, long distance, secure channel optional, but sensitive to termination, common-mode noise, and topology.
Wiegand: simple pulse lines (D0/D1), easier wiring but more exposed to EMI, ground shifts, and edge distortion.
Engineering rule: if the environment is noisy/outdoor and security matters, OSDP with proper protection is usually the stable path.
Eight common field symptoms → first measurement point
Symptom
Likely domain
First measurement point
What confirms it
First fix
Controller resets when relay pulls in
Power/ground, relay coil
Main rail at MCU/PMIC + relay coil node (scope both)
Rail droop / ground bounce synchronized to pull-in edge
Separate return path; add hold-up; slow edge/limit current
Unlock causes random freezes
Lock back-EMF / backfeed
Lock terminals vs controller ground; rail reverse current indicator
Large negative/positive spikes coincide with lock release
Flyback/TVS near connector; isolate driver; improve return
Queue overflow or atomic-write failures during power loss
Bounded queue; atomic commit; signed/chained logs for later sync
Offline modes (device-side only)
Policy cache: store policy_version + expiry; deny risky operations when cache is stale.
Identity cache: store encrypted template blobs with device binding; refuse unlock when binding check fails.
Event buffering: bounded queue with atomic commit and power-loss resilience; avoid “half written” records.
Audit continuity: sign or hash-chain events so offline decisions remain verifiable after reconnection.
Field priority: if a door system misbehaves, confirm power integrity + surge return paths before tuning protocol retries.
Wiring faults and inductive load spikes are the top source of “random” issues.
F8 connects wiring to failure modes: RS-485 protection and termination for OSDP, relay flyback for lock loads,
and filtered/monitored digital inputs (door contact, REX, tamper). Robust ground and surge return paths prevent “random” field issues.
H2-9. Power Tree & PoE PD: Cold-Start, Brownout, Hold-up, Thermal
Intermittent face recognition failures and random resets are often power problems in disguise.
This chapter turns “fails sometimes” into a measurable power evidence chain across
PoE negotiation, cold-start inrush, rail sequencing, load transients,
hold-up, and thermal throttling.
Power evidence chain (what to confirm first)
PoE budget: confirm class/type and the effective power limit under real cable + switch conditions.
Cold-start: inrush or UVLO loops present as repeated boot attempts or partial initialization.
Rail health: PGOOD drops, sequencing violations, or brownout interrupts correlate with “sudden reject” events.
Load transients: NPU bursts + liveness LED pulses + relay switching create fast dips and ground bounce.
Thermal: throttle changes latency; missed deadlines can look like “model accuracy issues.”
Field priority: if failures correlate with LED pulses, unlock events, or peak CPU/NPU usage, verify
SoC/NPU rail at the load and PD output rail before changing AI thresholds.
Four must-log power event fields (minimum)
Field
Definition & trigger
How to interpret in the field
brownout_count
Increment on PMIC/MCU brownout interrupt or PGOOD drop; include timestamp.
Rising count without obvious mains loss indicates transients, current limit, or poor return path.
last_reset_reason
POR / WDT / UVLO / thermal / software; store the last cause across boots.
Record PoE class/type and effective system power limit after negotiation.
Low limit on certain ports/cables commonly causes “LED on → reject or reboot.”
thermal_throttle_state
0/1/2 or target frequency; derive from NTCs, SoC sensors, and PD thermal events.
Summer/outdoor failures often map to throttle → inference latency over budget → higher reject rate.
Recommended add-ons (optional):pgood_drop_count,
npu_peak_current_est,
led_pulse_peak_mA,
rail_min_mv.
These turn “maybe power” into a clear diagnosis.
Cold-start and hold-up: what “stable” looks like
Cold-start: no repeated UVLO loops; rail rise order matches the SoC + sensor requirements; PGOOD stays asserted through LED enable.
Hold-up: on cable unplug or PoE drop, there is enough energy to complete an atomic event record and shut down cleanly.
Transient immunity: LED pulses and NPU bursts do not pull the SoC rail below the brownout threshold.
F9 ties field symptoms to power domains: PoE negotiation and effective power limit, cold-start/inrush control, rail sequencing,
transient loads (NPU bursts + illumination pulses + relay actions), hold-up behavior, and thermal throttling with mandatory event fields.
Cite this figure#fig-f9ICNavigator — “Face Access Controller”, Figure F9 (Power Tree & PoE PD Evidence)
Device security for a face access controller is defined by three chains:
boot trust (what code is allowed to run), update trust (how new code is installed safely),
and communication trust (how control and audit data are protected). This chapter keeps scope on-device:
secure boot, signed updates with rollback protection, TLS, debug lockdown, and certificate lifecycle.
Best practice: bind the baseline to an attestation record so that maintenance actions (template enrollment, policy changes, and updates)
produce verifiable evidence of the running firmware and security posture.
Boot and update chain: what must never be bypassed
ROM → Bootloader: signature verification is the first gate; failure must not fall back to an unsigned path.
Bootloader → OS/TEE: measured boot hash should be captured before high-level services start.
Update pipeline: verify signature → write inactive slot → boot as “pending” → commit only after health checks.
Rollback protection: monotonic version counter must advance only forward and be checked on every boot.
Communications security: keep control and audit protected
Management traffic: enforce TLS; prefer mutual authentication where feasible.
Key storage: device identity private key must be non-exportable (SE/TPM).
Certificates: track expiry, rotation events, and pinning state to avoid silent “works but insecure” behavior.
Debug lockdown principle: maintenance is allowed, but it must be auditable and rate-limited. Any debug enablement should create a signed event record.
F10 shows the on-device trust chains: ROM and bootloader verification, measured boot, signed A/B updates with commit and rollback protection,
secure element key vault (verification keys and monotonic counters), TLS-protected management/audit traffic, and debug port lockdown.
Cite this figure#fig-f10ICNavigator — “Face Access Contro
This playbook turns “works sometimes” into repeatable evidence. Each symptom starts with
two measurements (one image/score/log metric + one electrical/optical/waveform metric),
then provides discriminators, a short isolation sequence, and a first fix that stabilizes the system
before deeper tuning.
EEAT anchor rule: every diagnosis must end with a stored record:
timestamp + environment tag + symptom_id +
2_measurements + first_fix_applied.
Don’t skip the “two-point” rule: one metric alone (only score, only waveform, only logs) produces guesswork.
The fastest isolations come from correlating two signals in the same time window.
Common “first-fix” parts (example MPNs)
The following are example parts commonly used in face access controllers. Selection depends on voltage/current,
isolation level, EMC class, temperature, availability, and compliance needs.
Area
Function
Example MPNs (pick per design constraints)
PoE PD
IEEE 802.3af/at PD interface
TI TPS2373-4, TI TPS2372-4, Analog Devices/LT LTC4267
Hot-swap / eFuse
Inrush limiting, OCP/OVP
TI TPS25947, TI TPS2595, Analog Devices/LT LTC4365 (surge stopper use cases)
Buck DC/DC
Core rails regulation
TI TPS62130, TI TPS62840, Analog Devices ADP2302
Supervisor
Reset/PGOOD monitoring
TI TPS3808, Maxim/ADI MAX16054
NIR LED driver
Constant-current pulse drive
TI TPS92515, TI TPS92662, Analog Devices LT3477
Flash/VCSEL pulse
High-peak strobe (use-case dependent)
TI LM3644, Analog Devices MAX25601
ToF sensor
Depth measurement
ST VL53L5CX, ams OSRAM TMF8801
Secure element
Key storage, identity, counters
Microchip ATECC608B, NXP SE050, Infineon OPTIGA Trust M (SLS32AIA)
RS-485 transceiver
OSDP physical layer
TI SN65HVD72, Maxim/ADI MAX13487E
Isolated RS-485
Noise immunity / ground shift
Analog Devices ADM2587E, TI ISO1410 + external transceiver
Figure F11 — Decision Tree (Symptom → Evidence → First Fix)
F11 enforces the “two measurements first” rule: one metric from image/scores/logs plus one electrical/optical/waveform metric,
then a first fix that stabilizes the system before deeper tuning.
Acceptance checks:
each symptom includes First 2 measurements + Discriminator +
Isolate + First fix, and each fix can be tied to a log record or waveform capture.
H2-12. FAQs ×12 (Evidence-based; mapped to chapters)
Each answer stays inside this page boundary and points back to the measurable evidence chain
(histograms/confidence, pulse current, rail droop, RS-485 waveform, counters, and update/boot checks).
Outdoor sunlight causes sudden FRR spikes—optics flare or IR illumination saturation?
If FRR spikes track sun angle and persist with illuminator off, suspect flare/stray-light paths (window, baffles, IR leakage). If they appear only when the illuminator is on and the pulse current clips or droops, suspect illumination saturation or power limiting. Measure saturation ratio/ToF confidence + LED pulse current. First fix: stabilize pulse (e.g., TPS92662/TPS92515) and reduce glare paths.
Mapped to: H2-5 / H2-4
Liveness passes but wrong person accepted—embedding drift or template binding issue?
If false accepts rise across many users right after a model/pipeline change, suspect embedding drift or policy thresholds. If errors correlate with template import, board swap, or identity migration, suspect template binding/anti-clone logic. Measure embedding similarity distribution + template_version_counter/device binding (attested key). First fix: pin model version, enforce device-bound templates in a secure element (ATECC608B/SE050).
Mapped to: H2-6 / H2-7
Works at night but fails at dusk—auto-exposure curve or NIR duty control?
Dusk is the “control-loop trap”: ambient light changes fast and the exposure curve can oscillate while NIR duty ramps. If exposure/gain hunts or clips at the transition, tune AE hysteresis and ROI constraints. If NIR pulse amplitude/duty collapses as temperature rises, the driver is throttling. Measure exposure/gain stability + NIR pulse current/duty. First fix: bound the curve and cap thermal peaks.
Mapped to: H2-4 / H2-6
After firmware update, recognition changed—model versioning or calibration overwritten?
First separate “pipeline change” from “data corruption.” Check model/pipeline version hash and latency budget; then verify calibration/template partitions were not overwritten and counters didn’t roll back. Measure model version + template/calibration monotonic counter and verify status. First fix: signed updates with A/B slots and anti-rollback, and protect calibration/template storage behind SE-bound keys (SE050/OPTIGA Trust M).
Mapped to: H2-6 / H2-10 / H2-7
Device reboots when relay switches—ground bounce or lock kickback?
If resets align with relay_event_ts, check two things: SoC rail droop at the load and relay coil kickback/return-path injection. Large kickback spikes or ground bounce can trip UVLO or corrupt logic even when average power looks fine. Measure rail droop + coil/kickback waveform. First fix: tighten flyback paths (ULN2803A + diode/snubber), separate returns, and add eFuse/current limiting (TPS25947).
Mapped to: H2-8 / H2-9
OSDP intermittent only with long cable—termination, biasing, or surge protection?
Long cables amplify reflections, common-mode shifts, and surge events. If the RS-485 waveform shows ringing/undershoot, fix termination and biasing first. If CRC/timeouts spike during relay action or outdoor events, add robust TVS/clamp and filtering. Measure RS-485 differential + common-mode movement, and correlate with osdp_crc_err/osdp_timeout. First fix: SM712 TVS + correct termination; use isolated RS-485 (ADM2587E) when ground shift exists.
Mapped to: H2-8
Templates disappear after power loss—storage integrity or rollback protection triggered?
If template_verify fails without counter rollback, suspect non-atomic writes or power-fail during commit—correlate with brownout_count and last_reset_reason. If the monotonic counter rolls back or fw_rollback_detected asserts, anti-rollback logic likely rejected the state. Measure template_version_counter + verify result around power-loss events. First fix: journaled/atomic commits, hold-up for clean writes, and store counters/keys in a secure element (ATECC608B/SE050).
Mapped to: H2-7 / H2-9
ToF depth looks noisy near glossy surfaces—multipath or emitter timing?
Glossy surfaces often create multipath and backscatter that drop ToF confidence in angle-dependent patterns. If noise spikes follow surface angle/material even with stable timing, it’s multipath/stray light (optics). If noise correlates with frame trigger drift or modulation slip, it’s timing/sync. Measure ToF confidence/noise vs angle, plus emitter pulse timing/current stability. First fix: reduce glare paths (baffles/window treatment) and lock illumination-to-frame sync.
Mapped to: H2-3 / H2-4 / H2-5
PoE shows enough power but device throttles—thermal vs power classification limit?
If thermal_throttle_state rises and performance drops gradually, it’s thermal headroom (SoC/NPU, driver, enclosure). If poe_class_power_limit asserts or PD current caps during peaks, it’s a power budget/classification ceiling. Measure temperature/throttle flags + poe_class_power_limit and rail droop. First fix: reduce coincident peaks (NIR pulse + NPU burst), improve heat path, and verify PD/class (e.g., TPS2373-4) and inrush/eFuse behavior.
Mapped to: H2-9
Secure boot enabled but malware still runs—verified boot vs measured boot gap?
Measured boot only records what ran; verified boot blocks unsigned code. Malware can run if verification doesn’t cover OS/apps/config, or if debug/update paths remain open. Measure where signature checks occur in the boot chain, and confirm debug ports are locked and updates require signatures + anti-rollback. First fix: enforce verified boot end-to-end, disable JTAG/UART in production, and bind device identity/keys in a secure element (OPTIGA Trust M/ATECC608B).
Mapped to: H2-10
Face match latency doubles after enabling encryption—CPU offload or key store bottleneck?
Break latency by stages: capture → preprocess → infer → match → encrypt/Tx. If only the encrypt/Tx slice grows and CPU load spikes, crypto is on the CPU (no acceleration/session reuse). If stalls align with secure element transactions, the key store path is blocking. Measure per-stage latency + CPU utilization and SE call timing. First fix: enable HW crypto/offload and session resumption, and reduce synchronous SE operations (SE050/ATECC608B) in the hot path.
Mapped to: H2-7 / H2-6
How to rotate certificates without bricking offline doors?
Use a staged, overlap strategy: keep old+new certs valid together, install the new trust chain first, then switch identities after handshake success is confirmed. Offline devices need a local recovery path and anti-rollback-safe state transitions. Measure TLS handshake failures and cert state machine progress, and record monotonic counters for each stage. First fix: dual-slot cert store + timed cutover, with device identity anchored in SE (SE050/OPTIGA Trust M) and an offline fallback procedure.
Mapped to: H2-10 / H2-7 / H2-8