Power & Backup for Security Systems
← Back to: Security & Surveillance
Central idea: Security “Power & Backup” is engineered to keep critical loads running through outages (or degrade predictably) while leaving forensics-ready proof—measurable rail behavior, health metrics, and tamper-resistant local event logs—so every reboot or dropout is explainable and fixable.
It focuses on PSU selection, power-path control (eFuse/ORing), and hold-up/battery strategy to prevent “mystery failures,” especially under cold, pulse loads, and aging conditions.
H2-1. Mission & Boundary: What “Power & Backup” Owns in Security Systems
This page owns the engineering decisions that keep security endpoints powered through disturbances (blink, brownout, load spikes) and leaves a provable trail (health metrics + event logs) when anything goes wrong.
- Continuous power: AC-DC / DC-DC rails, sequencing, and transient behavior under real security loads (IR, heaters, small motors, edge compute).
- Backup continuity: two clear goals—hold-up (tens of ms to seconds) and battery/supercap backup (minutes to hours) with controllable shutdown policies.
- Power-path control: eFuse / hot-swap / ideal-diode ORing, inrush limiting, reverse-blocking, source switchover behavior.
- Provable operations: health metrics (brownout counters, thermal events, backup SoC/SoH) and event logs (reset reason, Vmin, duration, timestamps, monotonic counters).
- PoE switching / negotiation (PSE, power allocation, IEEE 802.3 behaviors) — only referenced as “external DC input” or “post-PD rail”.
- Outdoor surge/ESD/lightning protection (TVS/GDT, bonding, SPD) — dedicated “Surge/ESD Protection (Outdoor)” page.
- Timing networks (PTP/1588 distribution) — dedicated “Timing & Sync for CCTV” page.
- NVR/VMS platform ingest/codec/storage/compliance — dedicated recorder/integrity pages.
- Inputs: AC adapter or DC feed; optional “post-PD DC rail” as a generic input (no PoE deep dive).
- Loads: steady rails (SoC/MCU) + pulse rails (IR/heater/motor) + always-on domain (RTC/log/tamper).
- Signals: PG/RESET, fault IRQ, and a minimal telemetry/log interface (I²C/SPI/UART—named only, not protocol deep dive).
- Hold-up (ms–s): survive brief outages and source switching. Failure mode is usually a voltage-window problem (Vout dips below UVLO/PG) rather than average power.
- Battery / supercap backup (min–hours): maintain operation or perform controlled degradation (e.g., disable IR heater first, keep logs/RTC alive, then graceful shutdown). Failure mode is often SoC/ESR/temperature mismatch.
- Minimum measurable evidence: Vmin on main rail + duration below threshold; backup current snapshot; device temperature at event.
- Minimum log fields: reset reason (brownout/UVLO/OCP/OTP/watchdog), timestamp (RTC) + monotonic counter, Vmin, duration, SoC, temperature, and protection trip counters.
- Acceptance criterion: no reset under defined disturbances; if a reset happens, the system must output a complete log entry that points to a root-cause class.
Figure F1. Ownership map: inputs → rails → power-path control → security loads, with backup energy and a provable health/log trail.
H2-2. Topologies & When to Use: Flyback vs LLC in Security Power
Choose topology by power range, noise constraints, standby efficiency, and disturbance behavior—especially pulse loads (IR/heater/motor) and light-load burst conditions that can trigger resets.
- Best fit: low–mid power endpoints, cost-sensitive designs, flexible outputs.
- Primary risks: peak currents and magnetics/thermal limits reduce transient margin; clamp strategy impacts stress and noise.
- Security-specific trap: pulse rails (IR/heater/motor) create fast load steps—design must prioritize output sag and recovery time, not only efficiency.
- Best fit: mid–high power, efficiency/thermal/noise priority, higher continuous loads.
- Primary risks: light-load burst/skip behavior can introduce low-frequency ripple “packets” that couple into sensitive domains; regulation may degrade during input sag/hold-up window.
- Security-specific trap: night-time low load (idle camera) + periodic spikes (IR on) can repeatedly cross mode thresholds—this is where “random resets” often hide.
- Pulse load step (IR/heater/motor): measure Vout droop, recovery time, and PG/RESET threshold crossing. If Vout droops before PG/RESET, the root cause class is power transient margin. First fix: increase effective output energy (cap/loop), tune response, and review power-path switching window.
- Light-load burst noise: measure burst period and ripple envelope, and correlate it with fault counters/reset reasons. If faults align with burst packets, the root cause class is light-load mode behavior. First fix: adjust mode thresholds, add controlled preload, or improve filtering on sensitive domains (keep changes within this page’s scope).
- Hold-up during input sag: measure time-to-UVLO and the rail voltage window during sag. If hold-up meets calculation but fails system-level, the culprit is usually downstream UVLO/PG or path switchover timing rather than capacitor value alone.
Figure F2. Topology chooser for security endpoints: flyback favors low–mid power and flexibility; LLC favors mid–high power efficiency/noise—both must be validated against pulse-load steps and light-load burst behavior.
H2-3. Power Tree Architecture: Rails, Sequencing, and “Keep-Alive” Domains
A security endpoint power tree must separate “evidence survival” (always-on) from “service rails” (main) and “stress rails” (pulse loads). The objective is not only avoiding resets, but also ensuring every reset has a provable cause record.
- AON / Keep-Alive: RTC, event log buffer, tamper counters, minimal supervisor. Must survive brief outages long enough to finalize evidence.
- Main: MCU/SoC + networking/control rails. Must be stable under normal operation; can enter controlled degradation on brownout.
- Pulse-load: IR illuminator, heater, small motor/solenoid rails. High di/dt and high peak current; should be isolated and the first to shed.
- Power-up order: AON stable → Main rails stable → enable pulse-load rail(s). This prevents heavy loads from disturbing boot.
- PG chain: PG must represent the main rail window (not just “rail exists”). Use supervisors so reset release follows an explicit threshold + delay.
- Reset sources: treat reset as an input to the evidence system—every reset must map to a reason (brownout/UVLO/thermal/watchdog).
- Controlled degradation (preferred): on input sag or main-rail droop, disable pulse-load first (IR/heater/motor), keep AON alive, and give Main a short window to write a complete event record.
- Hard reset (only when necessary): if rails cross below a non-operational threshold, reset immediately—but the AON domain must still capture reset reason + Vmin + duration.
- First 2 measurements: Main rail Vmin & time-below-threshold + PG/RESET waveform on the same time axis.
- Prove causality: if Vmain droops before PG deasserts, the cause class is transient margin / domain isolation. If PG stays valid but resets occur, inspect reset counters (watchdog/thermal).
- Reset statistics: maintain counters for watchdog / brownout / UVLO / thermal events; trend frequency to identify the dominant failure mode.
Figure F3. Three-domain power architecture with PG/RESET chain and brownout policy that sheds pulse loads first while preserving the always-on evidence domain.
H2-4. Protection & Power Path Control: eFuse, Hot-Swap, Ideal Diode ORing
“Protection” here means protecting the power path and backup switchover: prevent destructive faults, avoid false trips during legitimate inrush, and switch sources without corrupting rails or evidence.
- Limit current (ILIM): caps short/overload energy and protects upstream supply; must still allow legitimate pulse loads via time-window control.
- Blanking window (tBlank): prevents nuisance trips for allowed surges (IR turn-on, motor start, bulk cap charging).
- Safe operating area (SOA): defines how long the device can survive “half-fault” conditions without overheating.
- Ramp control (dV/dt): controls inrush and upstream droop by shaping the output rise and limiting charging current into bulk capacitance.
- Reverse blocking: prevents battery/supercap from back-feeding the adapter rail (heat, instability, and unexplained brownouts).
- Dual-source priority: adapter-first, backup-first, or “highest wins” policies—choose based on required continuity and evidence retention.
- Switchover window: allow either seamless switching or an acceptable micro-sag; use hold-up capacitor to bridge the crossing window.
- Inrush validation: measure input current waveform + input droop + eFuse fault flag/log. If input droops before the trip, the system is being collapsed by inrush, not a true short. First fix: tune dV/dt, adjust tBlank, or reduce bulk charging peak.
- ORing switchover: measure reverse current during voltage crossing + output sag relative to UVLO/PG thresholds. If reverse current appears, reverse-blocking is insufficient; if output sag crosses UVLO, bridge energy or switch speed is insufficient. First fix: adjust priority thresholds, improve ORing control, and add hold-up bridging.
Figure F4. Adapter path (eFuse) and backup path (charger + battery/supercap) converge at ideal-diode ORing; hold-up capacitor bridges switchover and prevents evidence loss.
H2-5. Backup Energy Design: Hold-up Capacitor vs Supercap vs Battery
Backup energy must be engineered as a closed loop: define the target hold-up time, define the allowed voltage window, budget the effective load power after brownout policy, then validate by waveform and reset-cause evidence.
- Vhi / Vlo must be defined by real rail thresholds (PG/UVLO/reset) and the minimum voltage needed to finish evidence tasks (log finalize + counter update).
- P_load is the post-policy power: shed pulse loads first (IR/heater/motor), then keep AON alive, then keep Main only as needed for controlled shutdown.
- Low-voltage behavior matters: DC/DC efficiency and current limit near Vlo can shorten real hold-up time versus the ideal energy equation.
Hold-up Capacitor
- Best for: ms–s ride-through, fast source switching bridge.
- Watch-outs: ESR causes deeper droop; energy scales with Vwindow, not nominal voltage.
- Log: Vmin, time-below-threshold, reset reason at the event.
Supercap
- Best for: seconds–minutes, frequent outages, high cycle life.
- Watch-outs: leakage, balancing (series), low-temp ESR rise reduces transient margin.
- Log: cap voltage, charge/discharge count, cold-start failures.
Battery (Li-ion / LiFePO4 / Primary)
- Best for: minutes–hours, highest energy density, long evidence retention.
- Watch-outs: temperature window, protection (OVP/UVP), internal resistance at low temperature.
- Log: SoC/SoH, temperature, cycle count, UV cutoff events.
- Hold-up waveform: capture Vout during input loss, annotate Vhi/Vlo, and mark the exact time PG/RESET changes state.
- Timestamp & reset point: confirm the event record is finalized before Vout crosses the evidence boundary (AON must survive long enough).
- Low temp / aging: repeat at cold and at aged ESR/leakage conditions; many “calculated OK” designs fail due to ESR-driven droop, not missing nominal capacity.
Figure F5. A practical map of backup options with “best for” targets, critical watch-outs, and the minimum evidence to log for field validation.
H2-6. Charging & Discharging Control: Charger, Fuel Gauge, and Safe Cutoff
Charging and discharging must be treated as a system policy: power-path continuity, safe temperature windows, accurate state-of-charge decisions, and cutoffs that preserve the always-on evidence domain during brownout and cold start.
- Linear vs switching charger: choose by efficiency and thermal headroom; keep the interface as “state + limits”, not as a chip tutorial.
- Power-path preference: prioritize powering the system rail while charging storage, so switchover and load steps do not corrupt evidence rails.
- Temperature window: NTC gates charging enable, current level, and fault transitions; every violation must be logged.
- Discharge limiting: prevents the storage rail from collapsing when the main domain wakes or when pulse loads would otherwise step current.
- UV cutoff policy: choose a cutoff that protects the pack while still guaranteeing AON evidence completion (log finalize + counters).
- Cold start: at low temperature or low SoC, bring up AON first, then decide whether Main can start; record failures with timestamps.
- Coulomb count + OCV calibration: SoC is not “always correct”; define when SoC is trusted (rest windows, temperature range).
- SoH trend: runtime policy must consider aging; a pack can show high SoC but fail under load due to increased internal resistance.
- Decision logging: record SoC/SoH/temperature and the policy decision (keep recording, save logs, safe shutdown) for traceability.
- Charge curve: I/V/T versus time + state transitions (Precharge→CC→CV→Terminate) with reason codes.
- SoC drift & calibration points: log when OCV calibration occurs and how SoC changes; investigate large corrections.
- Cold-start proof: verify AON can rise at worst-case temperature/SoC; then verify Main start does not cause UVLO resets.
Figure F6. Minimal charge/discharge state machine with temperature gating, timeout/fault handling, and discharge cutoff; all transitions should generate reason-coded logs.
H2-7. Health Metrics: What to Measure, How to Trend, When to Alert
Health metrics are only useful if they are observable, trended, and explainable. Each alert should bind to two evidence snapshots (forensics-friendly) without relying on cloud dashboards.
PSU (Input quality + Stress)
- Vin stats: Vin_min_24h, Vin_p05_24h, brownout_count_24h.
- Power: Pavg_1h, Ppeak_24h, power_step_events.
- Thermal/protection: Tpsu_max_24h, OVP/OCP/OTP counts.
Backup (Energy + Aging)
- SoC/SoH: latest SoC, SoH trend, cycle_count.
- Cold events: cold_event_count, cold_start_fail_count.
- ESR/Rint proxy: ΔV/ΔI estimate during a known load step.
Path (Switching + Trips)
- eFuse: efuse_trip_count, inrush_event_count, blanking_hits (optional).
- ORing: oring_switchover_count, reverse_current_flags (if available).
- Reset correlation: reset_reason_count by type.
- Short-term buffer: seconds–minutes around events (pre/post window) to support field forensics.
- Long-term roll-up: hourly/daily min/max/p95/count for weeks of retention with low storage cost.
- Trend rules: rising brownout frequency, downward Vin_min slope, rising ESR proxy, or baseline thermal shift under similar load.
Use a minimal severity model (OK/Warn/Fail or 0–100). Every non-OK alert should append two waveform equivalents:
- Vmin + duration: the minimum voltage and how long the rail stayed below a policy boundary.
- Power snapshots: P_before / P_after (or Iin/Iout snapshot) to distinguish input weakness from load escalation.
Figure F7. Local health metrics flow from observable counters and rolling stats into threshold/trend rules, producing reason-coded alerts with evidence snapshots that can be audited in the field.
H2-8. Event Logs & Non-Repudiation (Local): What to Record for Forensics
This chapter stays strict: only power-domain event logs and the smallest practical integrity layer inside the device. The goal is reconstructable forensics without platform-level compliance scope.
Type & Cause
- event_type: brownout / UVLO / OVP / OCP / OTP
- reset_source: watchdog / brownout / thermal
- manual_power_cut flag (if detectable)
Time & Ordering
- rtc_timestamp (human readable)
- monotonic_counter (anti-replay)
- boot_seq / record_seq (optional)
Evidence Snapshot
- Vmin + duration_below
- P_before / P_after
- temperature, SoC, SoH (optional)
- CRC detects random corruption and partial writes.
- Hash chaining (prev_hash → this_hash) makes deletion/insertion detectable.
- Signature (optional): sign the chain head or key records with a secure element for device-origin proof.
- Two-phase record: write a pre-record on threshold crossing, then finalize with the last evidence values if AON power remains.
- Atomic validity flag: a record is either valid or marked partial; never treat partial as a clean event.
- Ring buffer with pointers: use stable pointers and monotonic record ids to survive power loss.
Figure F8. Minimal power-domain event log record layout with evidence fields and a layered integrity mechanism (CRC + hash chain + optional signature).
H2-9. Validation Plan: Bring-Up, Power Cycling, Brownout, and Backup Runtime Tests
Validation should be executed as a repeatable SOP: goal, tools, steps, and pass/fail. The focus here is three failure drivers: power interruptions, path switching, and pulse loads.
Level 1 — Minimal
- Programmable DC supply (or AC/DC source feeding the adapter).
- Electronic load with step mode (or switched resistor bank).
- 2–4 channel oscilloscope for rails + PG/RESET (or ADC proxies for trend-only).
- Serial/console access for local event-log extraction.
Level 2 — Recommended
- AC source with dip/flicker scripting (if AC-front-end is in scope).
- Thermal chamber (low temp / high temp) for ESR/aging sensitivity.
- Current probe (inrush, eFuse current limit, ORing reverse current).
- Automated harness: trigger tests → fetch logs → evaluate pass/fail.
- Step 0 — Instrumentation: confirm rail naming, measurement points, PG/RESET chain, and log extraction path.
- Step 1 — Baseline steady-state: verify nominal rails, ripple proxies, thermal rise, and idle power.
- Step 2 — Controlled power cycling: repeated cold/warm cycles; confirm boot reason codes are consistent and monotonic counters increment correctly.
- Step 3 — Brownout/flicker: scripted dips and short interruptions; verify hold-up and graceful behavior.
- Step 4 — Pulse loads: inject IR/heater/motor-equivalent load steps; verify transient recovery without false trips.
- Step 5 — Backup runtime: run-down from full to cutoff; verify runtime, cutoff policy, and log completeness.
- Step 6 — Protection behavior: OCP/short/OTP and recovery mode (latch vs hiccup) under controlled conditions.
A) Input variation & brownout/flicker
- Stimulus: minimum input, dip depth (ΔV), dip duration, flicker rate (repeated dips), short interruption (ms–s class).
- Observe: Main rail Vmin + duration; AON rail survival; PG/RESET timing; reset reason counters.
- Pass/Fail: no unexplained reset; degrade allowed only if intended and logged; hold-up meets spec boundary.
- Required log fields: event_type, rtc_timestamp, monotonic_counter, Vmin, duration_below, P_before/P_after, Temp, SoC, CRC/hash.
B) Pulse-load steps (IR / heater / motor equivalents)
- Stimulus: load steps (ΔI), ramp rate, duty cycle, repetition frequency; include worst-case “simultaneous enable.”
- Observe: rail droop (Vmin), recovery time, eFuse current limit behavior, ORing output stability, thermal rise.
- Pass/Fail: no brownout reset; no nuisance eFuse trips; if protection triggers, it must be repeatable and logged with cause.
- Required log fields: power snapshots (P_before/P_after), efuse_trip_reason (if available), oring_switchover_count, Temp.
C) Backup runtime & safe cutoff
- Stimulus: run on backup from full charge to cutoff under typical and worst-case loads.
- Observe: runtime, cutoff threshold behavior, AON domain continuity, log finalization under low energy.
- Pass/Fail: t_runtime ≥ requirement (or policy target); cutoff is controlled (no corruption); logs remain verifiable.
- Required log fields: SoC/SoH (or estimate), Vmin/duration near cutoff, Temp, monotonic_counter progression.
D) Low-temperature & aging sensitivity (ESR/Rint proxy)
- Stimulus: cold start and backup discharge at low temperature; repeated cycles to expose aging trend.
- Observe: Rint_proxy = ΔV/ΔI on a known step; cold-start fail count; recovery after warming.
- Pass/Fail: no unexpected shutoff when SoC indicates sufficient energy; cold event must be recorded with evidence.
- Required log fields: Temp, SoC, Rint_proxy (or step droop proxy), cold_event_count, reason_code.
E) Protection behavior (OCP/short/OTP) & recovery
- Stimulus: controlled overload, hard short (with safety limits), and thermal ramp to OTP.
- Observe: protection trigger thresholds, hiccup vs latch behavior, restart delay, post-fault stability.
- Pass/Fail: protection must be deterministic; recovery must match policy; no silent failures.
- Required log fields: event_type (OCP/OTP/OVP), trigger count, last Vmin/duration, last power snapshot, CRC/hash ok.
- No unexplained reset: each reset must map to a clear reason code and evidence snapshot.
- Hold-up meets spec: t_hold ≥ target under defined Vwindow (use PG/UVLO boundary as Vlo).
- Log completeness: every tested event yields a record with time + ordering + evidence + integrity fields.
Figure F9. Validation SOP ties every test stimulus to observable rail evidence and verifiable local logs, producing a deterministic verdict (pass / degrade+log / fail).
H2-10. Field Debug Playbook: Symptom → Evidence → Isolate → Fix
Field debug should minimize instruments and maximize evidence. Each symptom starts with two measurements and ends with a smallest viable fix (parameter / layout / policy).
Symptom: “Reboots immediately on power loss / recording drops”
- Rail evidence: Main rail Vmin + duration; check whether PG/UVLO boundary is crossed.
- Log evidence: event_type (brownout/UVLO) + monotonic_counter continuity; check if record is finalized (CRC/hash valid).
- PSU transient: Vmin crosses boundary with long duration; brownout counters climb.
- Path switching: Vmin does not cross boundary but reset occurs; ORing switchover count spikes.
- Backup ESR: SoC appears adequate but Vmin collapses only under load; Rint_proxy trend is high (especially cold).
- Parameter: adjust brownout policy boundary and hold-up target; tune ORing priority/threshold to avoid flapping.
- Layout: improve sense point (PG/RESET referenced to the true load rail); reduce ground bounce on PG/RESET returns.
- Policy: on brownout, disable pulse loads first and prioritize AON + log finalize.
Symptom: “Resets when IR illuminator / heater / motor turns on”
- Rail evidence: capture Vmin + recovery time during the enable edge; correlate to PG/RESET.
- Path evidence: check eFuse trip flags / inrush event count and the log power snapshots (P_before/P_after).
- PSU transient: rail droop proportional to ΔI with slow recovery; thermal rise increases droop over time.
- eFuse limitation: eFuse current limit or blanking mismatch causes repeatable trip exactly at load enable.
- Policy issue: simultaneous enables stack pulse loads; droop disappears when enabling is staggered.
- Parameter: tune eFuse ILIM/tBlank/dV/dt for inrush; apply soft-start to pulse loads.
- Layout: separate pulse-load return paths; ensure bulk caps sit on the load rail (not only at the source).
- Policy: stagger enable timing; cap total instantaneous power by scheduling (IR/heater/motor arbitration).
Symptom: “Battery shows charge, but device shuts off suddenly”
- Rail evidence: observe Vmin at the moment of shutdown; verify if UV cutoff is reached at the load rail.
- Backup evidence: log SoC + Temp + Rint/ESR proxy (ΔV/ΔI); check if cold events coincide.
- ESR/inner resistance: SoC is high but V collapses under load; Rint_proxy is rising (aging) or Temp is low (cold).
- Gauge drift: SoC is optimistic; OCV/learning points are missing; mismatch between SoC and cutoff behavior is consistent.
- Cutoff policy: cutoff threshold is too aggressive; discharge current limit triggers early.
- Parameter: adjust discharge cutoff and pre-discharge limits with temperature compensation; enforce precharge path if needed.
- Layout: reduce path resistance (connectors, sense lines); validate NTC placement and contact.
- Policy: on backup, reduce non-essential loads first; reserve energy for logging and controlled shutdown.
Symptom: “Intermittent eFuse trip; recovers and then normal”
- Path evidence: read trip reason (OCP/short/inrush) if available; check efuse_trip_count and inrush_event_count.
- Rail/log evidence: Vmin + duration around the trip; power snapshots to see if load surged.
- Inrush: trips occur only at startup or load enable; consistent timing; blanking mismatch.
- True overload: trips correlate with rising Ppeak or thermal; repeatable under high load.
- Intermittent short: trips are random; may correlate with vibration, moisture, connector movement.
- Parameter: tune ILIM/tBlank; choose recovery mode (hiccup vs latch) matching safety policy.
- Layout: confirm SOA margin; shorten high-current loops; improve connector strain relief and creepage where needed.
- Policy: rate-limit retries; require a log record per retry to avoid silent looping.
Symptom: “Worse in cold; drops out / cannot cold-start”
- Rail evidence: Vmin during start; check AON rail continuity and whether PG/RESET sequencing collapses early.
- Backup evidence: Temp + SoC + Rint_proxy; confirm cold_event_count and cold_start_fail_count in logs.
- ESR at cold: large droop for the same ΔI; behavior improves after warming.
- Cutoff / precharge: discharge cutoff is not temperature-compensated; precharge path is insufficient.
- Sequencing/policy: pulse loads start too early in boot; AON is not protected.
- Parameter: add temperature-dependent limits; enforce precharge before enabling heavy rails.
- Layout: check NTC coupling; reduce series resistance on the backup path.
- Policy: defer pulse-load activation until health checks pass; lock “AON + logs” as first priority.
Figure F10. A minimal-instrument decision tree that isolates failures into PSU transient, path switching, or backup ESR/cold categories, then applies the smallest viable fix.
H2-11. IC/BOM Pointers (Examples): What Parts Typically Appear Here
This chapter provides part categories + common example MPNs for security power and backup designs, without turning it into a shopping list. Each category includes why it is used, what parameters matter, and a replacement window to keep designs robust across supply changes.
A) AC-DC / Primary Power Conversion (Flyback / LLC / Sync-Rect)
Primary AC-DC stage feeding the main DC rails. This block dominates efficiency, transient recovery, hold-up behavior, and acoustic/EMI risk during light-load modes.
- Brownout resilience: better transient recovery reduces “mystery resets” during AC dips and flicker.
- Pulse-load stability: avoids rail sag when IR illuminator/heater/motor-equivalent loads step.
- Lower idle noise: prevents burst/skip artifacts that can couple into sensitive analog or audio paths.
- Flyback / QR / PSR controllers: TI UCC28730, TI UCC28740, TI UCC28780
- Integrated offline switchers (flyback family): Power Integrations InnoSwitch3 (INN3xx), Power Integrations TinySwitch-4 (TNYxxx)
- LLC / resonant controllers: TI UCC25640x (e.g., UCC256404), ST L6599A, onsemi NCP13992, NXP TEA2016/TEA2017 family
- Synchronous rectifier controllers: TI UCC24610 / UCC24612, Infineon IR11688S, onsemi NCP4306A, ST SRK2001
- Power range + peak load steps: max continuous power, peak pulse power, recovery time target.
- Standby / light-load behavior: efficiency at low load, burst/skip frequency placement (avoid sensitive bands).
- Hold-up interaction: allowed Vwindow at the DC bus, control stability near low line / low bus.
- Thermal margin: controller drive capability, SR timing margin, transformer/core heating headroom.
- Choosing on “efficiency headline” only, ignoring burst/skip noise that triggers field instability in edge devices.
- Underspecifying transient recovery: the PSU “passes nominal” but fails on IR/heater enable steps.
- SR controller timing not validated at light load → reverse current or oscillation-like behavior.
B) Secondary Regulation & Feedback (TL431/Opto or Primary-Side Regulation)
Feedback and regulation path that determines output accuracy, dynamic response, and behavior across load and temperature.
- Shunt reference: TI TL431 (and equivalents), onsemi TL431 family
- Optocouplers (feedback isolation): Vishay VO615A, Everlight EL817, Lite-On LTV-817
- PSR controllers (no opto): TI UCC28730 / UCC28740 / UCC28780 (PSR flyback families)
- Loop dynamics vs pulse loads: recovery time and overshoot control during IR/heater steps.
- Accuracy budget: reference tolerance, opto CTR spread/aging, compensation margin.
- Power-fail behavior: how regulation behaves as input collapses (affects hold-up predictability).
- Opto CTR drift not accounted for → field units show rail shift, triggering UVLO/brownout thresholds unexpectedly.
- PSR designs not re-validated across transformer variation → unit-to-unit differences in hold-up and dropout behavior.
C) Power-Path Control (eFuse / Hot-Swap / Ideal-Diode ORing)
Between sources (adapter/backup) and loads: limits inrush, prevents reverse feed, and manages seamless or controlled switching.
- eFuse / hot-swap controllers: TI TPS25947, TI TPS25982, TI TPS2660, TI TPS2663
- Ideal diode / ORing controllers: Analog Devices LTC4412, Analog Devices LTC4359, TI TPS2410, TI LM74700
- Commonly used families that support adjustable current limiting and robust fault behavior in DC distribution rails.
- ORing controllers that reduce diode loss and help avoid backfeed during source transitions.
- Good fit for “no unexplained reset” goals: predictable trip behavior + measurable indicators (trip counts, flags).
- Current limit + blanking: ILIM, tBlank, dV/dt control for inrush and pulse-load enables.
- SOA / fault energy: survivability during short/overload until protection acts.
- Reverse blocking: prevents backfeed from backup into adapter rail (and vice versa).
- Recovery policy: latch vs hiccup; retry timing (must match system safety policy).
- Setting ILIM without checking SOA → “works in lab, trips in field” under hot conditions.
- ORing thresholds too close → source flapping; output droop despite “two sources present.”
- Inrush blanking too short → nuisance trips at boot or at IR/heater enable.
D) Backup Management (Charger / Power-Path / Fuel Gauge / Protector)
Backup source management and reporting: charging, safe discharge, cutoff policy, SoC/SoH credibility, and cold behavior.
- Single-cell power-path chargers: TI BQ25895, TI BQ25601, Microchip MCP73871, TI BQ24075
- Multi-cell / higher-power chargers: TI BQ25713, TI BQ24650, Analog Devices LTC4015, Analog Devices LTC4162 family
- Fuel gauge / coulomb counters: TI BQ27441, TI BQ27Z561, Analog Devices MAX17048, Analog Devices MAX17055
- Protection (battery protectors / monitors): TI BQ2970 (1-cell protector family), TI BQ77915 (multi-cell protector family), ABLIC S-8254 family, common 1-cell protector DW01A (various vendors)
- Power-path behavior: can the system run while charging, and how does it transition when input is removed?
- Precharge & cold-start: low-voltage battery handling, temperature windowing (NTC), restart policy.
- SoC credibility: coulomb accuracy, OCV correction/learning points, temperature compensation, drift detection.
- Discharge limits: cutoff thresholds, current limits, protection coordination with eFuse/ORing.
- Using SoC only (ignoring Temp/Rint proxy) → “SoC looks OK but shuts down” failures in cold or aged packs.
- Charger without robust power-path behavior → path switching causes droops and unpredictable resets.
- Protector thresholds not aligned with system UVLO → unstable cutoff loops near end-of-discharge.
E) Always-On Evidence Domain (RTC / Supervisor / Watchdog / Log Memory)
AON domain that survives disturbances long enough to timestamp and finalize power events. This turns power anomalies into forensics-grade evidence instead of “random resets.”
- RTC: Analog Devices DS3231 family, Microchip MCP7940N, NXP PCF8523, NXP PCF8563
- Supervisors / watchdogs: TI TPS3823, TI TPS3839, Microchip MCP1316, Analog Devices MAX6369 family
- FRAM (power-fail tolerant logs): Infineon/Cypress FM24CL64B, Fujitsu MB85RC256V, TI FM24C256
- EEPROM (if endurance is acceptable): Microchip 24LC256/24AA256, ST M24C64, onsemi CAT24C256
- MRAM (robust logs, if used): Everspin MR25H256 family
- Timestamp integrity: RTC accuracy, backup supply method, brownout behavior.
- Reset explainability: supervisor thresholds and delays aligned to rail behavior and policy.
- Log endurance: FRAM/MRAM for high-write event logs; EEPROM only if write rate is low and power-fail safe scheme is proven.
- Power-fail safety: write time, atomicity strategy, CRC/hash chain verification.
- EEPROM used for frequent logs without wear control → silent corruption and lost forensics value.
- Supervisor threshold set too tight → nuisance resets during acceptable dips; too loose → logs cannot finalize.
- RTC backup not treated as an AON requirement → missing timestamps during the exact incidents that matter.
Figure F11. BOM categories mapped onto the system blocks: AC-DC control, rail regulation, eFuse/ORing power path, charger/gauge/protector backup management, and always-on evidence/log components.
H2-12. FAQs ×12 (Accordion; each answer maps back to chapters)
Each answer follows the same evidence-first format: one-line conclusion, two measurements, and one first fix. Scope is limited to power, backup, and local logs—no PoE negotiation, no surge/ESD parts, no cloud/VMS platform.
-
Hold-up time meets spec on bench, but fails in cold mornings—why?
Cold conditions usually reduce usable energy because capacitor/supercap ESR rises and the converter’s low-voltage efficiency shifts. Measure (1) the output Vmin and hold-up duration during a controlled power-cut at cold temperature, and (2) the backup source ESR/terminal droop under the same pulse load. First fix: re-budget energy for cold worst-case and adjust the Vwindow/cutoff policy.
Maps to: H2-5 (Energy & ESR) / H2-9 (Cold validation) -
IR illuminator turns on and the device reboots—PSU transient or eFuse trip?
This is either a PSU transient sag or a protection-path event; it can be proven quickly. Measure (1) main rail Vmin + recovery time and whether PG/RESET asserts before the reboot, and (2) the eFuse FAULT/IMON (or trip counter) at the same moment. First fix: if eFuse trips, tune tBlank/dVdt; if PSU sags, improve transient response or load-step sequencing.
Maps to: H2-2 (Transient behavior) / H2-4 (eFuse) / H2-10 (Field isolate) -
Battery shows 40% but system shuts down—SoC estimation or pack ESR?
SoC is an estimate; shutdown is triggered by terminal voltage under load, often dominated by aging/cold ESR. Measure (1) battery terminal droop during a known load step and derive an Rint proxy, and (2) fuel-gauge SoC with temperature at the same time. First fix: base cutoff on SoC + Temp + Rint (not SoC alone) and recalibrate learning points after pack aging.
Maps to: H2-6 (Gauge & cutoff) / H2-7 (Metrics) / H2-10 (Debug) -
Adapter unplug causes a reset even with battery present—ORing path switching issue?
A reset on unplug usually means the output rail experiences a brief dip during ORing transition or priority handover. Measure (1) output rail Vmin exactly at adapter removal, and (2) ORing controller gate/STATUS (or source voltage crossing) to see if the path “flaps.” First fix: widen the handover margin (threshold/priority), and add or reposition a small hold-up capacitor to bridge the switching gap.
Maps to: H2-4 (ORing & path) / H2-5 (Hold-up) -
Why does LLC supply “chirp” at night and cause random faults?
“Chirp” is typically light-load burst/skip behavior; it can place ripple/noise into sensitive thresholds and trigger nuisance resets. Measure (1) the light-load switching/burst frequency and rail ripple amplitude, and (2) the correlation to brownout/reset counters in logs. First fix: adjust light-load mode (controller settings or a small dummy load) and verify PG/UVLO thresholds are not overly tight.
Maps to: H2-2 (LLC light-load behavior) / H2-9 (Validation) -
How to log brownout events without corrupting storage?
The log must be power-fail safe: write a record in an atomic way, then mark it committed. Measure (1) whether log CRC/hash-chain verifies after repeated power-cuts, and (2) whether each record contains Vmin, duration, temperature, SoC, counter. First fix: use FRAM/MRAM when possible and implement a two-phase commit (write data → write commit flag) in the always-on domain.
Maps to: H2-8 (Event logs integrity) / H2-3 (AON domain) -
eFuse trips occasionally with no short—what two signals prove inrush vs overload?
Inrush is a short, high peak at enable; overload is sustained current beyond the limit. Measure (1) input current waveform peak + width during the trip window, and (2) eFuse FAULT reason/IMON (or trip counter + latch/hiccup state). First fix: for inrush, tune dV/dt and tBlank; for overload, raise headroom or stagger load enables and validate SOA margin.
Maps to: H2-4 (eFuse selection/tuning) / H2-10 (Decision tree) -
What’s the minimum health metrics set that’s actually useful?
A minimal set should answer three questions: “Is input getting worse?”, “Is backup aging?”, and “Is the path misbehaving?” Record (1) brownout count + Vmin statistics, (2) backup SoH/Rint trend + cold events, and (3) eFuse trip + ORing switch counts. First fix: tie every alert to a concrete log snapshot (Vmin/time/temp/SoC) so field issues are explainable.
Maps to: H2-7 (Health metrics) -
How do you validate backup runtime without waiting hours every time?
Use a two-layer method: fast predictive checks plus periodic full runs. Measure (1) average power at key modes (idle, record, event burst) and (2) cutoff voltage behavior near end-of-discharge to estimate runtime from the energy budget. First fix: define “segment tests” (boot, event logging, safe shutdown) as pass gates, then run full-duration tests only on sampled builds and temperature corners.
Maps to: H2-9 (Validation matrix) -
Supercap backup vs battery backup—what’s the real deciding factor?
The real decision is power vs time vs temperature, not “capacity on paper.” Measure (1) required backup power and allowed voltage window (Vhi→Vlo), and (2) cold performance via ESR/terminal droop under pulse load. First fix: choose supercap for seconds-to-minutes with high pulse power and long cycle life; choose battery for tens of minutes-to-hours and plan for aging, cutoff, and maintenance events.
Maps to: H2-5 (Backup options) -
Device survives outage but loses timestamp—what domain must be always-on?
If timestamp is missing, the always-on domain did not survive long enough to keep RTC and finalize the log. Measure (1) the AON rail during outage and recovery (does it dip below RTC/supervisor requirements?), and (2) log records for missing RTC time or monotonic counter gaps. First fix: power RTC + log memory from AON, and enforce a “finalize window” before hard reset.
Maps to: H2-3 (AON & sequencing) / H2-8 (Forensic logging) -
After replacing the battery, logs look inconsistent—what must be reset vs preserved?
Preserve forensic continuity (identity, monotonic counter, hash-chain seed), but reset battery-dependent learning state. Measure (1) whether the log integrity chain stays continuous across maintenance, and (2) gauge parameters (capacity/impedance model) before and after replacement. First fix: write a “maintenance event” record, reset or relearn SoC/SoH baselines as designed, and keep non-repudiation fields unchanged to avoid breaking audit trails.
Maps to: H2-8 (Integrity & continuity) / H2-6 (Gauge relearn)
Figure F12. A repeatable FAQ format that forces each answer to land on measurable evidence (two signals) and a minimal fix, while ensuring power incidents produce forensics-ready local logs.