123 Main Street, New York, NY 10001

EMC, Safety & Energy Metering Subsystem

← Back to: Smart Home & Appliances

Central Thesis: This page distills a reusable, evidence-based EMC + safety + metering subsystem—covering protection, isolation/leakage safety, accuracy under noise, and brownout-safe event logging—so teams can diagnose failures fast and reduce field uncertainty without redesigning each product from scratch.

It focuses on measurable targets, “first two measurements,” and minimum viable logs that turn ESD/EFT/surge, leakage trips, and metering drift/tamper into actionable root-cause paths.

H2-1. Boundary & Architecture: What this subsystem is (and is not)

This subsystem page defines a reusable evidence chain for EMC protection, isolation/leakage safety, energy metering, and event recording. It is designed to be cited by device pages via interface points and acceptance checks—without turning into device-specific architecture.

  • Clear scope contract
  • Reusable deliverables
  • Interface points to cite
  • Acceptance-style wording

1.1 What this page delivers (reusable deliverables)

The subsystem is written as a set of portable building blocks. Each block is specified using the same engineering language: interface point → evidence to capture → acceptance check. This keeps the page reusable while preventing scope creep.

Protection (TVS/MOV/GDT/filters)

Defines protection zoning, clamp placement rules, and selection dimensions (clamp behavior, energy handling, parasitics), plus evidence points to prove “no reset / no damage / no silent corruption”.

Cite by: entry node + residual spike + reset/log check
Isolation & Leakage Safety

Defines barrier type boundaries, PCB keepout/slot intent, and leakage monitoring discriminators (true leakage vs transient common-mode events), with evidence and acceptance wording.

Cite by: barrier boundary + leakage loop + false-trip discriminator
Energy Metering (sensing + AFE/SoC)

Defines where to sense, which sensor class fits the constraints, and how accuracy survives EMI/noise and temperature drift. Focuses on drift budget, not just “typical accuracy”.

Cite by: metering tap + noise coupling + drift budget checks
Event Recording (black-box logging)

Defines minimum viable log fields, brownout-safe write strategy, timestamp trust model, and a forensics workflow: “2 waveforms + 3 counters + 1 log dump” to isolate the root cause fast.

Cite by: log continuity + reset cause + export point
Non-overlap rule: Device pages may reference only the interface points and acceptance checks from this subsystem. They must not duplicate full theory sections. This page avoids device-specific power trees, motor/control topics, protocol stacks, and cloud/app content.

1.2 Where it applies (AC/DC/PoE/SELV entry contexts)

The subsystem is anchored at the “system boundary” where external energy and disturbances enter. It supports AC mains front-ends, low-voltage DC entry (12V/24V), and PoE/SELV interfaces by describing what to clamp, what to filter, what to isolate, what to measure, and what to log—without assuming a specific appliance.

AC mains
  • Primary risks: surge energy, leakage/touch safety, insulation margin stress.
  • Subsystem emphasis: entry protection stack + leakage monitoring discriminators + event log continuity.
  • Citeable interface points: AC entry clamp node, barrier boundary, fault log export.
DC (12V/24V) & long harness
  • Primary risks: EFT bursts, ground bounce, cable-coupled common-mode spikes.
  • Subsystem emphasis: zoning + filtering + evidence-first verification.
  • Citeable interface points: I/O protection at connector, sensitive-rail droop evidence, reset/log counters.
PoE / SELV
  • Primary risks: interface ESD/EFT, common-mode noise coupling across PHY barriers.
  • Subsystem emphasis: interface clamp + isolation boundary hygiene + metering tap noise immunity.
  • Citeable interface points: port-side clamp point, barrier parasitic coupling control, logging of link drops.
Mixed domains
  • When metering shares ground with noisy power: accuracy must be protected by layout, filtering, and drift budget.
  • Subsystem emphasis: “return path design” + “metering trust loop” + “black-box logging”.
  • Citeable interface points: metering tap + drift checks + tamper/log fields.

1.3 What it explicitly does NOT cover (scope contract)

This page does not replace a device’s system design. It will not provide device-specific schematics, control loops, or protocol-stack walkthroughs. If a reader needs those, the correct approach is to reference the appropriate device page.

  • Not covered: device architecture deep dives (e.g., motor drives, HVAC thermodynamics), protocol stacks, cloud/app/OS tutorials.
  • Allowed only as one-line handoff: “For device-specific integration details, see the sibling page.”
Reason: Keeping this page strictly subsystem-level prevents duplicated content across Smart Home & Appliances subpages and keeps each child page vertically unique.

1.4 How sibling pages should cite this subsystem (interface + acceptance)

Each sibling page should cite this subsystem using a consistent “fill-in template” so readers can verify outcomes without repeating theory.

Cite template (copy into sibling pages)
  • Interface point: Identify the node/boundary (entry clamp node / isolation barrier / metering tap / log export).
  • Evidence to capture: Two waveforms + counters + a log excerpt (example set listed below).
  • Acceptance wording: Use outcome-based checks (“no reset”, “no silent corruption”, “log continuity preserved”).

Evidence primitives

  • Waveforms: connector residual spike, sensitive-rail droop, common-mode burst envelope.
  • Counters: reset cause, watchdog reason, comm error counter, protection trip count.
  • Logs: event_id, timestamp/sequence, peak/duration, affected rail/interface, firmware version/hash.
Subsystem Map (Reusable Blocks) Interface points + evidence + acceptance checks AC/DC/PoE System Entry Interface: entry node Protection TVS / MOV / GDT CMC / Filter TVS MOV CMC Evidence: residual spike Isolation Barrier + CMTI Leakage Safety HOT COLD Evidence: false-trip discr. Metering Shunt / CT / AFE SHUNT AFE Evidence: drift budget MCU / SoC counters, trip logic, sanity checks Event Log NVM + RTC, ring buffer, export FRAM RTC CRC Service Debug Log Export Interface: export
Figure F1. Reusable subsystem blocks and citeable interface points. Device pages should reference these nodes and acceptance checks, not duplicate subsystem theory.
Cite this figure Suggested caption: “EMC/Safety/Metering subsystem map with protection zoning, isolation barrier, metering tap, and event logging.”

H2-2. Threat Model & Compliance Targets: turning standards into design targets

Threat names are not design inputs. This chapter converts ESD/EFT/surge/leakage/hi-pot into measurable targets, evidence points, and subsystem-level countermeasure strategies.

  • Coupling-path taxonomy
  • Outcome-based targets
  • First evidence to measure
  • Threat→chapter mapping

2.1 Taxonomy by coupling path (the map that matters)

EMC and safety failures repeat because energy enters through a small set of coupling paths. Classifying threats by where energy couples produces stable, reusable design rules that remain valid across different devices.

Direct discharge / clamp path (ESD)

Fast rise-time discharge couples into I/O, chassis seams, and exposed metal. The design goal is to clamp at the boundary and control the return path.

Primary: clamp placement + return path
Burst injection on harness / I/O (EFT)

Repeated bursts couple into long cables, relay wiring, and connector pins. The design goal is to prevent rail droop, false triggers, and silent corruption.

Primary: filtering + zoning + isolation
High-energy entry surge (surge)

Large energy couples through the power entry network. The design goal is to share energy across protection elements and verify post-stress aging signals.

Primary: MOV/GDT/TVS + fuse/limiter
Leakage / insulation degradation (safety)

Leakage increases via Y-cap paths, humidity/contamination, and insulation wear. The design goal is to monitor leakage and discriminate real faults vs transients.

Primary: monitoring + discrimination

2.2 Converting threats into outcome-based targets (verify, don’t guess)

Targets should be written as outcomes that can be verified on a bench with minimal ambiguity. Absolute standard levels vary by product class, so the subsystem uses stable acceptance language that device pages can bind to their final test plan.

  • ESD target: no hang/reset, no permanent damage, and no silent state corruption (log continuity preserved).
  • EFT target: no false trips, no communications collapse, and no lost events (counters and logs remain coherent).
  • Surge target: entry protection remains within safe temperature/leakage drift; device continues normal operation without degraded safety margin.
  • Leakage/hi-pot target: leakage faults are detected reliably; transient common-mode events do not cause chronic false trips.
Acceptance phrasing rule: Use “shall / shall not” outcome checks and tie each to a concrete evidence set (waveform + counter + log field). Avoid device-specific claims or certification procedure walkthroughs on this page.

2.3 First evidence to capture (two measurements first)

To keep debugging deterministic, each threat category starts with the same evidence discipline: capture the boundary electrical stress, capture the internal consequence, and verify log continuity.

Threat
Coupling node
First evidence
Acceptance check
ESD
fast discharge
I/O, chassis seams, exposed metal
1) connector residual spike
2) sensitive-rail droop / reset-cause
No reset/hang; log continuity intact
EFT
burst
long harness, relay wiring, connector
1) ground bounce / rail droop
2) error counters + false trip count
No false trips; counters coherent
Surge
high energy
power entry network
1) entry element temp/leakage drift
2) post-stress log + function check
No degraded safety margin
Leakage/Hi-pot
safety
Y-cap paths, insulation, humidity
1) leakage trend vs environment
2) trip discriminator evidence
Detect real faults; avoid chronic false trips

2.4 Threat → Coupling → Countermeasure → Evidence (chapter map)

Countermeasures are specified as strategy combinations, not single parts. Each strategy maps to later chapters where placement, selection dimensions, and verification points are defined in detail.

Threat → Coupling → Countermeasure → Evidence Design inputs must be measurable and portable Threat Coupling Countermeasure Evidence ESD fast discharge I/O + chassis seams boundary coupling TVS clamp + return path zoning + placement filter only if needed residual spike reset cause log continuity EFT burst injection harness + relays ground bounce filter + isolation zoning to protect rails avoid false trips rail droop error counters trip count Surge high energy entry network energy flow MOV/GDT/TVS stack fuse/limiter coordination post-stress aging checks temp / leakage function check log preserved Chapter links ESD → H2-3/H2-4/H2-8 • EFT → H2-3/H2-8/H2-6 • Surge → H2-5 • Leakage/HiPot → H2-7/H2-6
Figure F2. Threat categories mapped to coupling nodes, strategy combinations, and first evidence. This keeps compliance targets measurable and prevents device pages from repeating subsystem theory.
Cite this figure Suggested caption: “Threat-to-evidence mapping for ESD/EFT/surge/leakage with countermeasure strategy blocks.”
Next-step usage: Subsequent chapters define the “how” for each strategy (placement, selection, and verification points). Device pages should reference only the relevant interface nodes and acceptance checks, keeping content non-overlapping.

H2-3. Protection Zoning: where to clamp, where to filter, where to isolate

Most field failures come from poor zoning and uncontrolled return paths—not from “weak parts”. This chapter turns protection into a repeatable layout rule set: 3 zones + return-path priority + two-point evidence.

  • 3-zone definition
  • Return-path priority
  • Clamp vs filter decision
  • Two-point evidence

3.1 The 3-zone model (Entry / Interface / Sensitive)

Protection must be designed as space + current loops. Each zone has a different goal and a different “allowed” component set. Mixing goals across zones is the fastest way to create instability.

Entry Zone

The boundary where energy enters (AC/DC/PoE). The goal is to handle higher energy and keep large currents out of internal reference planes. Typical elements: MOV/GDT, higher-energy TVS, CMC, fuse/limiter.

Do not place entry protection deep inside
Interface Zone

The boundary for connectors and I/O. The goal is fast clamping and controlled impedance shaping without injecting noise into sensitive rails. Typical elements: low-C TVS arrays, series R / ferrite, small RC.

TVS must sit at the connector boundary
Sensitive Zone

Where state and accuracy live (MCU, AFE, metering, clocks). The goal is stable reference and rail integrity. Typical elements: local decoupling, quiet reference routing, isolation boundary.

No protection return current through sensitive ground
Core rule: A protection component is “correct” only if it forces the transient current loop to close in the intended zone. The component value matters less than the loop.

3.2 Return-path priority (ground / chassis / PE)

Return path is the real design knob. The same TVS can be stable or unstable depending on where the discharge current is forced to flow.

  • Priority 1: High-current ESD/surge energy returns to chassis/PE (if available) using the shortest and widest path.
  • Priority 2: If PE is not present, return to the nearest boundary reference point without crossing the sensitive reference area.
  • Priority 3: Sensitive references connect to noisy returns only through a controlled tie (single-point / constrained coupling).
Keep-out intent: Do not allow the “protection loop” to share copper with metering/AFE references. If a transient current can choose two loops, it will choose the one that breaks the product.

3.3 Clamp-first vs filter-first (decision conditions)

“Clamp first” and “filter first” are both valid—when used in the correct zone and aligned to signal constraints. Use a stable decision based on bandwidth and energy, not habit.

Decision
Use when
Typical strategy
Clamp-first
peak control
Fast edges and high peak stress (ESD-like), strict pin abs-max, or unknown external environments. Goal: reduce peak before anything enters the board.
TVS at boundary + shortest return path, then optional shaping (small R/CMC) downstream.
Filter-first
bandwidth
Signal bandwidth is sensitive to TVS capacitance (some high-speed, touch/audio), and the main risk is common-mode injection / ground bounce.
Series R / ferrite / CMC to slow di/dt + low-C TVS at the boundary to catch residual peaks.
Failure hint: If adding TVS makes the system less stable, suspect capacitance + return-path injection before suspecting “TVS quality”. This maps directly to H2-4 (Cj + layout parasitics).

3.4 Checklist + evidence (two measurements first)

A zoning plan is only real if it can be proven by two measurements: boundary stress and internal consequence. Capture both, then verify state continuity.

Checklist (placement & routing)
  • Protection order matches zone: connector boundary → clamp → shaping → sensitive.
  • TVS-to-connector trace is short; return path is wide and local.
  • Filter reference is consistent (no “ground hop” between elements).
  • Sensitive zone keep-out prevents protection-loop current from crossing.
Evidence (two points + state)
  • Point A: connector residual spike after clamp (boundary evidence).
  • Point B: sensitive-rail droop/noise envelope (internal evidence).
  • State: reset-cause / watchdog reason / error counters / log continuity.
3-Zone Protection Layout Concept Clamp at boundary, control return path, protect sensitive rails ENTRY ZONE INTERFACE ZONE SENSITIVE ZONE CONN CHASSIS / PE MOV GDT CMC TVS low-C R / FB RC MCU / AFE SENSITIVE RAIL LOG / COUNTERS ESD strike Return path KEEP-OUT no long loops KEEP-OUT no clamp current through AFE GND Point A: residual spike Point B: rail droop/noise
Figure F3. 3-zone protection concept. Clamp at the connector boundary, keep the high-current loop out of sensitive references, and prove the zoning with two measurements (boundary residual spike + sensitive-rail droop/noise).
Cite this figure Suggested caption: “Three-zone protection zoning with controlled return path and keep-out regions.”

H2-4. ESD & TVS Selection: clamp behavior + capacitance + robustness

TVS selection is not “pick the highest power”. It is a trade among clamp behavior, dynamic resistance, capacitance, leakage, and layout parasitics. This chapter provides a decision system and evidence checks to avoid unstable ports and hidden degradation.

  • Parameter-to-risk mapping
  • Interface-specific Cj budgeting
  • Combination strategies
  • ESD evidence checks

4.1 Translate TVS parameters into engineering risk

A TVS datasheet only becomes useful when each parameter is mapped to what can break in the system. The key is to distinguish “survives” from “stays stable and accurate”.

  • Vclamp: sets the peak voltage that reaches the protected node. Too high means the IC sees stress even if the TVS survives.
  • Rd (dynamic resistance): determines how much Vclamp rises at high peak current. Larger Rd means worse real-world clamping.
  • Ipp / peak power: relates to single-event survivability, but does not guarantee “no reset / no silent corruption”.
  • Cj (junction capacitance): loads signal lines and can create instability (touch drift, audio distortion, high-speed margin loss).
  • Reverse leakage: impacts standby power and bias errors; leakage drift with temperature can create field-only issues.
Rule of thumb: If a port becomes unstable after adding TVS, the first suspects are Cj and return-path injection, not Ipp.

4.2 Interface-specific Cj budgeting (low-C vs robust)

Capacitance budget should be set by interface sensitivity. Use low capacitance where signal integrity is fragile, and prioritize robustness where it is not.

Interface
Main constraint
Typical protection combo
High-speed
margin
Capacitance and impedance discontinuity
CMC/series shaping + low-C TVS at boundary; verify eye/margin indirectly via error counters
Touch / audio / sensors
stability
Cj affects drift/distortion and noise injection
Low-C TVS array + series R/FB; keep return local; prove with noise envelope + no false triggers
Low-speed GPIO / buttons
robust
Less sensitive to capacitance, more sensitive to direct pin stress
Robust TVS + optional series R; focus on clamp + shortest loop
Power lines
energy
Energy handling dominates; Cj rarely the limiting factor
TVS + MOV/GDT (as needed) + fuse/limiter; verify post-stress leakage/temperature drift

4.3 Combination strategies (TVS + R/FB/CMC)

Protection stability often improves when a small impedance element reduces di/dt and ringing, leaving the TVS to clamp the remaining peak. The combination must be placed within the correct zone (H2-3).

  • TVS + series R: limits peak current and damping; useful when the line tolerates small series impedance.
  • TVS + ferrite bead: shapes high-frequency energy and reduces injection into sensitive rails; verify it does not create resonance with line capacitance.
  • TVS + CMC: suppresses common-mode energy on paired lines; helps prevent burst-induced ground bounce from becoming a functional failure.
Placement rule: Put the clamp at the boundary. Place shaping elements so they do not push the high-current loop into the sensitive zone.

4.4 Evidence + failure pattern (TVS placed too far)

A classic failure is “TVS exists but the IC still gets hit” because trace inductance turns distance into voltage. The fix is usually placement and loop control, not a larger TVS.

Failure pattern
  • ESD applied at connector → occasional reset, port damage, or latent instability.
  • Measured residual spike at connector remains high; ringing persists.
  • TVS is placed far from connector; return path crosses internal planes.
Repair sequence
  • Move TVS to the connector boundary; shorten TVS-to-connector trace.
  • Force return current to close locally (chassis/PE or boundary reference).
  • Add small series R/FB if signal constraints allow; re-check evidence.
TVS Selection: Clamp + Capacitance + Robustness ESD pulse → TVS clamp → residual to IC (Cj affects signal stability) ESD pulse fast rise-time TVS clamp block Vclamp Rd Cj Robustness Leakage Protected IC pin abs-max Residual Clamp path to return Capacitance matters Cj can reduce margin and inject noise Evidence checks Residual spike + rail droop + reset cause / counters + log continuity SPIKE RAIL LOG
Figure F4. TVS selection is a trade: clamp behavior (Vclamp/Rd), capacitance (Cj), leakage, and layout parasitics. Verify with electrical evidence (residual spike + rail droop) and system evidence (reset cause, counters, log continuity).
Cite this figure Suggested caption: “TVS clamp and capacitance concept with evidence checks for stable protection.”
Scope-safe reminder: This chapter defines selection dimensions and verification evidence, not device-specific schematics or protocol details. Device pages should cite the decision rules and evidence checks only.

H2-5. EFT & Surge Front-End: MOV/GDT/TVS + fuse + inrush as a system

Treat EFT and surge as energy management, not single-part selection. A stable front-end uses coordinated roles: MOV absorbs energy, GDT handles high current, TVS clamps fast edges, while fuse and inrush control keep the system predictable.

  • EFT vs surge behavior
  • Entry stack as a system
  • MOV aging & fuse coordination
  • Post-test acceptance checks

5.1 EFT vs surge (repeated small hits vs one big hit)

EFT tends to create functional instability through repeated fast transients and coupling on harness/entry nodes. Surge is dominated by energy and heat, where survivability and controlled failure (safe disconnect) matter most.

Threat
What typically breaks
Primary design focus
EFT
repetition
Resets, false triggers, comm drops, noisy rails
Coupling control + filtering + zoning + evidence on counters/log continuity
Surge
energy
Overheating, MOV degradation, leakage drift, insulation stress
Energy distribution + thermal margin + fuse/limiter coordination + post-test drift checks
Scope-safe rule: This page defines the reusable front-end stack behavior and acceptance evidence, not any appliance-specific power tree.

5.2 Typical entry protection stack (series path + shunt path)

A robust front-end separates the series path (normal current flow and inrush control) from the shunt path (transient energy absorption). The stack works only when energy is diverted at the boundary and return paths are short and controlled (see zoning rules in H2-3).

Series path (normal operation)
  • Fuse / breaker → sets the ultimate “safe disconnect”.
  • Inrush control (NTC / limiter) → prevents repetitive stress and nuisance trips.
  • EMI element (CMC / filter) → reduces burst injection into internal rails.
  • DC bus → feeds downstream converters / loads (not expanded here).
Shunt path (transient energy)
  • MOV absorbs surge energy (watch aging and heat).
  • GDT handles extreme current when applicable (coordination matters).
  • TVS clamps fast edges and residual spikes close to the protected node.
  • Return closes to chassis/PE or the entry reference node (do not cross sensitive references).
AC/DC Entry Protection Stack Series path for normal current • Shunt path for transient energy flow INPUT AC / DC / PoE FUSE breaker INRUSH NTC / limiter EMI CMC / filter DC BUS Shunt energy path (transients) MOV absorbs energy GDT high current TVS fast clamp RETURN chassis / PE Energy flow should be diverted at the boundary (Entry Zone) After-test checklist TEMP RISE LEAKAGE INSULATION RESET / LOG CONTINUITY
Figure F5. Entry protection must be treated as a coordinated stack: series path controls normal current and inrush, shunt path diverts transient energy to chassis/PE or an entry reference node. Validate not only survival, but post-test drift (temperature rise, leakage, insulation, logs).
Cite this figure Suggested caption: “AC/DC entry protection stack with energy diversion and post-test checklist.”

5.3 Reliability & coordination (MOV aging, thermal, fuse interaction)

Front-end robustness is a lifecycle problem. MOVs can degrade with repeated surges and temperature, and coordination with a fuse/breaker defines whether failures remain safe.

  • MOV aging: repeated stress can shift clamp behavior and increase leakage; monitor drift rather than assuming “still OK”.
  • Thermal runaway risk: if energy is repeatedly dumped into a hot MOV without a disconnect path, temperature rise accumulates.
  • Fuse coordination: the system should disconnect safely under abnormal energy, rather than leaving a degraded shunt element as a leakage/heat source.
  • TVS role: clamp fast residual spikes; do not use it as the primary energy absorber for large surge energy.
Acceptance mindset: “No reset” is not enough. After stress, verify leakage drift, temperature rise, and log continuity to avoid field-only failures.

5.4 Do / Don’t (placement and return discipline)

Do

  • Place MOV/GDT/TVS in the Entry Zone, as close to the boundary as possible.
  • Provide a short, wide return to chassis/PE or an entry reference node.
  • Keep hot components away from sensitive references and isolation keepouts.
  • Record post-test drift (temp/leakage/insulation) and keep event logs continuous.

Don’t

  • Don’t place shunt absorbers deep on the board (energy will travel through internal copper first).
  • Don’t let transient return current cross metering/AFE ground references.
  • Don’t rely on “high Ipp” TVS as a surge-energy solution.
  • Don’t ignore leakage drift after stress (degradation can be silent).

H2-6. Isolation Barrier: selecting the right isolation and keeping it real in layout

Isolation is a boundary discipline: device choice + creepage/clearance intent + CMTI behavior + isolated power noise control. This chapter focuses on the isolation “anatomy” and evidence that the barrier stays real under common-mode transients.

  • Isolation type boundaries
  • CMTI-driven evidence
  • Layout keepout/slot intent
  • Parasitic coupling control

6.1 Isolation types and boundaries (digital / analog / power)

Isolation choices should follow the signal and accuracy needs, not a default part. The barrier must be consistent for data and power.

  • Digital isolators: control and data paths; focus on delay/bandwidth/power and CMTI behavior under fast common-mode edges.
  • Optocouplers: suitable for some slower paths; watch long-term drift/aging and ensure the layout preserves the barrier.
  • Isolated amplifiers / isolated AFE: used when analog fidelity and common-mode range are key drivers.
  • Isolated DC-DC: defines whether the power domain truly respects the barrier or rebuilds a bridge across it.
Scope guard: Terms like working voltage and lifetime are treated as design labels here (no certification procedure details).

6.2 Key specs that predict field behavior (CMTI, working voltage, lifetime terms)

Instead of memorizing labels, link each specification to what can be measured and what can fail.

Spec label
Engineering meaning
Evidence to watch
CMTI
transients
Common-mode transient tolerance without false switching
Bit/CRC errors, dropouts, unexpected resets, counter/log continuity
Working voltage
long-term
Sustained stress boundary across the barrier (aging risk)
Post-stress leakage trend and long-run stability (no silent drift)
Isolation rating
label
A capability label; meaningful only if layout preserves the barrier
Layout keepout/slot integrity + no unintended coupling paths

6.3 Layout reality: keepout, slots, and parasitic coupling paths

Layout determines whether the barrier is real. Any copper crossing the boundary (or running too close) creates a parasitic capacitor that can bypass isolation under fast edges.

  • Keepout intent: prevent copper, stitching, and long traces from bridging hot-to-cold domains.
  • Slots: reduce surface leakage paths and shrink the parasitic coupling channel.
  • Crossing traces: avoid routing signals or planes that create a predictable “capacitive bridge”.
  • Controlled coupling: if a coupling element is required, keep it explicit and predictable (not accidental).

6.4 Isolated power noise: don’t rebuild a bridge across the barrier

Isolated DC-DC can silently defeat an isolation barrier if its return currents or filtering loops cross domains. Keep power loops local to each side and avoid large area loops spanning the barrier.

  • Input filtering of isolated DC-DC should close its loop on the hot side without injecting into the cold reference.
  • Output decoupling should close its loop on the cold side without spanning the barrier.
  • Verify common-mode events do not translate into rail droop or false switching across the barrier.

6.5 Evidence under common-mode transients (bit errors, resets, log continuity)

Isolation must be verified as system behavior. Under common-mode stress, the barrier is “real” only if counters and logs remain consistent while data paths stay error-free.

Primary evidence
  • Link counters: CRC/bit errors, dropouts, unexpected retransmits.
  • System state: reset cause / watchdog / brownout indicators.
  • Continuity: event log remains monotonic and complete across stress.
Failure hint
  • Errors or resets often indicate a bypass path via parasitic capacitance, not an “isolation chip issue”.
  • Fix is usually keepout/slot discipline and power-loop control, then re-check evidence.
Isolation Barrier Anatomy Device choice + layout keepout/slots + CMTI evidence + isolated power loop control HOT DOMAIN line / entry side COLD DOMAIN logic / sensing side ISOLATION BARRIER ISOLATOR digital / analog ISO DC-DC power barrier ENTRY / LINE NOISY CM MCU / AFE / METER COUNTERS / LOG CRC • reset cause • continuity Parasitic C (unintended coupling) Common-mode transient Layout keepout + slots reduce bypass coupling Evidence: no bit errors, no resets, log continuity under transients
Figure F6. Isolation must remain real under common-mode transients. Parasitic capacitance can bypass the barrier if keepouts/slots and power loops are not controlled. Validate with error counters, reset causes, and event-log continuity.
Cite this figure Suggested caption: “Isolation barrier anatomy with parasitic coupling and CMTI evidence.”
Scope-safe reminder: This chapter defines isolation boundaries and verification evidence. Exact distances, certification procedures, and device-specific schematics are intentionally out of scope.

H2-7. Leakage & Touch Safety: monitoring, limiting, and diagnosing leakage paths

Turn leakage and touch-safety risks into measurable engineering evidence. Map leakage loops, choose monitoring that matches the path, and distinguish nuisance trips from real insulation degradation using two-point measurements and logs.

  • Leakage path map
  • RCD/CT/insulation monitoring boundaries
  • Nuisance trips vs missed detection
  • Field debug checklist (symptom → evidence → isolate)

7.1 Leakage path map (from hot side to touchable metal / cold side)

Leakage is a loop, not a single node. Start by identifying whether the path is intentional (EMI components), conditional (moisture/contamination), or aging/fault driven (degraded insulation). Each class has different symptoms and different evidence.

Path class
Typical cause
Most common symptom
Intentional
predictable
Y-cap / EMI coupling to chassis or reference
Stable baseline leakage; changes with input conditions more than humidity
Conditional
environment
Moisture film, dust, salt, surface contamination
Worse in humid conditions; improves after drying/cleaning
Aging / fault
degradation
Insulation damage, creepage path pollution, MOV/entry drift
Trend worsens over time; may correlate with temperature or stress events
Evidence-first rule: Always capture the leakage waveform class (spike vs baseline trend) and correlate with environmental conditions and event logs before changing parts.

7.2 Monitoring options (RCD/GFCI, CT/differential sensing, insulation monitoring)

Monitoring must match the leakage loop. A trip device proves a differential imbalance, but not necessarily the root path. A sensing chain provides trend and correlation—only if it can separate real leakage from transient common-mode injection.

RCD / GFCI (residual-current trip)
  • Best for catching true differential leakage loops.
  • May nuisance-trip under fast transients if the system injects common-mode spikes into the sensing loop.
  • Use logs and two-point measurements to separate “spike” events from baseline trend.
CT / differential leakage sensing (measurement + action)
  • Enables leakage trend tracking and event records (time-correlated with resets and EMI).
  • Requires bandwidth/filters that avoid treating fast CM bursts as real leakage.
  • Evidence output should include peak, RMS/mean, duration, and timestamp.
Insulation monitoring (health trend)
  • Helps detect degradation paths before catastrophic failure.
  • Most valuable when correlated to humidity/contamination and post-stress drift.
  • Focus on trend and repeatability, not one-time readings.

7.3 Nuisance trips vs missed detection (common sources and discriminators)

The highest-risk failure mode is confusing transient injection with true insulation leakage. Use discriminators that separate spike-only events from baseline drift and link both to entry stress and EMI coupling.

Nuisance trip (false)

  • Transient-driven spike: short pulses aligned with switching/entry events.
  • EMI capacitor injection: higher baseline but stable and repeatable across humidity.
  • CM burst coupling: sensor chain sees spikes while insulation trend stays flat.
Two-point discriminator: capture the leakage waveform at the sensor + capture entry noise at the same timestamp. If spikes align, treat as coupling/return-path issue first (see H2-8).

Missed detection (danger)

  • Slow drift: leakage baseline rises over days/weeks (aging/contamination).
  • Humidity sensitivity: strong correlation with moisture/condensation cycles.
  • Post-stress change: drift appears after surge/EFT events and stays elevated.
Trend evidence: track baseline RMS/mean over time + environment tags (humidity/temperature) + after-test drift checks (H2-5).

7.4 Evidence fields (engineering records without certification procedures)

Record what makes leakage explainable. The goal is reproducible evidence that links leakage behavior to environment and to subsystem events.

  • Environment tags: humidity/temperature, “dry vs wet” surface condition, contamination notes.
  • Leakage waveform class: spike peak, duration, repetition; baseline RMS/mean trend.
  • Action & state: trip/no-trip, response delay category, reset causes (if any).
  • Correlation: entry stress events, EMI sampling points, rail ripple, event-log continuity timestamp.
Scope guard: This chapter focuses on evidence and diagnosis. Test standards, thresholds, and certification steps are intentionally out of scope.

7.5 Field debug checklist (symptom → two measurements → isolate → first fix)

Use a consistent field workflow. Each symptom starts with two measurements and ends with an isolation decision.

Symptom
Two measurements first
Isolation decision & first fix
Occasional trips
sporadic
Leakage sensor waveform + entry noise timestamp alignment
Spike-aligned → treat as coupling/return-path first (H2-8); check entry stack behavior (H2-5)
Missed detection
risk
Baseline trend (RMS/mean) + sensor chain headroom (no saturation/offset)
Trend present → suspect insulation/aging path; add environmental tagging; verify post-stress drift
Worse in humidity
environment
Humidity/temperature log + surface-path indicator (repeatability after drying)
Strong correlation → surface contamination/moisture loop; prioritize sealing/cleanliness and keepout discipline
Leakage Current Loop Map the loop • Measure spike vs baseline trend • Place sensing on the residual path HOT / ENTRY line side TOUCH METAL accessible surface Y-CAP intentional leakage MOISTURE CONTAMINATION RETURN PATH (Residual Loop) chassis / reference return • avoid crossing sensitive references CT / RCD residual detect Leakage increases with surface paths (wet/dirty) and aging drift
Figure F7. Leakage is a loop: intentional coupling (Y-cap) and conditional surface paths (moisture/contamination) can route current to touchable metal. Place residual sensing on the return loop and classify evidence as spike events vs baseline drift.
Cite this figure Suggested caption: “Leakage current loop with Y-cap and surface paths, plus CT/RCD detection point.”

H2-8. EMI Filtering & Grounding: designing the return path, not just adding parts

EMI control is return-path control. Identify common-mode vs differential-mode behavior, place filters to shrink loop area, and verify improvements using two measurements first: entry noise plus sensitive-rail ripple, correlated to resets, metering error, and logs.

  • CM vs DM quick mapping
  • Placement rules (short, executable)
  • Chassis/PE/logic ground principles
  • Two measurements first + correlation

8.1 Common-mode vs differential-mode (recognize first, then filter)

Adding parts without identifying the mode often moves noise rather than reducing it. Use mode recognition to choose the right element and the right placement.

Mode
What it looks like
Primary tools
Common-mode (CM)
harness/chassis
Noise rides on multiple conductors together; couples to chassis and radiates
CMC + controlled return to chassis/PE (concept level) + placement at boundary
Differential-mode (DM)
loop current
Noise is driven by high di/dt loops; dominated by loop area
X-cap/LC/RC (concept level) + minimize loop area + close return paths
Key principle: EMI filtering is loop shaping. The “best part” fails if the loop area stays large or the return path becomes accidental.

8.2 Placement rules (short, executable)

These rules keep the filter effective and prevent protection elements from turning into injection paths.

  • Boundary placement: place entry/interface filters at the boundary so noise is contained before it enters sensitive domains.
  • Close the return: capacitor return loops must be short and local; long returns create big radiating loops.
  • CMC discipline: a CMC needs a controlled return strategy; otherwise it shifts CM energy into sensitive references.
  • Minimize high di/dt loop area: shrink current loops rather than stacking parts deep inside the board.
  • Keep noise out of metering references: prevent transient return currents from crossing measurement/AFE reference nodes.

8.3 Ground / chassis / PE principles (high-level, device-agnostic)

Grounding is about defining where current is allowed to return. Separate roles at a principle level: chassis/PE anchors high-frequency return behavior, while logic/measurement reference must remain stable and protected from large transient currents.

  • Chassis/PE: serves as a high-frequency return anchor and safety reference point (concept level).
  • Logic/measurement ground: defines ADC/meter references; keep transient return currents from crossing it.
  • Controlled connections: any connection between chassis/PE and logic reference must be intentional and predictable, not accidental.
Out-of-scope: exact topology choices and physical distances are not prescribed here; only principles and evidence are provided.

8.4 Two measurements first (entry noise + sensitive-rail ripple) and correlation

EMI success must show up in both the electrical domain and the system domain. Start with two measurements and correlate them to symptoms and logs.

Measurement 1: Entry noise proxy
  • Capture at the entry boundary (conducted noise sampling point or near-field proxy).
  • Look for mode changes (CM vs DM behavior) after placement adjustments.
Measurement 2: Sensitive-domain ripple
  • Capture rail ripple and reference noise near metering/AFE/MCU.
  • Verify the ripple improvement aligns with reductions in resets, metering drift, and log anomalies.
Correlation evidence
  • Reset cause / watchdog counters do not increase under stress.
  • Metering error and leakage false-trip rates decrease.
  • Event log remains continuous with stable timestamps.
Return Path Concept Small loop = less radiation & injection • Big loop = more EMI problems GOOD: SMALL LOOP Signal Return (close) Small loop area CMC at boundary CAP short return BAD: BIG LOOP Signal Return (detour) Big loop = radiation + injection CMC too far CAP long return Two measurements first Entry noise proxy Sensitive rail ripple Resets / Metering / Logs
Figure F8. EMI filtering effectiveness depends on return-path control and loop area. Keep signal and return close, place filters at the boundary, and validate with two measurements first: entry noise plus sensitive-domain ripple, correlated to resets, metering drift, and log continuity.
Cite this figure Suggested caption: “Return-path control: small loop vs big loop with boundary filter placement.”

H2-9. Energy Metering Architecture: where to sense, how to keep accuracy under noise

Metering accuracy is architecture: tap location + sensor type + front-end chain + noise coexistence. The goal is not only “accurate when clean,” but “continuous and explainable under switching noise and stress.”

  • Tap options (AC / DC / high-side / low-side)
  • Shunt vs CT vs Rogowski selection
  • Front-end chain choice logic
  • Accuracy killers + evidence checklist

9.1 Where to sense: AC line vs DC bus, high-side vs low-side

Start with the tap location. The same sensor behaves very differently depending on whether it sits on the AC line, on the DC bus, or on a high-side/low-side node where return currents and common-mode stress differ.

Tap option
Best when
Primary risk
Key evidence
AC line tap
often CT
Energy accounting across varied loads; isolation boundary is explicit
CM stress and boundary coupling under EFT/surge and switching noise
Waveform stability + noise floor vs phase alignment
DC bus tap
often shunt
Compact, cost-controlled, high resolution at low currents
Ground bounce and high di/dt return currents contaminating references
Differential sense waveform + rail ripple correlation
High-side
CM-aware
System-true measurement without sharing load return currents
Common-mode range / isolation complexity and layout coupling
CM transient immunity + missing-sample flags
Low-side
simpler
Simpler sensing chain and easier integration
Return-path pollution; sensitive references see transient currents
Step-current test + baseline drift under switching
Boundary rule: If the tap crosses a safety/isolation boundary, plan the isolation and return-path strategy first, then pick the sensor.

9.2 Sensor choice: shunt vs CT vs Rogowski (engineering tradeoffs)

Pick sensors by what breaks accuracy in the real environment: overload behavior, low-current resolution, bandwidth needs, and EMI sensitivity. Avoid choosing by “spec sheet power” alone.

Sensor
Strength
Main limitation
Typical fit
Shunt
direct
Strong low-current resolution; straightforward calibration
Self-heating and return-path/ground-bounce sensitivity
DC bus or controlled domains needing repeatable accuracy
CT
robust
Wide current range; overload tolerance in many cases
Magnetic saturation/external field sensitivity; placement matters
AC line taps and high-range current measurement
Rogowski
fast
Good for fast-changing currents; non-saturating coil concept
Needs an integrator/AFE chain; low-current noise can dominate
High di/dt environments where bandwidth matters
Practical shortcut: Large current range with frequent overload → CT/Rogowski direction; tight low-current accuracy + easy recalibration → shunt direction. In heavy switching-noise environments, return-path control determines success more than the sensor type (see H2-8).

9.3 Front-end chain: ΣΔ modulator vs metering AFE vs integrated SoC

The front-end chain must produce evidence, not only measurements. Prefer chains that expose saturation flags, missing-sample counters, coefficient integrity checks, and stable timestamping under stress.

ΣΔ modulator chain
  • Strong fit when isolation and high resolution are required.
  • Works best with a disciplined return path and controlled references.
  • Evidence focus: overload recovery and continuity flags.
Metering AFE
  • Good for integrated energy computation and calibration support.
  • Evidence focus: calibration coefficient CRC and internal status flags.
  • Use when “explainable accuracy” is a primary requirement.
Integrated metering SoC
  • Great for size/cost integration when evidence export is sufficient.
  • Evidence focus: counters/log fields must remain accessible.
  • Avoid if stress behavior cannot be observed and recorded.

9.4 Keeping accuracy under noise (and not losing samples)

Accuracy fails in three ways: the input saturates, the reference is injected, or the system loses continuity. Treat metering as a chain that must stay stable through switching noise and entry stress.

Failure mode
What it looks like
First isolation action
Proof field
Input saturation
overload
Clipped waveforms; step error after transients; slow recovery
Limit/clamp before the sensitive input; keep stress in the entry zone
Saturation flag + residual error after event
Reference injection
return path
Noise tracks switching state; baseline shifts with return currents
Fix return path and loop area; keep transient currents out of references
Rail ripple correlation + error reduction
Continuity loss
missing samples
Missing/duplicated samples; timestamp jumps; drift after resets
Add continuity evidence (counters/flags) and align logs to events
Sequence counter + log continuity
Two measurements first: entry noise proxy + sensitive rail ripple. Accuracy must improve in both, and system evidence (resets/logs) must confirm it.

9.5 Selection matrix + “accuracy killers” checklist

Use a compact decision matrix for early architecture selection, then verify against the most common accuracy killers.

Constraint
If true…
Lean toward
Watch out
Very wide current range
Overload is frequent and energy accounting is needed
CT / Rogowski direction
Magnetic effects; placement sensitivity
High low-current resolution
Standby / low-load accuracy is critical
Shunt direction
Self-heating; ground-bounce injection
Isolation boundary present
Tap crosses safety boundary
Isolated chain + evidence export
Parasitic coupling and CM stress
High EMI environment
Switching rails and stress events coexist
Return-path-first design
Saturation + missing samples
Accuracy killers (most common): ground bounce, temperature drift/self-heating, magnetic saturation/external fields, layout coupling to high di/dt loops, input overload recovery, and timestamp/sample discontinuity.
Metering Tap Options Choose tap + sensor + front-end • mark isolation boundary early AC LINE (CT) DC BUS (SHUNT) ROGOWSKI L N CT AFE / METER ISO? DC BUS SHUNT HIGH side LOW side diff sense ΣΔ / AFE ISO? Conductor COIL INTEGRATOR AFE / METER ISO? Tip: mark the isolation boundary early; accuracy under noise depends on return path + overload recovery + sample continuity.
Figure F9. Metering tap options: AC-line CT, DC-bus shunt (high-side/low-side), and Rogowski coil. Isolation needs must be decided at the architecture stage, before picking the front-end chain.
Cite this figure Suggested caption: “Metering tap options and early isolation markers for robust accuracy.”

H2-10. Calibration, Drift & Anti-Tamper: making metering trustworthy over life

Trust comes from a closed loop: calibration evidence, drift discrimination, tamper signals, and event logs that survive stress. This chapter focuses on engineering evidence and fields—without legal or certification claims.

  • Factory calibration vs in-field self-check
  • Drift sources + discriminators
  • Tamper signals → sensing → log fields
  • Evidence-to-log mapping + no overclaim

10.1 Calibration strategy: factory calibration vs in-field self-check

Factory calibration provides controlled reference conditions and repeatability. In-field self-check is a sanity loop that detects “no longer trustworthy” behavior and triggers service evidence—without pretending to replace calibration.

Method
Primary purpose
Required evidence fields
Failure indicator
Factory calibration
repeatable
Establish coefficients under controlled conditions
Cal version/date, coefficient CRC, reference tags
CRC mismatch; coefficient change without authorization
In-field self-check
sanity
Detect drift, abnormal behavior, and continuity loss
Trigger reason, pass/fail, deviation magnitude, temperature tag
Deviation trend; missing-sample anomalies; reset-aligned jumps
Design intent: Self-check produces evidence and service triggers. It is not a compliance claim and does not replace calibrated references.

10.2 Drift sources and discriminators (temperature, aging, magnetic effects)

Drift is diagnosable when it is categorized and correlated. Treat drift sources as classes with observable signatures, then decide what must be logged for explainability.

Drift class
Most likely contributor
Signature
First proof field
Temperature
T-correlated
Shunt/refs/AFE offsets; self-heating
Slow drift strongly correlated to temperature
Temp tag + baseline trend slope
Aging
time
Material changes, solder/mechanical stress, entry drift
Long-term trend; step change after stress events
Post-stress delta + trend persistence
Magnetic effects
CT/Rogowski
Saturation or external field bias (CT core)
Waveform distortion at specific current/shape; phase anomalies
Waveform anomaly flag + operating current tag

10.3 Anti-tamper: signals → sensing → log fields

Anti-tamper is evidence. For each tamper category, define one sensing method and one log field so that service can distinguish user behavior, environment influence, and true tampering intent.

Tamper category
One sensing method
One required log field
Caution
Cover open / enclosure
physical
Cover switch / optical / Hall (concept)
tamper_cover_open + timestamp
Proves access, not intent
Magnetic interference
field
Mag sensor or waveform consistency statistics
tamper_magnetic_bias + magnitude
Environment can be a confounder
Bypass / reverse
routing
Direction/consistency checks (V/I plausibility)
tamper_bypass_or_reverse
Needs robust plausibility logic
Waveform anomaly
stats
Outlier statistics (shape/harmonics/power factor)
tamper_waveform_anomaly + score
Avoid false positives under EMI
Chain integrity
integrity
Coefficient CRC + missing-sample counters
meter_chain_integrity
Must survive resets and stress
Evidence principle: Each tamper signal must be time-correlated to metering behavior and recorded as fields, not as a vague “tampered” state.

10.4 Evidence → event log mapping (what must be recorded)

Trust requires that evidence survives stress events. Record calibration state, self-check results, drift deltas, tamper flags, and continuity counters with stable timestamps so the service readout can reconstruct what happened.

  • Calibration: version/date, coefficient CRC, authorization state, reference tags.
  • Self-check: trigger reason, pass/fail, deviation magnitude, temperature tag.
  • Drift evidence: baseline trend slope, post-stress delta, persistence indicator.
  • Tamper: category field + magnitude/score + duration + recovery marker.
  • Integrity: missing-sample counter, timestamp continuity marker, reset cause linkage.

10.5 Don’t overclaim (engineering evidence, not legal guarantees)

This chapter provides an engineering trust loop: evidence fields, discriminators, and event logging. It does not claim legal compliance, certification completion, or absolute tamper-proof behavior. Trust is demonstrated by logs, coefficient integrity, anomaly statistics, and repeatable re-tests.

Metering Trust Loop Metering → sanity checks → tamper signals → event log → service readout → actions METERING sensor + AFE SANITY CHECKS self-check + plausibility TAMPER signals + statistics EVENT LOG NVM + RTC counters + fields SERVICE READOUT diagnostics export ACTIONS recal / service / isolate Missing samples counter Coefficients CRC Trust comes from evidence fields + timestamps, not from claims.
Figure F10. A metering trust loop: metering outputs feed sanity checks and tamper signals; evidence fields are written to an event log with stable timestamps; service readout drives actions (recalibration/service/isolation). This is engineering evidence, not a legal guarantee.
Cite this figure Suggested caption: “Metering trust loop with sanity checks, tamper signals, and event logging.”

H2-11. Event Recording & Forensics: what to log, how to timestamp, how to survive brownouts

Event recording is an engineering SOP, not a checkbox. A good on-device “black box” logs the right fields at the right time, survives brownouts, and enables fast root-cause triage with a minimal evidence set.

  • Event taxonomy + dictionary
  • MVS (minimum viable schema)
  • FRAM/EEPROM/Flash + ring buffer
  • Timestamp vs sequence strategy
  • Brownout survivability
  • 5-step forensics SOP

11.1 Log taxonomy: turn symptoms into a consistent event dictionary

The event dictionary must be consistent across product lines: one event_id plus a scoped subcode (e.g., rail index, trip source, channel index). This makes filtering and forensics deterministic.

Power events
  • UVLO/BOR, rail dip, inrush fail, watchdog reset correlation
  • Required: rail_id, min_v, dip_ms, reset_cause
Protection events
  • ESD/EFT hit counters, surge counter, eFuse/OVP/OCP/OTP trips
  • Required: trip_source, peak, duration_ms, recovery state
Metering trust events
  • Tamper flags, overcurrent signatures, coefficient CRC mismatch
  • Required: tamper_type, score, coeff_crc, sample continuity
Safety events
  • Leakage trip, insulation degrade flags, nuisance-trip pattern detection
  • Required: leak_trip, threshold_bin, duration_ms, retry/lockout state
Rule of thumb: If an event changes system state (reset, lockout, trip, accuracy invalid), it must produce a log record that survives brownouts.

11.2 MVS (Minimum Viable Schema): the smallest set that still enables reconstruction

Avoid “data dumping.” A record should be small, deterministic, CRC-protected, and sufficient to answer: what happened, which domain, how severe, in what order, under which firmware build.

Identity
schema_version, event_id, subcode, fw_version, config_crc
Timing
rtc_state (valid/invalid), timestamp (optional), seq_no (mandatory), uptime_ms
Evidence
rail_id, min_v, dip_ms, peak, duration_ms, reset_cause, crc
Bottom line: “Trusted time” is optional; “trusted order” is mandatory. If RTC is invalid, seq_no + uptime_ms must still reconstruct the timeline.

11.3 When to log: triggers, pre/post snapshots, and early commit for reset-type events

Many failures become unreproducible because “the interesting context” was only in RAM and disappeared after reset. Treat logging as a lightweight state-machine snapshot, not a waveform recorder.

  • Trigger points: BOR/UVLO IRQ, trip latch, leakage trip, tamper flag change, coefficient CRC mismatch, missing-sample spike.
  • Pre-event snapshot: keep the last N “min/peak/counter” values (no big payloads).
  • Post-event snapshot: record recovery state (auto-recover / lockout / degraded accuracy).
  • Early commit: for reset-type events, write the record before the system resets (or at BOR early warning if available).
Engineering intent: Two-phase commit prevents half-records from being interpreted as valid during brownouts.

11.4 Storage strategy: FRAM vs EEPROM vs Flash, ring buffer, and write-life control

Storage selection is about write determinism and survivability, not only capacity. Use a ring buffer with a fixed record size and a commit byte written last.

FRAM (preferred for brownout-critical logs)
  • Fast, deterministic writes; excellent endurance for frequent events.
  • Best fit for: BOR/UVLO/rail dip and “last record must survive” scenarios.
EEPROM (middle ground)
  • Works for moderate event rates; pay attention to page write behavior.
  • Best fit for: infrequent critical events + configuration snapshots.
Flash (capacity, but manage write amplification)
  • Great for large history, but frequent small writes are risky without buffering.
  • Best fit for: aggregated counters, periodic summaries, lower-frequency events.
Ring buffer record layout (practical): [header | payload | crc | commit] — write commit last. On boot, scan for the latest valid seq_no.
Concrete example MPNs (reference building blocks)
  • SPI FRAM: Infineon/Cypress FM25V10, FM25V02
  • I²C FRAM: Fujitsu MB85RC256V
  • I²C EEPROM: Microchip 24LC256, 24AA256
  • SPI NOR Flash: Winbond W25Q64JV, Macronix MX25L6406E
  • RTC: Microchip MCP7940N, NXP PCF85063A, ADI/Maxim DS3231
  • Reset supervisor / BOR helper: TI TPS3839, ADI/Maxim MAX809
  • Power-path / hold-up assist (examples): TI TPS2121, ADI LTC4412
  • Supercap (hold-up energy, example): Panasonic EEC-F5R5H105
Choose voltage/current ratings and packages per the product’s safety boundary; these MPNs are examples for architecture discussions.

11.5 Timestamping: RTC + backup, and “time untrusted” fallback with sequence numbers

Time is not always trustworthy. The log must carry an explicit RTC validity state and a monotonic sequence number.

  • When time is trusted: RTC has backup supply, passes sanity checks, and does not jump unexpectedly.
  • When time is untrusted: first power-up, RTC backup lost, or detected time discontinuity → mark rtc_state=invalid.
  • Always required: seq_no increments monotonically and survives resets (stored in NVM with wear-safe strategy).
Practical field rule: Sort by timestamp only if rtc_state=valid. Otherwise sort by seq_no, then uptime_ms.

11.6 Forensics SOP: the minimal evidence set + a 5-step reconstruction method

The fastest triage uses a minimal, repeatable evidence set: two waveforms, three counters, and one log dump. This is enough to separate “power collapse” from “measurement chain collapse” and from “protection latch behavior.”

Minimal evidence set (M-Set)
  • 2 waveforms: (1) most sensitive rail near MCU/AFE, (2) entry/interface noise proxy near protection zone.
  • 3 counters: missing samples, saturation/overload flag count, reset/trip counter.
  • 1 log dump: last K records + schema_version + fw_version.
5-step reconstruction
  1. Sort events by seq_no (or by trusted timestamp).
  2. Find the first “system turning-point” (BOR/UVLO/trip/leakage/tamper-integrity).
  3. Time-align that event to the two waveforms (rail dip vs entry spike window).
  4. Use counters to decide: saturation vs missing samples vs reset chain.
  5. Output the first fix direction: entry energy control / return path / clamp placement / recovery policy (no over-claim).

Figure F11 — Black box recorder: event sources → queue → NVM → service export (survive brownouts)

The black box recorder is a deterministic pipeline. Brownout survivability is achieved by early commit, hold-up energy for the last write, and a two-phase commit record format.

Black Box Recorder (On-Device) Event sources → queue → NVM ring buffer → service export • brownout survivability EVENT SOURCES POWER UVLO / BOR / rail dip PROTECTION ESD/EFT hit • trip 🛡 METERING tamper • overcurrent SAFETY leakage trip EVENT QUEUE priority + debounce PRE snapshot POST state NVM (RING BUFFER) two-phase commit + CRC FRAM fast writes EEPROM page write FLASH history SERVICE log dump BROWNOUT SURVIVABILITY hold-up energy + early commit + valid-bit last HOLD-UP supercap / backup EARLY COMMIT write last record COMMIT LAST valid byte Rail dip window BOR early warning → commit
Figure F11. On-device black box recorder: multiple event sources feed an event queue; records are committed to an NVM ring buffer with CRC and a commit byte written last. Brownout survivability comes from hold-up energy and early commit around BOR/rail dip windows.
Cite this figure Suggested caption: “Black box event recorder pipeline with brownout survivability and NVM ring buffer.”
Implementation guardrails: keep records small and deterministic; never require a large write during brownout; always log schema_version + fw_version so field forensics can attribute changes.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12. FAQs (Evidence-based; mapped to chapters)

Each answer stays inside this subsystem evidence chain: Protection, Isolation, Leakage Safety, Metering Trust, and Event Recording. No device-specific system designs are assumed.

1) After adding TVS, resets become more frequent: capacitance load or return path?

Treat this as an energy-routing problem, not a “TVS brand” problem. First capture two points: residual spike at the connector (after TVS) and the sensitive rail dip/BOR counter during the same hit. If spikes clamp but rails still dip, the return path is injecting ground bounce; if rails stay solid but the interface degrades, TVS capacitance/impedance is the likely load. Log event_id + reset_cause + seq_no.

Related sections: H2-3, H2-4, H2-8, H2-11
2) ESD does not crash the system, but metering “jumps”: front-end hit or rail ripple?

Separate “measurement chain saturation” from “reference corruption.” First capture metering AFE flags (overrange/saturation/missing-sample) and the metering-domain rail ripple around the ESD event. If the jump aligns with front-end saturation while rails remain stable, the input path needs protection/limiting; if the jump aligns with a rail dip or BOR warning, the power/return path is dominating. Preserve the record with event_id, peak, and seq_no.

Related sections: H2-4, H2-9, H2-11
3) EFT causes packet loss, but surge does not: why, and which filter stage first?

EFT is a repeated fast-edge aggressor, so coupling paths and return-path discontinuities matter more than bulk energy. First measure a common-mode noise proxy at the interface and the internal sensitive-rail ripple while logging drop counters. If loss occurs without rail dip, prioritize interface-side common-mode control (CMC placement + return-path continuity); if loss correlates with rail dips, prioritize entry energy management and rail resilience. Confirm by correlating seq_no with counter spikes.

Related sections: H2-5, H2-8
4) Surge passes, but MOV runs hot / leakage increases: how to judge aging risk?

“Pass” is not “healthy,” so post-test evidence must be captured. First record MOV temperature rise under a defined input condition and measure leakage/insulation-related indicators before and after the surge campaign. If both temperature rise and leakage drift upward, aging/derating risk is likely; if only temperature rises, thermal coupling or fuse/MOV coordination is more likely. Store the test stamp (fw_version/config_crc) with the log snapshot for traceability.

Related sections: H2-5
5) The isolator misbehaves on transients: insufficient CMTI or isolated supply coupling?

Use time alignment to discriminate CMTI stress from supply coupling. First collect link error/CRC counters and isolated-supply ripple or ground-reference movement across the barrier during the transient window. If errors line up with the common-mode edge while isolated supply remains quiet, CMTI/layout/keepout is the limiter; if errors track isolated-supply ripple, the isolated power and its return loop are coupling noise into the receiver. Log event_id + duration_ms + error_count with seq_no.

Related sections: H2-6, H2-8
6) Leakage alarms worsen in humidity: what are the three most common leakage paths?

Focus on measurable paths: (1) Y-cap related common-mode leakage across the safety boundary, (2) surface leakage from moisture/contamination films, and (3) insulation damage/aging creating a persistent resistive path. First record humidity/condensation conditions and the leakage-sense output (or trip timing) trend. Strong humidity correlation points to surface leakage; switching-state correlation points to Y-cap/common-mode behavior; slow monotonic growth points to aging. Store threshold_bin and duration_ms in the log.

Related sections: H2-7
7) RCD/GFCI nuisance trips: surge common-mode or Y-cap design effect?

Distinguish impulse-driven imbalance from an elevated steady leakage baseline. First capture a transient common-mode proxy during the trip window and the steady leakage baseline under normal operation. Trips that occur only during impulse windows suggest common-mode transients and return-path discontinuities; a consistently high baseline suggests Y-cap/insulation/leakage design drivers. Apply the first fix accordingly: improve common-mode control/placement for transients, or reduce baseline leakage contributors for steady issues. Log trip_source + peak + seq_no.

Related sections: H2-7, H2-8, H2-5
8) Same metering concept, new PCB revision is less accurate: sampling loop or ground reference?

Assume layout parasitics changed the measurement reference. First measure the metering input noise floor (at the AFE/ADC pins) and the sensitive-domain rail ripple/ground bounce under the same load condition. If noise tracks switching states and rail ripple, the ground/return reference is compromised; if noise changes mainly with routing distance/loop geometry, the sampling loop and its coupling are dominant. Implement the smallest fix first (loop area, reference continuity, shielding distance), then confirm by repeating the same calibration points and logging drift deltas.

Related sections: H2-9, H2-8
9) Readings go low at high current: CT saturation or shunt thermal drift?

Use waveform shape versus time/temperature correlation. First observe the sensor output shape at peak current and capture a temperature proxy (or time-under-load) alongside the measurement drift. A sudden compression/flattening at peaks suggests CT saturation or magnetic bias; a gradual drift correlated with temperature rise suggests shunt self-heating/thermal coefficient effects. Apply the appropriate fix: avoid magnetic saturation conditions for CT paths, or reduce shunt heating and improve thermal modeling for shunt paths. Record peak, duration_ms, and temperature bin in the log.

Related sections: H2-9, H2-10
10) Metering is suspected of tampering: minimal anti-tamper sensing + log fields?

Keep the design auditable with a minimal, deterministic loop. First select one tamper signal class (cover-open, magnetic bias, or bypass/reverse signature) and one plausibility statistic (waveform anomaly score or coefficient CRC integrity). If the signal asserts without a corresponding log record, the recorder chain is not trustworthy; if logs exist without fw_version/config_crc and record CRC, forensic attribution fails. Minimum fields: tamper_type, score, coeff_crc, seq_no, and timestamp/rtc_state. Tie each tamper event to an evidence snapshot.

Related sections: H2-10, H2-11
11) Frequent power loss causes “last event missing”: choose FRAM or add hold-up?

Start from the time budget: can the last record be committed before rails collapse? First measure BOR early-warning-to-reset time and estimate worst-case record write time with two-phase commit. If the time window is too short, hold-up energy (supercap/backup rail) or earlier commit is mandatory; if the window is sufficient but records still drop, move to deterministic storage such as FRAM. Practical parts for architecture discussion: SPI FRAM FM25V02/FM25V10 or I²C FRAM MB85RC256V plus a supervisor like TPS3839 to stabilize reset behavior. Verify by seq_no continuity across power cycles.

Related sections: H2-11
12) Use “two waveforms + one log” to tell ESD vs EFT vs power collapse: how?

Use a minimal reconstruction method. First capture (1) a connector/interface spike proxy and (2) the most sensitive internal rail waveform, then export the last K log records with reset_cause and seq_no. If spikes are large but rails remain stable, the aggressor is likely ESD/EFT coupling; if rails dip or BOR triggers, power collapse dominates. Differentiate ESD versus EFT by repetition patterns and event counters: single sharp hits versus repeated bursts. Confirm by aligning event_id timestamps/seq_no with the waveform windows.

Related sections: H2-2, H2-3, H2-11