123 Main Street, New York, NY 10001

Isolated Comparator Chain (ΣΔ Modulator + Digital Isolator)

← Back to:Comparators & Schmitt Triggers

This page shows how to build a high-side fault “comparator” across an isolation barrier using a ΣΔ bitstream + digital isolator—turning dv/dt, CMTI, latency, and drift into measurable budgets and reliable trip decisions. It focuses on practical thresholds, guardbands, and verification steps so false trips stay near zero while response time stays inside protection timing.

What this page solves (and what it does NOT)

This page focuses on isolated fault-detection chains where a ΣΔ modulator on the high-side crosses an isolation barrier through a digital isolator, then becomes a digital “comparator-like” decision (threshold / window / debounce / latch) on the low-side. The goal is a trip that remains reliable under high dv/dt, strong EMI, and large common-mode swings.

Use cases (problem-shape first)

High-side shunt OCP / short detection
  • Signal: mV-level differential over a large, fast common-mode.
  • Isolation reason: ground bounce + dv/dt can corrupt a low-side threshold if the signal is not isolated cleanly.
  • Typical failure mode: dv/dt → injected transient → bit errors → false trip or missed short pulse.
Bus-voltage compliance window (OVP/UVP, “valid power”)
  • Signal: divided high voltage with large surge and fast edges.
  • Isolation reason: robust domain crossing without coupling surges into logic ground.
  • Typical failure mode: threshold drift (R + reference TC) + transient spikes → chatter around trip-point.
Inverter / gate-driver fault trigger (latch + report)
  • Signal: fault condition from current/voltage/protection nodes.
  • Isolation reason: deterministic latch behavior and diagnosable reporting across the barrier.
  • Typical failure mode: isolator default state at power-up → false fault unless “valid-window” is enforced.
Isolation monitor / ground-fault trigger (noise + slow ramps)
  • Signal: slow variable with occasional spikes and strong ambient EMI.
  • Isolation reason: avoid coupling and allow strong digital debounce/windowing.
  • Typical failure mode: insufficient hysteresis/debounce → multi-toggling (chatter) under slow ramps.

Success metrics (definition → how it is measured)

Metric Practical definition Measurement hook (repeatable)
False trip rate Probability of a trip without a real fault under specified dv/dt + EMI conditions. dv/dt injection + noise injection + temperature points → count false latches across N events.
Miss rate Probability of not tripping within the required time after a true fault. Controlled fault pulses (amplitude + width sweep) → find minimum detectable pulse with pass criteria.
Response time Time from “input crosses limit” to “fault latch asserted” (end-to-end). Break into: ΣΔ window + isolator delay + digital debounce + latch/report → verify each term separately.
Bit errors under CMTI Bit flips / missing pulses caused by dv/dt and common-mode transient currents. Known pattern (or stable input) under defined dv/dt → error counter (BER or errors per second).
Trip-point drift vs temperature Change of the effective trip threshold (mV or A) across temperature and soak conditions. Temperature sweep with soak + repeated threshold scans → extract drift and hysteresis band.

These metrics are intentionally measurable. The rest of the page shows how to choose an architecture and tune the chain so the metrics meet targets under real dv/dt and EMI.

Out of scope (to avoid cross-page overlap)

  • Comparator families, output types, Schmitt triggers: covered on dedicated comparator/Schmitt pages.
  • Full ΣΔ theory and filter math: only the minimum needed for fault-decision budgeting is used here.
  • Safety standards tutorial: the focus is layout and verification hooks, not compliance clause-by-clause.
Problem domain map for an isolated comparator chain Block diagram showing high-side bus with dv/dt stress, an isolation barrier with a sigma-delta modulator and digital isolator, and low-side logic with filter, threshold, and fault latch/report. High-side / Bus Isolation barrier Low-side logic Shunt Bus sense dv/dt ΣΔ modulator Digital isolator Filter window Threshold Fault latch report

System architecture options: where the “comparison” actually happens

In an isolated chain, the most important design choice is where the decision is made: on the high-side with analog thresholds, or on the low-side after isolation with digital filtering and a programmable threshold/window. This choice sets the latency floor, the false-trip / miss behavior under dv/dt, and the diagnostic leverage available in production and in the field.

Quick comparison (first-pass selection)

Option Where the decision happens Latency floor dv/dt sensitivity Diagnostics
A. Isolate bitstream, decide digitally Low-side digital threshold / window after filtering (comparator-like in firmware/FPGA). Set by averaging / debounce window (OSR + decision window). Bit errors can corrupt short windows; longer windows reduce sensitivity but increase latency. Strong: logging, counters, self-test, field forensics.
B. Decide on high-side, isolate 1-bit fault High-side analog threshold/hysteresis; isolator carries a single fault flag. Potentially very low (analog propagation + isolator delay). dv/dt impacts the analog trip point directly; less leverage to “filter it out” digitally. Limited: fewer observables beyond a fault edge.
C. Dual path: fast trip + slow confirm Fast path trips quickly; slow path reconstructs / confirms for robustness and logging. Fast path defines protection; slow path defines false-trip control and diagnosis. Best if thresholds map consistently; otherwise “fast-only” false trips become hard to debug. Strong: fast protection + slow evidence (counters, trip history).

A reliable chain starts by picking an option that matches the required protection speed and the allowed false-trip rate. Later sections convert these choices into window sizes, guardbands, and verification tests.

Option A — High-side ΣΔ → isolation → low-side digital decision (main path)

When it fits
  • A trip must be repeatable under dv/dt and EMI, with tunable debounce/windowing.
  • Production and field need diagnostics: counters, logs, and self-test hooks.
  • Thresholds/windows may change by firmware (mode-dependent limits, temperature compensation).
Primary risks
  • Latency floor is set by decision window length; it cannot be tuned to zero without raising false trips.
  • Bit errors during dv/dt can bias short windows; robustness relies on window design and error tolerance.
  • Trip-point error is budgeted across divider/ref/modulator drift, not “assumed negligible”.
Key parameters to request/verify
  • Isolator CMTI and behavior under dv/dt (bit errors / missing pulses).
  • Modulator input range, offset/drift, and clock constraints (impact on decision window).
  • Decision method (moving average / simple CIC) and minimum stable window for target false-trip rate.

Option B — High-side analog threshold → isolate 1-bit fault (fast, but narrower)

When it fits
  • Protection must be extremely fast and a single trip edge is sufficient.
  • System can tolerate limited logging; primary goal is hard shutoff.
Trade-offs that must be accepted
  • Analog trip accuracy and chatter control live on the high-side; drift/noise/hysteresis are harder to tune post-bringup.
  • Less flexibility to separate dv/dt-induced glitches from real faults using digital statistics.
  • Field forensics is weaker because only a fault edge is transported.

Option C — Dual path: fast trip + slow confirm (recommended for tough dv/dt systems)

Why it exists
  • Fast path protects hardware; slow path provides evidence to control false trips and enable debugging.
  • Short glitches are handled by fast-path blanking and slow-path confirmation windows.
Non-negotiable rule

The fast and slow thresholds must be mappable to the same physical trip-point (mV or A). Otherwise, fast-only trips become untraceable, and “pass in the lab / fail in the field” repeats.

Three isolated-chain architectures and where the decision happens Three parallel lanes showing option A with digital decision after isolation, option B with analog decision before isolation and a one-bit fault flag, and option C with a dual path for fast trip and slow confirm. Architecture comparison (decision location) A Isolate bitstream → decide digitally B Decide on high-side → isolate 1-bit fault C Dual path: fast trip + slow confirm Sense ΣΔ mod Isolator Filter Decision window Sense Analog trip hysteresis Isolator Fault flag 1-bit Sense Fast trip Latch ΣΔ path Slow confirm Log + report tunable fast robust

ΣΔ modulator in this chain: what must be true (without teaching ΣΔ theory)

In an isolated fault chain, the ΣΔ modulator is not used as a “precision ADC lesson.” It is used as a density-encoded signal source that feeds a windowed digital decision. A reliable trip requires a few non-negotiable truths about bitstream behavior, latency, and overload recovery.

Six must-know facts (each with an engineering consequence)

Fact 1 — The decision is “density over a window,” not single bits.
  • Consequence: short windows are fast but show higher density jitter → higher false trips near threshold.
  • Action: keep a guardband between the nominal density and the trip threshold, or increase window length.
Fact 2 — Window length sets a hard latency floor.
  • Consequence: faster protection requires smaller N or higher bit-rate; algorithms cannot erase this floor.
  • Action: budget latency explicitly as window time + isolator + digital logic + latch/report.
Fact 3 — Overrange can saturate the bitstream and force a wrong decision.
  • Consequence: large transients may look like a “permanent fault” until recovery completes.
  • Action: define overload handling: blanking, “valid-window” gating, and required recovery time.
Fact 4 — Clock/bit-rate defines the decision time-base.
  • Consequence: a “fixed N-bit window” changes real time if bit-rate changes or bits are swallowed.
  • Action: choose the window definition (N-bit vs fixed-time) and verify behavior under dv/dt and supply noise.
Fact 5 — Reference stability becomes trip-point stability.
  • Consequence: Vref noise/TC and reference routing errors appear as threshold drift (mV or A) over temperature.
  • Action: treat reference noise/TC as part of the threshold budget; verify with soak + repeated scans.
Fact 6 — Source impedance and bias/leakage shift the effective threshold.
  • Consequence: large dividers/RC networks can drift with temperature and humidity → silent trip-point movement.
  • Action: keep impedance realistic or buffer; measure threshold across temperature and environmental stress.

Latency floor (budget in engineering terms)

For a window-based decision, the minimum reaction time is dominated by the time needed to collect enough bits:

  • Window time: twin ≈ N / fbit
  • Total trip time: ttrip ≈ twin + tiso + tdig + tlatch
  • Trade-off: larger N → lower false trips / misses, but higher latency and slower response to short faults.

A realistic design starts by setting N and fbit from the response-time requirement, then verifying whether the resulting density jitter and dv/dt error rate meet the false-trip target.

Key parameters that decide success (request → risk → test hook)

Parameter Failure signature Engineering test hook
Input range / headroom Density clamps near 0% or 100% → wrong trip during transients. Step beyond nominal + observe recovery time to stable density; add blanking if needed.
Input impedance / bias Trip-point shifts with divider R, temperature, humidity. Threshold scan vs temperature (and leakage stress if relevant); compare to budget.
Reference noise / TC Density drift → false trips near limits over temperature. Soak at temp points + repeated trip scans; inject reference ripple to quantify sensitivity.
Clock method / bit-rate stability Window time drifts; “same N” does not mean “same ms.” Measure ttrip across supply/temperature and under dv/dt; verify against response-time spec.
Overload recovery After saturation, density remains biased for a finite time. Apply worst-case fault pulse then return to nominal; verify “valid” decision timing and stability.
Bitstream density changes with input, and windowing sets latency Diagram shows an input step on top, a bitstream pattern below with denser ones after the step, and a window counter whose density crosses a threshold after enough bits are collected. Density over a window Input Bitstream Window step Window A Window B Density TH A B

Digital isolator & CMTI: turning dv/dt into “bit errors” and how to budget them

“High CMTI” is not a magic shield. It is a probability statement: under a defined dv/dt stress, the isolator output exhibits a low enough rate of bit flips, missing pulses, or runt pulses. In a windowed decision chain, these errors translate into false trips, missed faults, or latency drift.

Practical model: dv/dt → transient current → waveform errors → decision errors

  1. dv/dt event drives transient common-mode current through parasitic capacitances across the barrier.
  2. internal sampling/comparison nodes are momentarily disturbed (supply bounce, threshold shift, timing slip).
  3. output errors appear as bit flips, missing pulses (pulse swallow), extra pulses, or runt pulses.
  4. window counter observes a biased ones_count or an altered window time-base → threshold/window can be crossed incorrectly.

Parameter → risk → mitigation (avoid hand-waving)

Parameter / condition Failure signature System risk Mitigation & test hook
CMTI test point dv/dt (kV/µs) Burst errors near edges False trip spikes / missed short pulses dv/dt injection + error counter; design window and guardband for burst length
Data-rate / edge timing Runt pulses / missing pulses Window time-base drift; biased ones_count Scope pulse integrity under dv/dt; pick window type (N-bit vs fixed-time) deliberately
Propagation delay / skew Shifted sampling boundary Trip latency mismatch between channels Measure tiso distribution; enforce synchronization or per-channel guardband
Supply sensitivity (isolator VDD bounce) Bit flips correlated with switching False trips during high current edges Tight local decoupling + return path control; inject ripple to quantify BER slope
Default output behavior (power-up / UVLO) Stuck-high fault flag / random toggles Immediate false latch at boot Implement “valid-window” gating and boot sequencing; test brown-in/out
Output drive / logic thresholds Slow edges, overshoot, ringing Extra transitions counted as bits Route as controlled impedance if needed; scope at receiver pin under worst dv/dt
Burst error length (edge-correlated) Short clusters of wrong bits Window crosses threshold briefly Design K-of-M confirm or dual-window confirm; measure cluster stats, not only average BER
Bit swallow (missing pulses) Lower observed bit count in a time window Latency drift; inconsistent filtering Use time-based windows or monitor bit-rate; add “bit health” counter alarm
Ground return / layout near barrier Errors vary by probe point / routing “Pass in lab, fail in field” sensitivity Minimize loop area, enforce clean returns, keep barrier crossing short; validate with dv/dt A/B routing tests

From bit errors to false trips (a conservative budgeting method)

A typical decision counts ones in a window of N bits and compares to a threshold T. Each wrong bit can change the count by up to 1. A conservative guardband is built from the count margin:

  • Count margin: M = min |ones_count − T| under “no-fault” conditions.
  • Safe condition: ensure the worst expected error cluster stays below M (so the window cannot cross the threshold).
  • Rule-of-thumb (upper bound): if per-bit error probability is p, expected wrong bits is N·p → keep N·p < M.
  • Better for dv/dt: budget the burst length B during edges and require B < M with K-of-M confirmation.
Recommended decision hardening (minimal complexity, high leverage)
  • Guardband: do not set T at the nominal boundary; reserve margin M for dv/dt and noise clusters.
  • K-of-M confirm: latch only if K consecutive windows exceed the threshold.
  • Bit health counter: detect missing pulses / abnormal toggling and block decisions during unhealthy periods.
  • Valid-window gating: prevent boot / brown-out default states from generating false faults.

Repeatable verification hooks (what to measure, not what to “hope”)

  • dv/dt stress definition: specify edge amplitude, slew rate, source impedance, and ground reference points.
  • Error counter: count flips, missing pulses, runt pulses; report both average rate and worst burst length.
  • Correlation: tag errors by time relative to dv/dt edges to distinguish random noise from edge-coupled clusters.
  • Decision robustness: run no-fault conditions under dv/dt and record false-latch events (false trip rate).
  • Supply injection: add controlled ripple/step on isolator VDD to extract error sensitivity vs supply bounce.
dv/dt causes bit errors that can push a windowed decision across threshold Three stacked waveforms show a dv/dt step, an isolator output with a glitch and missing pulse, and a windowed density trace crossing a threshold and triggering a fault latch. dv/dt → bit errors → false decision dv/dt Isolator out Window edge errors TH Fault latch

Latency & response time: from analog event to a latched fault

Response time in an isolated ΣΔ fault chain is not a single number. It is a budget: the window required to see a real density change, plus isolator transport delay, plus the digital deglitch/confirm logic, plus the final latch and reporting path. A reliable design starts by choosing a time definition (detect vs latch vs report), then verifying worst-case behavior under dv/dt and supply stress.

Three time definitions (align expectations)

tdetect — first decision satisfied
Analog event crosses the effective threshold → the digital decision logic first evaluates “fault = true.”
tlatch — fault becomes latched
Analog event → confirm logic passes (K-of-M / debounce) → fault latch sets and remains asserted until reset/retry conditions.
treport — system action completes
Analog event → latch → interrupt/FPGA path → gate disable / log / message sent. This depends on firmware and system load.

Budget the latch time (the chain’s engineering target)

  • Window time: twin ≈ N / fbit (N-bit window at bit-rate fbit)
  • Confirm time: tconfirm ≈ K · twin (K consecutive windows required)
  • Latch budget: tlatch ≈ tconfirm + tiso + tdig + tlatchHW
The dominant term is typically tconfirm. If a “fast trip” requirement cannot tolerate that floor, the design must change (higher fbit, smaller N, or a separate fast fault path).

Latency budget table (typical scale, knobs, and side effects)

Segment What it means Typical dominance Knobs (what can be changed) Side effects / risks Verification hook
ΣΔ window accumulate Time required for density to represent the event with useful margin. Usually the main delay N, fbit, K, fast+slow windows Small N increases false trips under dv/dt or noise; large N misses short faults. Pulse-width sweep; record trip probability vs width and amplitude.
Isolator transport Barrier propagation delay and integrity (skew, swallow, runt pulses). Usually secondary Isolator choice, VDD decoupling, routing near barrier Errors can distort the window time-base and bias the decision. dv/dt injection + bit-health counters; edge-correlated burst length stats.
Digital deglitch / confirm Debounce logic that prevents chatter and dv/dt spikes from latching faults. Often the “stability knob” K-of-M, hysteresis thresholds, blanking windows Too strict increases latency and can miss short faults. No-fault dv/dt stress; measure false-latch rate per hour or per event.
Latch / report path How “fault = true” becomes a latched bit, a pin change, and a system action. Can dominate in firmware HW latch vs SW latch, FPGA clock, interrupt priority Non-deterministic firmware can break worst-case guarantees. Worst-case load tests; repeated measurements under interrupt stress.

What to lock in as requirements (prevents “fast/slow” ambiguity)

  • Time metric: define tdetect, tlatch, and treport explicitly.
  • Minimum fault pulse width: specify the shortest event that must latch (under worst dv/dt and noise).
  • False latch rate: define allowed false events per hour (or per 106 dv/dt edges).
  • dv/dt stress condition: specify amplitude and slew-rate, reference points, and repetition rate.
  • Bit health criteria: define what “missing pulses / runt pulses” means and when decisions must be blocked.
Timing budget from analog event to latched fault A timing bar chart shows four segments: sigma-delta window, isolator transport, digital debounce, and latch/report, with a final latch marker. Latency budget (t_latch) time event ΣΔ window Isolator Deglitch Latch/Report fault latched knobs: N f_bit K blanking latch path

How to implement thresholds/windows digitally (so it behaves like a comparator)

After the isolation barrier, the “comparator” becomes digital. The goal is identical to an analog comparator: stable behavior near threshold, controlled hysteresis, glitch immunity, and a deterministic latch. The implementation must also survive dv/dt-induced burst errors and missing pulses without turning normal switching edges into fault latches.

Bitstream → comparable quantity (three practical implementations)

1) Moving average (ones counting)
  • Output: ones_count / N (or ones_count) over an N-bit window.
  • Why it works: direct comparator analogy: “count above T” equals “input above threshold.”
  • Best use: minimal logic, clear thresholds and windows.
  • Watch: burst errors can push count across T → guardband + confirm windows recommended.
2) Simplified CIC / boxcar integration
  • Output: an integrated value that behaves like a smoother density estimate.
  • Why it works: reduces random density jitter and improves stability near threshold.
  • Best use: when false trips must be very low and latency budget allows smoothing.
  • Watch: effective window length and latency must be budgeted like N / fbit.
3) Dual-window (fast + slow)
  • Output: fast estimate for quick suspicion + slow estimate for confirmation.
  • Why it works: isolates short dv/dt bursts from real sustained faults.
  • Best use: systems with clustered edge-correlated errors or strict false-trip limits.
  • Watch: state machine must define entry/exit thresholds, blanking, and retry explicitly.

Make it behave like a comparator (hysteresis, debounce, latch)

Digital hysteresis (TH+ / TH−)
  • Enter fault: require the estimate to exceed TH+ (more strict).
  • Exit fault (if allowed): require the estimate to fall below TH− (more relaxed).
  • Benefit: prevents chatter around threshold without increasing window length.
Debounce / confirmation (K-of-M)
  • Rule: latch only if K consecutive windows satisfy the fault condition.
  • Benefit: rejects isolated bursts and reduces false latches under dv/dt edges.
  • Cost: adds latency ≈ K · (N / fbit) unless a fast path is used.
Blanking + retry (control non-ideal periods)
  • Blanking: block state transitions during boot, overload recovery, or known edge windows.
  • Bit health gating: if missing pulses or runt pulses exceed limits, block decisions and raise a diagnostic.
  • Retry: for recoverable faults, require a cool-down timer and stable TH− condition before clearing.

State machine template (copy-ready structure)

State Entry condition Exit condition Timers / counters Outputs
Normal Estimate below TH+; bit health OK; blanking inactive. Enter Suspect when estimate exceeds TH+ (or fast window trips). Clear confirm counter; clear diagnostics. Fault output deasserted.
Suspect First crossing of TH+ or fast window triggers suspicion. Return to Normal if estimate drops below TH−; enter FaultLatched if K confirms pass. Increment confirm counter; optional time-out back to Normal. Optional pre-fault warning flag.
FaultLatched Confirm counter reaches K (or hard-fast path triggers). Enter Retry only if fault is defined as recoverable; otherwise require reset. Freeze snapshot: estimate, thresholds, bit health, and timestamp. Fault asserted; gate disable/report may trigger.
Retry Recoverable faults only; cool-down timer starts. Return to Normal when estimate stays below TH− for the required time with healthy bits. Cool-down timer; stable-below counter; retry limit counter. Fault may remain asserted or pulse-based per system policy.
Digital comparator behavior using windows, hysteresis, debounce, and a latch Block diagram shows bitstream input, averaging windows, compare with TH+ and TH-, debounce K-of-M, latch, and report outputs, with a bit health monitor controlling blanking. Digital decision chain (comparator-like behavior) Bitstream Average N-bit Compare TH+ TH− Debounce K-of-M Latch Report Bit health monitor Blanking gate Key behaviors TH+/TH− K-of-M blanking retry

Threshold accuracy & drift: turning offset/TC/noise into “trip-point error”

Threshold accuracy in an isolated ΣΔ fault chain should be evaluated as a single engineering quantity: trip-point error (input-referred), expressed in mV (voltage trip) or A (shunt-current trip). This avoids mixing unrelated specs and makes guardband decisions measurable across temperature, dv/dt, and component tolerance.

Trip-point error: a consistent definition (what must be budgeted)

Static error (room, steady state)
Initial offset and divider ratio tolerance, plus bias × source resistance effects that shift the apparent threshold.
Drift error (temperature / gradient)
Offset drift, reference TC, resistor TC, and thermal gradients that create non-uniform temperatures across the divider and clamps.
Decision uncertainty (noise + windowing)
Short-term noise and window statistics spread the decision near threshold; dv/dt burst errors can add a temporary bias.

Guardband process (avoid over-reject while preventing missed faults)

  1. Build an input-referred budget: convert each contributor into an equivalent trip shift at the input (mV or A), then separate bias/drift (worst-case) from noise/uncertainty (statistical).
  2. Set a risk target: define acceptable false-latch rate and missed-fault rate under dv/dt and temperature extremes. Guardband is meaningless without a risk objective.
  3. Translate risk into margins: add worst-case bias/drift linearly, then add a statistical margin for uncertainty based on the chosen confirmation logic (window length and K-of-M). The final threshold uses Trip_set = Trip_required ± Guardband in the input domain.

Trip-point error budget table (fielded, input-referred)

Source Where it enters Input-referred impact Temp behavior Calibratable Primary knob Mitigation Verification hook
Divider ratio tolerance Sensor / divider Direct trip shift (first-order) TC + long-term drift Partial Resistor class, ratio matching Use matched networks; place isothermally; avoid gradients. Room + temp sweep; ratio inference from known stimulus.
Input bias × R_source Front-end R/RC/divider Trip shift grows with R_source Bias often increases with temperature Partial Lower R_source, buffering Keep R low where accuracy matters; use leakage-aware parts. Measure trip vs added series R across temperature.
Modulator offset & drift Modulator input stage Input-referred trip shift Offset drift dominates long spans Partial Device class, calibration strategy Use stable references; calibrate where repeatable; log drift. Temp sweep with fixed stimulus; track inferred offset vs T.
Reference Vref error & TC Reference / threshold “ruler” Scales all thresholds TC + load regulation sensitivity Partial Reference grade, filtering, buffering Keep Vref quiet; avoid dynamic loads; isolate grounds. Measure trip shift vs VDD ripple and temperature.
Decision uncertainty (window + dv/dt) Digital window + isolation integrity Turns trip into probability near threshold Burst errors correlate with switching edges Yes (logic) N, K, TH+/TH−, blanking Use hysteresis + confirm; gate decisions on bit health under dv/dt. No-fault dv/dt stress; compute false-latch rate and margin.
Trip-point error budget funnel from sensor to digital decision A funnel diagram shows layered contributors: sensor/divider, front-end bias, modulator offset, reference TC, and digital uncertainty combining into input-referred trip-point error. Trip-point error budget (input-referred) Sensor / Divider Front-end (bias × R) Modulator offset / drift Reference (Vref TC) Digital uncertainty Trip-point error mV / A

Front-end protection & survivability on the high side

High-side environments punish small-signal assumptions. Protection components can save the chain during surge, reverse connection, and switching noise—but they can also shift the threshold, slow the response, and extend overload recovery. The protection network must therefore be designed as a survive-first system with measurable impacts on trip accuracy and detection time.

High-side protection priorities (prevents “accuracy-only” mistakes)

  • Survive first: a chain that remains alive and predictable is more valuable than a chain that is “very accurate” only in benign lab conditions.
  • Limit current before clamp: clamps without current limiting can create ground bounce, heating, and recovery artifacts that look like faults.
  • Recovery matters: after a large event, define how long it takes until decisions are trustworthy again.

Protection network in 3 steps (each step has a cost)

Step 1 — Current limiting (series R / controlled impedance)
  • Goal: keep surge/ESD currents within the safe limits of clamps and input structures.
  • Cost: increases R_source → larger bias×R trip shift and slower settling.
  • Check: measure peak current during stress and verify trip stability near threshold.
Step 2 — Clamping (TVS / diode clamps / divider clamp)
  • Goal: prevent nodes from exceeding absolute maximum ratings during abnormal events.
  • Cost: leakage and dynamic resistance can shift trip points, especially at high temperature.
  • Check: temperature sweep of trip point; post-surge offset shift and recovery checks.
Step 3 — Filtering (RC / EMI shaping)
  • Goal: reduce high-frequency injection and suppress chatter from slow ramps or noisy wiring.
  • Cost: adds latency and may hide short faults; can lengthen overload recovery time.
  • Check: minimum fault pulse-width sweep and dv/dt false-latch testing.

Protection → accuracy/latency/recovery impact map (what to measure)

Element Threshold shift risk Latency impact Recovery impact Primary mitigation Verification hook
Series resistor (limit) Bias×R shift increases with R Settling slows Often neutral Keep R minimal for accuracy; place close to entry Trip vs R sweep; pulse injection for edge cases
TVS / clamp diode Leakage and dynamic resistance Can distort edges Can lengthen recovery after conduction Limit current before clamping; choose low-leakage parts Temp sweep; post-surge trip repeatability check
RC filter Bias×R and leakage can shift Adds delay Can slow return to normal Use smallest C that solves EMI; validate minimum pulse width Pulse-width sweep; dv/dt false-latch stress
Overload recovery path Temporary bias after large events Can block valid detection windows Defines “trustworthy again” time Blanking + health gating; log recovery markers Step stress then near-threshold probing to find stable decision point
High-side input protection network for an isolated sigma-delta fault chain Block diagram shows high-side signal entering series resistor, RC filter, divider, and TVS clamp paths into a sigma-delta modulator, with differential and common-mode return paths emphasized. High-side protection network (survivability + predictable trip) High-side shunt / bus Series R RC filter R + C Divider R ratio ΣΔ modulator input TVS clamp Common-mode return ground bounce / dv/dt current path paths: differential + common-mode return

Powering & grounding the high-side modulator (without becoming an isolation PSU page)

High-side power and return paths directly shape fault decisions in an isolated ΣΔ chain. Supply ripple can shift the modulator’s effective threshold and bias the bitstream density, while dv/dt-driven common-mode currents can create ground bounce and burst-like decision errors. The goal is to make the high-side domain quiet, compact, and predictable under switching stress.

Translate “power/ground issues” into observable chain behavior

Slow bias (trip-point drift)
Supply/REF noise or drift changes the modulator’s internal “ruler”, shifting the bit density baseline and moving the apparent threshold.
Burst errors (dv/dt-correlated)
Common-mode currents through barrier parasitics force return path voltage steps (ground bounce), disturbing sampling and causing short error bursts.

The 3 loops that must be checked (each creates a different kind of misdecision)

1) Supply ripple loop
  • Mechanism: VDD/REF ripple changes internal thresholds and biases bit density.
  • What to log: local VDD ripple (p-p), REF node ripple (if used), correlation to false-latch events.
  • Actions: place decoupling at pins; minimize loop area; keep high di/dt currents inside a small local region.
2) Input loop
  • Mechanism: long or asymmetric input return paths convert common-mode injection into differential error near threshold.
  • What to log: dv/dt stress vs input-node glitch amplitude, chatter counts around threshold, temperature sensitivity.
  • Actions: keep input network compact and symmetric; avoid routing the input return near the barrier region.
3) Barrier loop
  • Mechanism: dv/dt drives iCM through parasitic capacitance; return path inductance creates ground bounce and burst-like errors.
  • What to log: burst length distribution (bit-health counter), false-latch rate under no-fault dv/dt stress.
  • Actions: control cross-barrier capacitance; keep the common-mode return path short and intentional.

Symptom → root cause → action map (keeps debug measurable)

Observed symptom Likely loop Mechanism Quick measurement Fix priority Regression test
Trip point drifts with load or switching state Supply ripple VDD/REF ripple biases bit density baseline Local VDD & REF ripple at modulator pins Decoupling placement + loop area Repeat trip sweep vs switching states
False latches only near dv/dt edges Barrier iCM through Cpar causes ground bounce and burst errors Bit-health counter + burst length histogram Reduce cross-barrier capacitance; shorten return No-fault dv/dt stress with fixed thresholds
Chatter around threshold under slow ramps Input loop Input injection and asymmetry create differential noise Count toggles in a threshold window; measure node glitch Tighten symmetry; shorten input return Ramp tests at multiple dv/dt conditions
Common-mode current loop and ground bounce across the isolation barrier Diagram shows high-side and low-side domains separated by an isolation barrier, with parasitic capacitance driving common-mode current during dv/dt edges, creating ground bounce and burst errors. dv/dt → Cpar → iCM → ground bounce → burst errors High-side domain Switch node ΣΔ GND_HS bounce Low-side domain Isolator MCU/FPGA GND_LS Barrier Cpar iCM bitstream Bit errors burst

Layout & isolation barrier implementation: creepage/clearance + symmetry + EMI

Layout is the real CMTI implementation. Barrier zoning, cross-gap capacitance control, and compact return paths decide whether the chain behaves predictably under dv/dt. The checklist below prioritizes actions that reduce unintended coupling and measurement traps that can create “fake glitches”.

Layout review checklist (priority-ordered)

P0 — Barrier rules (do these first)
  • Creepage/clearance: verify gap and keepout satisfy the intended insulation class; keep the barrier zone clean.
  • No accidental cross-gap copper: avoid copper pours, vias, test pads, and silkscreen that reduce effective distance or add Cpar.
  • Cross-gap capacitance control: treat any intentional capacitor across the barrier as a controlled element; remove unintentional ones.
P1 — Loop area & return paths
  • Decoupling loop: place capacitors at modulator pins; minimize VDD–C–GND loop area.
  • Input loop: keep the input network compact; avoid routing returns near the barrier zone.
  • High di/dt distance: keep switching nodes and gate-drive loops away from the barrier and sensitive inputs.
P2 — Symmetry & differential integrity (if applicable)
  • Input symmetry: match parasitics on both input legs; avoid asymmetry that converts common-mode injection into differential error.
  • Reference consistency: keep sensitive references local to the modulator domain; avoid cross-domain return mixing.
  • Controlled routing: keep critical traces short, direct, and away from noisy edges; only use “differential” where the chain truly is differential.
P3 — Test points & probing traps
  • Ground clip loop: long ground leads create loop pickup and can turn dv/dt into “fake spikes”.
  • Probe capacitance: loading can create ringing or swallow narrow pulses; use low-inductance probing methods.
  • Measurement planning: reserve safe probing pads in each domain so correlation tests do not alter the circuit behavior.

Cross-gap risk points (fielded checklist for barrier integrity)

Risk point Why risky Where to look Fix How to verify
Cross-gap copper pour Increases Cpar and couples dv/dt energy Barrier edges, planes near isolator pins Pull copper back; enforce keepout zones No-fault dv/dt stress; compare burst counters
Test pads near barrier Adds parasitics and encourages unsafe probing loops Probe points close to isolation slit Move pads inward; add domain-local references Repeat measurements with low-inductance probing
Unintentional cross-gap capacitance Creates a hidden iCM return path and ground bounce Isolator package region; planes under components Reduce area; increase distance; control intentional caps Burst length distribution vs dv/dt edge timing
Layout zoning map and critical loops for an isolated sigma-delta chain A simplified board map shows high-side, barrier, and low-side zones with three highlighted loops: decoupling loop, input loop, and barrier common-mode loop. Probe hazards are marked. Board zoning + critical loops (keep it simple, keep it predictable) High-side zone Barrier keepout Low-side zone Modulator Cdec Input network Isolator MCU IO / logging Decoupling loop Input loop Barrier iCM path Probe hazard Ground clip

Verification plan: prove CMTI robustness + fault timing + false-trip rate

Reliability must be demonstrated as a repeatable test plan, not a promise. This section defines stimulus, setup, metrics, and pass gates for dv/dt-driven robustness (CMTI), end-to-end fault response time, and long-run false-trip rate. The same plan should be runnable in the lab, in EVT/DVT, and as a reduced “stress screen” in production.

Definitions (keep tests reproducible)

  • Event time (t0): the injected input crosses the configured threshold (or window boundary) at the high-side input node.
  • Decision time (tD): the low-side digital comparator state crosses its decision rule (window count / filter output / debounced state).
  • Latched time (tL): a persistent fault indication is asserted (hardware latch / FPGA flag / MCU latched status bit).
  • Bit error: a deviation in the isolated stream that would not occur under the same input without dv/dt stress (includes burst errors).
  • False trip: tL occurs in a “no-fault” condition where the input is held safely away from the threshold/window.

Test cases (Stimulus / Setup / Metrics / Pass criteria / Notes)

Stimulus Setup Metrics Pass criteria Notes
No-fault dv/dt injection (input held away from threshold) Apply worst-case dv/dt at the switching node; keep input margin > configured hysteresis + decision margin. False-trip count; burst length histogram; bit-health counter (edges/ones-count variance). False trips = 0 over a long-run dv/dt campaign (example gate: ≥1e6 dv/dt edges) and no burst crosses decision margin. Run two margins: “near-threshold” and “far-from-threshold” to separate coupling from genuine sensitivity.
Fault step injection (cross threshold with controlled overdrive) Inject a fast step at the high-side input (or shunt emulator). Time-align t0 at the actual input node. Latency distribution: tD−t0, tL−t0 (p50/p99/p999); overshoot/ringing at input node. p99 and p999 meet system timing budget; latch is monotonic (no de-latch) under dv/dt stress. Repeat at multiple window lengths to validate the configured delay knobs.
Short-pulse fault (pulse width sweep) Generate pulses across threshold with controlled width and amplitude; test both polarities if relevant. Minimum detectable pulse width; miss rate vs width; false latch vs blanking. No false latch below the configured blanking threshold; detection above the specified fault minimum. This validates debounce/blanking policy and prevents “chatter-driven” trips.
Temperature sweep + drift separation Sweep temperature points with soak; include a stable reference input and a dummy channel for measurement-chain tracking. Trip-point vs temperature; drift vs time; difference between channel-under-test and reference/dummy. Drift is within trip-point budget; observed drift is not dominated by measurement chain (checked via reference/dummy). Record configuration per point (window, hysteresis, firmware version) to keep results comparable.
Long-run false-trip soak (no fault, real operating noise) Run with real switching patterns and worst-case EMI environment; keep input safely away from threshold. False-trip rate upper bound; time-to-false-trip; bit-health trends. No false trips in the planned mission-time equivalent soak (convert to an upper bound rate). Use consistent logging intervals and a monotonic event counter for credibility.

Example reference chains (specific part numbers)

These are example ICs often used to build isolated ΣΔ fault-detection chains. Use them as a concrete starting point for the verification fixture and data logging.

  • Isolated ΣΔ modulator (bitstream across barrier): TI AMC1303 / AMC1336, ADI AD7403 / AD7405
  • Bitstream digital filter / decimation (low-side): TI AMC1210 (for CIC/decimation-style processing)
  • Extra digital isolation channels (GPIO / latch / status): TI ISO7721 / ISO7721-Q1, SiLabs Si86xx family
Verification bench for dv/dt injection, fault timing, and false-trip statistics Block diagram shows dv/dt injection source and switching node, high-side input fault injection, isolated sigma-delta chain, and low-side observation points with counters and timing capture. Temperature chamber is included as an environmental variable. Verification bench: injection points + observation points dv/dt source Half-bridge Switch node dv/dt injection High-side chain Fault injector ΣΔ modulator VDD Barrier Low-side observation Isolator RX Window compare Latch + time Counters stress bitstream Temperature Chamber soak + repeat env

Engineering checklist & production hooks (binning, self-test, diagnostics)

Production readiness is where isolated fault chains either become trustworthy products or endless debug loops. The goal is to define a minimal data schema, concrete self-tests, and binning rules that map failures back to adjustable knobs (window length, hysteresis, debounce, layout risk, power/return paths).

Production minimal data schema (fielded and traceable)

Group Must-have fields Why it matters Notes
Traceability Serial number, lot/date code, PCB revision, timestamp Enables correlation to suppliers and process shifts Keep stable naming conventions across stations
Configuration Threshold(s), window length, hysteresis, debounce/blanking, retry policy, firmware version Prevents “same unit, different rules” confusion Treat decision rules as part of the product spec
Environment Temperature point(s), VDD_HS, VDD_LS, stress tag (optional dv/dt profile ID) Separates drift from station variation Use consistent soak policy across lots
Measurements Trip-point (mV/A), timing (p50/p99), bit-health counters, false-trip count Converts reliability into numbers that can be binned and trended Log the decision parameters used for each measurement
Diagnostics Open/short flags, saturation/overload recovery time, stuck-at flags, clock-present flag Enables field returns to map to a root-cause class Keep codes stable to support fleet analytics

Binning rules (map failures back to knobs)

Bin Trigger condition Most likely cause class First knobs to adjust Fast confirmation
BIN-FALSE Any false trip in no-fault dv/dt stress Barrier coupling / ground bounce / insufficient debounce Debounce window, hysteresis, cross-gap capacitance control, return path Burst counter vs dv/dt timing correlation
BIN-TIME p99 latency exceeds timing budget Window too long / filter too heavy / slow reporting path Window length, filter stages, latch point (HW vs SW), IRQ policy Timing histogram across configurations
BIN-TRIP Trip-point error or temperature drift exceeds budget Offset/TC, reference drift, bias×R, divider tolerance/gradient Reference filtering, divider values, calibration policy, thermal placement Compare to reference/dummy channel drift
BIN-LINK Clock missing, stuck-at, no toggling, or invalid stream Isolation link fault / clock chain / power integrity Clock present monitor, stuck-at test, supply checks, connector integrity Edge-count threshold + stream sanity windows

Example IC part numbers for production-ready hooks

  • Isolated ΣΔ modulator (bitstream across barrier): TI AMC1303 / AMC1336, ADI AD7403 / AD7405
  • Bitstream decimation / digital filtering companion: TI AMC1210
  • Extra isolated status/control channels: TI ISO7721 / ISO7721-Q1, SiLabs Si86xx family
Production data flow: acquire → decide → bin → store → feedback Flow diagram shows production measurement acquisition, digital decision pipeline, self-test/diagnostics hooks, binning, data storage, and feedback loop to design knobs. Production hooks: data schema + binning + feedback Acquire SN / temp Compute window Decide trip / time Bin rules Store schema Self-test stream sanity Diagnostics open/short/clock Feedback knobs + layout window / hysteresis

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs: isolated ΣΔ chain + CMTI + digital decision (short, actionable)

These FAQs close long-tail issues around dv/dt robustness, isolated bitstream integrity, and “comparator-like” digital decisions. Each answer follows the same data-first structure: Symptom → Quick checks → Likely causes → Fix/guardband actions with measurable gates.

CMTI spec looks “enough,” but random false trips still happen. Which 3 loops to check first?
Symptom
Fault latch asserts in a “no-fault” condition, typically correlated with switching edges (high dv/dt) but not repeatable with simple DC tests.
Quick checks (≤3)
  • Correlate false-trip timestamps with switching-node dv/dt edges (same-cycle alignment is a strong clue).
  • Log bit-health counters: burst errors, missing edges, and window ones-count excursions during dv/dt.
  • Probe isolator RX-side supply ripple at the pins; look for edge-synchronous droop/spikes.
Likely causes (most common first)
  1. Barrier coupling loop: dv/dt drives common-mode current through parasitic capacitance across the barrier, disturbing RX logic thresholds.
  2. Supply/return loop: edge currents create local ground bounce or VDD droop at RX/decision logic, turning coupling into bit errors.
  3. Decision-margin loop: window/threshold margin is too tight, so rare bursts cross the decision boundary.
Fix / guardband actions (with gates)
  • Harden the 3 loops in order: reduce cross-gap capacitance (layout/partition), stabilize RX VDD/return (decoupling at pins), then widen decision margin (window + hysteresis + debounce).
  • Gate: false trips must be 0 over a dv/dt campaign of ≥ 1e6 switching edges (or the mission-time equivalent).
  • Gate: keep a statistical decision margin: window ones-count threshold should be separated by ≥ 6σ from “no-fault” distribution (reduce σ or increase margin).
Under dv/dt, the bitstream occasionally “drops pulses.” How to tell isolator issues from ground-bounce or supply issues?
Symptom
Edge count on the received stream occasionally decreases (missing toggles), or short gaps appear, often during fast switching transitions.
Quick checks (≤3)
  • Time-align: capture dv/dt edge markers and RX edge gaps; check if gaps occur within a fixed time window after dv/dt edges.
  • Compare two probes: RX output vs RX VDD ripple; gaps that follow VDD droop are usually power/return driven.
  • Repeat at reduced switching dv/dt (slower edges / lower bus swing): if gaps scale strongly with dv/dt, coupling is dominant.
Likely causes (most common first)
  1. RX supply integrity collapse: transient VDD/ground movement causes internal sampling/threshold errors, producing pulse swallow.
  2. Barrier common-mode injection: dv/dt current couples into RX input stage or reference, creating short “blind windows.”
  3. Measurement artifact: probing/ground leads add capacitance/inductance that creates apparent gaps or suppresses narrow pulses.
Fix / guardband actions (with gates)
  • Stabilize RX pins: place high-frequency decoupling at RX VDD pins; minimize loop area; keep return continuous to the decision logic.
  • Gate: during dv/dt stress, RX VDD transient should stay within ±1% of its nominal level at the pins (or tighter if required by the device).
  • Gate: edge-gap rate should be below the decision tolerance: “burst length” must never exceed what can cross the window threshold.
Longer averaging windows are steadier, but latency exceeds budget. How to trade false-trip rate vs latency?
Symptom
With a long window, false trips reduce, but end-to-end detection time (tL−t0) becomes too slow for protection timing.
Quick checks (≤3)
  • Plot latency histogram (p50/p99/p999) vs window length; identify diminishing returns region.
  • Plot no-fault ones-count distribution vs window length; measure σ shrink vs added delay.
  • Check if false trips are burst-driven (dv/dt correlated) or noise-driven (random); strategy differs.
Likely causes (most common first)
  1. Single-window design: one window is trying to do both “fast protect” and “low false-trip,” forcing it to be long.
  2. Decision margin too tight: short windows exist but threshold is set too close to the no-fault distribution tail.
  3. Unmodeled burst errors: long windows only “average away” bursts instead of preventing them at the source.
Fix / guardband actions (with gates)
  • Use dual-path decisions: a short “fast suspect” window plus a longer “confirm” window (Normal → Suspect → FaultLatched).
  • Gate: fast path must meet p99 latency budget; confirm path must meet false trips = 0 in the dv/dt campaign gate.
  • Guardband rule: set threshold margin by statistics (k·σ). Typical starting gates: k≈5 for ~10 ppm tail risk, k≈6 for ~1 ppm tail risk.
Adding an input RC reduces false trips, but the trip point shifts. Is it bias×R or divider error?
Symptom
After adding series resistance / RC filtering, the threshold in volts or amps moves, even though the digital decision policy did not change.
Quick checks (≤3)
  • Change the series R by a known factor (e.g., ×2): if trip shift scales with R, bias×R dominates.
  • Measure trip point at two temperatures: divider TC shows temperature-correlated shift even with constant input source impedance.
  • Measure input node DC at the device pin (not at the source): bias/leakage errors appear as pin-vs-source mismatch.
Likely causes (most common first)
  1. Bias/leakage × source impedance: small input currents create a DC offset across added R.
  2. Divider tolerance/TC: the intended threshold reference moves with temperature or resistor mismatch.
  3. Clamp leakage: TVS/diodes leak more at temperature, shifting the pin DC level under high impedance.
Fix / guardband actions (with gates)
  • Reduce impedance at the decision pin: lower R values or buffer the source; keep bias×R error below the trip budget.
  • Gate: ensure estimated DC error from bias/leakage is < 10% of allowed trip-point error budget (tighter for precision windows).
  • Guardband: re-calculate trip-point error budget after adding RC, and re-set digital threshold using measured pin voltage.
High-side overload recovery is slow and short-circuit pulses are missed. What are common root causes?
Symptom
After an overvoltage/overcurrent event, the chain stays “stuck” (saturated stream or stale decision), then fails to detect the next short pulse.
Quick checks (≤3)
  • Capture recovery time: time from overload removal to “normal” stream statistics (edge rate and ones-density return).
  • Check clamp conduction duration (TVS/diodes): long tail currents imply slow recovery and DC bias shift.
  • Check RC time constants at the input pin: large RC can “hold” the pin near saturation after the event.
Likely causes (most common first)
  1. Front-end saturation + long RC: input network stores charge, delaying return to the linear decision region.
  2. Clamp tail/leakage: protective elements keep injecting bias/leakage after the surge, shifting the pin.
  3. Decision “lock-up” policy: digital logic requires too long of a “good stream” window to re-arm.
Fix / guardband actions (with gates)
  • Limit and discharge: ensure the input network has a defined discharge path (bleed) and current limit that prevents deep saturation.
  • Gate: recovery-to-decision-ready time must be < 20% of the shortest fault pulse interval that must be detected.
  • Policy: implement a fast re-arm when stream sanity is restored (edge-count + density checks), then confirm with a longer window.
Chatter persists after digital filtering. Where should digital hysteresis live (before or after averaging)?
Symptom
The decision toggles repeatedly around the threshold (chatter), even when a moving average or CIC-like filter is used.
Quick checks (≤3)
  • Check correlation with dv/dt edges: if chatter aligns with dv/dt, treat as burst-driven (not analog noise).
  • Measure “distance to threshold” in statistics: how many σ separates no-fault distribution from the threshold?
  • Look at state transitions: is chatter coming from threshold crossing or from debounce/arming policy?
Likely causes (most common first)
  1. Insufficient decision hysteresis: filtering reduces noise but does not stop boundary toggling at the comparator state.
  2. Burst errors: dv/dt generates clusters of bit errors that push the filtered value across the boundary.
  3. Debounce policy too permissive: immediate re-arming causes repeated trips near the boundary.
Fix / guardband actions (with gates)
  • Default recommendation: place hysteresis after averaging (decision-domain hysteresis) and add a debounce window to prevent rapid toggling.
  • If analog-noise driven: add a small pre-filter deadband or input-domain hysteresis equivalent (reduce σ at the source).
  • Gate: require N consecutive decision windows in the same state before latch (example starting point: N=2–4 depending on latency budget).
Trip point shifts a lot with temperature. Check reference TC first or modulator offset/drift first?
Symptom
The measured trip point (V or A) moves significantly across temperature, exceeding the guardband planned for protection windows.
Quick checks (≤3)
  • Hold a stable input and log window ones-count vs temperature; if it drifts without input change, the chain is drifting.
  • Monitor the reference path (or a dedicated reference channel) vs temperature; correlated drift points to reference/PCB thermal gradient.
  • Compare two boards: identical slope suggests intrinsic TC; large unit-to-unit spread suggests layout/gradient or resistor network TC.
Likely causes (most common first)
  1. Reference TC / noise-to-threshold coupling: the reference sets the effective threshold scale and drifts directly into trip error.
  2. Divider network TC / gradient: resistor ratio drift moves the effective threshold, often unit dependent.
  3. Modulator offset/drift: device offset and drift shift the decision baseline, especially near rails.
Fix / guardband actions (with gates)
  • Prioritize reference + ratio stability: move sensitive references away from hot zones and control divider ratio TC.
  • Gate: temperature-induced trip-point shift must stay within the allocated budget (e.g., < 50% of guardband) across the specified temperature range.
  • Calibration rule: use minimal calibration only when coefficients remain stable; otherwise widen guardband and improve the physical causes first.
Default output state causes power-up false alarms. How to design power-on blanking and a “valid” window?
Symptom
On power-up or reset, fault latch asserts briefly even though the input is safe, then clears after firmware starts.
Quick checks (≤3)
  • Confirm isolator/modulator default output behavior (high/low/floating) during unpowered or clock-missing states.
  • Verify MCU/FPGA input pin biasing: floating pins can interpret default states as faults.
  • Check clock start-up: stream may be invalid until the clock is stable and the RX domain is ready.
Likely causes (most common first)
  1. Unarmed decision logic: decision runs before stream sanity and configuration are valid.
  2. Default output interpreted as fault: default high/low maps to the fault polarity.
  3. Asynchronous resets: RX logic resets earlier than HS stream, creating transient invalid windows.
Fix / guardband actions (with gates)
  • Define a “valid” condition: clock-present + edge-count above threshold + ones-density within sane bounds before arming trips.
  • Power-on blanking: ignore trips for a fixed window, then require consecutive valid windows before enabling latch.
  • Gate: zero power-up false alarms across ≥ 100 cold-start cycles and ≥ 100 warm resets (log each cycle).
Field EMI triggers false alarms, but the lab “passes.” How to build a more reproducible injection test?
Symptom
In real installations, rare false trips occur; in the lab, ad-hoc tests do not reproduce the issue consistently.
Quick checks (≤3)
  • Record field conditions: switching dv/dt profile, cable routing, grounding, and event timestamps (needed for replay).
  • Instrument stream health: edge count, burst length, ones-density variance; store with timestamps.
  • Check probe discipline: use consistent probe points and minimal loop area to avoid “measurement-made glitches.”
Likely causes (most common first)
  1. Wrong injection model: lab tests inject voltage but not the same common-mode current path as the field.
  2. Insufficient statistics: passing a few minutes does not bound rare-event probability.
  3. Environment coupling differences: long cables and chassis bonds create coupling absent on a bench.
Fix / guardband actions (with gates)
  • Reproduce the coupling path: emulate field return paths, cable harness, and dv/dt edge rates; inject common-mode current, not only differential voltage.
  • Gate: use a dv/dt edge-count target (e.g., ≥ 1e6 edges) with false trips = 0 to establish an upper bound.
  • Gate: acceptance is based on logs (bit-health + decision statistics), not only on a “clean-looking waveform.”
Multi-channel systems share a clock or isolated supply and show “linked” false trips. How to quickly find the coupling path?
Symptom
A disturbance on one channel causes other channels to false-trip at the same time (or with a consistent small delay).
Quick checks (≤3)
  • Measure cross-correlation: do false trips align within the same decision window (shared cause) or not (local cause)?
  • Disable one shared resource at a time (clock, isolated VDD, shared return) and watch if coupling disappears.
  • Compare RX supply transients across channels; shared-supply coupling often shows identical droop signatures.
Likely causes (most common first)
  1. Shared isolated supply impedance: one channel’s transient pulls shared VDD, shifting others’ thresholds.
  2. Shared clock integrity: clock edges distort under dv/dt, injecting coherent errors into multiple streams.
  3. Shared return/ground bounce: common return inductance creates system-wide reference motion.
Fix / guardband actions (with gates)
  • Partition shared resources: per-channel decoupling at pins; isolate clock distribution; control return paths with star/segmented routing.
  • Gate: under worst-case dv/dt, RX VDD transients must remain within the per-channel budget and must not correlate with other channels’ trips.
  • Decision policy: consider channel-specific margins (k·σ) rather than one global tight threshold across all channels.
Production guardband is too tight and causes over-reject. How to set limits back into a manufacturable range using statistics?
Symptom
Yield collapses because trip-point error, latency, or false-trip screens reject too many units, even though field failures are not observed.
Quick checks (≤3)
  • Plot distributions per temperature point and per configuration: mean, σ, and tail behavior (not only max/min).
  • Separate station variation from unit variation: use reference/dummy channels and repeatability runs.
  • Check which bin dominates rejects (trip-point vs timing vs link health); adjust the dominating screen first.
Likely causes (most common first)
  1. Limits set from worst-case guesses: guardband is chosen without measured distributions, inflating reject rate.
  2. Test condition mismatch: production stimuli and dv/dt conditions differ from design assumptions.
  3. Unstable station uncertainty: measurement drift or fixture coupling is being counted as device variation.
Fix / guardband actions (with gates)
  • Set limits by tails: define acceptable tail risk (ppm) and choose k in mean ± k·σ per station and temperature.
  • Gate: minimum sample size per condition: N ≥ 30 units per lot per temperature point (more for low-ppm tail targets).
  • Rule of thumb: k≈5 targets ~10 ppm tail risk; k≈6 targets ~1 ppm tail risk (validate with actual tails, not only Gaussian assumptions).
How to implement link self-test to detect stuck-at, clock loss, and isolator soft errors?
Symptom
The system needs a built-in method to classify link failures (stuck-at, missing clock, sporadic soft errors) without external instruments.
Quick checks (≤3)
  • Edge-count monitor: count transitions in a fixed time; near-zero indicates clock loss or stuck-at.
  • Density sanity: window ones-count must stay within plausible bounds under known safe inputs.
  • Variance/entropy: ones-count variance across windows detects “frozen” behavior even if edges exist.
Likely causes (most common first)
  1. Clock missing or unstable: no toggles or irregular toggles make decisions meaningless.
  2. Stuck-at in isolation path: output is forced high/low by a fault or power domain issue.
  3. Soft error bursts: rare dv/dt-driven bursts cause transient misclassification unless detected and gated.
Fix / guardband actions (with gates)
  • Three self-test hooks: (1) edge-count threshold, (2) density sanity bounds, (3) variance/entropy threshold across windows.
  • Gate: declare “clock missing” if edge count stays below threshold for M consecutive windows (starting point: M=2–5 depending on safety needs).
  • Production hook: log self-test codes with timestamp + configuration (window/hysteresis/firmware) so field returns can be classified and binned.