Bypass & Redundant Channel for LED Driver Systems

Q: Relay chatters at low line—coil brownout or debounce window too short?

Relay chatter at low line is typically coil brownout behavior combined with insufficient dwell/holdoff timing. Check whether repeated state transitions occur without stable CF confirmation, and review coil clamp choice and release time alignment with Verify windows. First fix: add UVLO-like gating and minimum on/off dwell; validate with a supervisor such as TPS3839 and a relay family such as Omron G5Q.

Q: We bypass correctly, but brightness steps visibly—transfer timing or current-loop settle?

Brightness steps after bypass usually come from transfer/verify timing that perturbs the current loop during settling. Check ILED settling and ripple aligned to BYPASS_CONFIRMED, and confirm Verify starts after actuator stabilization. First fix: add holdoff and soft-verify timing so the loop settles before declaring stable; enforce max transfers/hour.

Q: Voting disagrees intermittently—sensor drift or shared reference causing common-mode?

Intermittent vote disagreement is often due to common-mode coupling (shared reference/ground/ADC) rather than true drift. Check whether disagreements cluster during surge/transfer edges versus evolving with temperature/time, and verify independence of references and returns. First fix: separate references or stagger sampling; use robust isolation for cross-domain signals (e.g., ISO7721) and add a proof-test stimulus to validate channel independence.

Q: Contact is welded but system doesn’t detect it—missing Vdrop test or wrong injection point?

Welded contacts go undetected when open-state evidence is weak or the test stimulus does not traverse the suspect element. Check that Vdrop is measured (Kelvin) when open is commanded and/or that a bounded test-current injection produces an unambiguous signature. First fix: add a proof-test step that measures Vdrop/Vds under a known small stimulus and logs PROOF_TEST_PASS/FAIL; a telemetry-capable stimulus/protection reference is TPS25982 or TPS25940.

Q: False ‘open-string’ faults after maintenance—connector intermittency or sense wiring routing?

Post-maintenance open-string faults are usually intermittent connectors or disturbed sense wiring that turns noise into dropout signatures. Check dropout_cnt patterns for bursty behavior and validate sense routing integrity around high di/dt paths. First fix: increase confirmation robustness (Tconfirm + I/V correlation) and rework Kelvin sense routing; store key events in robust NVM such as MB85RS64V.

Q: Event logs show ‘bypass confirmed’ but field tech sees no bypass—feedback signal integrity or definition mismatch?

A ‘confirmed’ bypass without real bypass usually means confirmation criteria is too weak or feedback definition/integrity is wrong across isolation boundaries. Check whether BYPASS_CONFIRMED requires both CF and conduction evidence (Vdrop/Vds + current recovery), and validate feedback health for stuck-at or inversion. First fix: require two independent proofs for confirmation and add periodic proof-tests; use a robust isolator such as ISO7721 for cross-domain feedback.

Q: Power-loss during fault causes corrupted history—log atomicity or monotonic counter handling?

Corrupted history under power loss is primarily an atomicity/commit problem and secondarily a monotonic continuity problem. Check commit markers and CRC so partial writes are always detectable, and verify monotonic_ctr does not roll back across reboot. First fix: implement two-phase commit (write→CRC→commit marker) and store critical counters in robust NVM such as MB85RS64V; use a stable RTC such as DS3231 for time provenance.

Q: System recovers too aggressively and oscillates—retry policy or thermal cooldown missing?

Oscillation is usually a retry/backoff policy failure—no cooldown, weak rate limit, or too-quick verify—not a redundancy concept problem. Check for repeated Transfer→Verify→Retry loops within short intervals and correlate with thermal evidence trends. First fix: add exponential backoff and thermal cooldown gates and enforce max transfers/hour with service-required latch after repeated failures.

← Back to: Industrial Sensing & Process Control

Bypass and redundant channel design in LED drivers is not about adding extra hardware—it is about making every transfer evidence-driven, state-controlled, and auditable. A reliable system proves why it switched, how it switched, and that the mechanism still works, through measurable signals, structured voting logic, and verifiable event logs.

What “Bypass / Redundant Channel” Means in LED Driver Systems

A bypass design removes a failed element from the series energy path so the luminaire can keep operating in a controlled, traceable way. A redundant channel design switches the load to a separate healthy path (or ORs two paths) when the primary path becomes untrustworthy.

Series-path bypass Parallel redundancy Transfer / switchover Fail-safe state Evidence-driven decision

This topic stays focused on controlled power-path continuity under faults: (1) what exactly is bypassed or replaced, (2) what evidence proves a fault is real, and (3) what the system must do when evidence is incomplete. Interface protocols and converter topology details are treated only as boundary conditions, not as the subject of this page.

Two patterns that must not be mixed:

Auto-bypass (series-path): a bypass element bridges a failed point (e.g., an open segment), keeping current flowing through a defined safe path. The bypass element becomes part of the safety case and must be diagnosable.
Redundant channel (parallel): the load is transferred to a separate channel (Path-A → Path-B) using an ORing or transfer element. The system must prove the inactive path is not silently failed (latent fault).

Where it appears in lighting (as design triggers, not product categories):

Availability-critical installations (e.g., roadway/tunnel): “no full blackout” is the top requirement; degradation is allowed if auditable.
High maintenance-cost sites (industrial/high-bay): automatic transfer reduces downtime while logs guide service action.
Multi-segment / multi-string luminaires: isolating or bypassing a failing segment avoids cascading failure across the entire lamp.
Safety-dominant emitters (concept-level only): when hazard is high, the default state is conservative even if availability suffers.

Figure F1 — Concept boundary: series-path auto-bypass versus parallel redundant channel, plus the three-state safety model.

Scope boundary: this page focuses on controlled power-path continuity, health evidence, fail-safe policy, and auditable behavior. Interface protocols, converter derivations, and full EMC filter design are intentionally out of scope.

System Requirements & Failure Philosophy

“Reliability” becomes actionable only when it is expressed as testable requirement fields. A bypass/redundant design must be specified in terms of what is allowed to happen during faults, what must never happen, and what evidence is required before switching states.

Requirement fields to define before circuit choices:

Availability target: uptime objective and what “degraded operation” means (e.g., reduced current, reduced segments, limited runtime).
Max blackout time: the maximum allowable interruption during bypass/transfer (ms-level target for critical installations).
Switchover stress limit: maximum inrush/overshoot during transfer and the acceptable settling window.
Output accuracy bound: maximum allowed current error after transfer (steady-state and transient).
Recovery policy: automatic recovery vs service-only recovery; retry rate limit and latch conditions.
Auditability: what events must be recorded, how long logs must persist, and what minimum fields are required per event.
Proof-test interval: how often the bypass/transfer mechanism must be verified to prevent latent faults.

Fail-safe default must be explicitly chosen for the “uncertain evidence” condition. In other words, when health signals disagree, or when a suspected fault cannot be confirmed, the design must decide whether the safe action is OFF or BYPASS / TRANSFER. This decision depends on the hazard class: availability-critical installations may prioritize continuity, while safety-dominant emitters require conservative shutdown unless sufficient evidence supports continued operation.

Latent-fault tolerance is mandatory: the bypass mechanism, the transfer element, and the health monitor can fail silently. Requirements must specify (1) what constitutes a latent fault, (2) how it will be detected (diagnostic evidence), and (3) how the detection will be proven later (audit records). Without this, redundancy can create a false sense of safety.

Copyable requirement checklist (fill-in):

MTBF goal: ____
Max blackout time: ____ ms
Max inrush during switchover: ____ A (____ µs window)
Allowed current error after transfer: ____ % (transient) / ____ % (steady)
Max transfers per hour: ____
Default safe state when evidence is insufficient: OFF / BYPASS (choose one + rationale)
Log retention: ____ events or ____ days
Proof-test interval: ____ months

Figure F2 — Requirements field map: availability, switchover stress, fail-safe policy, and auditability converge into acceptance criteria.

Reference Architecture: Dual-Path Power + Dual-Sense + Control Plane

A redundant lighting channel design stays robust only when the architecture is expressed as three layers: power paths (what carries energy), evidence inputs (what proves health), and a control plane (what decides, supervises, and records). This reference architecture is reused across later chapters (voting, state machine, diagnostics coverage, audit logs, and verification).

Power paths: Path-A / Path-B ORing / Transfer element Bypass element Evidence inputs: I / V / T Contact feedback Supply-good CTRL + Watchdog + Event Log

Canonical blocks (kept intentionally protocol- and topology-agnostic)

Power paths: Path-A and Path-B carry energy to the same LED load. A transfer element enforces mutual exclusivity (one path at a time), while an ORing element supports parallel tolerance (both may conduct depending on conditions).
Bypass element: a dedicated actuator that bridges or isolates a failed point/segment under controlled policy. It is treated as a safety-relevant component that requires diagnosability and proof testing.
Sensing inputs (evidence only): current (I), voltage (V), temperature (T), contact feedback (CF), supply-good (PG), and isolation/ground-leak status (ISO) are modeled as inputs to the decision logic. Subsystem implementation stays out of scope here.
Control plane: a controller (concept-level) runs health evaluation and state transitions; a watchdog enforces a known-safe behavior during control failure; and a nonvolatile event log preserves evidence for later audit and service diagnostics.

Design rule: switching or bypass actions require at least two independent evidence signals (e.g., I + V, or Vdrop + CF) to avoid single-sensor false positives and common-mode failures.

Figure F3 — Reusable reference architecture: Path-A/Path-B, ORing/transfer, bypass actuator, evidence inputs, and a control plane with watchdog and event log.

Bypass Element Choices: Relay, MOSFET, SSR, eFuse — Tradeoffs That Matter

Bypass and transfer actuators must be selected under lighting-specific constraints such as surge exposure, long cable transients, thermal headroom, and creepage/clearance boundaries. The correct choice is not determined by a single “rated current” number; it is determined by whether the actuator remains controllable, diagnosable, and stable across fault and switchover conditions.

Mechanical relay

Strengths: very low conduction loss and strong surge tolerance. Risks: contact wear/weld, bounce, and coil-driven noise. Diagnostics must consider “commanded state” vs “actual contact state.”

Back-to-back MOSFET

Strengths: fast switching, no mechanical wear, and precise control. Risks: SOA and surge stress, plus gate-drive failure modes. Diagnostics rely on Vds + current + temperature evidence.

Solid-state relay (SSR)

Strengths: simple control and mechanical robustness. Risks: leakage in OFF state and limited surge capability. Leakage must be accounted for to avoid unintended residual current.

eFuse / hot-swap switch

Strengths: protection + telemetry in one device, often with fast fault response. Risks: conduction loss and transient behavior that can look like faults if thresholds and timing are not aligned with the system policy.

Decision fields (keep as bullet checks, not a giant table)

Peak surge current / energy: whether the actuator survives lightning/surge events without drifting into latent damage.
Steady loss (thermal): heat dissipation margin under sealed housings and high ambient temperature.
Isolation requirement: whether a physical open is required, and how creepage/clearance constraints shape component choice.
Diagnostic observability: whether welded-on, stuck-off, or half-on states can be proven by evidence (Vdrop/Vds, current, temperature, feedback).
Lifetime model: contact cycles, thermal cycling, and surge count endurance—not just steady current rating.
Response time: ability to meet max blackout time and settling windows defined in requirements.
Leakage: OFF-state leakage impact on residual current and unintended illumination behavior.

Figure F4 — Relay vs back-to-back MOSFET: compare current paths and the minimum sensing points needed to diagnose welded/stuck/half-on behaviors.

Selection principle: choose the actuator that can meet surge and thermal constraints and provide auditable evidence of its true state (not just its command state), aligned with the max blackout time and fail-safe policy.

Channel-Health Monitoring: What to Measure and What It Proves

A channel-health monitor must be evidence-driven: actions (bypass/transfer/fail-safe) should follow only from signal changes that prove a fault class and exclude common false positives such as brownout and transient coupling. Evidence is grouped into four domains: LED path, power path context, thermal, and actuator truth.

LED path: ILED / Vstring Events: Ripple / Dropout Context: PG / BusSag / Brownout Thermal: ΔT / dT/dt Actuator: CF / Vdrop / Gate

Core measurements (and the fault classes they support)

LED path evidence: ILED waveform and dropout events indicate whether energy is truly delivered. Vstring/headroom helps separate “open-like” behavior from “supply margin” issues. Ripple and dropout bursts are strong indicators for intermittent connectors and segment discontinuities.
Power-path context (flag level): input brownout margin and intermediate bus sag explain whether LED-path anomalies may be caused by upstream instability. Switch-node abnormality flags (not detailed waveforms) can annotate abnormal converter states without drifting into topology discussions.
Thermal evidence: hotspot-to-ambient ΔT and runaway signatures (dT/dt) distinguish overload and thermal coupling failures from benign ambient changes. Rate-of-rise is typically a stronger indicator than absolute temperature alone.
Bypass/transfer actuator truth: relay coil drive evidence, contact feedback (CF), MOSFET gate status, and Vdrop/Vds across the element validate “commanded state vs actual conduction,” enabling detection of welded-on, stuck-off, and half-on states.

Evidence rule: do not trigger bypass/transfer from a single sensor. Require a minimal evidence set (typically 2–3 signals) per fault type, and treat context-only signals (PG/BusSag) as gating evidence to avoid brownout-driven false actions.

Fault types (defined by minimal evidence sets)

Open LED string: ILED drops or becomes discontinuous and Vstring/headroom rises toward limit; dropout events increase.
Short / bypassed segment: Vstring collapses and ILED deviates (over/limited); context flags may indicate stress.
Current drift: ILED offset persists with supportive thermal/voltage evidence; distinguish aging vs transient conditions.
Intermittent connector: bursty dropout events with correlated Vstring jumps; avoid confusing with intended PWM dimming modes.
Thermal overload: ΔT or dT/dt indicates runaway; current limiting may appear as a secondary signature.
Actuator stuck/welded: commanded OFF yet Vdrop/current indicates conduction; CF mismatches command.

Figure F5 — Compact fault signature matrix: map fault classes to minimal evidence signals to reduce false trips and improve diagnosability.

Voting Logic & Redundancy Patterns: 1oo2, 2oo2, 2oo3 (Practical View)

Voting logic is valuable only when it is framed in engineering terms: what it saves (missed faults, unsafe continuation) versus what it costs (false trips, hardware complexity, validation burden). Voting should operate on independent evidence channels rather than on duplicated signals that share the same failure causes.

1oo2 (one-out-of-two)

High availability: a single channel can trigger bypass/transfer. Cost: higher false-trip risk if one evidence chain is noisy or drifting. Requires strict retry limits and strong audit logging.

2oo2 (two-out-of-two)

High safety: action occurs only when both channels agree. Cost: reduced availability and slower response when one chain is degraded. Suits cases where false actions are more harmful than brief loss of availability.

2oo3 (two-out-of-three)

Balanced behavior: tolerates one bad chain while avoiding single-chain false trips. Cost: more sensors, logic, and validation effort. Best when both availability and safety are important and the system can afford complexity.

Handling disagreement (engineer-friendly policy)

Enter degraded mode: limit actions (e.g., reduce current, lock out repeated transfers) while collecting more evidence.
Increase sampling confidence: extend observation window, raise sampling rate, or require repeated consistent signatures before action.
Request service / proof-test: when evidence remains inconsistent, record a service-needed event and schedule a proof test of the actuator.
Escalate to fail-safe OFF: when hazard is high or evidence cannot be trusted, prioritize safe state over availability.

Independence rule: voting inputs should avoid common-mode failures (shared ADC reference, shared ground path, shared firmware bug). Prefer mixed-domain evidence (e.g., I + Vdrop + CF) over duplicated measurements with shared error sources.

Figure F6 — Practical voting block: independent evidence channels feed a voting decision, with common-mode pitfalls explicitly called out.

Switchover State Machine: Debounce, Transfer, Re-try, Latch Policies

Redundancy becomes stable only when it is governed by an explicit state machine. Without timers, rate limits, and verification, “smart switchover” can degrade into repeated transfers that look like random flicker. The state machine below separates transient filtering from confirmed faults, enforces cooldown to prevent oscillation, and records audit logs at each decision point.

Tconfirm (debounce) Tretry (backoff) Tholdoff (cooldown) Rate limit: N/hour Verify window Latch vs Auto-recover

State intent (why each state exists)

Normal: establish baseline statistics (dropout counts, temperature trend, bus margin) and prevent unnecessary actions.
Suspect: apply debounce to exclude brief transients; require minimal multi-signal evidence before escalation.
Confirmed fault: commit a fault class decision using evidence sets; decide whether transfer/bypass is permitted for the hazard class.
Transfer / bypass: execute the action under rate limits and safe timing rules; immediately transition into verification.
Verify: confirm that the new path carries the load and the old path is truly isolated; detect actuator failure modes early.
Degraded run: stabilize operation when evidence is inconsistent or capacity is reduced; restrict further transfers and collect more data.
Service required: request proof-test/maintenance when repeated attempts, rate limits, or unsafe ambiguity is reached.

Key mechanisms (rules that prevent flicker-like oscillation)

Debounce windows: Suspect must persist for Tconfirm with consistent evidence (e.g., I+V or dropout counts) before confirming.
Rate limits: enforce max transfers per hour and a minimum dwell time on each path to protect actuators and reduce visible disturbance.
Cooldown / thermal settle: after transfer, hold actions for Tholdoff to allow electrical and thermal stabilization before any re-try.
Latch vs auto-recover: lock out automatic recovery when hazard class requires deterministic behavior; allow auto-recover only with bounded retries and logging.

Audit requirement: log on entry to Suspect, on fault confirmation, on every transfer attempt, and on verify outcome. Include evidence snapshots (I/V/T/CF/Vdrop + context flags) so that false positives and actuator faults can be proven after the fact.

Figure F7 — State diagram with Tconfirm (debounce), Tretry (backoff), Tholdoff (cooldown), rate limits, and explicit logging points to keep redundancy stable and auditable.

Diagnostics Coverage: Detecting Welded Contacts, Stuck MOSFETs, and False Positives

Redundancy is auditable only when diagnostic coverage is explicit: welded contacts, stuck switches, and false positives must be detectable using repeatable tests and multi-signal correlation. Coverage improves when tests are executed inside a defined safe window (stable supply, not during transfer, energy-limited) and when measurements compare commanded states against measured conduction evidence.

Relay weld detection

When commanded OPEN, verify isolation by measuring Vdrop across contacts (and/or an energy-limited test stimulus where safe). A mismatch between command and conduction evidence indicates welded or stuck behavior.

MOSFET stuck-on detection

When commanded OFF, compare gate status against Vds and path current evidence. OFF command with low Vds or sustained current indicates stuck-on or shorted conduction.

Stuck-off detection

When commanded ON, verify current rises and Vds/Vdrop falls. If no current flows, gate context signals (PG/BusSag) and cross-check the alternate path to avoid confusing supply collapse with actuator failure.

False-positive suppression

Correlate evidence across domains (I + V + T, or I + CF + Vdrop) instead of single thresholds. Use context flags as gating signals to prevent brownout-driven false trips.

Built-in self-test (BIST) principles (safety-first)

Safe window: run BIST only when supply-good is stable, not during transfer, and with an energy-limited stimulus.
Expected reading: define what must change (and what must not) for each test so that coverage is verifiable.
Independence: do not rely on a single measurement chain; compare command vs independent conduction evidence.

Coverage statement: welded-on / stuck-on / stuck-off can be proven when BIST stimuli are available and command states are compared against Vdrop/Vds and path-current evidence under stable context (PG/BusSag gated).

Figure F8 — BIST injection points and expected readings: prove welded/stuck conditions by comparing commanded states against conduction evidence (Vdrop/Vds and path current) under a safe window.

Audit Logs & Evidence: What to Record So Failures Can Be Proven

Bypass and redundancy become “auditable” only when event logs preserve a complete evidence chain: what was measured, how the decision was made, and what action was executed. An evidentiary log is not a narrative; it is a set of records that can survive field disputes by showing the exact evidence snapshot, thresholds, firmware identity, and sequence ordering.

FAULT_DETECTED VOTE_DISAGREE BYPASS_COMMAND BYPASS_CONFIRMED RECOVERY_ATTEMPT LATCHED_OFF PROOF_TEST_PASS PROOF_TEST_FAIL

Fields per event (what makes the record evidentiary)

Identity & ordering: event_type, channel_id, record_id, monotonic_ctr (no rollback), and record_crc/commit marker.
Time provenance: timestamp plus timestamp source (RTC / network / relative). If time is uncertain, the source must say so.
Decision reproducibility: reason_code, vote_mode, state_from→state_to, and the exact threshold set used.
Evidence snapshot: raw or reduced measurements (I/V/T/CF/Vdrop + context flags such as PG/BusSag) captured at the decision point.
Software traceability: firmware version plus config hash (or equivalent) so the active policy cannot be disputed later.

Retention concept: use a ring buffer with wear leveling. Preserve critical events (LATCHED_OFF, PROOF_TEST_FAIL, repeated BYPASS_CONFIRMED) longer than routine health summaries. Include a commit marker so partial writes after brownout are detectable.

Copy-ready mini-template (compact event record schema)


          event_type: BYPASS_CONFIRMED

          ts: 2026-02-12T00:00:00Z

          ts_src: RTC

          monotonic_ctr: 0001234567

          channel_id: PATH_A

          state: TRANSFER->VERIFY

          reason_code: OPEN_STRING

          vote_mode: 2oo3

          snapshot: {I_led_avg, V_string, dropout_cnt, PG, BusSag, T_hot, CF, V_drop}

          thr_set: {Tconfirm, Tretry, Tholdoff, I_min, V_max, T_max, N_drop}

          fw_version: vX.Y.Z

          config_hash: 0x________

          record_crc: 0x________

Minimum evidence set (by event type)

FAULT_DETECTED: I/V/dropout + PG/BusSag + thr_set + state
VOTE_DISAGREE: per-channel inputs summary + vote_mode + reason_code
BYPASS_COMMAND: command parameters + rate-limit counters + state
BYPASS_CONFIRMED: CF + Vdrop/Vds + path-current evidence + verify window result
LATCHED_OFF: hazard class + retry counters + final evidence snapshot
PROOF_TEST_PASS/FAIL: stimulus_id + expected reading + measured summary

Figure F9 — Event log pipeline: sensor snapshot and context gating feed vote/state decisions, which build evidentiary records into an NVM ring buffer for service export.

Hardware Implementation Notes: Relay Drive, Isolation Boundaries, and Noise Immunity

The implementation details that matter most for bypass and redundancy are the ones that directly change switchover stability, verification reliability, and diagnostic evidence quality. This section focuses on coil-drive behavior, brownout chatter risks, isolation-aware feedback design, and measurement integrity around high di/dt bypass paths.

Relay coil drive: clamp choice changes release time and noise

Diode clamp: lower EMI but slower release; increases transfer overlap risk and extends verification windows.
Zener/TVS clamp: faster release but higher dv/dt; can inject noise into feedback and measurement paths.
RC clamp: balanced behavior but parameter-sensitive; requires validation across tolerance and temperature.

Coil brownout behavior: prevent chatter

Chatter risk: brownout can place coil current near the pickup/hold boundary, causing repeated contact toggling.
Mitigation concept: apply UVLO-like gating for coil drive, minimum on/off dwell time, and confirm state using CF/Vdrop evidence.

Isolation boundaries: keep cross-domain feedback simple and robust

Crossing signals: CF/state feedback crossing isolation should be low-complexity and noise-robust.
Evidence integrity: loss/stuck conditions on feedback signals should map to explicit reason codes and log events.

High di/dt bypass paths: measure for evidence, not just precision

Kelvin sense: Vdrop/Vds evidence should be sensed with Kelvin routing to avoid inductive and ground-bounce corruption.
Sampling windows: avoid measuring during transfer edges; align measurement windows with Verify/Tholdoff policy.

Figure F10 — Coil clamp options (diode vs Zener/TVS vs RC): release time and noise tradeoffs directly affect switchover timing, verification reliability, and diagnostic evidence quality.

Verification & Proof Testing: How You Validate It Won’t Fail Silently

Redundancy that “works in the lab” can still fail silently in the field when actuators weld, MOSFETs stick, logs corrupt during brownout, or surge/ESD causes false bypass/latch events. A verification plan must therefore test behavior (state transitions), evidence (I/V/T/CF/Vdrop consistency), and auditability (event sequence + monotonic counters) as a single system.

Functional faults Transfer endurance False-bypass / false-latch Log integrity Service proof-test

Test matrix (what to test, what evidence must appear, what to log)

1) Functional — forced open/short

Stimulus: emulate open LED string / short / drift.
Expected behavior: Normal → Suspect → Confirmed → Transfer/Bypass → Verify → Degraded or Service-required (policy-dependent).
Expected evidence: ILED dropouts + Vstring/headroom change; actuator evidence (CF + Vdrop/Vds) confirms the action.
Required logs: FAULT_DETECTED → (optional VOTE_DISAGREE) → BYPASS_COMMAND → BYPASS_CONFIRMED → (RECOVERY_ATTEMPT or LATCHED_OFF).

2) Functional — thermal ramp

Stimulus: controlled temperature rise to trip thermal policy.
Expected behavior: enter Degraded run or latch-off only with sufficient thermal evidence and context gating.
Expected evidence: T_hot / ΔT / dTdt trends cross configured thresholds without contradictory PG/BusSag context.
Required logs: FAULT_DETECTED (THERMAL_*) with thr_set + snapshot; state transition fields must show policy execution.

3) Functional — intermittent connector

Stimulus: intermittent open events (bursty dropouts) to stress debounce and rate limit.
Expected behavior: Suspect filtering blocks oscillation; transfers are bounded by Tconfirm + max transfers/hour.
Expected evidence: dropout_cnt spikes, but transfers occur only after confirmation; Verify windows are respected.
Required logs: repeated FAULT_DETECTED is acceptable; repeated BYPASS_COMMAND without confirmation is a failure.

4) Transfer robustness — endurance + policy stability

Stimulus: repeated switchover cycles and max transfers/hour saturation.
Expected behavior: commands execute; Verify confirms; rate limit triggers Service-required or latch policy when exceeded.
Expected evidence: CF and Vdrop/Vds remain consistent after N cycles; no increasing mismatch rate over time.
Required logs: BYPASS_COMMAND / BYPASS_CONFIRMED sequences with retry counters and rate-limit counters captured.

5) Surge/ESD (mis-trigger focus only)

Stimulus: surge/ESD exposures relevant to false events.
Expected behavior: no “action without evidence.” If bypass/latch occurs, it must be preceded by valid evidence snapshots.
Expected evidence: BYPASS_COMMAND and LATCHED_OFF are allowed only when FAULT_DETECTED meets the minimum evidence set.
Required logs: preserve the evidence snapshot immediately preceding any action-class event.

6) Log integrity — power loss and monotonic continuity

Stimulus: remove power during NVM write and during export packaging.
Expected behavior: records are either fully committed (CRC/commit marker) or explicitly marked invalid; never ambiguous “half records.”
Expected evidence: monotonic_ctr never rolls back across reboot; discontinuities are detectable and reportable.
Required logs: record_crc + commit_marker fields; boot-time continuity check event (concept-level) if implemented.

Acceptance rule: any bypass/latch action must be provably linked to a prior evidentiary FAULT_DETECTED record (snapshot + thr_set + policy ID). “Action without evidence” is classified as false-bypass/false-latch.

Proof-test procedure (service can validate actuator + sensors with bounded impact)

Choose a safe window: PG stable, transfer disabled, energy-limited stimulus enabled.
Freeze policies: lock rate-limit counters for the test window; prevent state machine from reacting to test stimuli as real faults.
Inject stimulus: small Itest/Vprobe or controlled gate toggle (implementation-dependent) to verify commanded vs measured conduction.
Observe evidence: CF/Vdrop/Vds and path-current summary must match the expected result for OPEN/ON.
Log result: PROOF_TEST_PASS/FAIL with stimulus_id, expected reading, measured summary, and monotonic_ctr.

Example part numbers (MPN) commonly used to build and validate this plan

The verification strategy above maps to concrete hardware building blocks for actuation, protection, logging, time, and supervision. The list below provides example MPNs engineers often use as reference points in prototypes and verification fixtures.


          Relay (signal/power examples): Omron G5Q series; Panasonic TQ2 series

          Hot-swap / eFuse (telemetry-capable examples): TI TPS25940; TI TPS25982

          Surge / TVS (rail clamp examples): Littelfuse SMBJ series; Vishay SMBJ series

          Precision supervisor / reset (monotonic/log integrity helper): TI TPS3839; Analog Devices ADM809/ADM810 family

          FRAM for robust event storage (example): Fujitsu MB85RS64V (SPI FRAM)

          Real-time clock (timestamp source example): Maxim/ADI DS3231 (TCXO RTC)

          Digital isolator for feedback crossing (example): TI ISO7721; Analog Devices ADuM1250 (I²C isolator class)

Figure F11 — Proof-test flow: a safe window and frozen policy allow bounded stimuli; expected evidence outputs (snapshot + actuator conduction proof + monotonic continuity) are logged as PROOF_TEST_PASS/FAIL.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (Bypass / Redundant Channel)

Each answer is structured as: 1-sentence conclusion + 2 evidence checks + 1 first fix. Every question maps back to H2-3…H2-11 to avoid scope creep.

1Auto-bypass triggers during surge but the channel is actually fine—thresholds or common-mode pickup?

Conclusion: If bypass actions occur without a valid pre-event evidence snapshot, it is almost always a mis-trigger (threshold/context gating or common-mode pickup), not a real channel fault.

Evidence check A: In the log sequence, verify FAULT_DETECTED exists immediately before BYPASS_COMMAND, and that its snapshot includes I/V/dropout changes plus context (PG/BusSag) consistent with a real fault. (H2-5/H2-11)
Evidence check B: Compare surge timestamps with feedback integrity: do CF/Vdrop readings spike or glitch during the surge window, indicating common-mode injection into sensing/feedback routes? (H2-10)
First fix: Tighten “action requires evidence” gating: require 2-signal correlation (I + V) and a minimum debounce window before allowing bypass; add a rail clamp and validate mis-trigger immunity (e.g., TVS in SMBJ family) and ensure supervisor reset timing is stable (e.g., TPS3839). (H2-5/H2-10/H2-11)

Maps to: H2-5 / H2-10 / H2-11

2Relay chatters at low line—coil brownout or debounce window too short?

Conclusion: Relay chatter at low line is usually coil brownout behavior amplified by insufficient minimum dwell/holdoff timing, not “random” firmware.

Evidence check A: Correlate low-line events with coil drive supply and chatter frequency: do repeated state transitions occur without stable CF confirmation? (H2-7)
Evidence check B: Inspect coil suppression and release time: slow release can stretch transfer windows and cause repeated re-entries into Suspect/Transfer if verification is time-misaligned. (H2-10)
First fix: Add UVLO-like gating for the coil drive and enforce minimum on/off dwell; choose a clamp strategy consistent with the state machine timing and validate with a supervisor (e.g., TPS3839) plus a relay family suited for switching duty (e.g., Omron G5Q series). (H2-7/H2-10)

Maps to: H2-7 / H2-10

3We bypass correctly, but brightness steps visibly—transfer timing or current-loop settle?

Conclusion: Visible brightness steps after bypass are usually a transfer/verify timing issue that reinitializes or perturbs the current loop, not a bypass “failure.”

Evidence check A: Compare ILED ripple/settling before and after BYPASS_CONFIRMED; a transient ILED drop or overshoot aligned with Transfer→Verify indicates loop settle or headroom disturbance. (H2-5)
Evidence check B: Validate state-machine windows: if Verify begins too early (before coil release or MOSFET stabilization), it can force retries that look like flicker/steps. (H2-7)
First fix: Add a holdoff and a “soft-verify” stage (delay + filtered evidence) so current loop settles before declaring stable; enforce a maximum transfers/hour to prevent repeated perceptible steps. (H2-7/H2-5)

Maps to: H2-7 / H2-5

4Voting disagrees intermittently—sensor drift or shared reference causing common-mode?

Conclusion: Intermittent vote disagreement is often caused by hidden common-mode coupling (shared reference/ground/ADC) rather than true sensor drift.

Evidence check A: Look at VOTE_DISAGREE records: do disagreements cluster during surge/transfer edges (common-mode), or do they evolve slowly with temperature/time (drift)? (H2-6)
Evidence check B: Check independence assumptions: are both channels using the same ADC reference or the same return path such that ground bounce can move both “independent” readings together? (H2-5)
First fix: Increase independence at the evidence layer: separate references/filters or staggered sampling windows; add a proof-test that injects a small stimulus and verifies each channel’s response distinctly. For cross-domain signals, use a robust isolator (e.g., ISO7721). (H2-6/H2-5/H2-11)

Maps to: H2-6 / H2-5

5Contact is welded but system doesn’t detect it—missing Vdrop test or wrong injection point?

Conclusion: Welded contacts go undetected when the design lacks a credible open-state evidence method (Vdrop measurement or safe test-current injection at the correct point).

Evidence check A: When “open” is commanded, is Vdrop across the contact measured (Kelvin) and logged as part of a proof-test or BIST event? (H2-8)
Evidence check B: If a test current is injected, confirm it traverses the suspect element and produces an unambiguous signature; wrong injection points can bypass the evidence path. (H2-11)
First fix: Add a bounded proof-test step that measures Vdrop/Vds with a known small stimulus and logs PROOF_TEST_PASS/FAIL; if using a telemetry-capable hot-swap/eFuse for controlled stimulus, a common reference part is TPS25982 or TPS25940. (H2-8/H2-11)

Maps to: H2-8 / H2-11

6MOSFET bypass runs hot in normal mode—Rds(on) margin or SOA underestimated?

Conclusion: A hot bypass MOSFET in normal operation is typically a margin issue (Rds(on) at temperature, gate drive, or current distribution), and verification should confirm SOA and thermal evidence under worst case.

Evidence check A: Measure and log Vds (or Vdrop) and current simultaneously to compute real conduction loss versus expectations, including at elevated temperature. (H2-4)
Evidence check B: Review proof-test/verification results for worst-case current and thermal ramp: if temperature rise is faster than predicted, SOA or cooling assumptions are wrong. (H2-11)
First fix: Increase conduction margin (lower Rds(on) device or parallel strategy) and add thermal gating that prevents repeated transfers under high junction temperature; validate with thermal ramp tests and record evidence snapshots for audit. (H2-4/H2-11)

Maps to: H2-4 / H2-11

7False ‘open-string’ faults after maintenance—connector intermittency or sense wiring routing?

Conclusion: Post-maintenance “open-string” faults are most often intermittent connectors or disturbed sense wiring that turns benign noise into dropout signatures.

Evidence check A: Confirm dropout_cnt patterns: intermittent connectors create bursty dropouts with recoveries; a real open string produces persistent I=0 with consistent Vstring changes. (H2-5)
Evidence check B: Validate routing/sense integrity around high di/dt paths; poor reference routing can create false Vstring/headroom readings during switching edges. (H2-10)
First fix: Increase confirmation robustness (Tconfirm + 2-signal correlation) and rework sense routing for Kelvin reference; if log retention is fragile during repeated maintenance cycles, store key events in FRAM (e.g., MB85RS64V) for robust history. (H2-5/H2-10/H2-9)

Maps to: H2-5 / H2-10

8Event logs show ‘bypass confirmed’ but field tech sees no bypass—feedback signal integrity or definition mismatch?

Conclusion: “Confirmed” in the log without real bypass almost always means the confirmation criteria is too weak or the feedback definition/integrity is wrong across the isolation boundary.

Evidence check A: Inspect what “confirmed” means: does BYPASS_CONFIRMED require both CF and a credible conduction metric (Vdrop/Vds + current recovery), or just one noisy signal? (H2-9)
Evidence check B: Validate feedback integrity: check for stuck-at, polarity inversion, or cross-domain corruption; verify that CF transitions correlate with actual Vdrop change. (H2-10)
First fix: Strengthen confirmation to require two independent proofs (CF + Vdrop/Vds evidence) and add a periodic proof-test; if feedback crosses isolation, use a proven digital isolator class (e.g., ISO7721) and log a “feedback health” reason code on anomalies. (H2-9/H2-10/H2-11)

Maps to: H2-9 / H2-10

9Power-loss during fault causes corrupted history—log atomicity or monotonic counter handling?

Conclusion: Corrupted history under power loss is an atomicity/commit problem first, and a monotonic continuity problem second; both must be verifiable after reboot.

Evidence check A: Verify records have commit markers and CRC; partial writes must be detectable and never interpreted as valid events. (H2-9)
Evidence check B: Confirm monotonic_ctr continuity across resets; rollbacks or unexplained gaps must be flagged or explainable. (H2-11)
First fix: Implement two-phase commit (write → CRC → commit marker) and store critical counters in robust NVM; FRAM is a common choice for write endurance (e.g., MB85RS64V), and a stable timestamp source for audit is a TCXO RTC (e.g., DS3231). (H2-9/H2-11)

Maps to: H2-9 / H2-11

10System recovers too aggressively and oscillates—retry policy or thermal cooldown missing?

Conclusion: Oscillation is usually a retry/backoff policy failure (no cooldown, no transfer rate limit, or too-quick verify), not a fundamental redundancy concept issue.

Evidence check A: Look for repeated Transfer→Verify→Retry sequences within a short interval; this indicates missing max transfers/hour and insufficient holdoff. (H2-7)
Evidence check B: Check requirements and hazard policy: if the system allows recovery without thermal settle, the evidence (T_hot/ΔT) will show rising stress while retries continue. (H2-2)
First fix: Add exponential backoff and thermal cooldown gates before retry; enforce a hard rate limit with a Service-required latch after repeated failures to prevent visible flicker and stress accumulation. (H2-7/H2-2)

Maps to: H2-7 / H2-2

11Redundant channel works in lab, fails in cold start—health monitor gating or timing assumptions?

Conclusion: Cold-start failures are commonly caused by incorrect gating assumptions (PG/bus readiness) and timer windows that do not account for cold behavior of actuators and sensing.

Evidence check A: During cold start, verify that health evaluation is gated by valid PG/BusSag conditions; if evidence is sampled before the system is settled, false faults and failed verify are expected. (H2-5)
Evidence check B: Compare transfer/verify timers at cold and warm: relay release time, MOSFET behavior, and sensor offsets can shift windows enough to cause systematic Verify failures. (H2-7/H2-11)
First fix: Add explicit “boot-safe window” gating and cold-calibrated timing margins; validate with a proof-test at cold start and log the exact thr_set and timer values used. If timestamps are needed for audit correlation, a stable RTC like DS3231 is a common reference. (H2-5/H2-7/H2-11)

Maps to: H2-5 / H2-7 / H2-11

12How do we prove to an auditor the bypass is ‘controlled’ not ‘random’?

Conclusion: Bypass is provably controlled when every action is linked to a minimum evidence set and a reproducible policy record, and when periodic proof-tests prove the actuator and sensors still behave as assumed.

Evidence check A: For each bypass action, show the event chain: FAULT_DETECTED (snapshot + thr_set + fw/config) → BYPASS_COMMAND → BYPASS_CONFIRMED (CF + Vdrop/Vds + current recovery). (H2-9)
Evidence check B: Demonstrate proof-test logs: PROOF_TEST_PASS/FAIL records with monotonic_ctr continuity and commit integrity prove the mechanism hasn’t silently degraded. (H2-11)
First fix: Adopt a fixed event schema (type/reason/snapshot/thr_set/fw/config_hash/ctr/CRC) and enforce “action requires evidence”; store critical records in robust NVM (e.g., MB85RS64V) and keep time provenance via RTC (e.g., DS3231). (H2-9/H2-11)

Maps to: H2-9 / H2-11

Bypass & Redundant Channel for LED Driver Systems

Bypass & Redundant Channel for LED Driver Systems

What “Bypass / Redundant Channel” Means in LED Driver Systems

System Requirements & Failure Philosophy

Reference Architecture: Dual-Path Power + Dual-Sense + Control Plane

Canonical blocks (kept intentionally protocol- and topology-agnostic)

Bypass Element Choices: Relay, MOSFET, SSR, eFuse — Tradeoffs That Matter

Decision fields (keep as bullet checks, not a giant table)

Channel-Health Monitoring: What to Measure and What It Proves

Core measurements (and the fault classes they support)

Fault types (defined by minimal evidence sets)

Voting Logic & Redundancy Patterns: 1oo2, 2oo2, 2oo3 (Practical View)

Handling disagreement (engineer-friendly policy)

Switchover State Machine: Debounce, Transfer, Re-try, Latch Policies

State intent (why each state exists)

Key mechanisms (rules that prevent flicker-like oscillation)

Diagnostics Coverage: Detecting Welded Contacts, Stuck MOSFETs, and False Positives

Built-in self-test (BIST) principles (safety-first)

Audit Logs & Evidence: What to Record So Failures Can Be Proven

Fields per event (what makes the record evidentiary)

Copy-ready mini-template (compact event record schema)

Minimum evidence set (by event type)

Hardware Implementation Notes: Relay Drive, Isolation Boundaries, and Noise Immunity

Relay coil drive: clamp choice changes release time and noise

Coil brownout behavior: prevent chatter

Isolation boundaries: keep cross-domain feedback simple and robust

High di/dt bypass paths: measure for evidence, not just precision

Verification & Proof Testing: How You Validate It Won’t Fail Silently

Test matrix (what to test, what evidence must appear, what to log)

Proof-test procedure (service can validate actuator + sensors with bounded impact)

Example part numbers (MPN) commonly used to build and validate this plan

Request a Quote

Accepted Formats

Attachment

FAQs (Bypass / Redundant Channel)

Explore

Categories

Get in Touch