123 Main Street, New York, NY 10001

Onboard PDU & Protection (eFuse, Surge, Telemetry)

← Back to: Rail Transit & Locomotive

An onboard rail PDU keeps the low-voltage bus stable and safe by combining fast electronic protection (eFuse/high-side switching), contactor/relay control, and surge/lightning energy routing, while continuously logging evidence (V/I/state/retry) so every trip or reset can be diagnosed and fixed quickly in the field.

The core goal is not only “pass/fail”, but “self-explainable behavior”: when something goes wrong, the PDU’s telemetry proves whether it was inrush, a true short, backfeed/miswire, clamping-induced brownout, contactor issues, or thermal derating—and what to change first.

H2-1. What This PDU Page Covers (and What It Doesn’t)

This page treats the onboard PDU as a power-distribution + protection + evidence module. The goal is not to explain traction or signaling functions, but to make branch power behavior deterministic under real railway stress: inrush at power-up, hard/soft shorts, reverse connection, surge/lightning coupling, relay/contactor switching, and the telemetry needed to close the field-debug loop.

In the onboard power tree, the PDU sits downstream of the auxiliary power sources and DC buses and upstream of the loads. Conceptually, it is the “last mile” layer that decides whether a disturbance becomes a localized branch event or a whole-bus brownout.

The design intent throughout this page is simple: isolate the failing branch first, keep the shared bus stable when possible, and always emit enough evidence (reason codes + waveforms/peaks + counters) to classify the root cause without guesswork.

  • Design: choose the switching/protection style (eFuse/high-side vs relay/contactor vs hybrid) so power-up, normal load steps, and fault isolation are repeatable.
  • Validation: prove selectivity under abuse (short, overload, reverse, surge) and verify that the PDU logs the exact fault signature.
  • Field debugging: distinguish “false trip” vs “true short” vs “thermal derate” vs “surge coupling” using telemetry and event context.

Out of scope here: traction inverter topology, motor control loops, CBTC/ETCS functional design, wayside I/O cabinets, or general EMC textbooks. Those belong to their dedicated pages.

Onboard PDU: Position & Responsibilities Power distribution + protection + evidence (telemetry) HV Source Catenary / Battery Aux Conversion DC/DC, rails DC Bus Shared energy PDU Switching Protection Telemetry Loads (Branches) Vital keep alive Mission degrade Comfort shed first Evidence Output Trip reason • Ipeak • Vmin • Temp proxy • Retry count • Contactor health Event timestamps • Maintenance counters
Figure H2-1 — The onboard PDU sits between shared DC buses and multiple load branches, enforcing selectivity and exporting telemetry/evidence for fast field debugging.

H2-2. System Power-Tree Context: Where the PDU Sits

A rail PDU cannot be designed from voltage names alone. What matters is the disturbance profile of the upstream sources (interruptions, switchover gaps, overshoot, surge coupling) and the service continuity required by each downstream branch. This chapter establishes the power-tree context so later protection decisions (fast trip vs energy-limit, hiccup vs latch-off, precharge sequencing, and surge clamping paths) are anchored to clear system objectives.

Upstream sources typically include auxiliary converters, onboard batteries, and emergency supplies. Even if the nominal bus is stable most of the time, the PDU must be robust to three categories of stress:

  • Switching transients: source switchover, load steps, contactor operations, and bus recovery slopes.
  • Abuse faults: hard/soft shorts, overloads, reverse connection, and intermittent wiring faults (vibration/moisture).
  • Surge coupling: long harness energy, common-mode injection, and lightning-related spikes that shift local reference potentials.

Downstream, branches should be grouped by service continuity rather than by “device type.” A practical hierarchy is: Vital (must remain powered or degrade safely), Mission (temporary loss is tolerated but must be logged), and Comfort (shed first to protect the shared bus). This hierarchy directly determines: the protection response style, retry policy, and what evidence must be recorded.

With this context, the PDU’s three responsibilities become measurable engineering targets: Switching (repeatable power-up and controlled isolation), Protection (branch-first selectivity without collapsing the bus), and Telemetry (fault signatures that allow unambiguous classification: true short vs false trip vs thermal derate vs surge-induced upset).

Practical consequence: if “one branch fault collapses the bus,” the issue is rarely a single component. It is usually a power-tree problem: selectivity thresholds, shared impedance/return paths, precharge timing, or missing evidence fields that hide the real trigger.

Onboard Power Tree + PDU Branches Service continuity groups drive protection response and telemetry requirements Sources Aux Converter Battery Emergency Shared DC Bus interruptions • switchover • surge coupling Bus Sense: V / I / Temp PDU Branch Control switch • protect • measure • log Vital Branches degrade safely, log evidence Branch: Safety-critical Policy: controlled retry Mission Branches isolate fast, keep bus alive Branch: Operational Policy: selectivity-first Comfort loads are shed first to protect shared bus stability Telemetry fields for root-cause Trip reason Ipeak / Vmin Temp proxy Retry counters
Figure H2-2 — Power-tree context: source disturbances and service-continuity groups determine selectivity policy, retry behavior, and the telemetry fields required for unambiguous field diagnosis.

H2-3. Switching Elements: eFuse vs High-Side Switch vs Relay/Contactor

Switching elements in an onboard PDU are selected by system objectives, not by component labels. The decision must balance selective isolation (branch-first protection), service continuity (avoid bus-wide brownouts), and evidence availability (telemetry fields that classify real shorts vs false trips vs thermal derates).

Isolation visibility Trip speed Restart policy Telemetry evidence Lifetime & vibration Fault containment

When semiconductor switching is the right default

Semiconductor switching (eFuse / high-side switch) fits branches that require fast, repeatable protection with measurable behavior. It enables deterministic actions such as current limiting, timed fault qualification, controlled retry windows, and export of reason codes and peaks/minima for field diagnosis.

  • Use eFuse-style protection when branch behavior must be policy-driven: limit curves, fault timers, retry counters, and clear trip reasons.
  • Use high-side switching when remote on/off and basic protection are sufficient and a separate system layer handles deeper evidence/logging.

When relay/contactor switching is still required

Relays/contactors remain valuable for physical isolation, creepage/clearance constraints, and maintenance-safe disconnection. However, mechanical switching alone rarely delivers the telemetry resolution needed to classify transient events, so it often benefits from an electronic layer that records fault signatures and enforces selectivity at branch granularity.

  • Prefer contactors for clear open/close states, safety audits, and high-energy isolation.
  • Plan for diagnostics (coil health, weld suspicion, event counters) because mechanical behavior can be non-ideal under low bus voltage or vibration.

Hybrid channel: the common rail PDU pattern

A hybrid channel combines precharge (to control inrush), a contactor (for visible isolation), and an electronic branch protector (for fast selectivity + telemetry). This structure decouples three problems: controlled startup, safe isolation, and evidence-rich fault handling.

GoalBranch-first isolation without collapsing the shared bus.
ProofTrip reason + Ipk + trip time + V droop are logged for every event.
Hybrid PDU Channel Precharge + Contactor + eFuse (selectivity + evidence) DC BUS IN Precharge Path R + switch Main Contactor visible isolation NODE eFuse / HSS limit • trip • retry reason codes Branch Load vital / mission / comfort Telemetry V / I / Temp Trip reason Channel State Machine OFF PRECHARGE ON FAULT Log reason + peaks + counters, then isolate or retry by policy
Figure H2-3 — A hybrid channel separates controlled startup (precharge), visible isolation (contactor), and fast selectivity plus evidence (eFuse/high-side protection).

H2-4. Inrush, Pre-Charge, and Hold-Up: Don’t Trip Yourself

Many “mysterious trips” are not real faults. They are normal startup energy events that collide with protection thresholds. The PDU must separate legitimate inrush (capacitor charging and converter startup) from true short-circuit behavior. Without that separation, protection becomes a self-inflicted outage generator: the more sensitive the trip, the more frequent the false shutdown.

Where inrush comes from (as waveforms, not labels)

  • One-shot charging: large input capacitors and EMI capacitance create a high peak that decays quickly.
  • Startup loads: DC/DC and motor-driven auxiliaries draw elevated current for longer, often with step-like ramps.
  • Harness resonance: long cables and filters can ring or inject common-mode energy, confusing simple threshold logic.

Precharge strategy: control the trajectory

Precharge is a trajectory control problem: limiting peak current is not enough unless the bus recovery slope and switching handoff are also controlled. Common rail patterns include resistor precharge followed by contactor closure, then staged branch enabling so the shared bus does not cross brownout thresholds for vital loads.

Protection style must follow service continuity

  • Hiccup (auto retry): useful when brief recovery is expected, but requires cooldown windows to avoid oscillation.
  • Foldback (energy-limited): maintains bus stability while bounding device stress during overload-like conditions.
  • Latch-off: appropriate for non-critical branches or suspected hard faults, paired with clear reason codes and counters.

A robust restart policy prevents “bounce resets” where repeated retries keep the bus below thresholds. Use retry windows, thermal-aware cooldown, and classified counters so a transient event is not treated as a permanent short.

Inrush vs Protection (Time Domain) Separate normal startup energy from true faults Current Time Trip region (threshold / timer) Retry window cooldown thermal-aware Inrush Current limit Rule: cooldown + retry counters + Vmin evidence prevent oscillating restarts
Figure H2-4 — Concept view: inrush trajectory, current limiting, and retry cooldown windows prevent false trips and repeated brownout oscillations.

H2-5. Short-Circuit & Overload Protection: SOA, Fast Trip, Selectivity

Short-circuit and overload handling must be engineered around protection priorities and selectivity. In rail PDUs, the practical objective is not “maximum sensitivity,” but branch-first containment that prevents harness/connector damage and avoids collapsing the shared bus for other branches.

Protection targets (order matters)

  • Harness & connectors: prevent thermal runaway and insulation damage.
  • Switching device SOA: prevent secondary failures during fault energy handling.
  • Load containment: prevent fault propagation and repeated stress.
  • Service continuity: preserve vital branches through selectivity and controlled policies.

Fast trip vs energy-limited response

Fast trip is essential for hard shorts to minimize energy. Energy-limited control (I²t/SOA-aware limiting) is often superior for soft shorts and overload-like faults because it keeps the bus stable while bounding device stress. These are not mutually exclusive—an effective PDU uses fault classification and timed qualification to choose the right action.

Selectivity: engineering acceptance, not a slogan

  • Time selectivity: branch protection acts before upstream bus-level protection, with margin.
  • Energy selectivity: branch I²t remains within harness/connector limits and avoids bus droop crossing vital thresholds.

Fault types and what evidence is needed

  • Hard short: high Ipk + rapid trip time.
  • Soft short: limiting duration + temperature proxy trend.
  • Overload: moderate overcurrent + long duration + thermal accumulation.
  • Intermittent short: event counters + correlation to vibration/moisture + repeated V droop signatures.

Minimum evidence to log for each event (enables unambiguous field classification): Ipk, trip reason, trip time, bus V droop, Tj proxy, and retry count.

Selectivity + SOA-Aware Protection Branch-first isolation keeps the shared bus alive Shared DC Bus upstream protection (slower, last resort) Bus-level protect Branch A Branch B Branch C Branch protect Branch protect Branch protect Load hard/soft short Load overload Load intermittent Acceptance: branch trips first (time margin) and limits I²t so bus droop stays above vital thresholds Log these 6 fields Ipk • trip reason • trip time • bus V droop • Tj proxy • retry count
Figure H2-5 — Selectivity is an acceptance criterion: branch protection must act first (time margin) and constrain energy so other branches stay powered; evidence fields make faults classifiable.

H2-6. Reverse Polarity, Backfeed, and Miswiring Defense

Maintenance and module replacement can temporarily turn a clean one-direction power tree into a multi-source OR-ing network. The risk is not only component damage. A more severe failure mode is de-energized illusion: a branch appears disconnected yet remains powered through a backfeed path, creating safety hazards, false trips, and confusing field symptoms.

Reverse polarity Backfeed paths OR-ing isolation De-energized verification Voltage distribution diagnosis

Reverse polarity: block the direction, coordinate the energy

Reverse polarity defense should be treated as a layered strategy. Reverse blocking (ideal diode / back-to-back MOSFET) prevents reverse current while minimizing forward loss. For severe miswiring events, coordination with fuse/breaker elements is required so the system does not rely on semiconductor stress alone.

  • Reverse blocking layer: prevents reverse current and protects internal rails from negative injection.
  • Isolation layer: defines a verifiable off-state so a disconnected branch cannot remain energized.
  • Evidence layer: logs voltage and event signatures that expose the actual backfeed direction.

Backfeed scenarios: three high-probability sources

  • Parallel sources: two supplies are inadvertently OR-ed, allowing one to feed the other through a branch path.
  • Service injection: external maintenance power energizes a branch and backfeeds upstream nodes.
  • Energy loads: motor drives, storage, or large capacitors push energy back into the branch during shutdown or faults.

OR-ing and isolation: prevent “branch OFF but still live”

OR-ing and isolation design must ensure that an OFF command results in a measurable de-energized state. If isolation is only logical (software state) but not electrical, a branch can remain live through unexpected return paths. Practical defenses include reverse-blocking OR-ing, isolation placement near the hazard node, and off-state verification signals.

Field diagnosis: use voltage distribution to trace the backfeed path

A repeatable diagnostic method is to measure four nodes: bus input, PDU input, branch output, and load-side. Observe which node decays last after a controlled disconnect sequence. The node that stays high the longest typically indicates the backfeed source direction and the coupling path.

Check 1Branch OFF, yet branch output voltage stays high → suspect OR-ing or service injection backfeed.
Check 2Shutdown causes voltage rebound or slow decay → suspect energy load backfeed (storage / large capacitance).
Backfeed Paths (Concept) Three common routes that keep a branch energized after “OFF” A) Parallel sources B) Service injection C) Energy load Source 1 Source 2 OR-ing + block Branch Upstream node Backfeed V-branch V-upstream Service power (external) Maintenance port Branch PDU / Bus side Backfeed V-branch V-bus Energy load Motor drive / storage Branch output Upstream node Backfeed V-out V-upstream Rule: a branch OFF must be electrically de-energized and verifiable by voltage distribution
Figure H2-6 — Backfeed is a path problem: parallel sources, service power injection, and energy loads can keep a branch energized after “OFF.” Diagnose by voltage distribution and isolate with reverse blocking + OR-ing rules.

H2-7. Surge & Lightning: Clamps, Energy Paths, and Placement Rules

Surge protection is not achieved by adding more clamp parts. It is achieved by creating an intentional energy path that safely returns transient current without pulling the control reference or the shared bus into reset conditions. Layout and return routing often dominate part selection in determining whether the PDU survives and whether the system remains stable.

Energy path Clamp loop area Chassis return Bus droop evidence False trip prevention

Where surge energy enters (PDU-relevant routes)

  • Long harness injection: cable inductance and coupling deliver fast current spikes at the PDU input.
  • Chassis/common-mode shift: structural potential changes move the local reference, disturbing sense and logic thresholds.
  • Ground potential differences: return paths momentarily separate, producing unexpected voltage at control ground.
  • External interfaces: maintenance or auxiliary interfaces can import transient energy into a branch network.

Define the protection target before choosing clamps

A PDU typically protects three layers: (1) input switching and branch protectors (eFuse/high-side/control drivers), (2) vital branch loads that must avoid brownout resets, and (3) sense/control references that must not see common-mode shifts. The clamp network must serve the layer that dominates the failure risk in the installation.

Clamp combinations (roles, not datasheet lists)

  • TVS: fast peak clipping near sensitive nodes.
  • MOV: energy absorption for larger transient portions.
  • GDT: high-energy diversion toward chassis/ground (with trigger behavior).
  • RC shaping: reduces dv/dt and ringing that can trigger false fault detection.
  • Common-mode suppression: steers common-mode current into controlled return paths and protects measurement references.

Placement rules: make the discharge loop explicit

  • Rule 1: every clamp must have a defined current loop: input → clamp → return point → input (minimize loop area).
  • Rule 2: high-current discharge paths must be separated from sensitive sense/control reference returns.
  • Rule 3: measure both sides: one point before clamping, one after clamping, to classify “surge event” vs “true fault.”
  • Rule 4: verify that clamp action does not create a bus undervoltage that resets vital branches or chatters contactors.

Protection can still cause misoperation (and must be proven)

Typical rail symptoms include bus undervoltage resets due to clamp action, contactor chatter under reference shift, and eFuse false trips due to sensing disturbance. The PDU must log bus Vmin, Ipeak, trip reason, and timestamped events so surge-induced misoperation can be separated from genuine short-circuit faults.

Surge Energy Path in a PDU Key question: where does surge current go? Long harness injection + coupling PDU input entry point Branch network protectors + loads surge current Clamp network roles + discharge paths TVS MOV GDT RC shaping dv/dt + ringing Sense before Sense after Chassis return high-energy diversion Bus return controlled loop High-current loop (minimize area) Prove: clamp action must not cause Vmin brownout, contactor chatter, or eFuse false trips
Figure H2-7 — Surge design is an energy-routing problem: define the discharge loop (to chassis/return), minimize loop area, and place measurement points before/after clamping to distinguish surge events from true faults.

H2-8. Relay/Contactor Drive: Coil Power, Demag, Weld Detection, Lifetime

Contactors and relays are the mechanical–electrical boundary of an onboard PDU. Many rail field incidents happen here because failure modes are intermittent and state-dependent: pull-in failures, coil chatter, slow release due to demagnetization choices, and welded contacts that create a dangerous “OFF but still live” condition.

Pull-in vs hold PWM hold Demag strategy Release time Weld suspected Lifetime counters

Coil power: separate pull-in energy from hold energy

Robust drive strategy treats coil energization as two phases. Pull-in requires higher energy to overcome the air gap and spring force. Hold reduces power and heat, often via PWM hold or a lower steady current. If the shared bus droops during pull-in, the coil can enter the unstable region where chatter becomes likely.

  • Pull-in phase: short high-energy window to guarantee closure under vibration and temperature spread.
  • Hold phase: lower power with enough margin to prevent chatter during bus disturbances.
  • Evidence to log: pull-in success time, coil brownout events, and chatter counters.

Demagnetization: clamp choice controls release time

The demag path (flyback handling) is a system decision, not a minor detail. A simple diode clamp reduces peak voltage but can make release slower. TVS clamping accelerates release by allowing a higher flyback voltage. Active clamping can target a controlled release window that satisfies safety timing without excessive EMI.

Weld detection: use multi-evidence consistency, not a single indicator

Welded contacts must be detected without confusing backfeed scenarios. A safe approach is to combine: command state, measured branch voltage/current, and optional auxiliary contact feedback. OFF command with sustained load-side voltage and current is stronger evidence than voltage alone.

  • Command OFF but branch still conducts: sustained voltage + current → weld suspected.
  • Open attempt leaves stable residual voltage: verify against backfeed signatures (H2-6) before declaring weld.
  • Aux contact mismatch: state disagreement escalates the confidence level.

Lifetime & health: count what correlates with failure

  • Total cycles: basic wear accumulator.
  • Load-break cycles: higher stress than no-load switching.
  • Thermal exposure: time above elevated temperatures increases coil and contact risk.
  • Event counters: chatter, brownout, weld suspected, slow release flags.

Fail-safe: define the de-energized default and the recovery policy

Fail-safe behavior is a policy decision: which channels must default to OFF when drive power is lost, and what evidence is required before re-closing. A recovery policy should avoid repeated close-open oscillations when bus quality is poor.

Contactor Drive + Demag Options Release-time tradeoff (concept) Coil driver pull-in + hold Pull-in high energy Hold PWM hold Coil actuator Diode slow TVS fast Active clamp controlled demag path Release time (concept) same coil, different clamp strategy time I/flux Diode (slow) TVS (fast) Active (controlled)
Figure H2-8 — Coil drive is a two-phase problem (pull-in then hold). Demagnetization choice determines release time: diode (slower), TVS (faster), active clamp (controlled window). Weld detection should combine command-state with measured voltage/current (and aux contact if present).

H2-9. Telemetry & Event Logging: What to Measure So Field Debug Is Fast

Protection without telemetry is not a closed loop. An onboard PDU should produce a compact, consistent evidence set that can answer field questions quickly: was it a real short, a surge-induced misoperation, a brownout during pull-in, or a thermal derate that preserved the bus by design?

Must-have fields (minimal set that enables classification)

  • Bus voltage: Vmin and droop duration (brownout proof).
  • Branch current: Ipk and trip time (hard vs soft fault signatures).
  • Switch temperature proxy: indicates SOA/thermal stress and derating cause.
  • Trip reason: short / overload / UV / overtemp / surge event / weld suspected.
  • State: OFF / PRECHARGE / ON / FAULT (same state model used across the page).
  • Retry count: exposes oscillating restarts and policy-driven recovery.

Recommended fields (accelerate root-cause isolation)

  • Line impedance estimate: indicates harness/connection degradation and correlates with surge sensitivity.
  • Inrush signature: distinguishes normal charging from abnormal sustained overcurrent.
  • Contactor health flags: pull-in success time, chatter count, slow release, weld suspected confidence.

Event triggers (what should generate a record)

  • Protection events: trip, overload limit, latch-off, retry escalation.
  • Supply quality: brownout/UV, restart loop detection, bus recovery failure.
  • Surge indicators: clamp active, surge counter, abnormal V/I transient patterns.
  • Mechanical health: weld suspected, coil chatter, pull-in fail, slow release.

Interfaces (names only; protocol details belong elsewhere)

Logs and telemetry are commonly exported via CAN, Ethernet, or RS-485 depending on platform integration. The value is not the bus choice, but consistent timestamps and context fields that stay aligned across subsystems.

Power Fault Evidence Chain Waveform → Event → Log → Maintenance action Waveforms Event detector Log record Action Bus V Branch I Temp proxy Trip Brownout Clamp active Weld suspected Timestamp Trip reason Vmin / Ipk State + retry Context Inspect harness Replace contactor Tune policy Schedule service Key: timestamp alignment + context fields make field debug fast
Figure H2-9 — A usable evidence chain connects waveforms to classified events, then to a timestamped log record with context, enabling fast maintenance actions instead of guesswork.

H2-10. Derating, Thermal, and Enclosure Reality (EN 50155 Style Thinking)

Rail PDUs often live in sealed, vibration-exposed enclosures where airflow is limited. Reliability depends on treating temperature as a control variable: thermal paths must be understood, and protection must be coordinated with service continuity. A “protect everything immediately” policy can cause repeated outages; a staged derating policy can preserve vital branches while preventing damage.

Thermal path: device to carbody (the only heat exit)

  • Junction → copper: local spreading and hot-spot management.
  • Copper → enclosure: interface resistance dominates in sealed boxes.
  • Enclosure → carbody: final sink; mounting and contact quality become reliability factors.

Derating strategy: thresholds, linear derate, staged shed

  • Warning threshold: log peak temperature and time-above-threshold for maintenance evidence.
  • Linear derate: limit branch current/power to keep bus stable while reducing stress.
  • Staged shed: shed non-vital loads first; preserve vital loads as long as safe.
  • Protect/shutdown: last resort when thermal path cannot remove heat.

Resolve “protection vs continuity” by grouping loads

Conflict is resolved by grouping channels (vital vs comfort) and assigning different derating policies. This prevents a single overheated branch from collapsing service for unrelated vital branches. Derating events should be recorded with the same timestamp and state fields used elsewhere so long-term patterns are visible.

Field feedback: make thermal history actionable

  • Peak temperature records: quantify excursions that shorten lifetime.
  • Thermal cycle counters: correlate expansion stress with intermittent faults.
  • Time-at-derate: proves that the system preserved service by policy rather than failing randomly.
Thermal Path + Derating Policy Sealed enclosure reality: preserve vital loads while limiting stress Thermal path Junction Copper Enclosure Carbody Heat exits only through this chain Derating ladder Warning log peaks + time-above Linear derate limit current/power Stage shed non-vital first Protect / shutdown last resort Load grouping Vital preserve longest Comfort shed earlier policy uses groups
Figure H2-10 — Thermal reality is a control problem: heat flows from junction to copper to enclosure to carbody. A staged derating ladder (warning → linear derate → shed non-vital → protect/shutdown) preserves vital loads while preventing damage, and produces actionable thermal history.

H2-11. Validation & Compliance Hooks (Rail Focus)

A rail PDU validation plan should be executable as a checklist and anchored in evidence. Passing is not only “no reset” or “no damage”. A stronger criterion is self-explainability: when a test fails, the PDU must provide enough telemetry to prove which mechanism caused the failure (surge clamp brownout, true short, miswire/backfeed, contactor chatter, or thermal derate).

Electrical abuse EMC: source & victim Vibration intermittent faults Evidence-based pass/fail Rail test language

Electrical abuse checklist (PDU-centric)

  • Inrush / precharge: prove that inrush stays below the protection envelope, and that the state machine transitions (OFF→PRECHARGE→ON) are stable.
  • Short / overload: prove fast trip or energy-limited behavior without collapsing the upstream bus; confirm selectivity (branch trips before bus).
  • Brownout: prove that critical branches either ride through or fail in a controlled way; confirm there is no restart oscillation.
  • Reverse polarity / miswire: prove reverse blocking and “de-energized verification” (a disconnected branch does not remain live via backfeed).
  • Surge / lightning-like transients: prove the intended energy path (to chassis/return) and that clamp action does not trigger false trips or contactor chatter.
Pass criterion A — Classifiable events Every protection action yields a clear event type: trip reason + state + retry count + Vmin/Ipk snapshot.
Pass criterion B — Time-aligned evidence Waveform window, event trigger, and log record share aligned timestamps so root cause can be reconstructed.
Pass criterion C — Repeatable signatures Inrush, brownout, and intermittent faults produce stable patterns rather than random-looking logs.

EMC interface points: PDU as both emitter and victim

  • As an emitter: switching edges, loop area, and contactor demag spikes can radiate/conduct noise that disturbs local references and creates false trips.
  • As a victim: common-mode currents, chassis potential shifts, and clamp-induced bus droop can upset sensing thresholds and state machines.
  • Required hooks: clamp-active flag, bus Vmin + droop duration, trip reason, contactor chatter count, and state transitions around the event.

Mechanical environment: vibration-driven intermittent faults (and how to prove them)

  • Intermittent short (vibration/moisture): short Ipk spikes, frequent retries, and location-dependent repetition. Evidence: Ipk distribution + trip time + retry burst pattern.
  • Connector loosen / contact resistance: random droops and localized heating. Evidence: bus droop duration + impedance estimate proxy + temperature proxy trend.
  • Rule: mechanical faults should be detectable from event clustering and evidence consistency, not from a single “fault bit”.

Example MPN hooks (reference parts to implement measurable, testable behavior)

The following MPNs are examples commonly used to build measurable protection, clamping, sensing, and secure logging in harsh environments. Selection must match the platform bus voltage, energy level, thermal design, and safety rules.

High-side switchInfineon BTS50055-1TMA / BTS7008-1EPP
eFuse / hot-swapTI TPS2662 / TPS25982
Current sense ampTI INA240A1 / INA281A1
Isolated amplifierTI AMC1301 / AMC1311
Digital isolatorADI ADuM141E / TI ISO7741
TVS diodeLittelfuse 5KP58A / 5KP64A (example surge clamps)
MOVTDK EPCOS B72220S230K101 (example MOV)
GDTBourns 2038-09-SM-RPLF (example GDT)
Temp sensorTI TMP117 / Maxim MAX31875
Secure elementMicrochip ATECC608B (signing/attestation hook)
FRAM (event log)Fujitsu MB85RS64V / MB85RS256TY

Test item → expected telemetry evidence → pass/fail symptom

Test item (rail PDU view) Expected telemetry evidence (minimum) Pass/Fail symptom (what it should look like)
Inrush / precharge State: OFF→PRECHARGE→ON; inrush signature; bus Vmin + droop duration; retry count Pass: stable transition, no oscillation. Fail: repeated PRECHARGE resets, Vmin dips coincide with retries.
Hard short (branch) Ipk + trip time; trip reason=SHORT; bus Vmin; upstream channel remains ON Pass: branch trips fast, bus survives. Fail: bus collapses or wrong channel trips (no selectivity).
Soft overload Current-limit mode flag; temperature proxy trend; trip reason=OVERLOAD; derate state Pass: controlled limiting or timed trip. Fail: random resets without clear overload signature.
Brownout during pull-in Bus droop duration; coil chatter count; state transition timing; retry burst pattern Pass: either holds through or controlled open with clear reason. Fail: chatter + ambiguous resets.
Reverse polarity / miswire Reverse-block event; branch V distribution; trip reason=MISWIRE/REVERSE; de-energized verification Pass: no reverse conduction; branch truly de-energized. Fail: “OFF but still live” via unintended path.
Backfeed / OR-ing Branch voltage holds after OFF; upstream node remains high; weld suspected confidence stays low unless current also flows Pass: backfeed classified correctly (not mis-labeled as weld). Fail: false weld alarms or unsafe live branch.
Surge clamp event Clamp active flag; bus Vmin; trip reason (none or surge); state remains stable; chatter count stays low Pass: clamp event logged without false trip. Fail: UV resets, contactor chatter, or eFuse false trips.
Contactor weld suspected Command OFF but current persists; residual voltage stability; aux-contact mismatch (if present) Pass: multi-evidence confirms weld. Fail: weld suspected triggered by backfeed-only voltage.
Thermal derate Derate state; temp proxy; time-at-derate; load group actions (shed non-vital first) Pass: staged derate preserves vital loads. Fail: sudden shutdown with no prior evidence trend.
Vibration intermittent fault Event clustering by time; Ipk short spikes; retry burst signature; impedance proxy trend Pass: intermittent fault becomes diagnosable pattern. Fail: scattered resets with no correlated evidence.

Mobile note: the table scrolls horizontally to prevent page shift. The goal is “testable + explainable”, not “one-time pass”.

Validation Closed Loop (Rail PDU) Stimulus → Measure → Event → Log → Action → Self-explainable pass/fail Stimulus Surge Brownout Reverse Short / overload Vibration PDU under test Measure Bus V / Branch I / Temp Detect Trip / Derate / Clamp Context State + retry + aligned time pre/post snapshot Evidence Event reason Log timestamp Counters health Pass / Fail self-explainable? Rule: a failure is acceptable only if evidence proves why it failed
Figure H2-11 — Validation is a closed loop: rail-relevant stimuli must produce measurable evidence (time-aligned events, logs, counters). Pass/fail should be determined by whether failures are self-explainable, not by a single “did it reset” outcome.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12. FAQs (Evidence-Linked, No Scope Creep)

Each answer follows the same rail field-debug structure: 1-sentence conclusion2 evidence checks1 first fix. Titles map back to the relevant evidence-chain chapters above.

FAQ 01 Trips immediately at power-up — inrush is real, or eFuse threshold/filtering is too sensitive? (→ H2-4/H2-5)

Conclusion: Most “instant trips” are threshold/filter sensitivity unless the inrush waveform shows a predictable single peak aligned with PRECHARGE/ON.

  • Evidence #1: Compare Ipk + trip time against the expected precharge/inrush envelope; real inrush is short and repeatable.
  • Evidence #2: Check state timing (OFF→PRECHARGE→ON) and whether trips occur at a fixed boundary (suggests filtering/windowing).
  • First fix: Add a startup blanking / two-stage threshold (looser during precharge, tighter after ON) and re-validate selectivity.
Example parts: TI TPS2662 / TPS25982 (eFuse/hot-swap), TI INA240A1 (current sense)
FAQ 02 A branch short collapses the whole LV bus — selectivity is wrong, or return path common impedance? (→ H2-5/H2-2)

Conclusion: If the upstream bus droops before the branch trips, either trip selectivity is too slow or the return path/reference point is sharing impedance.

  • Evidence #1: Correlate bus Vmin/droop duration with branch trip time; bus-first droop indicates poor selectivity or wiring impedance.
  • Evidence #2: Compare “which channel tripped” vs fault location; wrong-channel trips often point to sensing reference/grounding issues.
  • First fix: Tighten branch fast-trip/I²t so the branch clears before bus collapse, and verify the current-sense reference location.
Example parts: TI INA281A1 (current sense), TI TPS2662 (eFuse), ADI ADuM141E (isolation for sense/control)
FAQ 03 No real short found, but it keeps hiccuping — intermittent harness short or thermal derate? (→ H2-5/H2-10)

Conclusion: Repeated hiccup is thermal if temperature proxy trends upward into a derate state; it is intermittent wiring if events cluster with short, random Ipk spikes.

  • Evidence #1: Look for temp proxy + derate state preceding hiccup; thermal causes show gradual drift, not single spikes.
  • Evidence #2: Check Ipk distribution and retry bursts; intermittent shorts show brief spikes with frequent retries and inconsistent intervals.
  • First fix: Add retry backoff + retry limit to avoid oscillation, then separate logs for “DERATE” vs “SHORT” signatures.
Example parts: TI TMP117 (temperature), TI TPS25982 (hot-swap/eFuse), TI INA240A1 (current sense)
FAQ 04 After a maintenance miswire, the system becomes unstable — weak reverse protection or an unblocked backfeed path? (→ H2-6)

Conclusion: If OFF branches remain biased or “half-alive”, backfeed/OR-ing is usually the root cause; if thresholds drift or trips become random, prior reverse stress may have damaged protection.

  • Evidence #1: Measure voltage distribution after OFF; sustained load-side voltage implies backfeed unless current also proves conduction.
  • Evidence #2: Compare trip reason + Ipk before/after the incident; a systematic shift suggests component stress/damage.
  • First fix: Implement/verify reverse blocking + OR-ing isolation so a disconnected branch cannot stay energized through backfeed.
Example parts: TI TPS2662 (eFuse w/ protection), Infineon BTS50055-1TMA (HSS), TI ISO7741 (isolator)
FAQ 05 Contactor chatters occasionally — coil undervoltage, or bus sag caused by surge clamping? (→ H2-8/H2-7)

Conclusion: Chatter is coil undervoltage unless chatter timestamps align with clamp-active + bus droop events, which indicates clamping-induced sag.

  • Evidence #1: Align coil chatter count / pull-in state with bus Vmin + droop duration.
  • Evidence #2: Check clamp active (surge indicator) near chatter; repeated correlation points to energy path and placement issues.
  • First fix: Increase coil hold margin (PWM hold strategy) and shorten/define the clamp energy loop to chassis/return.
Example parts: Littelfuse 5KPxx series (TVS example), Bourns 2038-xx (GDT example), TI TMP117 (temp logging)
FAQ 06 Contactor releases too slowly and arcing worsens — wrong demag clamp choice or PWM hold policy? (→ H2-8)

Conclusion: Slow release is typically dominated by the demagnetization clamp path; PWM hold only matters if it delays or re-triggers the release window.

  • Evidence #1: Measure release time (command OFF → coil current decay / contact opening proxy) for each clamp option.
  • Evidence #2: Verify PWM hold exit timing and ensure there is no “hold residue” during OFF transitions.
  • First fix: Move from diode-only clamp to TVS or controlled active clamp to hit a bounded release time target.
Example parts: Littelfuse 5KPxx series (TVS example), Infineon BTS7008-1EPP (HSS option for coil supply control)
FAQ 07 False trips are worse on stormy days — clamp loop too long, or common-mode current takes the wrong path? (→ H2-7/H2-11)

Conclusion: Weather-correlated false trips usually indicate an energy-path problem (loop length/return) or common-mode current coupling into sensing references.

  • Evidence #1: Check clamp active and whether bus Vmin dips coincide with trip events (misoperation by sag).
  • Evidence #2: Compare which channels trip; widespread or “wrong-channel” trips suggest common-mode/reference disturbance.
  • First fix: Shorten the clamp path and enforce a clear to-chassis / to-return discharge route, then re-run the validation checklist.
Example parts: TDK EPCOS B722xx (MOV example), Bourns 2038-xx (GDT example), TI ISO7741 (isolation for control I/O)
FAQ 08 TVS/MOV runs hot or fails — insufficient energy rating, or the loop dumps too much energy into the clamp? (→ H2-7)

Conclusion: Clamp overheating is more often an energy-path/placement issue than a pure rating issue; a bad loop can force the clamp to absorb most of the surge energy.

  • Evidence #1: Track clamp active counters and correlate with temperature rise; frequent clamp activation suggests repeated energy absorption.
  • Evidence #2: Observe bus behavior; if clamping causes deep droops or resets, the energy is not being routed correctly.
  • First fix: Redesign the discharge loop (short, low-inductance, defined return) before upsizing clamp ratings.
Example parts: Littelfuse 5KPxx (TVS example), TDK EPCOS B722xx (MOV example)
FAQ 09 Remote telemetry says “overcurrent”, but nothing is found on site — which log fields or trigger windows are missing? (→ H2-9)

Conclusion: Without Ipk/trip-time plus a short pre/post capture window, “overcurrent” is a label, not evidence—field teams cannot confirm or reproduce the cause.

  • Evidence #1: Verify the log contains Ipk + trip time and the exact trip reason (SHORT vs OVERLOAD vs UV side effects).
  • Evidence #2: Confirm a pre/post snapshot (bus Vmin, state, retry count) to distinguish true load fault from brownout/clamp-induced misoperation.
  • First fix: Add a single “fault record” template that always logs: Vmin, Ipk, trip time, state, retry, timestamp.
Example parts: Fujitsu MB85RS64V (FRAM log), Microchip ATECC608B (signed event hook)
FAQ 10 Residual voltage remains after branch shutdown — backfeed, or a load energy-storage “illusion”? How to tell? (→ H2-6/H2-9)

Conclusion: Backfeed is suspected only when OFF-state voltage is sustained and accompanied by current; pure energy storage shows a predictable decay with near-zero current.

  • Evidence #1: Check OFF-state voltage shape: stable plateau suggests backfeed; monotonic decay suggests stored energy discharge.
  • Evidence #2: Measure OFF-state current: sustained current confirms conduction path; near-zero current indicates storage-only behavior.
  • First fix: Implement “de-energized verification” logic that requires V + I consistency before declaring weld/backfeed faults.
Example parts: TI INA240A1 (current evidence), TI TPS2662 (blocking/monitoring hook)
FAQ 11 Works in winter but derates often in summer — poor enclosure heat path or temperature sensing bias? (→ H2-10)

Conclusion: If measured temperature correlates tightly with load current/power, the heat path is real; if derate triggers without matching power trends, sensor placement or bias is likely.

  • Evidence #1: Compare temp proxy vs current/power; real thermal limits show consistent correlation and lag.
  • Evidence #2: Check time-at-derate and peak trends; spiky triggers with low power suggest measurement bias or local hotspots unrelated to the main path.
  • First fix: Recalibrate thresholds and verify sensor placement, then improve the enclosure-to-carbody thermal interface if correlation confirms heat-path limits.
Example parts: TI TMP117 (temperature), Maxim MAX31875 (temperature), TI TPS25982 (power-path protection)
FAQ 12 Compliance tests pass, but field resets still happen — brownout policy or missing event context? (→ H2-11/H2-9)

Conclusion: Field-only resets usually require two fixes: a brownout/retry policy that avoids oscillation, and richer event context so the reset can be proven as brownout vs misoperation.

  • Evidence #1: Log bus droop duration + restart loop counters; repeated droops aligned with retries indicate policy-induced oscillation.
  • Evidence #2: Confirm the log includes state + timestamp alignment and a pre/post snapshot; otherwise the root cause stays ambiguous.
  • First fix: Add retry backoff + hard limit, and enforce a minimum event record: Vmin, Ipk, state, reason, retry, timestamp.
Example parts: Microchip ATECC608B (signed events), Fujitsu MB85RS256TY (FRAM log)

Tip: FAQ answers stay short on purpose—each one points back to the upstream chapters where waveforms, thresholds, selectivity, clamping paths, and logging rules are defined.