PoE PD Controller: Classification, Isolated DC-DC & Event Logging

Q: Class is correct, but the PD keeps rebooting under load — inrush retry storm or DC-DC hiccup?

Likely cause: Brownout loop (UVLO hysteresis too small) or converter hiccup colliding with PD retry/backoff. Quick check: Capture Vin/Iin/Vbulk and log state transitions around reboot; verify reboot aligns to UVLO or OCP/OTP event. Fix: Increase UVLO hysteresis (ΔV=X), tune inrush profile, and enforce retry backoff (Tbackoff=X) before re-enable. Pass criteria: No reboot for X minutes at Y% load step; retry count ≤ X per hour; Vbulk never drops below X V.

Q: Passes on bench, fails on long cable — marginal MPS or startup timing?

Likely cause: Cable drop increases input ripple; marginal MPS current during light-load phases triggers disconnect. Quick check: Compare long-cable vs bench: measure Vin ripple, record MPS-related events, and run a controlled sleep/load profile. Fix: Increase hold-up (Cbulk=X) or adjust load shaping to keep MPS, and avoid startup windows with insufficient margin. Pass criteria: No disconnect for X hours on Z meters cable; Vin_min ≥ X V during worst transient; MPS events = 0.

Q: Classification is unstable between boots — leakage/tolerance drift or input bridge drops?

Likely cause: Leakage paths or DET/CLS bias interacts with bridge drop and tolerance stack. Quick check: Measure signature/class currents across X cold boots, varying humidity/temperature; compare with bridge vs ideal-bridge path if available. Fix: Tighten leakage budget, verify Rsig tolerance stack, and re-bias DET/CLS network to meet window with margin. Pass criteria: Class result identical across X boots and Y conditions; class current stays within ±X% window.

Q: SR improves efficiency but creates random faults — dead-time too tight or ringing false turn-on?

Likely cause: SR dead-time margin too small, or drain ringing couples into SR gate sense and causes false turn-on. Quick check: Scope SR gate and secondary current; look for overlap and false pulses during ringing at worst load/line. Fix: Increase dead-time (DT=X ns), add damping/snubber, and tighten SR gate routing/return. Pass criteria: SR overlap time = 0; minimum DT margin ≥ X ns across Y corners; random fault rate ≤ X/day.

Q: Light-load sleep saves power but gets disconnected — MPS missing due to burst mode?

Likely cause: Burst-mode or deep sleep pulls average input below PD-side MPS requirement. Quick check: Log MPS-related events and measure input current profile over the sleep duty-cycle. Fix: Add controlled maintenance load and schedule periodic wake if needed; keep AUX rail from collapsing into MPS loss. Pass criteria: No disconnect for X hours with sleep duty-cycle Y%; maintenance pulses every X ms (or bleeder ≤ X mW).

Q: Fault pin toggles but logs show nothing — brownout wiped log or interrupt not latched?

Likely cause: Log write happens too late and gets lost during brownout, or INT/fault edge is not latched. Quick check: Force a repeatable fault and verify log commit time, hold-up time, and INT latch behavior. Fix: Log first-fault immediately, add brownout-safe commit, and latch INT until host ACK. Pass criteria: First-fault record retained after X ms brownout; INT asserted ≥ X ms or until ACK; missing logs = 0/X trials.

Q: Thermal looks fine, still trips OTP — hotspot near SR/rectifier or sensor placement mismatch?

Likely cause: Local hotspot exceeds OTP while the measured point stays cool. Quick check: Correlate OTP time with IR scan/thermocouples at SR devices, transformer, and sensor location. Fix: Improve spreading (copper/vias), adjust SR timing to reduce loss, and align sensor placement to hotspot. Pass criteria: Hotspot ΔT ≤ X °C at ambient Y °C; no OTP in X minutes at Y% load.

Q: Inrush looks within limit, yet startup fails — UVLO hysteresis or bulk cap ESR/ESL?

Likely cause: Vbulk droops due to ESR/ESL and triggers UVLO; or UVLO hysteresis is too small and creates oscillation. Quick check: Capture Vbulk droop and UVLO threshold crossing; compare capacitor ESR/ESL and placement inductance. Fix: Increase hysteresis (ΔV=X), reduce loop inductance, and choose cap ESR/ESL to meet hold-up window. Pass criteria: Startup succeeds across X cold boots; Vbulk_min ≥ X V; UVLO transitions ≤ X per boot.

Q: Output short causes long recovery time — latch-off policy too strict or retry backoff too long?

Likely cause: Latch-off requires manual intervention, or conservative backoff delays re-power after transient shorts. Quick check: Inject a controlled short for X ms; log fault code, action, and backoff timing. Fix: Use hiccup for transient faults and latch for hard faults; tune backoff and limit restart attempts per window. Pass criteria: Recovery ≤ X s for transient short; restart attempts ≤ X per Y minutes; first-fault cause recorded.

Q: Field units fail after an ESD event but still power — leakage shift breaks signature?

Likely cause: Post-ESD leakage increases and shifts signature/class behavior or corrupts DET/CLS bias. Quick check: Before/after ESD: measure leakage and compare signature/class stability; correlate with boot failures. Fix: Improve protection/leakage budget and add margin to signature/class networks. Pass criteria: After IEC ESD stress, class/signature stable within ±X%; boot success ≥ X% across Y boots; leakage ≤ X µA.

← Back to: Industrial Ethernet & TSN

A PoE PD controller turns Ethernet power into predictable, field-reliable rails by managing detect/class, inrush/startup, MPS, and fault/logging.

This page focuses on PD-side design hooks for integrated isolated DC-DC (including sync rectification), so power-up stays stable, disconnects are avoided, and every field failure leaves a usable “black-box” record.

H2-1. Definition & Scope of a PoE PD Controller (with Isolated DC-DC + Sync Rectification)

A PoE PD controller is the power-admission brain of a Powered Device: it negotiates power (PD-side), performs safe power-up, keeps the port alive (MPS), and exposes protection + telemetry—so the isolated converter can start reliably and recover predictably.

Key takeaways

PD controller ≠ PHY/MAC. PHY/MAC moves Ethernet data; the PD controller controls PoE power entry and power-state behavior.
“Power-on” is not the end. Inrush, MPS, retry policy, and fault recovery decide field reliability.
Isolated DC-DC + SR raises the bar. Startup sequencing, fault policy, and telemetry must cooperate with the converter and the load.

What a PD controller actually controls (PD-side responsibilities)

Signature & classification: presents the correct PD signatures so the port can grant power.
Inrush / hot-swap admission: limits input surge, ramps the bulk capacitor, and prevents repeated brownout loops.
MPS & disconnect behavior: maintains port validity during light-load and sleep modes, and handles safe removal.
Protection & recovery policy: UVLO/OVP/OCP/OTP responses, latch vs hiccup vs backoff retry.
Telemetry & event hooks: exposes PG/fault states, counters, and logs for service and forensics.

Responsibility map (to prevent “wrong-owner” debugging)

PD Controller (this page)

Classification, inrush/hot-swap, MPS, fault policy, PG/fault/telemetry, event logging and converter coordination.

Ethernet PHY / MAC (not owned here)

Link training, packet IO, timing/clocking for Ethernet data—separate from PoE power-admission behavior.

PSE policy & port power management (not owned here)

Per-port allocation, detection policy, and network-driven power advertisement belong to the PSE controller page.

When isolated DC-DC and sync rectification (SR) support matters

Multiple isolated rails: an isolated main rail plus a housekeeping rail needs clean sequencing and PG/fault gating.
Higher efficiency at low headroom: SR reduces secondary losses, but requires margin against ringing and false turn-on.
Field reliability: deterministic retry/backoff plus event logging prevents “mystery resets” and speeds service triage.

Practical integration options: external isolated flyback/active-clamp converter (PD coordinates enable/PG) or tighter PD-led sequencing with explicit fault policy and telemetry hooks.

Stop line (scope boundary)

This page stays strictly on the PD side: signature/classification, inrush/MPS/disconnect, isolated DC-DC coordination (including SR), and fault/event logging. PSE policy, magnetics/ESD/surge deep-dives, and TSN/PTP topics are handled on their dedicated pages.

PoE PSE Controller · Magnetics & CMC · Low-C TVS

System view: power path (blue) and control/telemetry hooks (dashed) to support deterministic startup and field diagnostics.

H2-2. Power Path & Interfaces: From RJ45 to Isolated Rails

Treat the PD as a traceable chain. Every field failure should map to one block on the path (input admission, bulk energy storage, isolation conversion, or telemetry/control). This section defines that chain.

The two paths you must keep distinct

Power path (energy): PoE input → hot-swap/inrush → bulk capacitor → primary switch → transformer → SR → isolated rails.
Control/telemetry path (behavior): DET/CLS decisions, gate control, enable/PG/fault, and I²C/PMBus reporting.

PD input front-end: diode bridge vs ideal bridge (what changes in practice)

Diode bridge: simplest polarity handling, but drops headroom and concentrates heat—classification margins and startup robustness become more sensitive.
Ideal bridge: reduces losses and improves thermal margin, but adds control behavior that must remain predictable across transients and fault recovery.

Design rule: choose the input method based on headroom and thermal budget first, then validate that detection/classification behavior remains stable across corners.

Bulk capacitor placement (energy store + startup stress)

Bulk is both a reservoir and a load: it stabilizes the converter input, but it is the main reason inrush exists.
Placement is part of the control loop: distance increases loop inductance, distorts inrush shape, and can trigger protection or brownout loops.
Size must match policy: bulk sizing must be consistent with inrush limit and retry/backoff strategy (large bulk + aggressive retry is a reboot amplifier).

Interfaces (grouped by function, not by pin list)

DET / CLS (power negotiation signals)

Controls how the PD presents its identity. Instability here shows up as inconsistent classification, intermittent power grant, or “works once, fails next boot.”

Gate drive / Hot-swap (power admission control)

Shapes inrush and defines safe startup. Failures look like reboot loops, “clicking” power, or brownout at load transients.

AUX / housekeeping (control survival rail)

Keeps state, logging, and control alive through short disturbances. A weak housekeeping strategy often causes missing logs and non-reproducible field behavior.

PG / Fault + I²C / PMBus (diagnostics)

Turns symptoms into evidence. Good logging design makes “random resets” measurable: first-fault cause, retry count, temperature peaks, and state transitions.

Output rails: isolated main vs auxiliary housekeeping (why both matter)

Isolated main rail: powers the payload. Its behavior dominates efficiency and thermal performance.
Housekeeping rail: powers control and diagnostics so state and logs survive short droops and controlled shutdown.

Design hook: treat housekeeping as the “black-box power supply.” If it collapses first, evidence disappears.

Stop line

This section defines the PD-side power/control chain only. Detailed magnetics/ESD/surge layout rules and PSE allocation policy are linked elsewhere and not expanded here.

Read failures left-to-right: input admission → bulk/inrush behavior → isolation conversion → rails and diagnostics.

H2-3. IEEE 802.3 PD Handshake (Only What PD Designers Must Use)

This section keeps only the PD-side handshake logic that directly constrains hardware design: what the PD must present in each phase, and what “Class” actually constrains (power budget, inrush behavior, and MPS/maintain requirements).

Key takeaways (PD designer view)

Handshake is phase-based: Detect → Class → Power-on → Maintain. Each phase has a PD “presentation” that is checked in a window.
Class is a constraint bundle: it bounds the budget, influences inrush limits, and affects MPS behavior during light-load or sleep.
Common pitfall: treating Class as “guaranteed usable power” ignores path losses and policy limits (cable, bridge, thermal, DC-DC efficiency, protection).

The four phases (what the PD must “present”)

Detect

The PD presents a signature (modeled as Rsig plus leakage/background paths). The check is sensitive to leakage and tolerance stack.

Class

The PD presents a classification current shape within a defined window. Measurement pitfalls include parasitics, bridge drops, and event-driven drift.

Power-on

The PD transitions to power conversion with controlled inrush. Bulk energy storage, gate shaping, and retry policy decide whether startup is deterministic or a reboot loop.

Maintain

The PD must remain “valid” under light load and sleep strategies. MPS constraints interact with burst modes, housekeeping rails, and logging retention.

What “Class” constrains (design inputs, not marketing labels)

Power budget constraint: caps average draw assumptions for system sizing and thermal planning.
Inrush behavior constraint: limits how quickly bulk and the converter input can be energized without violating windows.
MPS constraint: restricts “too-light” operating modes; maintain strategies must keep the port alive.

Sanity rule: usable payload power must be computed after subtracting path losses and policy headroom: cable drop + input bridge/ideal-bridge behavior + converter efficiency + thermal derating + protection/retry policy.

Typical misconception: “Class equals guaranteed usable power”

Class is a handshake promise checked in defined windows; it is not a constant “always available” payload guarantee.
Field headroom shrinks with temperature, cable length, connector aging, and conversion losses.
Policy matters: inrush limiting and MPS compliance can force conservative behaviors that reduce usable payload power.

Stop line

This is a PD-side, minimal-use view of the handshake. It preserves phase semantics and constraints only. Full standard text and PSE allocation policy are not reproduced here.

Use phase mapping to debug: mis-detect/mis-classify → signature windows; reboot loops → inrush and bulk; late drops → MPS/maintain behavior.

H2-4. Detection & Classification Circuit Design (Accuracy, Tolerance, and Failure Modes)

Mis-detect and mis-classify rarely come from a single “bad part.” Robust PD design treats detection/classification as a three-layer error chain: tolerance stack, leakage/parasitics, and event-driven drift (plug, cold start, humidity, and post-ESD shifts).

Three-layer error chain (turn symptoms into a test plan)

Layer 1 — tolerance stack: Rsig tolerance, reference/threshold spread, temperature drift.
Layer 2 — leakage & parasitics: ESD clamp leakage, bridge behavior, PCB surface leakage, unintended parallel paths.
Layer 3 — multi-event interaction: plug/unplug + cold start + humidity + post-ESD drift creating corner-only failures.

Signature path design (tolerance stack + leakage reality)

Rsig is never alone: any parallel leakage path shifts the effective signature and narrows margin.
Temperature creates direction: leakage typically increases with temperature; Rsig drift depends on its technology and coefficient.
ESD structures are part of the circuit: clamp leakage (especially after stress) can change detect/class results without visible damage.

Design intent: ensure signature margins survive worst-case stacks: Rsig tolerance + drift + leakage + contamination + bridge variation.

Classification current shaping: common measurement pitfalls

Edge spikes: fast transients can “look like” over-current inside the window even when average current is correct.
Plateau instability: a wobbling current plateau behaves like noise at the classifier’s measurement point.
Cold-start coupling: undervoltage and startup sequencing can distort current shape and cause intermittent class results.

Practical goal: current shape should be stable and repeatable across temperature and event sequences, not only on a warm bench.

Multi-event interaction (why “only sometimes” failures happen)

Plug/unplug: contact bounce and transient paths change what the detector sees.
Humidity/contamination: surface leakage adds a hidden parallel path that is absent on a clean bench.
Post-ESD drift: the system may still “work” but margins shrink; detect/class failures become corner-triggered.

Debug principle: isolate which layer dominates first (tolerance vs leakage vs event interaction), then design a targeted test sequence.

Stop line

This section covers detection/classification accuracy and failure mechanisms at circuit level. Detailed IEC test setup and layout rules live in the protection/magnetics pages.

Design for margins across the full stack: tolerance + drift + leakage + event interactions. Use TP1–TP3 to localize where the signature shifts.

H2-5. Inrush, Hot-Swap, and Safe Startup Sequencing

Startup failures are usually not “random.” They are outcomes of three coupled constraints: inrush limiting, bulk energy, and the retry/backoff state machine. This section turns those constraints into design inputs, waveforms, and verification hooks.

Failure map (symptom → phase)

Plug-in reboot loop: INRUSH ↔ UVLO oscillation or timeout before Vbulk reaches the RUN threshold.
Cold-start fails, warm-start OK: margin shrink from Rds(on), leakage, and reference drift affects inrush slope and thresholds.
Bigger bulk makes it worse: longer inrush window + added drop/heat increases retry probability and storm risk.
Starts once, then never again: thermal accumulation or lockout after repeated FAULT events.

Inrush current limiting strategy (what the PD enforces)

Constraint target: limit Iin peak/average and control dVbulk/dt to stay inside handshake/startup windows.
Design intent: charge bulk fast enough to reach RUN, but not so aggressively that upstream checks or protection trip.
Waveform requirement: a stable inrush plateau is safer than a spiky peak with the same average.

Placeholders for verification: Iin_limit = X, inrush_max_time = X, Vbulk_OK = X, UVLO = X.

Bulk capacitance sizing logic (PD-side, policy-agnostic)

Too small: Vbulk sags during load steps and falls into UVLO/brownout loops.
Too large: longer charge time under Iin_limit increases timeout risk and heats the hot-swap path during retries.
Matched sizing: bulk size must be compatible with (Iin_limit × allowed time) so Vbulk reaches Vbulk_OK with margin.

Practical rule: validate bulk not only for steady-state hold-up, but also for “time-to-RUN” under worst-case Vin and hottest path resistance.

Startup state machine (retry/backoff, UVLO hysteresis, brownout loops)

UVLO hysteresis: insufficient hysteresis creates oscillation near thresholds (RUN↔FAULT).
Retry/backoff: backoff prevents thermal accumulation and avoids “retry storms.”
Brownout loop: load step pulls Vbulk below UVLO, causing repeated restarts unless policy breaks the loop.

Logging hooks: retry_cnt, last_fault_code, last_Vbulk_min, last_inrush_time (all readable via host interface or fault pins).

Verification hooks (minimum test capture)

Waveforms: Vin, Iin, Vbulk, PG/Fault, retry counter (trigger on plug-in and on FAULT edges).
Corner sequences: cold-start, long cable drop, repeated plug/unplug, load step at RUN entry.
Pass criteria (placeholders): start_time ≤ X, Iin_peak ≤ X, retry_cnt ≤ X within Y minutes, Vbulk_min ≥ X.

Debug by mapping symptoms to transitions: INRUSH timeout, UVLO oscillation, RUN brownout, and RETRY storms. Keep thresholds as “X” placeholders for product-specific limits.

H2-6. Maintain Power Signature (MPS) & Disconnect Behavior

A PD can “look alive” locally yet still get disconnected upstream when the maintain criteria are violated. The most common triggers are light-load gaps, burst/skip modes, and deep sleep states that drop the effective load below the maintain window.

What triggers MPS loss (PD-side)

Low-load operation: average power may be adequate, but the effective load can fall below the maintain window.
Burst/skip mode gaps: long “no-load” gaps create maintain holes even if bursts are large.
Deep sleep: main rails collapse while only housekeeping remains; the port can be seen as inactive.

Placeholder windows: MPS_min = X, max_gap = X, disconnect_delay = X.

Keeping MPS while optimizing efficiency

Bleeder load

Simple and robust. Trades efficiency and heat for maintain stability. Best for always-on connectivity and minimal firmware dependence.

Pulsed loading

Higher efficiency by injecting periodic load pulses. Requires timing design (period/duty) to avoid long gaps that violate the maintain window.

Design guard: ensure the longest “effective load gap” stays below the allowed maintain gap (placeholder X).

Corner cases (sleep modes, AUX rails, periodic wake)

AUX-only operation: housekeeping may keep logs running while main rails are off; maintain can still be lost.
Periodic wake: long intervals create maintain holes; short intervals reduce savings and raise temperature.
Mixed bursts: communications or sensing bursts can create irregular load profiles that violate maintain windows unexpectedly.

Verification hooks (make MPS measurable)

Record: load profile (or duty-equivalent), longest gap, disconnect timestamp, and last keep-alive action.
Stress: deepest sleep, lowest ambient, highest cable drop, and repeated wake/sleep cycling.
Pass criteria (placeholders): no disconnect within Y hours; max_gap ≤ X; keep-alive energy ≤ X.

A PD can be disconnected when burst gaps or deep sleep drop the effective load below the maintain window. Keep the longest gap below the allowed limit (X) using bleeder or pulsed loading.

H2-7. Isolated DC-DC Integration: Primary Control, Feedback, and Sync Rectification (SR)

SR is not “just higher efficiency.” It introduces timing and recovery risks that couple secondary current, gate timing, feedback behavior, and PD enable/PG policy. This section turns SR into a controllable interface with measurable hooks.

Integration map (what must line up)

PD controller: inrush/hot-swap, MPS, enable sequencing, telemetry, and fault/event capture.
Primary control: switch drive and current limiting; startup dynamics determine whether SR ever sees valid current.
SR stage: gate timing must avoid reverse current and reduce diode conduction without creating overlap.
Feedback: opto vs PSR impacts light-load stability and how PG/FAULT should be defined.

Isolation topologies used with PD (integration constraints only)

Flyback (incl. QR/CCM variants)

SR sees pulsed energy transfer, especially at light load. Gaps and discontinuous current make gate timing margin sensitive. PD enable/PG must tolerate burst behavior without false FAULT transitions.

Active clamp flyback (ACF)

Higher efficiency with faster edges and richer waveforms. SR and feedback are more sensitive to parasitics and timing tolerance. EMI depth analysis belongs to sibling pages; here the goal is stable interfaces and recovery policy.

SR control modes (timing and dead-time risks)

Self-driven SR: simpler, but timing shifts with load, transformer parasitics, and temperature. Risks reverse current or excessive diode conduction.
Controller-driven SR: controllable and efficient, but dead-time must be tuned. Too short → overlap/reverse current. Too long → diode loss and heating.

Placeholder: dead-time = X (validate across light load, cold start, and worst-case parasitics).

Feedback and PD interaction (enable/PG stability)

Feedback choice defines how output stability is sensed during light-load gaps and sleep states:

Opto feedback

Strong regulation, but PG/FAULT definitions must account for startup and recovery edges. Define PG as “Vout ≥ X for ≥ X time” to avoid toggling in burst modes.

PSR (primary-side regulation)

Fewer isolated parts, but sensitivity rises with transformer tolerance and load profile. Validate output drift and recovery timing during deep sleep and periodic wake.

PD hooks: EN sequencing, PG stability window, FAULT latching policy, and event logs (first-fault cause).

Verification hooks (SR made measurable)

Scope points: SR gate, secondary current (or proxy), Vout ripple, SR MOS temperature, PG/FAULT edges.
Corner cases: light load + burst/skip, cold start, lowest Vin, load steps near sleep transitions.
Pass criteria (placeholders): reverse-current duration ≤ X, diode conduction share ≤ X, PG toggles ≤ X per Y minutes.

Treat SR as a timed interface. Validate dead-time (X) against secondary current across light-load gaps, cold start, and worst parasitics; define PG with stability windows to prevent burst-induced toggling.

H2-8. Protection & Fault Handling (PD + DC-DC Combined)

Fault handling should be expressed as cause → symptom → quick check → action. Combined PD + DC-DC designs must prevent retry storms, preserve first-fault evidence, and choose latch-off, hiccup, or auto-retry based on safety and recoverability.

Fault taxonomy (combined view)

Input: UVLO / OVP / inrush-OCP → startup loops, drop/reset, or no power-on.
Power stage: primary OCP, SR timing faults, feedback faults → current limit, ripple, or overheat.
Thermal: OTP in PD, primary, or SR → periodic dropouts and heat accumulation under retries.
Output: short/overload/OVP → collapse, hiccup cycling, or latch-off depending on policy.
Handshake-related: classification/detect faults → never enters RUN or powers briefly then stops.

Latch-off vs hiccup vs auto-retry (when to choose what)

Latch-off

Use for persistent shorts or suspected hardware damage. Prevents repeated heating and avoids uncontrolled on/off cycling.

Hiccup

Use for transient overloads. Limit duty cycle to control temperature, and keep fault windows long enough to avoid oscillation.

Auto-retry (with backoff)

Use only when the fault is recoverable and low-risk (e.g., upstream dips). Must include retry limits and exponential or staged backoff.

Anti-storm guard: retry_cnt ≤ X within Y minutes, plus temperature-aware backoff (placeholders).

Safe recovery policy (log first-fault cause before action)

Latch first-fault: fault_code_first, timestamp, and the state at fault entry.
Snapshot: Vin_min, Vbulk_min, Vout_min, temp_max, retry_cnt (placeholders).
Apply policy: latch-off / hiccup / retry+backoff based on fault class and thermal margin.

Express faults as “symptom → cause → check → action.” Preserve first-fault evidence, enforce retry limits (X), and use backoff to prevent storms.

H2-9. Telemetry, Event Logging & “Black-Box” for Field Diagnostics

Field issues are rarely reproducible on demand. A minimal black-box turns “intermittent” into an ordered event timeline with consistent counter definitions, brownout-safe retention, and fast readout.

What to log (3 layers that work in the field)

Event log (timestamped)

Power class / negotiation result
Start attempts and state transitions (IDLE → INRUSH → RUN → FAULT → RETRY)
Faults with codes (UVLO/OVP/OCP/OTP/SHORT/FB/SR)
Recovery action taken (LATCH / HICCUP / RETRY+backoff)

Snapshots (triggered “evidence”)

Capture a compact electrical snapshot on key events: Vin, Iin, Temp peak, State, FaultCode, RetryCount. Keep units and scaling consistent to enable cross-device comparison.

Counters (long-horizon trend)

Track totals such as start_attempts, fault_count_by_type, brownout_count, and otp_events. Counters are only useful when the denominator and time window are explicitly defined.

Counter definitions (avoid denominator/window mismatch)

Bind a denominator: per start_attempts (startup issues) or per uptime_minutes (run-time issues). Avoid mixing.
Declare the time window: since boot vs rolling window vs last N events. Keep one default for dashboards.
Define state coverage: whether INRUSH/RETRY are included. “RUN-only” metrics often hide startup storms.
Prevent endpoint mixing: per-port vs per-device totals must not be merged without labels.

Common misread

A “low fault rate” can be an artifact if the denominator is uptime while faults occur during repeated startups. For startup fragility, normalize by start_attempts and report retry_count distribution.

Interfaces and retention (brownout-safe evidence)

Readout: I²C or PMBus for structured fields; an INT pin for immediate “new event” notification.
Storage: use a ring buffer with schema_version and a commit marker to avoid partial records after brownout.
Brownout rule: log first-fault + snapshot before retry policy decisions. Preserve evidence before cycling.
Retention policy: keep last N events and last M faults, plus last boot record (placeholders).

Minimal “forensics record” template (fields list)

Keep a compact record that supports timeline reconstruction without inflating storage or readout time:

EventID (enum)
Time (ms since boot; optional UTC if available)
Vin (X), Iin (X), Temp (X)
State (IDLE/INRUSH/RUN/FAULT/RETRY)
FaultCode (UVLO/OVP/OCP/OTP/SHORT/FB/SR)
RetryCount (X) and PolicyAction (LATCH/HICCUP/RETRY+backoff)
CommitFlag (valid/partial) and SchemaVersion

Use a timestamped event log with compact snapshots. Lock counter denominators/windows, and protect evidence with ring-buffer commits across brownouts.

H2-10. Verification Plan: Bench Bring-Up → System → Production Gates

Verification should run as gates. Each gate has must-test items, captured evidence, and pass criteria (X). Failures should route to the correct chapter (handshake/inrush/SR/protection/logging) without expanding scope.

Gate template (consistent evidence and pass criteria)

Purpose: what this gate proves (electrical truth, system robustness, or production screen).
Must-test: 3–5 checks that cover the dominant risks.
Evidence: waveforms + event logs aligned by time markers.
Pass: thresholds (X) and stability windows (X).

Design gate (bench bring-up)

Handshake capture: detection/class/power-on windows (PD view).
Inrush: Iin_peak ≤ X, Vbulk ramp time ≤ X, no oscillation.
Startup stability: PG stable for ≥ X time after RUN entry.
SR timing margin: dead-time X validated against secondary current across load sweep.
Policy sanity: inject one controlled fault; confirm fault_code + action + first-fault log.

Bring-up gate (system corners)

Cable corners: validate startup and retry behavior under worst-case line drop (X).
Temperature corners: cold/hot stability; temp_peak ≤ X and no runaway retries.
Sleep/light-load: maintain stable operation without unintended drops; log continuity preserved.
Load transients: Vout droop ≤ X and no false FAULT/PG toggling.
Recovery behavior: backoff enforced; retry_cnt ≤ X per Y minutes (placeholders).

Production gate (fast screen + traceability)

Signature/class sanity: class result within expected window (X).
Fast startup: start_time ≤ X and PG asserted within X.
Log readout: schema_version + last N events + first-fault fields readable.
Param limits: key thresholds set to allowed ranges (X).
Controlled fault (optional): short pulse or overload pulse; action matches policy.

Gate the work: prove electrical truth on bench, then system robustness, then production screening and traceability. Keep thresholds as explicit placeholders (X).

H2-11 · Applications (near the end)

This page targets Powered Device (PD) designs that benefit from an isolated power stage, optional synchronous rectification (SR), and fault/event records for field diagnostics. Use the buckets below to map system needs to an implementable PD power architecture.

Bucket A · IP Cameras / Access Control

Power: 13–25W Isolated rails: 1–2 Logging: recommended

Why isolated: long cable + chassis coupling + remote mounting often demand isolation to reduce ground-loop and noise injection.
Why SR matters: sealed enclosures and compact mechanicals make efficiency-to-thermal headroom a first-order constraint.
Why logging: repeated brownouts / restart loops / thermal peaks are costly without a minimal evidence record.

Example IC part numbers to shortlist (verify fit vs power & topology):

Integrated PD + flyback (size-first): TI TPS23758
PD interface + isolated controller: TI TPS23754-1, TI TPS23753A
Integrated PD + switching regulator: ADI LTC4269-1, ADI LTC4267
High-power PD interface (external PWM): TI TPS2373 + TI LM51551-Q1 (PWM)
Secondary SR controller (if external SR): TI UCC24610

Bucket B · Wireless APs / Edge Nodes

Power: 25–60W Isolated rails: 1 + AUX SR: often worth it

When integrated isolated DC-DC is preferred: tight BOM, faster bring-up, fewer topology gotchas, and repeatable production limits.
Thermal/space trigger: if airflow is weak and heatsinking is limited, SR + spread-spectrum options become decision drivers.
Service trigger: field failures often present as “reboot loops”; logging should preserve first-fault cause across brownouts.

Example IC part numbers to shortlist:

High-power PD interface + external PWM: TI TPS2373 + TI LM51551-Q1
High-power PD interface (external pass FET option): TI TPS2379
PoE-PD interface (high power family): onsemi NCP1096
Integrated PD + switching regulator: ADI LTC4269-2 (forward + SR-friendly use cases)
Secondary SR controller: TI UCC24610

Bucket C · Industrial Sensors / Remote Nodes

Power: 7–20W Isolated rails: 1 MPS: watch light-load

Primary risk: deep-sleep or burst-mode loads can fall below Maintain Power Signature (MPS) and trigger disconnect.
Design posture: choose a PD + isolated controller that supports predictable startup/backoff and clean PG/fault signaling.
Logging posture: record power-on attempts, fault codes, and thermal peaks to avoid “no-fault-found” returns.

Example IC part numbers to shortlist:

PD + isolated controller: TI TPS23754-1, TI TPS23753A
Integrated PD + flyback regulator: ADI LTC4267 (802.3af class range), ADI LTC4269-1
Secondary SR controller (if external SR): TI UCC24610

Bucket D · Remote I/O / Distributed Control

Power: 15–30W Isolated rails: 2+ Fault policy: strict

When SR matters: multi-rail systems (isolated main + housekeeping) often run warm; SR recovers margin without larger heatsinks.
When logging matters: intermittent shorts or overloads can create retry storms; recovery policy must preserve evidence and rate-limit retries.

Example IC part numbers to shortlist:

PD interface + isolated converter controller: TI TPS23754-1
PD interface + external PWM (scales power): TI TPS2373 + TI LM51551-Q1
PoE-PD interface family: onsemi NCP1096
Secondary SR controller: TI UCC24610

When a PD with integrated isolated DC-DC / SR / logging is the right call

Isolation is non-negotiable: remote nodes, ground-loop exposure, or mixed chassis/field grounds.
Thermal density is high: limited airflow + compact enclosure + power above X W (placeholder).
Service cost is high: “reboot loop” returns require black-box evidence (first-fault + retry counters + thermal peaks).

Stop line: deeper RJ45/magnetics/CMC/TVS/EMI topics belong to sibling pages: Low-C TVS, Magnetics & CMC, Long cable & grounding.

Diagram · Application buckets (power / isolation / logging)

Buckets map system intent to PD architecture triggers. Labels show the minimum decision axes that affect stability, thermals, and serviceability.

H2-12 · IC Selection Logic + Engineering Checklist (Design → Bring-Up → Production)

The goal is a repeatable selection path: must-have gates → nice-to-have → risk hooks, ending with a checklist that forces each risk hook to have a measurable pass criterion (X placeholders).

Gate A · Must-have inputs (do not guess)

Required input power: target class/type and worst-case load (X W) plus startup margin.
Isolation requirement: yes/no, number of isolated rails (Y), and any functional partition constraints.
SR requirement: required efficiency/thermal headroom (ΔT = X) and enclosure airflow assumptions.
Telemetry/logging requirement: minimum record fields + retention across brownout (N events / T hours).
Adapter ORing: external adapter coexistence (priority policy) and enable/PG handshakes.

Gate A Output · Pick an architecture category

Category 1 — Integrated PD + flyback controller (fastest bring-up): TI TPS23758
Category 2 — PD + isolated converter controller (flexible power stage): TI TPS23754-1, TI TPS23753A
Category 3 — High-power PD interface + external PWM (scales power / tuning): TI TPS2373 + TI LM51551-Q1
Category 4 — PoE-PD interface family (system-defined DC-DC): onsemi NCP1096
Category 5 — Integrated PD + switching regulator options: ADI LTC4269-1, ADI LTC4269-2, ADI LTC4267

Note: SR can be integrated (device-dependent) or implemented via a dedicated secondary SR controller such as TI UCC24610.

Gate B · Nice-to-have (choose intentionally)

Programmable inrush profile: reduces startup surprises with large bulk caps.
Retry/backoff controls: prevents “storm” behavior under intermittent faults.
PG/fault semantics: unambiguous enable/disable of the downstream converter.
Sync / spread-spectrum options: helps manage switching interference without deep EMI detours.
Low-load behavior: avoids accidental MPS drop during sleep or burst-mode loads.
Log readout hooks: INT pin + I²C/PMBus (or simple GPIO codes) for field triage.

Gate C · Risk hooks (force a verification item)

Retry storm risk: short/overload + fast auto-retry can cause repeated inrush stress → require backoff (X) and first-fault logging.
Light-load MPS drop: deep sleep or burst mode can look like “dead PD” → require MPS retention test across modes (X).
SR timing margin: ringing/noise can collapse dead-time → require SR timing capture and minimum margin (X ns).
Thermal headroom illusion: enclosure airflow assumptions often fail → require ΔT measurement at worst case (X °C).
Counter definition mismatch: window/denominator confusion ruins diagnostics → freeze metric definitions in firmware (schema version).

Key specs checklist (with “why it matters”)

Keep targets explicit; use placeholders (X) until measured. Table scrolls on mobile by design.

Spec	Target (X)	Why it matters	How to verify
Inrush limit profile	Iin ≤ X	Prevents PSE trips and startup oscillation	Capture Vin/Iin/Vbulk during startup
UVLO hysteresis	ΔV ≥ X	Avoids brownout loops and repeated restarts	Sweep input & observe state transitions
MPS retention at light load	No disconnect for X	Prevents “system runs then suddenly dies”	Sleep/load profile tests + logs
SR dead-time margin	DT ≥ X ns	Avoids cross-conduction and thermal spikes	Scope SR gate vs secondary current
Fault recovery policy	Backoff = X	Prevents repeated stress and false RMA	Fault injection + verify logs

Concrete part-number shortlist (starting points)

The list below is intentionally pragmatic: pick a category first, then validate class/power, topology, and thermal margin in Gate tests.

Block	Example IC P/N	Use when…
Integrated PD + flyback	TI TPS23758	Size/BOM reduction is the priority
PD + isolated controller	TI TPS23754-1, TI TPS23753A	Isolated stage needs control flexibility
High-power PD interface	TI TPS2373, TI TPS2379	Power scaling / external pass device is needed
PWM controller (flyback)	TI LM51551-Q1	Used with PD interface that supports advanced startup
Secondary SR controller	TI UCC24610	External SR is preferred or required
Integrated PD + switching regulator	ADI LTC4269-1, ADI LTC4269-2, ADI LTC4267	Complete front-end + regulator simplifies design
PoE-PD interface family	onsemi NCP1096	System-defined DC-DC; PD handshake + inrush managed

Engineering checklist (must produce evidence)

Design gate · define policies + measurement points

Inrush policy: set I-limit profile and bulk capacitance target (Cbulk = X) with waveforms as evidence.
Brownout policy: UVLO thresholds + hysteresis (ΔV = X) to stop oscillation loops.
SR policy: required dead-time margin (DT ≥ X ns) and capture plan (gate + current).
Fault taxonomy: map fault → action (latch/hiccup/retry) and set backoff (X).
Forensics schema: freeze counter definitions + event record fields + versioning.

Bring-up gate · capture waveforms + stress corners

Handshake capture: detect/class/power-on traces and signatures (windows = X).
Startup stability: verify no repeated restart loops across cable and temperature corners.
MPS retention: validate deep sleep/light-load profiles do not drop power for X minutes/hours.
SR margin: verify dead-time at worst ringing condition; confirm no cross-conduction spikes.
Fault injection: short/overload/OTP tests must leave usable first-fault logs.

Production gate · quick tests + param limits

Class sanity: verify correct class behavior under controlled input (pass = X).
Startup time: time-to-regulation within X under nominal load.
Log readout: verify event counters readable and schema version matches firmware.
Thermal spot-check: ΔT within X at controlled airflow condition.
Controlled fault: one scripted fault must produce consistent action and evidence record.

Diagram · Selection flow (inputs → gates → outputs)

The flow forces every “risk hook” to appear as a verification item with a measurable pass criterion.

Stop line: selection here is PD-power-centric; deep EMI/magnetics/TVS content stays on sibling pages to avoid cross-scope.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13 · FAQs (PD classification / inrush / MPS / SR / logging)

Each answer uses a fixed, data-oriented format: Likely cause → Quick check → Fix → Pass criteria (X). Scope is strictly PD-side behavior (no PSE policy).

Class is correct, but the PD keeps rebooting under load — inrush retry storm or DC-DC hiccup?

Likely cause: Brownout loop (UVLO hysteresis too small) or converter hiccup colliding with PD retry/backoff.

Quick check: Capture Vin/Iin/Vbulk and log state transitions around reboot; verify reboot aligns to UVLO or OCP/OTP event.

Fix: Increase UVLO hysteresis (ΔV=X), tune inrush profile, and enforce retry backoff (Tbackoff=X) before re-enable.

Pass criteria: No reboot for X minutes at Y% load step; retry count ≤ X per hour; Vbulk never drops below X V.

Passes on bench, fails on long cable — marginal MPS or startup timing?

Likely cause: Cable drop increases input ripple; marginal MPS current during light-load phases triggers disconnect.

Quick check: Compare long-cable vs bench: measure Vin ripple, record MPS-related events, and run a controlled sleep/load profile.

Fix: Increase hold-up (Cbulk=X) or adjust load shaping (bleeder/pulsed load) to keep MPS, and avoid startup windows with insufficient margin.

Pass criteria: No disconnect for X hours on Z meters cable; Vin_min ≥ X V during worst transient; MPS events = 0.

Classification is unstable between boots — leakage/tolerance drift or input bridge drops?

Likely cause: Leakage paths (ESD/clamp/PCB contamination) or DET/CLS bias interacts with bridge drop and tolerance stack.

Quick check: Measure signature/class currents across X cold boots, varying humidity/temperature; compare with bridge vs ideal-bridge path if available.

Fix: Tighten leakage budget (cleanliness/guarding), verify Rsig tolerance stack, and re-bias DET/CLS network to meet window with margin.

Pass criteria: Class result identical across X boots and Y conditions; class current stays within ±X% window.

SR improves efficiency but creates random faults — dead-time too tight or ringing false turn-on?

Likely cause: SR dead-time margin too small, or drain ringing couples into SR gate sense and causes false turn-on.

Quick check: Scope SR gate and secondary current; look for overlap and false pulses during ringing at worst load/line.

Fix: Increase dead-time (DT=X ns), add damping/snubber, and tighten SR gate routing/return to reduce false triggering.

Pass criteria: SR overlap time = 0; minimum DT margin ≥ X ns across Y corners; random fault rate ≤ X / day.

Light-load sleep saves power but gets disconnected — MPS missing due to burst mode?

Likely cause: Burst-mode or deep sleep pulls average input below PD-side MPS requirement during long idle windows.

Quick check: Log MPS-related events and measure input current profile over the sleep duty-cycle; confirm disconnect aligns with low-load segment.

Fix: Add controlled maintenance load (bleeder or pulsed loading) and schedule periodic wake if needed; keep AUX rail from collapsing into MPS loss.

Pass criteria: No disconnect for X hours with sleep duty-cycle Y%; input maintenance pulses occur every X ms (or bleeder ≤ X mW).

Fault pin toggles but logs show nothing — brownout wiped log or interrupt not latched?

Likely cause: Log write happens too late and gets lost during brownout, or INT/fault edge is not latched/qualified.

Quick check: Force a repeatable fault and verify: (1) log commit time, (2) hold-up time, (3) INT latch behavior and debounce.

Fix: Log “first-fault” immediately, add brownout-safe commit (NVM/retention RAM), and latch INT until host acknowledges.

Pass criteria: First-fault record retained after X ms brownout; INT remains asserted ≥ X ms or until ACK; missing logs = 0 / X trials.

Thermal looks fine, still trips OTP — hotspot near SR/rectifier or sensor placement mismatch?

Likely cause: Local hotspot (SR MOSFET/rectifier/transformer) exceeds OTP while the measured “average” point stays cool.

Quick check: Compare OTP trigger time with an IR scan (or thermocouples) at SR devices, transformer, and controller sensor location.

Fix: Improve hotspot spreading (copper/thermal vias), adjust SR timing to reduce loss, and align sensor placement to worst-case hotspot.

Pass criteria: Hotspot ΔT ≤ X °C at ambient Y °C; no OTP in X minutes at Y% load.

Inrush looks within limit, yet startup fails — UVLO hysteresis or bulk cap ESR/ESL?

Likely cause: Vbulk droops due to ESR/ESL and triggers UVLO; or UVLO hysteresis is too small and creates oscillation.

Quick check: Capture Vbulk droop and UVLO threshold crossing at startup; compare capacitor ESR/ESL and placement-induced inductance.

Fix: Increase hysteresis (ΔV=X), reduce loop inductance (placement/return), and choose cap ESR/ESL to meet hold-up window.

Pass criteria: Startup succeeds across X cold boots; Vbulk_min ≥ X V during enable; UVLO transitions ≤ X per boot.

Output short causes long recovery time — latch-off policy too strict or retry backoff too long?

Likely cause: Latch-off requires manual intervention, or conservative backoff delays re-power after transient shorts.

Quick check: Inject a controlled short for X ms; log fault code, action (latch/hiccup/retry), and backoff timing.

Fix: Use hiccup for transient faults, latch for hard faults; tune backoff and limit restart attempts per window.

Pass criteria: Recovery time ≤ X s for transient short; restart attempts ≤ X per Y minutes; first-fault cause recorded.

Field units fail after an ESD event but still power — leakage shift breaks signature?

Likely cause: Post-ESD leakage increases and shifts signature/class behavior or corrupts DET/CLS bias, making handshake fragile.

Quick check: Before/after ESD: measure leakage on the input path and compare signature/class stability; correlate with new boot failures.

Fix: Improve protection/leakage budget (layout return paths, clamp selection, cleanliness), and add margin to signature/class networks.

Pass criteria: After IEC ESD stress, class/signature stable within ±X%; boot success ≥ X% across Y boots; leakage ≤ X µA.

Different PD controller vendor in the same footprint changes behavior — DET/CLS pin bias mismatch?

Likely cause: Pin-level analog expectations differ (DET/CLS bias, thresholds, leakage, timing), even if footprint matches.

Quick check: Compare DET/CLS node voltages/currents during detect/class across vendors; validate required external component ranges.

Fix: Recalculate external networks (R/C values) for the new device’s biasing model; re-validate tolerance stack and leakage.

Pass criteria: Detect/class traces overlap within ±X%; class result stable across X boots; no false detect events in X trials.

Event counters disagree across firmware versions — window/denominator definition drift?

Likely cause: Counter definitions changed (window length, reset rules, denominator scope), producing incompatible metrics.

Quick check: Verify schema version and document: window=T, denominator=scope, reset=rule; replay identical test and compare raw events.

Fix: Freeze metric definitions, add schema version to logs, and publish a migration map between versions.

Pass criteria: Same test yields counters within ±X% across versions; schema version present in 100% of records; window = X s fixed.

PoE PD Controller: Classification, Isolated DC-DC & Event Logging

PoE PD Controller: Classification, Isolated DC-DC & Event Logging

H2-1. Definition & Scope of a PoE PD Controller (with Isolated DC-DC + Sync Rectification)

H2-2. Power Path & Interfaces: From RJ45 to Isolated Rails

H2-3. IEEE 802.3 PD Handshake (Only What PD Designers Must Use)

H2-4. Detection & Classification Circuit Design (Accuracy, Tolerance, and Failure Modes)

H2-5. Inrush, Hot-Swap, and Safe Startup Sequencing

H2-6. Maintain Power Signature (MPS) & Disconnect Behavior

H2-7. Isolated DC-DC Integration: Primary Control, Feedback, and Sync Rectification (SR)

H2-8. Protection & Fault Handling (PD + DC-DC Combined)

H2-9. Telemetry, Event Logging & “Black-Box” for Field Diagnostics

H2-10. Verification Plan: Bench Bring-Up → System → Production Gates

H2-11 · Applications (near the end)

H2-12 · IC Selection Logic + Engineering Checklist (Design → Bring-Up → Production)

Request a Quote

Accepted Formats

Attachment

H2-13 · FAQs (PD classification / inrush / MPS / SR / logging)

Explore

Categories

Get in Touch

PoE PD Controller: Classification, Isolated DC-DC & Event Logging

PoE PD Controller: Classification, Isolated DC-DC & Event Logging

H2-11 · Applications (near the end)

H2-12 · IC Selection Logic + Engineering Checklist (Design → Bring-Up → Production)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

H2-13 · FAQs (PD classification / inrush / MPS / SR / logging)

Explore

Categories

Get in Touch