PoE PSE Controller (802.3af/at/bt): Design & Validation

Q: Class looks right, but port still trips at load step—first check policing window or inrush limit?

Likely cause: enforcement window too short (peak policing) or inrush timer/limit too tight for the PD step. Quick check: compare I_peak window vs I_avg window; read trip reason + window_id; capture Iport with bandwidth ≥ X kHz over ≥ X ms. Fix: widen peak window or raise inrush limit/timer within policy; align telemetry averaging with enforcement windows. Pass criteria: no trip across X load steps (ΔP = X W) with trip rate ≤ X / hour and reason_code = none.

Q: Some PDs keep flapping every few minutes—MPS threshold too strict or auto-retry too aggressive?

Likely cause: MPS detect threshold/window rejects low-power/EEE behavior, or retry loop creates oscillation. Quick check: log per-event state_reason; correlate flap period with MPS window; verify retry_count growth + backoff timing (X). Fix: relax MPS threshold/window (or enable MPS-friendly mode); add bounded retry + exponential backoff; prevent immediate re-apply after drop. Pass criteria: no flap for ≥ X minutes at low-load (Pport = X W) and retry_count ≤ X.

Q: Total PSU has headroom, but ports still deny power—allocator policy vs priority mismatch?

Likely cause: allocator uses conservative nameplate/requested budget or priority table blocks grant despite PSU margin. Quick check: dump allocator inputs (P_total, P_alloc per-port, priority tier, deny reason); verify which accounting mode is active (nameplate/measured/requested). Fix: correct accounting mode; adjust per-tier caps; ensure policy_version/commit_id updates atomically with tables. Pass criteria: grant decisions match policy table for X scenarios; deny_reason only when P_total − ΣP_alloc < X W.

Q: False detect after ESD test—leakage path or debounce settings?

Likely cause: post-ESD leakage creates a fake signature, or detect debounce/retry is too sensitive. Quick check: record detect V/I waveform in the detect window; compare before/after ESD; check debounce time and detect threshold drift (X). Fix: increase debounce; tighten valid signature window; add “cool-down then re-detect” profile after an ESD-like event. Pass criteria: false-detect rate ≤ X% over X cycles; detect_reason_code indicates valid only when PD is attached.

Q: Works with one PD vendor, fails with another—classification tolerance or startup ramp?

Likely cause: class current tolerance/multi-event expectation mismatch, or startup ramp/inrush profile conflicts with PD front-end. Quick check: compare class measurement repeatability (σ ≤ X%) across vendors; log class result + event count; capture Vport ramp slope (dV/dt = X). Fix: widen class tolerance (within standard intent); support required event mode; adjust ramp slope and inrush limit window. Pass criteria: class result stable within ±X class-bin across X cycles; startup success ≥ X% for both vendors.

Q: Event log shows overload but scope current looks OK—measurement bandwidth/averaging artifact?

Likely cause: PSE senses a short peak that the scope setup averages out, or log uses a different window than the measurement. Quick check: align bandwidth (RBW/LPF) and time window; read window_id; compare raw peak counter vs averaged telemetry; confirm sample rate ≥ X samples/s. Fix: harmonize enforcement and reporting windows; add peak capture (max-hold) for fault events; tune filter/averaging to match policy windows. Pass criteria: log peak and external measurement agree within ±X% over X events; false-overload events ≤ X / day.

Q: Thermal shutdown only on ports 5–8—layout heat coupling or per-port Rds(on) binning?

Likely cause: localized hot zone or higher loss path on those ports (effective Rds(on) or duty distribution). Quick check: compare Tport/Tj (or proxy) across ports under identical load; correlate trips with Iport and on-time; verify per-port dissipation estimate (P≈I²·R, X). Fix: redistribute high-power ports via policy; enable derating before shutdown; standardize FET selection/drive where applicable; reduce sustained limit for hot ports. Pass criteria: port-to-port temperature delta ≤ X °C at Pport = X W; no thermal trips for X hours at worst-case load mix.

Q: 4-pair bt load causes 2-pair ports to drop—shared PSU droop or policy oversubscription?

Likely cause: transient V_in droop triggers UVLO or allocator shedding; policy underestimates bt ramp concurrency. Quick check: log V_in minimum and UVLO reason; compare drop timing with bt startup; verify ΣP_alloc vs P_total window (X ms) during ramp. Fix: add ramp staggering; increase droop tolerance where safe; tighten allocator concurrency limits; prioritize critical 2-pair ports. Pass criteria: no 2-pair drops during bt ramp across X trials; V_in droop stays above UVLO + X V margin.

Q: Auto-retry causes network storm—backoff missing or retry criteria wrong?

Likely cause: retry has no hard cap or no increasing cool-down; the same fault re-triggers immediately. Quick check: check retry_count and retry interval histogram; verify cool-down/backoff curve parameters; confirm latch-off conditions for repeated faults (X). Fix: implement bounded retry (≤ X) + exponential backoff; escalate to latch-off after N repeats; log evidence per retry series. Pass criteria: retry_count never exceeds X; minimum backoff grows to ≥ X s; storm events ≤ X / week.

Q: Port status OK but PD never powers—state machine stuck or interrupt lost?

Likely cause: host missed an interrupt/event edge, or the port state machine is waiting for a condition that never completes. Quick check: read live port_state and last_state_reason; compare IRQ counter vs event log entries; verify INT pin toggles and IRQ batching settings (X). Fix: enable level-triggered/latched interrupts; add periodic state polling watchdog; reset the port FSM on stale state timeout (X ms). Pass criteria: no stale state beyond X ms; IRQ counter matches event log within ±X events over X hours.

← Back to: Industrial Ethernet & TSN

A PoE PSE controller is the “power traffic cop” that detects, classifies, allocates, and protects per-port power—so multi-port systems stay deterministic under load, faults, and brownouts.

This page turns 802.3af/at/bt into executable knobs and pass/fail criteria (detect → class → inrush/MPS → policing → recovery → telemetry) to make bring-up and production repeatable.

H2-1 · What a PoE PSE Controller Does (and what it does NOT)

Fast answer (what it is)

Detect: decide whether a connected device looks like a valid PD (signature window + debounce).
Classify: infer the PD’s power intent (Type/Class → power budget knobs).
Power-up: ramp port power safely (inrush control, timers, undervoltage checks).
Police: enforce per-port limits (current/power windows, priority/deny policy).
Protect + log: respond to faults fast and record evidence (V/I/T + fault counters).
Telemetry: expose per-port state, measurements, and events to MCU/SoC.

Scope boundary (avoid buying / designing the wrong block)

In-scope for a PSE controller

Per-port detection & classification state machine.
Per-port power ramp, inrush timers, MPS/keep-power checks (at a controller level).
Per-port policing (power/current limits), fault response, retry/backoff policy, and logging hooks.
Host interface for control and telemetry (I²C/SMBus/SPI + IRQ/GPIO patterns).

Not a PSE controller (out-of-scope here)

Not a switch / TSN engine: VLAN/QoS/TSN scheduling belongs to switching/TSN pages.
Not a PD controller: PD-side signature/MPS/DC-DC behaviors belong to the PD page.
Not the 54V power plant: PSU sizing and isolated conversion belong to power-stage pages.
Not magnetics / surge design: RJ45 magnetics, CMC, TVS/surge return belong to protection/layout pages.

Endspan vs Midspan (PSE-controller-relevant differences only)

Power insertion point: Endspan injects power inside the switch; Midspan injects power inline. This changes how port power events map to system logs and service access.
Management ownership: Endspan typically integrates with switch/SoC management; Midspan often needs a standalone MCU for per-port telemetry and fault policy.
Budget enforcement: Both need per-port policing, but Midspan designs often require stricter global budget logic because the data path and power path are managed separately.

Practical deliverables (used later in this page)

Power budget fields: P_total, P_port_max, priority, deny_policy, averaging_window (X), thermal headroom (X).
Event/log fields: timestamp, port_id, port_state, class_result, fault_id, V/I/T snapshot, retry_count.
Validation matrix headings: detect window sanity, classification repeatability, inrush/ramp, policing accuracy, fault response time, recovery backoff.

System view (controller boundary highlighted)

H2-2 · Standards Map: 802.3af / 802.3at / 802.3bt in PSE Terms

The key translation: “standard terms” → “PSE knobs” → “pass criteria”

For a PSE design, 802.3af/at/bt is most useful as a set of controllable knobs: Type/Class drives detection/classification behavior, 2-pair vs 4-pair power delivery intent, and how strict the port power policing must be. Every knob must map to a measurable pass criterion (threshold placeholders X).

Datasheet must-check (so “bt-ready” means something measurable)

Supported range: af / at / bt support, and whether 4-pair modes are supported.
Classification capabilities: classification mode(s), event support, repeatability hooks (counters/flags).
Policing knobs: per-port power limit (P_limit), current limit (I_limit), averaging window (X), and deny policy.
Startup controls: inrush limit / ramp timing (t_inrush, X), UVLO/OV thresholds, and state transitions.
Fault strategy: auto-retry vs latch-off, backoff (X), thermal handling, and per-fault counters.
Telemetry + events: per-port V/I/P/T (if available), IRQ behavior, and log-friendly registers.
System policy hook (optional): ability to accept host-set budgets (static or measured). LLDP negotiation belongs to the system/software layer.

Compatibility boundary (policy-level, not a standards lecture)

A high-capability PSE should not “force” higher power into a lower-capability PD. The port should supply power according to the detected/classified intent, and fall back to a conservative budget if intent is uncertain.

If classification is stable: allocate budget based on class result → apply policing knobs.
If classification is ambiguous: use a conservative budget (or deny) and log the reason.
If an optional event is not supported: fall back to the basic class path and enforce conservative limits.

Type/Class → PSE knobs (engineering view)

Use this mapping to turn standards language into configuration fields and testable thresholds.

Standard term	PSE knobs to set	Pass criteria (X placeholders)
Type / Class	P_limit, priority, deny_policy	P_limit accuracy within ±X% over Y s; deny reason logged
2-pair / 4-pair intent	pairing_mode, per-pair limits (if exposed)	pairing mode matches configuration; no unexpected brownout on load steps (X)
Classification event support	event_mode, class retry/debounce	class result repeatable within X/100 plug cycles; logs include event flags
Power-up / inrush constraints	I_limit_inrush, t_inrush, ramp profile	no false trip during ramp; ramp time within X ms; peak I within X A
Policing windows	averaging_window, peak window (X)	no nuisance trips for valid load steps; faults correlate with V/I/T snapshot

Standards map (Type/Class translated into PSE knobs)

H2-3 · Per-Port Power Path Architecture (Blocks you must budget)

Fast summary (what this chapter locks down)

Port power path is treated as a measurable chain: 54V_in → pass element → sense → port node → RJ45 center tap.
No missing blocks: pass FET (internal/external), current sense, gate driver, detect/class engine, telemetry/ADC, temp sense, fault logic.
Budget fields are defined per block (Rds(on), sense accuracy/bandwidth, thermal resistance, protection thresholds) and tied to test hooks (Vport/Iport/Tj).

Typical per-port blocks (what must exist somewhere in the design)

Pass FET: the controllable power element; determines conduction loss and thermal stress.
Current sense path: shunt/mirror + measurement chain; required for policing accuracy and fault detection.
Gate driver: controls ramp/inrush and prevents oscillation when external FETs are used.
Detect + class engine: front-end that runs the port state machine (detect → classify → power).
Telemetry (ADC + averaging): V/I/P/T snapshots, counters, and time windows used by policy and forensics.
Temp sense + fault logic: fast trip + recover strategy; must produce loggable evidence.

Two integration forms (engineering trade-offs only)

Fully integrated (internal FET) simplifies routing and reduces external BOM, but concentrates heat in one package and makes per-port thermal coupling a primary constraint.

Design focus: junction temperature margin, airflow/heatsinking, and consistent port-to-port derating.
Common failure mode: a subset of ports trips first due to heat coupling, not due to power policy.

Controller + external FET scales power and improves thermal spreading, but layout parasitics and measurement integrity become the dominant risks.

Design focus: gate loop stability, sense routing, and keeping the port node measurable and clean.
Common failure mode: nuisance trips caused by bandwidth/window mismatch or ringing at the gate/port node.

Port budget fields (use as a checklist and a test plan input)

These fields make later chapters measurable: inrush, policing, faults, and telemetry all depend on them.

Block	Budget fields	Why it matters	Pass criteria (X placeholders)
Pass FET	Rds(on), package thermal (θ), heat coupling	Sets conduction loss and the first thermal trip point	Tj margin ≥ X °C at P_port_max; no thermal trip in Y min
Current sense	Rsense, tolerance, offset/gain, bandwidth (BW)	Governs policing accuracy and nuisance trip behavior	I_limit within ±X%; load-step causes no false trip within window X
Gate driver	Gate loop parasitics, ramp slope, blanking timer (X)	Controls inrush stability and prevents oscillation/overshoot	Vport ramps smoothly; peak I ≤ X A; no ringing-triggered trip
Telemetry / ADC	Sampling rate, averaging window (X), snapshots (V/I/T)	Enables correlation between faults and evidence	Fault log includes V/I/T within X ms of event; counters monotonic
Fault thresholds	I_limit, P_limit, trip time, retry/backoff (X)	Prevents damage and avoids retry storms	Trip time ≤ X ms; backoff grows to ≥ X ms; no flapping

Port BOM block (54V_in → RJ45 center tap) with test hooks

H2-4 · Detection: Signature, Valid/Invalid, and false-detect traps

Detection logic (keep it as an observable state machine)

Idle: port is off; waits for a connection event or periodic probe.
Detect: apply a controlled probe and check the signature inside a V/I window.
Decision: validate/invalid decision with debounce and sanity checks.
Class: proceed to classification only after detection is stable.

The dominant failure mode is rarely “no PoE”; it is usually a window/denominator mismatch: the probe result is correct, but the time window, debounce, or retry policy converts it into flapping.

Detection timeline (V/I window → debounce → retry/backoff)

Debug detection by aligning evidence to the timeline. If a false detect happens, it must appear as out-of-window samples, insufficient debounce, or an overly aggressive retry loop.

False-detect traps (symptom → likely cause → quick check)

Symptom	Likely cause (measurable)	Quick check
Detect succeeds on bench, fails in the field	Cable/port capacitance shifts probe waveform outside the sampling window	Compare detect V/I snapshots vs window (X); check debounce hits and retry count
Random false detect after ESD/surge events	Residual leakage path or bias shift causes “valid-looking” samples	Record detect attempts and invalid reasons; correlate with humidity/temperature logs
Port flaps on plug/unplug (connect bounce)	Debounce too short or retry policy too aggressive	Check debounce counter and retry/backoff (X); confirm stable decision in Y ms
Some ports detect reliably, others never do	Board-level leakage/contamination or port-to-port parasitic differences	Swap cable/PD across ports; compare detect V/I and invalid reasons; inspect port leakage paths

Observability requirements (minimum evidence to debug detection)

Counters: detect_attempt_count, valid_count, invalid_count, debounce_hit_count, retry_count.
State capture: last_state (Idle/Detect/Decision/Class) and last_invalid_reason code.
Snapshots: last_detect_V/I snapshot (if supported) aligned to the detect window.
Policy: configured window/debounce/backoff values stored alongside logs for correlation.

Detection state machine + timeline (where false-detect traps appear)

H2-5 · Classification: Classes, Events, and How PSE decides power intent

What “classification” must produce in engineering terms

Classification is only useful when it becomes fields that can be logged, validated, and consumed by power policy. Treat it as an observable port output, not a checkbox feature.

Primary result: class_code (power intent bucket).
Event flags: multi-event / mode flags that refine intent.
Validity: valid/invalid + reason code (evidence for failures).
Attempts: attempt_count / retry_count (detect flapping early).
Timing: class_time_ms (latency budget for link bring-up).
Snapshot (if supported): V/I sample tied to the class window.

Classification flow (logic chain + configurable knobs)

Precondition: Detection is stable (Detect OK). Classification should not mask detection defects.
Probe: Apply a controlled class probe and capture V/I evidence inside a defined window.
Decision: Convert samples into class_code with a validity flag and a reason code.
Optional events: Multi-event modes refine intent when supported and verified.
Finalize: Publish class_result fields into the policy input stage.

Config knobs to document and log with results: class_mode, t_class (X), t_gap (X), max_attempts, fallback_policy, event_mode.

Multi-event: engineering meaning (why it matters)

More precise intent: refines how much budget a port should request and how strict policing must be.
Higher-power path control: enables a controlled transition to higher port budgets when evidence is strong.
Safe fallback: if event_flags are missing or invalid, policy must degrade to conservative limits.

Conservative rule: invalid or incomplete event evidence → use lower P_alloc and tighter P_limit, and log the reason code for field forensics.

Verification: accuracy, repeatability, and stability (measurement criteria only)

Accuracy: expected_class vs observed_class; event_flags match expectation; invalid_reason rate ≤ X.
Repeatability: N plug cycles, class consistency ≥ X%; port-to-port variance ≤ X.
Stability: across input voltage and temperature, no class flapping; class_time_ms ≤ X; retries ≤ X.

Recommended log row: port_id, expected_class, observed_class, event_flags, attempts, class_time_ms, invalid_reason, V/I snapshot.

Diagram: Class result → policy inputs (P_alloc / P_limit / priority)

H2-6 · Power Allocation Policy (per-port budget, priority, and policing)

Three budget semantics (nameplate / measured / requested)

Nameplate (static): planned per-port caps; predictable and conservative.
Measured (dynamic): use telemetry to reclaim headroom and detect abnormal draw.
Requested (system interface only): an upper bound hint; never override safety limits.

Practical rule: P_alloc should be derived from class evidence + system budget, then guarded by measured reality and derating margins.

Per-port policy object (what allocation must set)

P_alloc: assigned budget (may be lower than class maximum when oversubscribed).
P_limit: sustained power policing limit (avg window X).
I_limit: peak / short protection (peak window X or blanking X).
priority: survival order when total power is insufficient.
deny_policy: admit/deny for new loads when headroom is gone.
fault_backoff: retry timing to prevent flapping (X).

Oversubscription behavior (deterministic, no flapping)

When total PSU power cannot satisfy all ports, behavior must be predictable and stable.

Admit control: deny or downgrade new ports first (no surprise brownout).
Throttle: lower P_limit on lower priority ports using windows to avoid nuisance trips.
Preempt: hard-off the lowest priority ports with backoff to stop oscillation.

Determinism rule: equal priority ports should have a stable tie-break (e.g., last-in, faultiest, or highest draw), and the choice must be logged.

Policing (P_limit + I_limit) with averaging and peak windows

P_limit (avg): sustained power gate aligned to avg_window (X).
I_limit (peak): fast protection for transients/shorts aligned to peak_window or blanking (X).
Window integrity: windows must match sense bandwidth and telemetry sampling; mismatch causes nuisance trips or slow fault response.

Verification hooks: log window values with each fault so field reports can distinguish policy decisions from physical defects.

Output template: Power Budget Sheet (field schema)

Scope	Fields	Pass criteria (X placeholders)
System	P_PSU_total, derating, guardband, thermal_cap (X), policy_version	No brownout under worst-case plug-in; headroom ≥ X%
Per-port	port_id, class_result, priority, P_alloc, P_limit, I_limit, avg_window (X), peak_window (X), deny_policy, backoff (X)	Oversubscription actions are deterministic; no port flapping beyond X retries
Evidence	measured_P, V/I/T snapshots, fault_count, last_event, timestamp, policy_snapshot	Fault record includes V/I/T within X ms; counters monotonic; policy snapshot present

Diagram: Total PSU budget → per-port allocator (admit / throttle / preempt)

H2-7 · Startup, Inrush, and Maintain Power Signature (MPS) without surprises

Define a “clean startup” as measurable criteria (not a feeling)

Ramp completion: Vport crosses V_on (X) within t_ramp (X) without oscillation.
Inrush bounded: Iport peak ≤ I_inrush_peak (X), and current-limit plateau ≤ X ms.
No nuisance trips: policing thresholds (P_limit/I_limit) do not trigger during legitimate bring-up.
MPS continuity: MPS checks pass under light-load and low-power states (no false disconnect).

Soft-start / slope control / inrush limiting: the three-knob coupling

Startup robustness comes from aligning t_ramp, I_inrush, and Vport thresholds to the port path and the connected load.

Too-fast ramp: Iport spikes → hits limit → Vport dips → restart loops.
Too-tight inrush limit: Vport rises too slowly → timeouts or unstable intermediate states.
Misaligned thresholds: “looks powered” but the controller never declares stable power.

MPS: why light-load power saving can look like a disconnect

Maintain Power Signature (MPS) is a keep-alive decision window. If the measured signature falls outside the MPS window, the port may be declared “not present” even while the cable is intact.

Light-load: average draw drops below the MPS threshold (X) during idle/sleep.
Bursty load: peaks exist, but the window average misses them (window mismatch).
Accounting mismatch: telemetry filtering is not aligned to the MPS decision window, hiding true behavior.

Engineering rule: log the MPS window ID and the matching sampling/averaging settings so field logs explain dropouts.

Plug events and brownout: bounded retries with backoff and clear recovery gates

Plug bounce: state machines re-enter detect/class/startup; retries must be bounded.
Brownout: V_in dips → Vport falls below a recovery threshold; recovery should wait for stable V_in (X ms).
Retry storm prevention: retry_count_max (X) + stepped/exponential backoff (X) + a stable recovery condition.

Minimum startup log fields: timestamp, port_state (startup/inrush/steady/mps_check), V/I snapshots, inrush_peak, t_ramp_measured, mps_fail_count, retry_count.

Diagram: Vport & Iport startup waveform + MPS window (X placeholders)

H2-8 · Fault Handling: short/overload/thermal/surge events and recovery modes

Fault taxonomy: classify first, then recover (for field and production consistency)

Fault handling must map symptoms to a stable fault_id so recovery actions are deterministic and comparable across ports and units.

short: fast protection path, typically requires latch-off or bounded retries.
overcurrent / overload: sustained draw beyond limits; may allow throttle + retry.
overpower: P_limit violation (windowed); often policy-controlled.
overtemp: thermal trigger; recovery gated by cool-down threshold (X).
undervoltage: distinguish system V_in dip vs port-local dip (inrush/policing).
detect anomaly: state machine instability; quarantine and bounded retries.

Recovery mode templates (avoid retry storms by construction)

auto-retry: allowed only with retry_count_max (X) and backoff (X).
latch-off: for hard faults or repeated trips; requires an explicit clear action or long cool-down.
cool-down: recovery is gated by temperature falling below a threshold (X), not by a blind timer.
backoff: stepped or exponential delays prevent power cycling loops and PSU collapse.

Determinism rule: each fault_id should map to a default recovery profile and a bounded retry envelope.

Logging criteria: minimum evidence per fault event

Each fault event should record a reproducible “evidence packet” so field data explains both root cause category and recovery behavior.

timestamp (monotonic time reference)
fault_id + port_state (detect/class/startup/steady/mps)
V/I/T snapshots aligned to the trip window
retry_count + backoff stage + recovery mode
policy snapshot (P_limit, I_limit, window IDs) for attribution

Diagram: Fault tree (symptom → fault_id category → quick check)

H2-9 · Telemetry & Control: Interfaces, registers, interrupts, and “black-box” hooks

Control plane vs event plane (maintenance starts here)

Control plane: I²C / SMBus / SPI for configuration, readbacks, and versioned policy commits.
Event plane: GPIO pins (INT / FAULT / PG) for low-latency notifications and “no-miss” fault capture.
Evidence plane: black-box snapshots (pre/post X) aligned to the same window IDs used by decisions.

Engineering rule: interrupts report “something happened”; registers explain “what and why”; black-box preserves “how it evolved”.

Minimum telemetry set (portable across ports, boards, and firmware)

A system is diagnosable only if logs bind values to a decision context (window/debounce/policy version).

port_state: detect · class · startup · steady · mps_check · fault
Vport / Iport / Pport: with window_id (X) to match policing/MPS decisions
temperature: Tj or local sensor + thermal flags
class result: class/type + validity + reason code
fault counters: per-fault_id counts + last_trip_time
energy (if available): cumulative energy for slow overload / thermal accumulation attribution

Event reporting: aggregation, thresholds, debounce, and window alignment

Interrupt aggregation: avoid GPIO storms by grouping per-port events into a single system INT with per-port cause bits.
Threshold alarms: V/I/P/T thresholds should reference the same window used by enforcement.
Debounce (X): suppress noise and transient spikes; log debounce_id for auditability.
Sampling window (X): window_id must be recorded alongside every latched alarm and trip decision.

If telemetry uses one averaging window and decisions use another, field logs will never explain dropouts.

“Black-box” hooks: preserve evidence packets, not just counters

Each fault or unusual event should trigger an evidence packet with pre/post snapshots and a configuration fingerprint.

pre-trigger (X): V/I/P/T + port_state + key counters
trigger: fault_id + reason + threshold snapshot
post-trigger (X): recovery mode + backoff stage + retry evolution
fingerprint: fw version + config hash + policy_version + window/debounce IDs

Diagram: Telemetry loop (PSE → log/counters → MCU policy → PSE)

H2-10 · Engineering Checklist (Design → Bring-up → Production)

Three-gate workflow: build repeatability into the process

The checklist is not a list of tips. It is a gated pipeline with explicit inputs, outputs, and owners. No gate pass → no next stage.

Design Gate: budgets and observability defined before hardware spin.
Bring-up Gate: state machines validated with evidence packets and fault injection.
Production Gate: thresholds locked, test criteria frozen, configs versioned.

Design Gate (inputs → checks → outputs)

Power budget artifacts: per-port P_alloc plan + oversubscription policy (X).
Thermal planning: worst-case port dissipation + hotspot risk + derating knobs.
Protection & policies: P_limit/I_limit/MPS windows and recovery templates (X).
Observability contract: minimum telemetry set + window_id/debounce_id + black-box packet schema.
Owner: (Owner) · Pass: artifacts reviewed and versioned.

Bring-up Gate (prove the state machine with evidence)

Detect/Class/Startup/MPS: scope waveforms mapped to window IDs (X) and pass criteria (X).
Fault injection: short / overload / thermal / UV triggers produce correct fault_id and bounded recovery.
Closed-loop verification: event → log → policy update → apply confirmation (commit_id).
Owner: (Owner) · Pass: evidence packets captured for each key test.

Production Gate (freeze thresholds, freeze criteria, freeze configs)

Threshold lock: window_id/debounce_id and trip thresholds are fixed and versioned.
Test criteria lock: pass criteria (X) uses the same measurement windows as enforcement.
Config fingerprint: fw version + config hash + policy_version recorded per unit.
Audit sampling: periodic fault-injection sampling with evidence packet retention.
Owner: (Owner) · Pass: manufacturing can reproduce results across stations.

Output template: Test matrix (copy into a sheet)

Columns: test item · stimulus · expected · pass criteria (X) · logs required · owner

Test item	Stimulus	Expected	Pass criteria (X)	Logs required	Owner
Detect + class	PD connect	valid signature + stable class	misclass ≤ X%	window_id + reason	(Owner)
Startup + inrush	cold start	no restart loop	I_peak ≤ X	pre/post X	(Owner)
Fault recovery	short/overload/UV	correct fault_id + bounded retry	retry ≤ X	commit_id	(Owner)

Diagram: 3-gate flow (inputs/outputs + owner placeholders)

H2-11 · Applications (where PSE controllers actually differ)

Application buckets (no horizontal expansion)

PSE controllers diverge most in budget control, thermal survivability, and maintainability. Each bucket below binds those themes to concrete knobs and acceptance checks (threshold placeholders: X).

Power budget: allocator, priorities, policing windows
Thermal: per-port dissipation visibility + derating hooks
Manageability: telemetry + interrupts + black-box evidence

Bucket A · Multi-port industrial switch (Endspan)

power budget

thermal

manageability

Differentiators are dominated by oversubscription behavior and per-port evidence quality when multiple loads ramp concurrently.

Knobs that matter

Allocator: total PSU budget → per-port P_alloc/P_limit mapping + priority shedding order.
Policing windows: average/peak windows aligned to telemetry window_id (X).
Thermal hooks: temp flags + derating path (avoid on/off oscillation).
Evidence: per-port state + fault_id + retry_count + pre/post snapshots.

Acceptance checks (X placeholders)

Oversubscription test: dropping/limiting order matches the configured priority table (no “random” port losses).
Field-debug test: each port trip produces fault_id + window_id + retry_count with completeness ≥ X%.
Thermal stress: derating reduces event rate without creating retry storms (bounded retries ≤ X).

Example controller BOM anchors (IC part numbers)

TI: TPS23880, TPS23881 (8-channel IEEE 802.3bt PSE controllers).
Microchip: PD69208T4 / PD69208M (PSE manager) + PD69200 (PSE controller).
Analog Devices: LTC4290 + LTC4271 (8-port PSE controller chipset); LTC4270 + LTC4271 (12-port chipset).
Broadcom: BCM59111 (4-channel IEEE 802.3at PSE controller; integration-oriented designs).

Bucket B · Midspan injector (power insertion platform)

detect/class robustness

hot-plug

evidence

Midspan failures are usually “state-machine ambiguity”: a PD appears incompatible, but the platform cannot prove which phase failed (detect, class, startup, MPS).

Knobs that matter

Hot-plug behavior: debounce and bounded retry/backoff (X) to avoid repetitive power-cycling.
False-detect immunity: robust signature discrimination and clear reason codes.
External FET friendliness: clear sense scaling + telemetry alignment (window_id).
Evidence packets: pre/post snapshots on every failed detect/class/startup.

Example controller BOM anchors (IC part numbers)

Analog Devices: LTC4290 + LTC4271 (endpoint or midspan 8-port PSE chipset).
Microchip: PD69204T4 (4-port PSE manager) or PD69208T4 (8-port PSE manager) + PD69200.
TI: TPS23861 (4-channel 802.3at PSE) for compact midspan modules; TPS23880/TPS23881 for bt-class platforms.

Bucket C · Outdoor / long-cable PSE (self-healing first)

recovery profiles

bounded retries

log quality

The platform must distinguish transient line events from genuine faults and recover without creating a retry storm. Protection component details are intentionally excluded (see sibling pages).

Knobs that matter

Recovery profiles: auto-retry / latch-off / cool-down / backoff, with a hard upper bound (≤ X).
Windowed policing: P_limit/I_limit windows (X) tuned to avoid false trips from short transients.
Evidence on every event: timestamp + V/I/T + state + retry_count.

Example controller BOM anchors (IC part numbers)

Microchip: PD69204T4 (4 ports Type 4 capable) or PD69208T4 (8 ports) + PD69200.
TI: TPS23880/TPS23881 (bt-class designs with per-port telemetry support).
Analog Devices: LTC4290 + LTC4271 (8-port chipset; recovery behavior can be validated with reason codes + counters).

Bucket D · Mixed-load (camera + AP + I/O)

priority

measured vs nameplate

policy versioning

Mixed-load stability depends on consistent budget accounting and predictable shedding under aggregate limits.

Acceptance checks (X placeholders)

Fixed mixed-load script: the same ports are limited/dropped every run (deterministic priority behavior).
Policing events: telemetry window_id matches enforcement windows (no mismatched averages).
Remote tuning: every change carries policy_version/commit_id and can be audited.

Example controller BOM anchors (IC part numbers)

TI: TPS23881 / TPS23880 (bt-class multi-port control with per-port telemetry concepts).
Microchip: PD69208T4 + PD69200 (multi-port management + controller split).
Analog Devices: LTC4270 + LTC4271 (12-port chipset for higher port-density platforms).

Diagram: Application buckets (power budget · thermal · manageability)

H2-12 · IC Selection Logic (choose a PSE controller with evidence, not guesswork)

Selection pipeline (hard gates → scorecard → red flags)

Hard gates: ports × power × bt / 4-pair × integration form × interface constraints.
Scorecard: policy control + telemetry + recovery boundedness + thermal hooks.
Red flags: non-auditable decisions (no window IDs, no reason codes, no bounded retry).

Step 0 · Write requirements as structured inputs

A decision tree works only if requirements are measurable.

Ports: X (total) · Per-port target: X (W) · Total PSU: X (W)
Standard target: af/at vs bt (Type 3/4) · Pairing: 2-pair vs 4-pair
Form: endspan vs midspan · Integration: integrated vs external FET strategy
Manageability: basic vs black-box evidence · Host I/F: I²C/SMBus/SPI + INT lines

Step 1 · Hard gates (fail any gate → remove from shortlist)

bt / 4-pair required? If yes, shortlist bt-capable families first (then decide port granularity).
Port density: 4-port vs 8-port vs 12-port impacts thermal distribution and MCU load.
Integration form: integrated vs external FET determines dissipation placement and layout constraints.
Interface & logs: require at least V/I/P/T + state + reason codes + counters.

Step 2 · Selection scorecard (template)

Scoring scale: 1 (weak) → 5 (strong). Weights are placeholders (X) and can be tuned per project.

Category	Field	Why it matters	Weight (X)	Score (1–5)	Notes
Power & Ports	bt / 4-pair capability	Defines max delivered power and pairing modes	X	—	Type 3/4 needs
Policy Control	Allocator + priority shedding	Deterministic behavior under oversubscription	X	—	Order + audit
Telemetry	V/I/P/T + state + reason	Field diagnosis without guesswork	X	—	window_id required
Recovery	Bounded retry/backoff	Avoid retry storms and oscillation	X	—	retry ≤ X
Thermal	Derating + temp reporting	Survive worst-case ports and airflow	X	—	stable under load
Ecosystem	Reference designs + compliance support	Reduces integration risk and lab iterations	X	—	production readiness

Step 3 · Red flags (one-vote veto)

No window IDs / no debounce IDs: telemetry cannot explain enforcement decisions.
No reason codes for state failures: detect/class/startup/MPS failures become un-debuggable.
Unbounded recovery: auto-retry without a hard cap (≤ X) creates field retry storms.
Counters without snapshots: “CRC-like” symptoms with no evidence packet on trips.
Non-auditable policy updates: no policy_version/commit_id → remote tuning cannot be trusted.

Example shortlists (IC part numbers by decision outcome)

These are BOM anchors for the controller function. Platform power entry, magnetics, and protection parts are intentionally excluded here (see sibling pages).

bt-class multi-port (8 channels)

TPS23880 · TPS23881 · PD69208T4 + PD69200

Compact 4-port (Type-2 / midspan modules)

TPS23861 · PD69204T4 + PD69200 · BCM59111

High port density (12 ports)

LTC4270 + LTC4271

Endpoint or midspan (8 ports, chipset approach)

LTC4290 + LTC4271

Diagram: Decision tree (ports × power × manageability → controller form)

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13 · FAQs (PoE PSE Controller: detect · class · power · fault · telemetry · policy)

How to use these FAQs

Each answer is a four-line executable recipe: Likely cause → Quick check → Fix → Pass criteria. Thresholds are placeholders (X) so the same structure works for design, bring-up, and production gates.

Class looks right, but port still trips at load step—first check policing window or inrush limit?

Likely cause: enforcement window too short (peak policing) or inrush timer/limit too tight for the PD step.

Quick check: compare I_peak window vs I_avg window; read trip reason + window_id; capture Iport with bandwidth ≥ X kHz over ≥ X ms.

Fix: widen peak window or raise inrush limit/timer within policy; align telemetry averaging with enforcement windows.

Pass criteria: no trip across X load steps (ΔP = X W) with trip rate ≤ X / hour and reason_code = none.

Some PDs keep flapping every few minutes—MPS threshold too strict or auto-retry too aggressive?

Likely cause: MPS detect threshold/window rejects low-power/EEE behavior, or retry loop creates oscillation.

Quick check: log per-event state_reason; correlate flap period with MPS window; verify retry_count growth + backoff timing (X).

Fix: relax MPS threshold/window (or enable MPS-friendly mode); add bounded retry + exponential backoff; prevent immediate re-apply after drop.

Pass criteria: no flap for ≥ X minutes at low-load (Pport = X W) and retry_count ≤ X.

Total PSU has headroom, but ports still deny power—allocator policy vs priority mismatch?

Likely cause: allocator uses conservative nameplate/requested budget or priority table blocks grant despite PSU margin.

Quick check: dump allocator inputs (P_total, P_alloc per-port, priority tier, deny reason); verify which accounting mode is active (nameplate/measured/requested).

Fix: correct accounting mode; adjust per-tier caps; ensure policy_version/commit_id updates atomically with tables.

Pass criteria: grant decisions match policy table for X scenarios; deny_reason only when P_total − ΣP_alloc < X W.

False detect after ESD test—leakage path or debounce settings?

Likely cause: post-ESD leakage creates a fake signature, or detect debounce/retry is too sensitive.

Quick check: record detect V/I waveform in the detect window; compare before/after ESD; check debounce time and detect threshold drift (X).

Fix: increase debounce; tighten valid signature window; add “cool-down then re-detect” profile after an ESD-like event.

Pass criteria: false-detect rate ≤ X% over X cycles; detect_reason_code indicates valid only when PD is attached.

Works with one PD vendor, fails with another—classification tolerance or startup ramp?

Likely cause: class current tolerance/multi-event expectation mismatch, or startup ramp/inrush profile conflicts with PD front-end.

Quick check: compare class measurement repeatability (σ ≤ X%) across vendors; log class result + event count; capture Vport ramp slope (dV/dt = X).

Fix: widen class tolerance (within standard intent); support required event mode; adjust ramp slope and inrush limit window.

Pass criteria: class result stable within ±X class-bin across X cycles; startup success ≥ X% for both vendors.

Event log shows overload but scope current looks OK—measurement bandwidth/averaging artifact?

Likely cause: PSE senses a short peak that the scope setup averages out, or log uses a different window than the measurement.

Quick check: align bandwidth (RBW/LPF) and time window; read window_id; compare raw peak counter vs averaged telemetry; confirm sample rate ≥ X samples/s.

Fix: harmonize enforcement and reporting windows; add peak capture (max-hold) for fault events; tune filter/averaging to match policy windows.

Pass criteria: log peak and external measurement agree within ±X% over X events; false-overload events ≤ X / day.

Thermal shutdown only on ports 5–8—layout heat coupling or per-port Rds(on) binning?

Likely cause: localized hot zone or higher loss path on those ports (effective Rds(on) or duty distribution).

Quick check: compare Tport/Tj (or proxy) across ports under identical load; correlate trips with Iport and on-time; verify per-port dissipation estimate (P≈I²·R, X).

Fix: redistribute high-power ports via policy; enable derating before shutdown; standardize FET selection/drive where applicable; reduce sustained limit for hot ports.

Pass criteria: port-to-port temperature delta ≤ X °C at Pport = X W; no thermal trips for X hours at worst-case load mix.

4-pair bt load causes 2-pair ports to drop—shared PSU droop or policy oversubscription?

Likely cause: transient V_in droop triggers UVLO or allocator shedding; policy underestimates bt ramp concurrency.

Quick check: log V_in minimum and UVLO reason; compare drop timing with bt startup; verify ΣP_alloc vs P_total window (X ms) during ramp.

Fix: add ramp staggering; increase droop tolerance where safe; tighten allocator concurrency limits; prioritize critical 2-pair ports.

Pass criteria: no 2-pair drops during bt ramp across X trials; V_in droop stays above UVLO + X V margin.

Auto-retry causes network storm—backoff missing or retry criteria wrong?

Likely cause: retry has no hard cap or no increasing cool-down; the same fault re-triggers immediately.

Quick check: check retry_count and retry interval histogram; verify cool-down/backoff curve parameters; confirm latch-off conditions for repeated faults (X).

Fix: implement bounded retry (≤ X) + exponential backoff; escalate to latch-off after N repeats; log evidence per retry series.

Pass criteria: retry_count never exceeds X; minimum backoff grows to ≥ X s; storm events ≤ X / week.

Port status OK but PD never powers—state machine stuck or interrupt lost?

Likely cause: host missed an interrupt/event edge, or the port state machine is waiting for a condition that never completes.

Quick check: read live port_state and last_state_reason; compare IRQ counter vs event log entries; verify INT pin toggles and IRQ batching settings (X).

Fix: enable level-triggered/latched interrupts; add periodic state polling watchdog; reset the port FSM on stale state timeout (X ms).

Pass criteria: no stale state beyond X ms; IRQ counter matches event log within ±X events over X hours.

Power granted then removed instantly—UVLO threshold or inrush timer mis-set?

Likely cause: UVLO (input or port) is triggered by droop, or inrush timer expires before steady-state.

Quick check: log min V_in and min Vport at the drop; read UVLO reason flag; compare inrush timer value vs measured ramp time (X ms).

Fix: adjust UVLO thresholds within system margin; increase inrush timer or soften ramp; add staggered startup for multiple ports.

Pass criteria: V_in stays ≥ UVLO + X V; Vport reaches steady within X ms; drop rate ≤ X / 1k startups.

Field returns show no faults—logging fields incomplete (missing V/I/T/timestamp)?

Likely cause: fault events occur but evidence is not captured (missing timestamp/V/I/T/state/retry_count), making failures invisible.

Quick check: verify event schema includes: timestamp, port_id, port_state, reason_code, V/I/P/T snapshot, retry_count; check log drop rate ≤ X%.

Fix: add “fault evidence packet” on every trip; implement ring-buffer with watermark; guarantee sync-to-storage on critical events.

Pass criteria: evidence completeness ≥ X% across X field events; missing-field incidents ≤ X / month.

PoE PSE Controller (802.3af/at/bt): Design & Validation

PoE PSE Controller (802.3af/at/bt): Design & Validation

H2-1 · What a PoE PSE Controller Does (and what it does NOT)

H2-2 · Standards Map: 802.3af / 802.3at / 802.3bt in PSE Terms

H2-3 · Per-Port Power Path Architecture (Blocks you must budget)

H2-4 · Detection: Signature, Valid/Invalid, and false-detect traps

H2-5 · Classification: Classes, Events, and How PSE decides power intent

H2-6 · Power Allocation Policy (per-port budget, priority, and policing)

H2-7 · Startup, Inrush, and Maintain Power Signature (MPS) without surprises

H2-8 · Fault Handling: short/overload/thermal/surge events and recovery modes

H2-9 · Telemetry & Control: Interfaces, registers, interrupts, and “black-box” hooks

H2-10 · Engineering Checklist (Design → Bring-up → Production)

H2-11 · Applications (where PSE controllers actually differ)

H2-12 · IC Selection Logic (choose a PSE controller with evidence, not guesswork)

Request a Quote

Accepted Formats

Attachment

H2-13 · FAQs (PoE PSE Controller: detect · class · power · fault · telemetry · policy)

Explore

Categories

Get in Touch

PoE PSE Controller (802.3af/at/bt): Design & Validation

PoE PSE Controller (802.3af/at/bt): Design & Validation

H2-1 · What a PoE PSE Controller Does (and what it does NOT)

H2-2 · Standards Map: 802.3af / 802.3at / 802.3bt in PSE Terms

H2-3 · Per-Port Power Path Architecture (Blocks you must budget)

H2-4 · Detection: Signature, Valid/Invalid, and false-detect traps

H2-5 · Classification: Classes, Events, and How PSE decides power intent

H2-6 · Power Allocation Policy (per-port budget, priority, and policing)

H2-7 · Startup, Inrush, and Maintain Power Signature (MPS) without surprises

H2-8 · Fault Handling: short/overload/thermal/surge events and recovery modes

H2-9 · Telemetry & Control: Interfaces, registers, interrupts, and “black-box” hooks

H2-10 · Engineering Checklist (Design → Bring-up → Production)

H2-11 · Applications (where PSE controllers actually differ)

H2-12 · IC Selection Logic (choose a PSE controller with evidence, not guesswork)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

H2-13 · FAQs (PoE PSE Controller: detect · class · power · fault · telemetry · policy)

Explore

Categories

Get in Touch