Edge Aggregation Switch for Campus & Industrial TSN
← Back to: 5G Edge Telecom Infrastructure
An edge aggregation switch for campus/industrial networks is built to deliver deterministic TSN forwarding with hardware PTP timestamps, while safely powering endpoints via PoE and keeping reliability high through closed-loop thermal and power telemetry. It turns “spec-sheet numbers” into provable field behavior: bounded latency, stable time, predictable port power, and actionable alarms/logs.
Edge Aggregation Switch for Campus & Industrial TSN
A campus/industrial edge aggregation switch concentrates PoE endpoints and uplinks while enforcing TSN determinism, hardware PTP/802.1AS time stamping, and telemetry-driven power/thermal control—so latency, power, and failures stay bounded and explainable.
- Allowed (deep): TSN feature selection (Qbv/Qci/Qbu/CB), hardware time-stamp path (MAC/PHY), PoE budgeting/priority/port behavior, telemetry closed-loop (sense→decide→act→log), validation & field debug.
- Not in scope (mention only): GNSS/Grandmaster holdover & BMCA deep dive, dedicated Boundary Clock switch internals, P4/whitebox programming, UPF/MEC/DPU/SmartNIC, security gateway/ZTNA, probe/TAP capture, site backup power design.
H2-1 · What it is & where it sits (Boundary + Non-goals)
Goal: define the device by its system boundary, deployment position, and the four engineering pillars that the rest of the page will prove with measurable evidence.
Definition (what it is)
An edge aggregation switch for campus/industrial networks is an uplink-facing node that aggregates PoE-powered endpoints (APs, cameras, sensors, controllers) and enforces deterministic forwarding for selected flows. The differentiator is not raw throughput; it is the ability to keep latency/jitter bounded, time-stamp events in hardware, and maintain power/thermal stability through telemetry-driven policies.
Where it sits (typical placements)
- Industrial ring / cell network: aggregates machine endpoints; TSN flows must survive congestion without violating worst-case delay.
- Campus aggregation: concentrates access switches & PoE endpoints; power budget and thermal derating must be predictable.
- Edge cabinet / micro-closet: compact enclosure; telemetry and fault codes must enable remote diagnosis (port drops must be explainable).
The four pillars (what this page will deliver)
- TSN determinism: choose Qbv/Qci/Qbu/CB by use-case; validate the latency ceiling under realistic load.
- Hardware time stamping: understand where timestamps are captured (MAC/PHY), what error terms exist, and what must be monitored.
- PoE PSE behavior: design budget/priority and port lifecycle rules so “budget shortage” becomes a controlled outcome, not chaos.
- Telemetry closed-loop: define the sense→decide→act→log chain so thermal/power issues trigger predictable actions with traceable reasons.
Non-goals (intentional exclusions)
- No GNSS/Grandmaster holdover design: time-source engineering belongs to the time-hub page.
- No P4/whitebox pipeline programming: this page focuses on deterministic behavior, not reconfigurable data planes.
- No UPF/MEC compute offload: any appliance acceleration is out of scope here.
H2-2 · System architecture blueprint (Data / Time / Power / Management planes)
Goal: lock the page around a four-plane blueprint—so each later chapter can go deep without repeating or drifting into neighboring topics.
Architecture rule: four planes, one evidence chain
The system is easiest to reason about when separated into Data, Time, Power, and Management/Telemetry planes. Each plane must produce measurable evidence (counters, fault codes, logs) so field issues become diagnosable rather than anecdotal.
Data plane (TSN switching silicon)
- Inside: ingress classification, per-stream policing hooks, deterministic queues, egress shaping.
- Controls: worst-case delay bound, jitter under load, starvation resistance for critical streams.
- Must measure: per-queue occupancy, drop counters, gate-related anomaly indicators (when available).
- Field symptom: “throughput is fine” but delay spikes or periodic jitter appears under mixed traffic.
Time plane (PTP/802.1AS + hardware time stamping)
- Inside: timestamp capture (MAC/PHY), local time distribution, minimal jitter-cleaning inside the switch.
- Controls: schedule correctness for time-aware shaping and timestamp credibility for monitoring/debug.
- Must measure: timestamp error counters, sync/alignment state, time-related alarms tied to scheduling.
- Field symptom: sync looks “locked” yet TSN schedules still miss windows or drift over temperature/load.
Power plane (PoE PSE, budget & port behavior)
- Inside: 48–57V PoE input, PSE controllers, per-port sensing/limiting, system-level budget manager.
- Controls: deterministic port power behavior during budget shortage and fault conditions.
- Must measure: per-port V/I/P, negotiated power, fault reason codes, remaining budget headroom.
- Field symptom: port “flapping”, widespread derating after bt enablement, or priority inversion during shortage.
Management/Telemetry plane (sense → decide → act → log)
- Inside: management MCU/CPU, PMBus/I²C telemetry, fan control, alarm fan-in, event logs.
- Controls: thermal policy, PoE derating/disable actions, safe recovery behavior, remote diagnosability.
- Must measure: hotspot temperatures, fan RPM, PSU rails, port fault history with timestamps and cause codes.
- Field symptom: “cannot reproduce” incidents due to missing counters or ambiguous alarms.
Cross-plane coupling (why these planes cannot be treated independently)
- Time → TSN: local-time misalignment translates into gate schedule errors; determinism collapses without visible “high utilization.”
- Power → Thermal → PoE: temperature rise triggers derating, which changes endpoint behavior (restarts, link renegotiation) and back-propagates into traffic patterns.
- Telemetry → Debug: missing evidence fields turn a 10-minute diagnosis into days of guesswork; define the evidence dictionary early.
H2-3 · Specs that actually matter (turn datasheet numbers into field behavior)
Goal: convert “good-looking specs” into bounded field outcomes—latency ceilings, timestamp credibility, PoE stability, and thermal survivability—each with an acceptance criterion and evidence chain.
Rule: every spec must map to (1) a failure symptom and (2) a measurable bound
A switch rarely fails because a single number is “low.” It fails when multiple small terms add up and exceed a hidden margin. The practical method is to translate specs into a budget (what must be bounded) and an evidence list (what must be logged and counted).
Determinism: build a latency ceiling budget (not a typical latency)
Determinism is a worst-case promise. A usable acceptance criterion is the end-to-end latency upper bound:
Dmax = Dfwd + Dqueue + Dgate + Dserdes + Dsync-margin
- Dfwd (forwarding): fixed pipeline delay inside the switch ASIC (store-and-forward vs cut-through matters here).
- Dqueue (queuing): worst-case contention from non-critical traffic; this is where “throughput looks fine” can still produce spikes.
- Dgate (gating): Qbv guard time + window alignment slack; too little slack creates periodic misses.
- Dserdes (PCS/SerDes): PHY/PCS/retimer path delay and temperature/line-rate mode effects.
- Dsync-margin: the time-alignment error budget that prevents schedule drift from turning into “gate misses.”
| Budget term | What it means in the field | How to obtain | Primary control knob |
|---|---|---|---|
| Dfwd | Base latency per hop | ASIC mode + vendor timing | Forwarding mode / pipeline |
| Dqueue | Spikes under mixed traffic | Worst-case traffic model + counters | Queue mapping / policing (Qci) |
| Dgate | Window misses / periodic jitter | Gate schedule + guard time definition | Qbv window sizing + guard bands |
| Dserdes | Mode/temperature drift | PHY/PCS latency + temp sweep test | PHY mode / retimer settings |
| Dsync-margin | Schedule alignment robustness | Sync/alignment telemetry & alarms | Timestamp path + alignment monitoring |
Timestamp accuracy: MAC vs PHY time stamping (error sources that matter)
- MAC time stamp (inside switch pipeline): sensitive to internal data-path variability; load-dependent micro-variations can leak into perceived time if capture points move relative to queuing and shaping.
- PHY time stamp (near the line): reduces pipeline ambiguity; dominant errors shift to link calibration and PCS/SerDes mode effects (rate, encoding, temperature).
- Practical selection criterion: PHY time stamping is preferred when “line-event alignment” is required; MAC time stamping is often sufficient when trend and relative consistency are the goal.
| Capture point | Main error drivers | What to monitor | Failure symptom |
|---|---|---|---|
| MAC TS | Pipeline coupling, load sensitivity | Queue/gate anomalies, TS error counters | “Locked” sync but schedule drift |
| PHY TS | Link delay calibration, PCS/SerDes mode drift | Link-mode changes, temperature correlation | Step-like timing shifts after mode changes |
PoE: budget, priorities, and bt (4PPoE) thermal derating
- System budget: allocate a finite PoE pool with a fixed headroom so renegotiation and transient peaks do not trigger uncontrolled port drops.
- Port priority policy: define who is protected first (controllers/industrial endpoints) and who is degraded first (non-critical loads) under shortage.
- bt thermal behavior: higher delivered power raises cable and PSE heat; derating should be staged (limit → reduce → shut down) with explicit cause codes.
Industrial environment: temperature ranges, cooling style, MTBF, alarm thresholds
- Cooling style determines telemetry design: fanless designs need earlier derating thresholds; fan-cooled designs need fan RPM monitoring and stall detection.
- Alarms must be tiered: Warning → Derate → Shutdown, each with a recovery condition to prevent oscillation and “alarm storms.”
- MTBF is operational, not marketing: use event logs to prove stability under thermal and PoE stress, not only bench pass/fail.
H2-4 · TSN feature set selection (Qbv/Qci/Qbu/CB) mapped to campus & industrial use
Goal: translate TSN standards into actionable selection rules—what to enable, what it costs, and how to prove it works under real traffic and fault conditions.
Start from scenarios (not from acronyms)
- S1 — Periodic control streams: motion control / cyclic IO; requires bounded latency and predictable transmission windows.
- S2 — Mixed traffic on shared links: control + video + IT traffic; requires protection against bursty or misbehaving streams.
- S3 — No-downtime networking: rings or dual-homing; requires seamless redundancy with defined buffer and bandwidth costs.
Qbv (Time-Aware Shaper): when it is mandatory
- Use when: critical streams require time windows isolated from best-effort traffic (cyclic control, time-aligned AV/industrial sync).
- Key design lever: window sizing with guard time and alignment margin—windows must tolerate sync error and PHY/PCS variability.
- Cost: schedule management complexity; incorrect margins create periodic “gate misses” even when utilization is low.
- How to validate: stress with mixed traffic; confirm critical frames always exit within the defined gate window across load/temperature sweeps.
Qci (Per-stream filtering/policing): keep determinism from collapsing
- Use when: the network must survive abnormal or bursty streams without flooding deterministic queues (common in industrial installations).
- Key design lever: thresholds derived from stream models (rate + max burst); not from guesswork.
- Cost: incorrect thresholds can either (a) silently allow harm or (b) falsely drop valid traffic.
- How to validate: inject controlled bursts and malformed streams; check drop-reason counters and verify critical streams remain bounded.
Qbu / 802.3br (Frame preemption): protect small critical frames from large frames
- Use when: large frames share a link with strict-latency small frames and gate scheduling alone cannot protect the bound.
- Key design lever: preemption policy and compatibility—misalignment with endpoints can create confusing retransmissions or throughput instability.
- Cost: more complexity in link behavior and debug; wrong settings can look like “random” packet issues.
- How to validate: run large-frame background traffic while measuring the critical-frame latency bound; confirm preemption events and error counters behave.
802.1CB (FRER): redundancy without visible downtime
- Use when: ring/dual-homed paths must tolerate a single failure without loss or reordering impacts beyond defined limits.
- Key design lever: duplicate-and-eliminate window and sequence handling; window settings trade off buffer size vs loss risk.
- Cost: bandwidth overhead (duplicate traffic), sequence tables, buffering and a more complex debug surface.
- How to validate: cut one path during load; confirm no loss at the consumer and verify duplicates are eliminated with recorded counters.
Engineering selection table: scenario → feature → cost → validation
| Scenario | Primary TSN feature | Main cost | Proof method |
|---|---|---|---|
| S1 Periodic control | Qbv | Schedule design + margins | Gate-window compliance under load/temp |
| S2 Mixed traffic | Qci (+ Qbu as needed) | Threshold tuning + debug counters | Burst/fault injection; check drop reasons |
| S2 Mixed + large frames | Qbu/802.3br | Link behavior complexity | Latency bound with large-frame background |
| S3 No downtime | 802.1CB | Bandwidth + buffering + sequence state | Path-cut test; verify duplicate elimination |
H2-5 · Hardware timestamping path (where time is captured, corrected, and consumed)
Hardware time stamping is not “a feature bit.” It is a path through the switch: capture points, correction logic, and where local time is consumed by TSN scheduling. This section explains the pipeline without drifting into grandmaster timing or holdover design.
Capture points: PHY vs MAC, and Ingress vs Egress
A time stamp becomes trustworthy only when its dominant error terms are understood and bounded. The most important architectural choice is where the capture happens and which parts of the packet path remain “in front of” the capture point.
| Capture option | What is included in the time stamp | Dominant error terms | Typical field symptom |
|---|---|---|---|
| PHY-side | Near-line event timing | Link delay calibration, PCS/SerDes mode drift, temperature correlation | Step-like timing shifts after mode/temperature changes |
| MAC-side | Pipeline-aligned timing | Pipeline coupling, load sensitivity, shaping/queue interaction | “Sync looks OK” but gate alignment still breaks determinism |
| Ingress | Before queuing/shaping decisions | Less queue-induced ambiguity; more reliance on correction model | Stable time stamps but egress latency still needs budgeting |
| Egress | After shaping/queue arbitration | More exposed to queue/gate effects; requires tight schedule & correction handling | Periodic misses if local-time alignment margin is insufficient |
Internal clock domains and queues: where variability enters
The switch has multiple internal timing domains: packet parsing, queueing/shaping, and port serialization. Queuing creates variable delay because the packet’s departure time depends on contention, shaping rules, and gate windows. Hardware time stamping makes the behavior controllable by tying capture points to deterministic correction and observable counters.
- Queue contention: best-effort traffic can push critical frames unless strict mapping and policing are enforced.
- Shapers and preemption: shaping/pacing changes departure timing; preemption changes how large frames block small frames.
- Serialization and PCS: line coding, retimers, and mode changes contribute fixed or step-like delays that must be tracked.
Coupling to TSN: Qbv relies on local time (drift becomes determinism loss)
Time-aware shaping (Qbv) depends on local time alignment. If local time drifts relative to the schedule, a frame that “should fit” can land outside its gate window. This converts timing error into a deterministic failure mode: periodic latency spikes or missed transmission opportunities.
- Mechanism chain: local time offset → gate window misalignment → frame waits an extra cycle → latency ceiling breaks.
- What must be monitored: local alignment state, gate-window compliance counters, and the correlation between drift alarms and latency spikes.
- Design implication: gate margins must include alignment error budget and PHY/PCS step-change tolerance.
Selection criteria: MAC vs PHY time stamping (proof-oriented)
| Need | Risk to bound | Preferred approach | Validation |
|---|---|---|---|
| Line-event alignment | Mode/temperature steps | PHY TS + mode-change tracking | Temperature sweep + link-mode transitions |
| Load robustness | Pipeline coupling | PHY TS or MAC TS with explicit correction | Mixed-traffic stress + queue/gate counters |
| Implementation simplicity | Opaque internal variability | MAC TS when bounds are relaxed and evidence is sufficient | Compare against reference under load/temp |
H2-6 · PoE PSE subsystem engineering (power budget, port behavior, protection)
PoE is a behavioral contract per port: detect and classify safely, power on predictably, enforce a budget policy under shortage, and protect each port with clear fault codes. This section stays at the port level—no site backup or rack power topics.
Port lifecycle: Detect → Classify → Power On → Maintain → Monitor → Fault
The PSE should behave like a deterministic state machine. Each state must have (a) entry conditions, (b) actions, (c) exit conditions, and (d) a reason code when it fails. This prevents “mystery port drops” in the field.
Budget policy: priorities, preemption, and staged power limiting
Total PoE capacity must be treated as a managed pool with headroom. Under shortage, ports should degrade in a predictable order (limit → reduce → shut down) rather than collapsing into random drops.
- Total budget: available PSU power (after temperature derating) minus reserved margin for stability.
- Port priorities: protect critical endpoints first (industrial controllers, safety cameras), then best-effort loads.
- Preemption rules: define which ports can be reduced or disabled when a higher-priority port requests power.
- Reason codes: “Budget-Preempt” must be distinguishable from “Overcurrent” and “Overtemperature.”
| Port class | Priority | Max power | Degrade order | Log reason |
|---|---|---|---|---|
| Industrial control | High | Capped by policy | Limit only | Budget-Limit |
| Security cameras | Medium | Negotiated | Reduce → Limit | Budget-Reduce |
| AP / best-effort | Low | Negotiated | Reduce → Shut | Budget-Preempt |
Port-level protection: OCP/short, thermal, surge (actions and recovery)
Protection logic must be staged so the port can degrade gracefully before hard shutdown, and it must always record a cause and a snapshot.
- Overcurrent / short: fast trip → cooldown → limited retries; lockout after repeated faults.
- Overtemperature: derate first, then shut down if necessary; use hysteresis to avoid oscillation.
- Surge / transient: record event count and last-trip cause; avoid turning a transient into an indefinite shutdown loop.
bt (4PPoE) cable heating: derating thresholds and predictable behavior
Under bt power levels, cable and connector heating can dominate reliability. The PSE should expose a tiered response model: Warning → Derate → Shutdown, each with a clear recovery condition and a rate-limited alarm strategy.
- Warning: notify and prepare to reduce power on low-priority ports.
- Derate: apply staged power limits per port class while keeping critical endpoints alive.
- Shutdown: controlled port-off for lowest priority when thermal margin is exhausted.
H2-7 · Thermal design & telemetry closed loop (sense → decide → act → log)
Thermal design in an edge aggregation switch is a closed loop, not a heatsink checklist. The goal is predictable behavior under heat: measure the right points, decide with stable thresholds, act in stages, and leave proof in logs.
Heat-source decomposition (what actually drives temperature)
A campus/industrial aggregation switch concentrates four major heat contributors. Each source has a different “power shape,” which determines which telemetry matters and which control action works.
- Switch ASIC: load-dependent power (queues, shaping, high-throughput forwarding) can create short thermal spikes.
- PHY/retimers: link mode and speed changes can produce step-like power shifts and temperature transitions.
- PoE PSE: port power is often the largest contributor; endpoint mix and cable heating dominate steady-state thermal load.
- DC/DC stages: losses move hotspots across the board depending on input voltage and port distribution.
Sense: sensor placement that enables root-cause isolation
“More sensors” is not the same as “better diagnosis.” A usable thermal loop separates hotspots from ambient and airflow effects and correlates PoE power with temperature rise.
Decide → Act: staged control (fan curve, PoE derate, port shutdown)
The thermal controller should avoid oscillation. Use hysteresis and time windows, then apply staged actions: Warning → Derate → Shutdown. Derate should happen before shutdown, and shutdown should be selective by port priority.
| State | Trigger (example) | Actions | Recovery | Log reason |
|---|---|---|---|---|
| Warning | Hotspot trending up | Raise fan PWM, start trend logging | Temp slope normal | Thermal-Warn |
| Derate | Hotspot above limit | PoE staged limits by priority, fan curve max | Below threshold + hysteresis | Thermal-Derate |
| Shutdown | Critical temperature | Selective low-priority port off, protect silicon | Cooldown window | Thermal-Shutdown |
Log: graded alarms and evidence snapshots
Thermal problems repeat in the field. Logging must capture what changed and why the system acted. Trend-based logging is more useful than single-point values.
- Graded alarms: Info / Warning / Critical aligned to actions (fan raise / derate / shutdown).
- Trend evidence: window average + slope for hotspot, plus inlet/outlet delta.
- Action evidence: fan PWM/RPM, PoE total W, affected ports, limit level, cooldown timers.
- Root-cause tags: port-off reason (Thermal vs OCP vs Budget) and PoE fault code when relevant.
Telemetry map (measure → owner → use → threshold/action)
| Metric | Who measures | Used for | Threshold / action | Log fields |
|---|---|---|---|---|
| ASIC hotspot °C | On-die / board sensor | Derate + protect silicon | Warn/Derate/Shutdown | Temp, slope, state |
| PSE temperature °C | PoE controller | Port derate triggers | Tiered limit | Reason code |
| Inlet/Outlet °C | Board sensors | Fan curve stability | Curve select + alarms | Delta + slope |
| PoE total W | PSE/MCU | Thermal correlation | Derate thresholds | Ports impacted |
| Fan PWM/RPM | MCU | Cooling effectiveness | Fan fault → derate | PWM, RPM, alarm |
H2-8 · Ruggedization for campus/industrial (surge/ESD, isolation boundaries, uptime)
“Industrial-grade” means the box survives real field stress without unpredictable behavior. This section focuses on inside-the-chassis design: port surge/ESD resilience, grounding boundaries, and device-level uptime features.
Field killers (three ways deployments fail)
Most campus/industrial failures are repeatable patterns. A rugged switch should defend against these with evidence-driven telemetry: surge/ESD events, thermal stress, and configuration-driven instability.
Port-side surge/ESD: protection placement and side effects (inside the box)
Port entry protection must be designed as a chain: absorb fast transients, control return paths, and keep the link stable. The key is not “strongest clamp,” but predictable behavior and measurable impact.
- Placement principle: protect at the connector boundary and ensure the transient return path is controlled inside the chassis.
- Side-effect awareness: added parasitics can degrade signal integrity; monitor CRC/FEC counters and link-flap events.
- Evidence requirement: count transient trips and correlate with link retrain, error bursts, or port resets.
Grounding and shielding boundaries (chassis vs signal vs PoE return)
Rugged behavior depends on clear current boundaries inside the enclosure. The chassis, logic/signal domain, and PoE power return must be treated as distinct regions with intentional coupling points.
- Chassis domain: provides a controlled return path for transient energy and enclosure bonding.
- Logic/signal domain: protects sensitive timing and switching domains from high di/dt return currents.
- PoE return domain: high-power return currents should not pollute signal references; enforce boundary discipline.
Device-level uptime: dual power, fan redundancy, and self-recovery
High availability at the edge starts inside the device. Rugged switches should survive single failures without collapsing into long outages.
- Dual power inputs: device-internal switchover and monitoring with clear alarms and cause codes.
- Fan redundancy: fan failure should trigger a policy shift (raise fan targets on remaining fans and stage PoE derating).
- Port self-recovery: controlled retry, cooldown timers, lockout thresholds, and reason codes prevent endless flap loops.
Threat → mitigation → observable evidence (inside-the-box checklist)
| Threat | Where it hits | Mitigation (inside box) | Observable evidence | Logs |
|---|---|---|---|---|
| Surge / ESD | Port entry | Protection chain + controlled return path | CRC/FEC bursts, link retrain counters | Transient event count |
| High temperature | Hotspots + airflow | Staged warn/derate/shutdown | Trend slope + inlet/outlet delta | Thermal reason codes |
| Misconfiguration | Policy plane | Guardrails + alarms + safe defaults | Gate-miss counters, drift alarms | Config-change audit |
H2-9 · Management & security baseline (OOB mgmt, firmware integrity, safe defaults)
A campus/industrial edge aggregation switch must be operable and basically trustworthy by default. This baseline focuses on OOB access, firmware integrity, safe defaults, and NOC-ready telemetry—without turning the device into a security gateway.
Scope guard (baseline only)
- OOB/console access, break-glass recovery
- Config backup, change audit, rollback
- Secure/measured boot concepts, signed updates
- Safe defaults (min services, least privilege)
- NOC telemetry: thermal/power/ports/time alarms
- Firewall, ZTNA, IDS/IPS, DPI
- DDoS mitigation, threat hunting, SOC workflows
- Network-wide security architecture
Management-plane access (OOB, console, and break-glass)
Operations depend on having at least one reliable management path even when the data plane is misconfigured or unstable. A practical baseline separates routine remote management (OOB) from last-resort local recovery (console).
- OOB Ethernet: dedicated management connectivity for inventory, monitoring, and controlled upgrades.
- Serial / USB console: break-glass access for recovery when IP access fails (e.g., wrong ACLs, bad certs, lost mgmt IP).
- Service minimization: only required management services enabled; risky or legacy services disabled by default.
Configuration lifecycle (backup → change audit → rollback)
Configuration must be treated as a controlled asset. A baseline that engineers trust provides versioned backups, auditable changes, and a “last-known-good” rollback path.
| Capability | What it enables | Field failure it prevents | Evidence to log |
|---|---|---|---|
| Versioned backup | Restore known state quickly | Irrecoverable drift | Config hash, timestamp |
| Change audit | Trace who/what/when | Silent outages from edits | User, diff tag, commit ID |
| Rollback (last-known-good) | Undo bad changes safely | Bricked mgmt plane | Rollback reason code |
Firmware integrity (signed updates + non-bricking rollback)
Baseline trust comes from a controlled boot chain and controlled update chain: the device should refuse unauthorized images and recover from failed updates without becoming unreachable.
NOC-ready telemetry (minimum set that must be observable)
Telemetry is only useful if it drives decisions. A baseline set should cover thermal/power, port health, and time alarms, with clear severity and reason codes.
| Telemetry | Why it matters | Alarm examples | Evidence fields |
|---|---|---|---|
| Temps / fan RPM | Prevents silent thermal collapse | Warn/Derate/Shutdown | Temp, slope, PWM, reason |
| PoE total W + per-port W | Explains derating and port drops | Thermal vs budget derate | Ports impacted, fault code |
| Port error counters | Detects link instability | CRC/FEC bursts, flap | Counter deltas, timestamps |
| Time alarms | Protects TSN determinism | Sync loss, drift threshold | State, duration, reason |
H2-10 · Validation & production checklist (prove TSN/Time/PoE/Thermal works)
Validation is not “it seems fine.” It is repeatable proof across TSN, timestamps, PoE behavior, and thermal policies, with captured evidence that can be compared across firmware versions and production batches.
How to read this checklist
- Setup lists the minimum test tools and conditions.
- Steps define a repeatable sequence.
- Pass criteria uses behavior-based thresholds (stable, monotonic drift vs random jumps).
- Evidence specifies counters/plots/log fields that must be saved for later comparison.
✅ TSN checklist (Qbv / Qci focus)
- Qbv window test: run periodic critical flows and record egress timing patterns per cycle.
- Qci injection test: inject abnormal/burst flows and verify policing/filtering protects critical queues.
- Egress timing aligns to the configured gate windows with stable periodicity.
- Abnormal flows are contained (drop/shape) without pushing critical traffic beyond its latency budget.
- Egress timestamp series, queue occupancy stats, per-stream violation counters.
- Drop/police counters linked to injected flows and test timestamps.
✅ Timestamp checklist (MAC vs PHY consistency under temperature/load)
- Baseline consistency: compare MAC vs PHY timestamps on the same path with stable conditions.
- Thermal drift: repeat comparisons while temperature ramps; record drift over time windows.
- Load sensitivity: repeat comparisons under idle vs full load; look for random jumps vs monotonic drift.
- MAC/PHY deltas remain stable or drift smoothly with temperature (predictable).
- No random step changes tied to load or queue behavior that break deterministic timing assumptions.
- Delta vs time plots (MAC−PHY), temperature trace, load state tags.
- Queue stats and timestamp alarm states at the same time marks.
✅ PoE checklist (fault behavior, priority under budget limits, thermal interaction)
- Short/overload: confirm port enters fault state and recovers with cooldown rules.
- Budget starvation: push total PoE power beyond budget and verify priority behavior is deterministic.
- Thermal coupling: raise thermal stress and verify derating happens before selective shutdown.
- Fault behavior is predictable: protect → log → recover; no endless flap loops.
- Budget actions match port priority: critical endpoints remain powered longer than non-critical ones.
- Reason codes distinguish thermal derate vs budget derate vs OCP/short.
- Per-port power/current traces, port state transitions, fault codes, and cooldown timers.
- Total PoE W and the ordered list of ports impacted by derating/shutdown.
✅ Thermal checklist (full load, hot chamber, fan failure, staged policy)
- Full load @ hot: verify hotspots remain bounded and policies engage in correct order.
- Fan failure: force fan fault and confirm policy escalation (fan curve change + PoE derate).
- Recovery: cooldown and hysteresis prevent oscillation; verify stable return to normal state.
- Warn/Derate/Shutdown stages occur without rapid toggling.
- Derate happens before shutdown; shutdown is selective and logged with thermal reason codes.
- Hotspot/inlet/outlet traces, fan PWM/RPM, PoE derate levels, port-off reason codes.
- Event timeline correlating thermal triggers to actions.
Cross-domain test matrix (conditions × domains)
This matrix prevents “single-point validation.” Every test should be tagged by temperature, load, PoE power, and TSN enablement. The captured evidence becomes the comparison baseline across firmware and production.
| Condition tag | TSN | Timestamp | PoE | Thermal | Evidence ID |
|---|---|---|---|---|---|
| Ambient · Idle · PoE low · TSN off | ✅/❌ | ✅/❌ | ✅/❌ | ✅/❌ | LOG-001 |
| Ambient · Full · PoE high · TSN on | ✅/❌ | ✅/❌ | ✅/❌ | ✅/❌ | LOG-002 |
| Hot · Full · PoE high · Fan fault | ✅/❌ | ✅/❌ | ✅/❌ | ✅/❌ | LOG-003 |
H2-11 · Failure modes & debug playbook (symptom → isolate → confirm → fix)
Field issues become fast to solve when every symptom is treated as a repeatable workflow: read the right counters, run a minimal A/B toggle, confirm with a small experiment, then apply a fix with measurable verification. The playbook below stays inside the switch box (data/time/PoE/thermal) and avoids “network-wide” detours.
- Fingerprint the symptom (what is always true vs occasional noise).
- Isolate in 3 steps (each step = one field to read + one minimal action).
- Confirm with a cheap A/B test (load, temperature, TSN on/off, PoE budget).
- Fix & verify using a metric (counter drops to 0, jitter bound improves, port stops flapping).
TSN: periodic packet loss or latency/jitter spikes
Symptom fingerprint
- Spikes are periodic (repeat every N ms) or load-triggered (only under burst).
- Only critical flows are affected (gate/queue related), or all flows are affected (fabric congestion).
- Drops appear as egress drops (queue/policer) vs ingress drops (ingress policing/filtering).
Isolate in 3 steps
- Read gate/queue counters:
gate_miss,queue_occupancy,egress_drop. Action: temporarily throttle best-effort burst (rate limit) and see if spikes vanish. - Read per-stream policing/filtering:
psfp_drop,psfp_violation. Action: widen PSFP thresholds for one test flow only, compare drop counters. - Read schedule alignment health:
schedule_state,time_sync_alarm. Action: disable Qbv for a short window (same load), compare tail latency bound.
Confirm test (cheap A/B)
- Load A/B: idle vs worst-case burst. If spikes scale with burst, prioritize queue/congestion paths.
- TSN A/B: Qbv/Qci off vs on. If spikes only exist with TSN enabled, prioritize gate schedule + local time alignment.
- Stream A/B: one known-good stream vs suspect stream. If only suspect stream drops, prioritize PSFP/Qci.
Fix & verify
- Gate window repair: enlarge critical window margin, reduce conflicting best-effort burst near gate boundaries. Verify:
gate_missstops increasing. - Queue discipline: enforce strict priority only where necessary; cap burst with ingress policing. Verify: occupancy peaks flatten; tail latency bound improves.
- PSFP tuning: set per-stream burst/interval limits to match real traffic. Verify:
psfp_violation→ 0 for normal operation.
Telemetry fields to read (Ctrl+F friendly)
| Field / Counter | Meaning | What it isolates | Pass criteria |
|---|---|---|---|
queue_occupancy, queue_drop |
Queue pressure and drop point | Congestion vs configuration | No sustained saturation during critical windows |
gate_miss, gate_state |
Schedule miss / gate health | Qbv timing/schedule mismatch | Gate misses do not grow in steady state |
psfp_drop, psfp_violation |
Per-stream policing outcomes | Qci/PSFP too strict or wrong classification | Violations only during injected abnormal traffic |
egress_drop, ingress_drop |
Drop stage location | Ingress policing vs egress queue overflow | Drops align with deliberate stress only |
Time: sync lost or occasional time jumps (inside the switch)
Symptom fingerprint
- Jump correlates with link events (flap/retrain) or with load (queue blocking).
- Jump appears on specific ports only (timestamp path) vs global (local time domain/correction).
- Issue worsens when Qbv is enabled (schedule depends on local time coherence).
Isolate in 3 steps
- Read timestamp error counters:
ts_err,ts_overflow,one_step_fail. Action: lock to one port and compare errors across ports. - Read local time health:
time_domain_alarm,pll_lock,freq_offset_ppb. Action: remove heavy traffic load (idle test) and see if jumps disappear. - Read PHY/MAC latency stability:
link_retrain,fec_uncorrect,pcs_err. Action: force stable link mode (no auto-reneg during test), compare drift.
Confirm test
- Load A/B: idle vs full mirror/PoE-heavy traffic. If only full load triggers jumps, prioritize queue/correction interactions.
- Temp A/B: room vs hot (localized heating). If drift scales with temperature, prioritize PHY/clocking stability and calibration.
Fix & verify
- Timestamp path sanity: align MAC/PHY timestamp mode with design intent; verify per-port timestamp errors stop increasing.
- Clock tree stability: ensure jitter cleaner / PLL stays locked across traffic and temperature; verify
pll_locknever deasserts during stress. - Queue protection: prevent correction starvation under congestion; verify time jump events disappear in logs under worst-case load.
PoE: port power flapping (Detect → Classify → Power On → Monitor → Fault)
Symptom fingerprint
- Flap occurs at startup only (inrush/classification) vs during steady power (thermal/overload/LLDP).
- Only certain PD types flap (AP/camera) → suggests negotiation/class profile mismatch.
- Flap frequency increases with ambient temperature → suggests derating/thermal protection.
Isolate in 3 steps
- Read PSE state and reason codes:
pse_state,class_result,fault_code. Action: swap to a known-good PD and compare reason codes. - Read negotiation and allocation:
lldp_power_req,power_alloc_w,budget_remaining_w. Action: cap port power to a stable value and see if flap stops. - Read protection triggers:
ocp_trip,inrush_trip,thermal_derate. Action: distribute load across ports (avoid adjacent hot cluster), compare derate events.
Confirm test
- Budget A/B: full budget vs intentionally constrained budget. Verify port priority behavior matches policy.
- Thermal A/B: force high PoE load on adjacent ports vs spread ports. If only adjacent load fails, prioritize PSE thermal path and derate thresholds.
Fix & verify
- Classification robustness: adjust detection/class timing within standard limits; verify stable
pse_statetransitions (no loop). - Budget policy: enforce priority tiers (critical PDs never preempted by low priority). Verify: expected ports stay on under deficit.
- Protection tuning: inrush and OCP thresholds aligned with cable + PD behavior. Verify:
inrush_trip/ocp_triponly during injected faults. - Derate strategy: derate before shutdown. Verify: derate events appear, but ports stop hard-cycling.
Thermal: over-temp alarms even though fans look “normal”
Symptom fingerprint
- Alarm triggers at specific workloads (PoE heavy vs traffic heavy) → points to which hotspot dominates.
- Fan RPM is nominal, but inlet/outlet delta is abnormal → airflow short-circuit or blocked path.
- Single sensor reads hot while neighbors stay cool → sensor placement or coupling issue.
Isolate in 3 steps
- Read sensor map:
asic_temp,pse_temp,inlet_temp,outlet_temp. Action: correlate temperature rise with PoE power and traffic separately. - Read fan control loop:
fan_pwm,fan_rpm,fan_fault. Action: step fan PWM up for a short test; if hotspot does not respond, suspect conduction/airflow path. - Read mitigation triggers:
poe_derate,port_shutdown_reason. Action: force PoE load redistribution; compare hotspot response.
Confirm test
- Workload A/B: traffic stress only vs PoE stress only. Identify whether ASIC/PHY or PSE/DC-DC is the dominant heat source.
- Airflow A/B: temporary obstruction check (filters, vents) + inlet/outlet deltas. Confirm airflow effectiveness rather than RPM.
Fix & verify
- Sensor strategy: ensure at least one hotspot sensor per heat island (ASIC / PHY / PSE / DC-DC). Verify: hotspot trend matches real load changes.
- Control policy: fan curve + PoE derate ladder (derate → partial shutdown → hard shutdown). Verify: alarms stop escalating under sustained load.
- Logging: record “why” (threshold crossed, sensor ID, mitigation step). Verify: field RCA is possible from logs alone.
The part numbers below are common building blocks that directly surface in TSN/PTP/PoE/thermal debug, because they expose counters, alarms, and telemetry used in this chapter. Selection still depends on port count, PHY media, PoE power class, and industrial temperature grade.
| Subsystem | Example part numbers | Why it matters in H2-11 | Typical debug signals / telemetry |
|---|---|---|---|
| TSN / AVB-capable switch silicon |
Marvell 88E6390X Microchip LAN9662 NXP SJA1105 |
Queue/gate behavior, TSN counters, cut-through/latency behavior | queue_occupancy, gate_state, egress_drop, per-stream policing counters |
| PHY-side IEEE 1588 timestamping |
TI DP83640 Microchip VSC8574 |
When “sync jump” correlates with PHY/link events; PHY timestamping reduces timestamp uncertainty close to the wire | ts_err, link retrain/PCS error counters, recovered clock / SyncE-related status |
| Jitter cleaner / clock multiplier | Si5345 | Local time quality and lock stability directly affect Qbv schedule correctness and timestamp correction stability | pll_lock, alarm pins/logged events, frequency offset / hold status (device-local) |
| PoE++ PSE controller (802.3bt) |
TI TPS23881 ADI LTC4291 + LTC4292 |
Port flapping is usually visible as state/reason codes and protection triggers inside the PSE subsystem | pse_state, fault_code, lldp_power_req, power_alloc_w, thermal_derate |
| 48V input hot-swap / inrush control |
TI LM5069 ADI LTC4286 |
Prevents brownouts and hard resets under load insertion; helps separate “power droop” from “TSN/PTP” symptoms | PG/fault pins, current limit events, PMBus/SMBus telemetry (when available) |
| Digital power monitor (telemetry) | TI INA228 | Turns “it overheats” into measurable power/thermal correlation (PoE island vs ASIC island) | Shunt/bus voltage, current, power, alert thresholds via I²C/SMBus |
| Multi-fan controller (closed loop) | Microchip EMC2305 | Helps prove whether airflow control is working (RPM-based closed loop, stall detection) | fan_pwm, fan_rpm, fan_fault, alert interrupts |
H2-12 · FAQs × 12 (TSN / Time / PoE / Thermal)
These FAQs convert common field questions into actionable checks: each answer provides a quick root-cause split, the minimum counters/logs to read, and a small A/B action to confirm. Example material numbers are included as reference building blocks for this edge aggregation switch class.
Q1In TSN, why can throughput look “fine” but latency still be unstable?
queue_occupancy/egress_drop, gate_miss, and psfp_violation first. Confirm by throttling best-effort traffic or temporarily disabling Qbv for an A/B run. Example parts: Microchip LAN9662, NXP SJA1105.
Q2How should a Qbv gate window be sized to fit critical and normal traffic?
gate_miss stops growing. Confirm by replaying the same load and verifying a stable tail-latency bound. Example parts: Marvell 88E6390X, Microchip LAN9662.
Q3Why can enabling Qbu frame preemption cause strange loss or retransmissions?
Q4How to set Qci policing thresholds without killing normal traffic?
psfp_violation and psfp_drop; if they rise during normal operation, the contract is too tight or classification is wrong. Confirm with an injected abnormal burst and ensure only the injected case trips. Example parts: NXP SJA1105, Microchip LAN9662.
Q5MAC timestamp vs PHY timestamp—what is usually more stable, and why?
Q6Why can time sync appear “locked” but TSN still occasionally hits the wrong gate window?
pll_lock/time_alarm, timestamp error counters, and queue congestion markers around the event. Confirm with a load A/B test (idle vs worst-case) while keeping the same schedule. Example parts: Silicon Labs Si5345, Microchip VSC8574.
Q7When PoE total budget is insufficient, how should port priority be designed safely?
budget_remaining_w, power_alloc_w, and per-port priority decisions. Confirm by intentionally constraining budget and verifying that critical ports stay powered while low-tier ports derate or shut down in a predictable order. Example parts: TI TPS23881, ADI LTC4291 + LTC4292.
Q8Why do PoE ports “flap” (power cycling repeatedly), and what is the typical trigger chain?
pse_state and fault_code plus inrush_trip/ocp_trip/thermal_derate. Confirm by swapping in a known-good PD and capping port power for an A/B run; if stable, the issue is negotiation or protection thresholds. Example parts: TI TPS23881, ADI LTC4291.
Q9After enabling 802.3bt (4PPoE), why is the system hotter—cable, PSE, or DC/DC, and how to prove it?
port_power_w, PSE temperature, DC/DC hotspot temperature, and inlet/outlet delta. Confirm with two tests: same total PoE power but (a) concentrated adjacent ports vs (b) distributed ports. Example parts: TI INA228, Microchip EMC2305.