TSN Switch / Bridge for Deterministic Industrial Ethernet

Q: Enabling Qbv causes sporadic packet loss — guard band underestimated or gate switch overhead ignored?

Likely cause: Guard band T_guard smaller than worst-case frame serialization + gate edge latency; long BE frame overlaps a critical window. Quick check: Inspect gate_miss_cnt, late_window_cnt, per_queue_drop_cnt under max-MTU background load; correlate drops with gate edge timestamps. Fix: Increase T_guard using worst-case MTU/link rate; fully close BE around critical slots; enable/validate preemption only if needed. Pass criteria: gate_miss_cnt=0 and late_window_cnt=0 over Y cycles; critical drops ≤ X ppm under worst-case load.

Q: End-to-end latency percentiles look great, but rare spikes violate the hard bound — microburst or admission leak?

Likely cause: Microburst drives queue above planned watermark; or admission accounting missed slot/queue/buffer, allowing occasional overload. Quick check: Compare p99_999_e2e_us vs max_e2e_us; inspect per_queue_watermark near spikes; check admission_violation_evt and policer_drop_reason. Fix: Add ingress policing; strengthen isolation; tighten admission worksheet; add gate windows if shaping-only is insufficient. Pass criteria: max_e2e_us ≤ X over Y minutes; per_queue_watermark never exceeds X%; admission_violation_evt=0.

Q: Time sync shows “locked”, yet windows are shifted — timestamp tap point mismatch or time-base step?

Likely cause: Time-base step/holdover transition or inconsistent timestamp tap points across hops. Quick check: Inspect timebase_step_evt, holdover_evt, ts_jump_cnt; measure win_edge_err_ns at ingress/egress per hop. Fix: Block schedule activation unless time base stable; standardize timestamp tap points; recompute alignment after sync topology/clock changes. Pass criteria: timebase_step_evt=0 and ts_jump_cnt=0 over Y minutes; |win_edge_err_ns| ≤ X on every hop.

Q: Preemption enabled, but throughput drops or retransmissions spike — fragment handling or queue mapping issue?

Likely cause: Fragmentation/reassembly overhead or wrong mapping between preemptable classes and shaping/gating queues. Quick check: Track preempt_frag_cnt, preempt_abort_cnt, reassembly_err_cnt, per-queue watermark; verify prio_to_queue_map. Fix: Restrict preemption scope; keep critical queues non-preemptable and isolated; add buffer headroom or reduce BE bursts with policing. Pass criteria: Throughput loss ≤ X%; reassembly_err_cnt=0; retry/CRC indicators within X ppm.

Q: One traffic class is always starving — CBS/ATS parameters wrong or gate schedule conflict?

Likely cause: Queue shares bandwidth without guaranteed service; CBS credit never recovers; gate closes during peak arrivals. Quick check: per_queue_tx_bytes flat for the class; check cbs_credit_min/cbs_credit_reset_cnt; check gate_open_time_us per cycle. Fix: Allocate dedicated queue; tune CBS/ATS; align gate slots with arrivals or remove gating if shaping is sufficient. Pass criteria: Class achieves ≥ X% requested rate over Y cycles; drop ≤ X ppm; no sustained credit collapse beyond planned floor.

Q: Determinism breaks after topology change — multi-hop schedule alignment or residence-time update missing?

Likely cause: Path/hop count changed but GCL offsets and residence-time corrections were not recomputed. Quick check: Compare hop_count and per-hop res_time_ns before/after; check win_edge_err_ns growth; confirm config_bundle_ver consistency. Fix: Recompute offsets and residence-time terms; deploy bundles atomically with rollback-ready staging. Pass criteria: win_edge_err_ns ≤ X per hop; max_e2e_us ≤ X; no bundle mismatch events over Y minutes.

Q: Temperature change triggers window miss — oscillator drift or aggressive holdover strategy?

Likely cause: Time-base drift pushes offset_ppb high; holdover enter/exit introduces step or wander. Quick check: Log temp_c vs offset_ns/drift_ppb; inspect holdover_evt/timebase_step_evt; count late_window_cnt. Fix: Improve clock quality/compensation; reduce holdover chattering; add window margin if drift budget requires it. Pass criteria: drift_ppb ≤ X across temperature; timebase_step_evt=0; late_window_cnt=0 over Y thermal cycles.

Q: Measured latency is significantly higher than theory — store-and-forward path or buffer watermark throttling?

Likely cause: Forwarding mode/pipeline adds extra delay; buffering hits watermarks and introduces throttle/backpressure. Quick check: Decompose per-hop ingress_ts→egress_ts; verify store_fwd_active; check per_queue_watermark and throttle_evt. Fix: Align budget with actual forwarding mode; add buffer headroom or reduce bursts; isolate critical queues from BE bursts. Pass criteria: Decomposition matches budget within ±X%; throttle_evt=0 during bounded-load tests; max_e2e_us ≤ X.

Q: Background traffic increases critical jitter — queue isolation missing or ingress policing absent?

Likely cause: Queue/buffer sharing with BE bursts; unpoliced ingress causes microbursts that disturb critical schedule. Quick check: Validate prio_to_queue_map; read per_queue_watermark and policer_drop_cnt; confirm critical queue has dedicated gate/shaper. Fix: Enforce dedicated critical queue; add ingress policing; apply shaping to reduce burst; use gate windows for strict jitter bound. Pass criteria: Critical jitter_us ≤ X under worst-case BE; critical watermark < X%; critical drops ≤ X ppm.

Q: After configuration update, the network sporadically flaps — bundle inconsistency/rollback gap or time-base relock?

Likely cause: Partial deployment causes schedule/mapping mismatch; no atomic commit; time base relock disrupts alignment. Quick check: Compare config_bundle_ver across nodes; inspect atomic_swap_evt/rollback_evt/timebase_unlock_evt; correlate with link_flap_evt. Fix: Staged rollout with pre/post checks; shadow tables + atomic swap; block activation when time base unlocked; rehearse rollback per bundle. Pass criteria: config_bundle_ver identical; link_flap_evt ≤ X per Y hours; timebase_unlock_evt=0 during swap window.

← Back to: Industrial Ethernet & TSN

A TSN switch/bridge makes Ethernet deterministic by turning traffic into scheduled, admitted, and measurable streams—so latency, jitter, and loss stay within provable bounds under worst-case load.

This page explains the mechanisms (time-aware gating, shaping, admission control, hardware timestamping, and observability) and how to configure and verify them so “works in the lab” becomes repeatable in production and field deployments.

Definition & Scope Guard: What “TSN Switch/Bridge” Covers

Working definition

A TSN switch/bridge is a forwarding node that enforces deterministic behavior using hardware time windows, hardware timestamps, traffic shaping, and admission control, so latency/jitter/loss have explicit bounds and are verifiable.

Role in the system

Isolation: separates time-aware flows from best-effort traffic to remove queueing uncertainty.
Bounded behavior: turns “average performance” into a worst-case bound via gates/shapers/admission.
Proof hooks: exposes measurable points (timestamps/counters/events) to validate deterministic guarantees.

“Deterministic” constrains three outcomes (plus time error)

1) Latency bound

One-way end-to-end worst-case delay is bounded (not just mean/P95).

2) Jitter bound

Packet delay variation is bounded (queue jitter + gate timing + timestamp noise).

3) Loss bound

Drop/late-drop/police-drop are kept below a defined limit under admitted load.

+ Time error

Offset/drift/holdover events shift window alignment and directly affect worst-case behavior.

When a TSN switch/bridge is needed (vs. a managed switch)

A hard deadline exists (missed deadline is a functional failure, not a “slower UI”).
Background load changes over time, so queueing jitter must be removed or bounded.
Multiple endpoints require time coordination (scheduled windows / deterministic triggering).

If traffic is best-effort only, deadlines are soft, or critical flows run on a dedicated link, a managed switch with VLAN/QoS is often sufficient.

Scope Guard (strict)

In-scope (covered on this page)

Hardware time windows: gating concepts, schedules, guard bands (concept-to-budget).
Hardware timestamps: tap points, error terms, validation hooks.
Shaping: controlling burstiness to keep worst-case bounds intact.
Admission control: ensuring new flows cannot break existing guarantees.
Verification metrics: counters/events/tests to prove bounds (X placeholders).

Out-of-scope (link only, no deep dive)

PHY / magnetics / ESD / surge / layout — physical-layer integrity and protection. PHY Co-Design & Protection
PoE / PoDL — power delivery and thermal/protection co-design. PoE / PoDL
PROFINET / EtherCAT / CIP stack details & certification — endpoint protocol/stack domain. Industrial Ethernet Stacks
Ring redundancy (MRP/HSR/PRP) — topology redundancy and switchover mechanics. Ring Redundancy

System map: time-aware vs best-effort flows across a TSN switch/bridge

The diagram highlights the only job of this page: enforce and verify bounds inside the TSN switch/bridge (gates, shaping, timestamps, admission). Physical layer, power, stacks, and redundancy are referenced but not expanded.

Determinism Goals & Key Specs: Latency, Jitter, Loss, and Time Error

What this section locks down

A measurable definition of deterministic outcomes (bounds, not averages).
A common dictionary so all later chapters reuse the same metric meanings.
A budget template that decomposes one-way E2E delay into controllable blocks.

One-way E2E latency decomposition (switch/bridge-centric)

For deterministic design, the worst-case bound is driven by a small set of blocks. The most common dominating term is gate waiting (missing a window) or queueing under burst.

Ingress processing: parse/classify; becomes a fixed per-hop constant.
Gate waiting: time until the next open window; worst-case can approach one cycle if a window is missed.
Queueing + shaping delay: depends on burstiness and shaper parameters; bounded by admission + isolation.
Fabric + egress scheduling: forwarding latency inside the switch and port scheduling effects.
Serialization + propagation: hard lower bounds set by frame length, link rate, and cable length.

Jitter taxonomy (useful for verification and triage)

Queueing jitter

Variation caused by competing traffic and transient bursts; addressed by isolation, shaping, and admission.

Gate timing jitter

Window boundary uncertainty (schedule alignment, guard band, internal switching granularity).

Timestamp noise

Measurement noise from timestamp tap points and quantization; fixed by consistent tap definition and calibration.

Clock/time error

Offset/drift/holdover events shift schedules and can cause sporadic worst-case violations even when averages look good.

Metric Dictionary (standardized terms)

Metric	Definition	Obs. point	Common misuse	Pass criteria
One-way E2E latency (worst-case)	Maximum one-way delay under admitted load	Endpoint timestamps	Using mean/P95 as “bound”	≤ X
Per-hop residence time	Ingress-to-egress time inside a switch/bridge	Switch HW timestamp/counter	Mixing store-forward and cut-through paths	≤ X
Gate waiting time	Time from arrival to next open window	Ingress + schedule timeline	Ignoring missed-window worst-case	≤ X
PDV (delay variation)	Peak-to-peak or percentile spread of one-way delay	Endpoint timestamps	Comparing different timestamp tap points	≤ X
Window miss count	Packets arriving too late/early for a configured window	Switch gate counters	Assuming “no drops” implies “no misses”	≤ X / hour
Drop breakdown	Drops by queue / policer / buffer watermark	Per-queue counters	Only tracking CRC errors	≤ X
Timestamp error	Tap-to-wire (or wire-to-tap) uncertainty	Calibrated test	Comparing uncalibrated devices	≤ X ns
Time base events	Lock transitions / holdover enters / time step	Time-sync status/event log	Treating “locked” as “always stable”	0 critical events

All later chapters should reference these metrics by name. If a test cannot map to a metric above, it is likely not proving determinism.

E2E latency budget template (per-hop block view)

Block	Worst-case bound	How to measure	Owner	Risk flag
Ingress processing	≤ X	Switch counters / profiling	Switch config	Unexpected parsing path
Gate waiting	≤ X (can approach one cycle)	Schedule timeline + gate counters	TSN schedule	Missed window / drift
Queueing + shaping	≤ X	Per-queue occupancy / shaper stats	Shaping + admission	Microburst / oversubscription
Fabric + egress scheduling	≤ X	Ingress/egress timestamps	Switch architecture	Unexpected store-forward path
Serialization	Frame_len / link_rate	Known constants	System spec	MTU growth
Propagation	Cable_len / velocity	Cable model / measurement	System design	Topology change

Any deterministic claim should identify the dominant bound term and show how it is measured (or proven) under admitted load.

Latency budget waterfall: ingress-to-egress blocks and measurement points

The diagram is a repeatable template: each block should map to a measurement method and an owner. If a block cannot be measured or bounded, the design is not deterministic.

TSN Feature Map: Which Mechanism Solves Which Problem

Purpose (mechanism-first, not standards-first)

Determinism is achieved by choosing mechanisms that directly control the dominant term in the worst-case budget. This section maps each mechanism to the outcomes it can guarantee, the cost it introduces, and the scenarios where it is typically used. IEEE clause-level details and certification workflows are intentionally excluded to avoid overlapping with standards/certification pages.

Fast triage: pick the mechanism by the symptom

Hard deadline misses

Use Time-aware gating to bound waiting; add guard band / preemption when long frames overlap window edges.

PDV / jitter grows under load

Use Shaping (burst control) and admission control (resource cap); add policers to isolate misbehaving sources.

Looks “fine” but cannot be proven

Require hardware timestamping and per-queue counters/events; without observability, deterministic claims are not verifiable.

Mechanism-to-Outcome mapping (what each block actually guarantees)

Mechanism	Primary outcome	Dominant term controlled	Cost / overhead	Typical use	Misuse symptom
Time-aware gating (Gate)	Bounded latency (windowed service), deterministic scheduling	Gate waiting (missed window dominates worst-case)	Schedule management (GCL), guard bands, time alignment dependency	Motion control, PLC cyclic traffic, deterministic triggers	Sporadic deadline misses; “great average, bad worst-case”
Traffic shaping (Shaper)	Bounded jitter under burst; predictable queue growth	Burstiness → queueing (microbursts / overshoot)	Added controllable latency; parameter tuning and validation effort	Imaging streams, mixed cyclic/acyclic networks, gateways	Jitter spikes when background load changes
Policing (Policer)	Isolation from misbehaving sources; protects deterministic domain	Ingress overload (rate/burst violations)	Drops/marking; requires clear violation definitions and counters	Multi-vendor integration, field expansion, mixed trust domains	“Random” drops without per-flow accountability
Admission control (Admission)	No starvation under admitted set; prevents future bound breakage	Resource cap (bandwidth/slots/queues/buffers)	Requires resource model + workflow (approve/rollback/versioning)	Scalable cells, factories with add-on nodes, shared infrastructures	Works until “one more device” is added, then bounds collapse
Hardware timestamp (Timestamp)	Proof and calibration; enables per-hop residence time measurement	Measurement credibility (tap-point consistency)	Requires clear tap definition + error budget + event logging	Any deterministic validation, multi-hop timing, commissioning	Conflicting measurements across tools/devices

The mapping avoids standard clause references and focuses on engineering outcomes: which dominant term is controlled and how misconfiguration typically manifests.

Typical “mechanism recipes” by application (quick selection)

Application	Gate	Shaper	Guard/Preempt	Admission	Timestamp
Motion control	✓	✓	✓	✓	✓
Imaging / vision	✓	✓	—	✓	✓
PLC cyclic + diagnostics	✓	✓	—	✓	✓
Robot cell (multi-axis)	✓	✓	✓	✓	✓

The recipe matrix is a starting point. The final choice should be driven by the dominant worst-case term and the verification hooks available on the chosen switch/bridge.

TSN toolbox: five blocks and the outcomes they enable

The toolbox view emphasizes mechanism selection by outcome. Each block must map to measurable metrics and counters; otherwise deterministic claims cannot be validated.

Hardware Architecture: Ingress → Classify → Queue → Shape → Gate → Egress

Why the internal pipeline matters

Determinism is implemented inside the switch/bridge pipeline. The location of classification, queueing, shaping, gating, and timestamp taps determines what can be bounded and what can be measured. This section describes the data path and the time path without expanding into PHY signal integrity.

Typical data path (conceptual pipeline)

Ingress parse/classify: maps frames to traffic class/queue/stream state.
Per-stream state: eligibility, policing, accounting, and shaping context.
Queues (per port / per TC): isolation boundary; occupancy drives worst-case delay.
Shapers/policers: bound burst input and protect deterministic domain.
Gate scheduler: time-aware service windows and guard-band handling.
Egress scheduling: final arbitration and transmit timing.

Cut-through vs store-and-forward (mechanism-level implications)

Cut-through

Lower typical latency, but worst-case must account for internal stalls and path variability.
Determinism requires measurable per-hop residence time and clear tap-point definition.

Store-and-forward

Includes frame-length dependent delay, but the forwarding model is often easier to bound.
Worst-case is dominated by queueing/gating rather than hidden internal variability when counters are available.

Hardware knobs checklist (selection and bring-up)

Knob	Why it matters	How to verify	Red flag
Queue / TC count	Sets isolation granularity and scheduling expressiveness	Map flows to queues; confirm no unintended sharing	Key flows share a queue with best-effort
Per-port buffer + watermarks	Microburst tolerance and drop behavior; impacts worst-case delay	Stress with bursts; watch occupancy and drop reason counters	Drops occur without accounting detail
Gate list depth + granularity	Limits schedule complexity and smallest achievable windows	Implement a representative GCL; verify window miss counters	Cannot express required slots/guards
Timestamp tap points + resolution	Determines credibility of residence time and E2E measurements	Confirm ingress/egress timestamp availability and calibration hooks	Tap points are undocumented/inconsistent
Per-queue counters/events	Enables proof: drops, late-to-window, shaper eligibility, policing	Check counter coverage; validate with controlled faults	Only global counters exist
Telemetry / mirror / event log	Supports field forensics and configuration drift detection	Verify event logs (time base step/lock change) and export path	No reliable black-box signals

Where to measure (conceptual observation points)

Observation point	Typical signal	Proves	Common pitfall
Ingress timestamp (T_in)	Hardware timestamp / ingress marker	Per-hop residence time (with T_out)	Tap point mismatch across devices
Queue occupancy	Per-queue depth / watermark	Queueing/jitter dominance under load	Only global buffer metrics available
Gate miss / late-to-window	Window miss counters/events	Schedule correctness and guard band adequacy	Assuming “no drops” means “no misses”
Shaper eligibility	Credit/state counters	Burst control and stability under background load	Misinterpreting shaped delay as failure
Egress timestamp (T_out)	Hardware timestamp / egress marker	Residence time + output scheduling effects	Using software timestamps for proof

Observation points should map to the metric dictionary and the E2E budget blocks. If a critical bound term cannot be measured, deterministic validation is incomplete.

Switch pipeline block diagram: data path and time path (where determinism is enforced)

The diagram separates the data path (frames) from the time path (schedule/sync). Determinism is enforced at queues, shapers/policers, and gates, and proven via consistent timestamp tap points and per-queue counters.

Time-Aware Windows: Gate Control, Schedules, Guard Bands, and Preemption

What this section guarantees (mechanism-level)

Time-aware gating bounds worst-case waiting by turning contention into a scheduled service window. The engineering focus is: how to derive a gate schedule from periodic traffic, why guard bands are required, when preemption becomes necessary, and what multi-hop alignment must ensure. PTP algorithms and standard clause-level explanations are intentionally excluded to avoid overlapping timing/sync pages.

Bound model: what a gate window actually caps

Window waiting cap: if a queue is closed, the worst-case waiting is bounded by one cycle plus guard-band effects (conceptually).
Queueing vs scheduling: shaping bounds burst-driven queue growth; gating bounds service timing by design.
Proof hooks: “late-to-window” / “gate-miss” counters and consistent timestamp tap points are required to verify the bound.

Deriving a gate schedule from periodic traffic (practical workflow)

Inputs required (no protocol details)

Cycle time: X (control period / sync epoch).
Per-flow envelope: period, max frame size, deadline, queue/TC mapping.
Port constraints: link rate, queue count, gate switching overhead X, GCL depth X.
Policy: how much best-effort bandwidth is allowed without harming key windows.

Step-by-step schedule build

Group critical flows by deadline and isolation needs; assign them to dedicated queues/TCs where possible.
Choose a shared epoch (cycle start) that aligns with the system’s control rhythm and commissioning reference.
Allocate time slots per critical queue so that the slot capacity covers the per-cycle payload at line rate with margin X.
Insert guard bands at window edges to prevent boundary infringement by non-critical frames.
Place best-effort windows in remaining space and restrict them by shaping/policing as needed.
Validate the schedule with “late-to-window / gate-miss” counters and per-hop residence time measurements.

Gate Control List (GCL) template (commissioning-ready)

Cycle	Slot	Start (offset)	Duration	Gate state (Q0..Qn)	Guard band	Notes
T = X	S0	0	X	Q0:OPEN, others:CLOSED	GB = X	Critical window
T = X	S1	X	X	Q1:OPEN, others:CLOSED	GB = X	Secondary window
T = X	S2	X	X	BE:OPEN (shaped), Q0/Q1:as needed	GB = X	Best-effort window
Repeat	S3..	…	…	…	…	Version / rollback id

Recommended commissioning practice: version the GCL, define a rollback plan, and validate per-hop residence time and window-miss counters before production rollout.

Guard-band quick estimation (rule block)

Minimum guard band (structural form)

GB ≥ (MaxFrameTime at line-rate) + (Gate switching overhead X) + (Margin X)

MaxFrameTime depends on maximum non-critical frame size X and link speed X.
Switching overhead includes implementation latency when changing gate state (X).
Margin covers clock error, timestamp uncertainty, and measurement bias (X).

Fast symptom check (guard band too small)

Late-to-window / gate-miss counters increase in a periodic pattern.
Failures correlate with large best-effort frames near window edges.
Reducing best-effort MTU or adding preemption reduces misses immediately.

When preemption is necessary (decision logic + trade-offs)

Trigger	Why guard band alone is inefficient	Expected gain	Cost / risk
Narrow windows + large BE frames	Guard band must cover near-max frame time, wasting slot capacity	Smaller guard band, better window utilization	Higher configuration/validation complexity
High utilization deterministic schedule	Static gaps accumulate and reduce usable bandwidth	More payload per cycle without violating window edges	Debugging and forensics become harder
Strict window-edge integrity must be proven	Without preemption, BE overlap risk must be eliminated by oversized guard bands	Cleaner proof of “no overlap” with proper counters	Requires consistent capability across the path

Multi-hop note (no sync algorithm details): window phase alignment must ensure the flow’s arrival time lands inside the intended window at every hop with margin X.

Time wheel + gate timeline: cycle, slots, guard band, and preemption boundary

The timeline view highlights the non-obvious constraint: window edges must be protected from overlap. Guard bands and preemption are the primary tools to preserve edge integrity without sacrificing excessive slot capacity.

Not covered here (scope guard)

PTP/SyncE algorithm internals and topology selection logic.
Clause-by-clause IEEE interpretations and certification procedures.

Shaping for Determinism: CBS / ATS / Rate-Limit and Queue Discipline

When gating is not required

Many systems achieve sufficient determinism without explicit time windows by bounding burstiness and isolating queues. This section explains how shaping reduces queue growth and packet delay variation, where stability traps occur, and how to decide when a gate schedule becomes mandatory.

Mechanism comparison (what each shaper controls)

CBS

Best for: steady or near-periodic streams needing bounded PDV.
Controls: output smoothness via eligibility/credit dynamics.
Typical pitfall: credit reset/initialization causes periodic “mystery jitter”.

ATS

Best for: mixed or bursty traffic requiring per-flow timing discipline.
Controls: burst spreading via time-based eligibility scheduling.
Typical pitfall: queue coupling hides microbursts unless per-queue telemetry exists.

Rate-limit

Best for: ingress protection and simple fairness, not strict deadlines.
Controls: peak rate/burst envelope to prevent overload.
Typical pitfall: “average looks OK” while microbursts still hit watermarks.

Shaper stability traps (symptom → quick check → fix)

Microburst dominance

Quick check: correlate PDV spikes with queue watermarks/occupancy peaks (not averages).
Fix: tighten ingress policing, isolate queues, and set buffer watermarks X with telemetry.

Credit/state discontinuities

Quick check: jitter spikes repeat with a fixed cadence tied to state resets or link events.
Fix: validate shaper initialization, avoid hidden resets, and log eligibility state transitions.

Queue coupling

Quick check: key flows share a queue/TC with best-effort or diagnostics traffic.
Fix: dedicate queues for key classes, separate buffer pools where supported, and enforce admission rules.

Shaper selection decision tree (gate vs CBS vs ATS vs rate-limit)

If the requirement is a hard deadline bound at specific cycle phases → prefer Gate (and guard/preempt as needed).
If traffic is near-periodic and the target is bounded PDV under load → choose CBS with strict queue isolation.
If traffic is bursty/mixed and needs controlled eligibility timing → choose ATS and verify per-flow/queue telemetry.
If the goal is ingress protection and preventing overload rather than strict determinism → apply rate-limit/policer plus admission.
If adding one more stream risks breaking existing bounds → enforce admission control regardless of shaper choice.

Microburst control checklist (field-stable determinism)

Ingress rate guard: define burst envelope and enforce with policing.
Queue isolation: keep key flows away from best-effort and diagnostics queues.
Buffer watermarks: set early warning thresholds X and log occupancy peaks.
Telemetry: require per-queue occupancy peak, drop reason, eligibility/credit state.
Admission: prevent incremental additions from silently breaking bounds.

Queue + shaper behaviors: no shaping vs CBS vs ATS (occupancy over time)

The panels focus on occupancy peaks rather than averages. In practice, microburst control must be validated with per-queue occupancy peaks, watermarks, and eligibility/credit telemetry.

Not covered here (scope guard)

Protocol stack specifics, parameter tables for particular industrial profiles, and certification workflow details.
PHY/magnetics/layout/EMI topics (handled in the PHY co-design and protection pages).

Admission Control & Isolation: Keep the Network from Becoming a Tuning Lottery

Why admission exists (determinism needs contracts)

Gates and shapers control forwarding behavior, but without admission the guarantees are eventually broken by new flows, new tenants, or untracked diagnostics traffic. Admission turns determinism into an enforceable contract by making bandwidth, queues, gate slots, buffers, and time budget explicit and versioned.

Admission input → decision → contract output

Inputs (flow descriptor)

Period: X
Max frame: X
Priority / TC: X (and queue mapping)
Path: ingress port → … → egress port(s)
Bound required: latency X / jitter X / loss X
Burst policy: allowed / disallowed (X)

Outputs (decision + reservation contract)

Decision: Admit / Reject / Admit-with-conditions
Reserved resources per hop: slot time X, shaper rate X, buffer quota X
Isolation mapping: dedicated queue/TC + counters scope
Monitoring hooks: thresholds X + violation actions
Versioning: config id + rollback pack

Resource dimensions admission must account for

Bandwidth: per-cycle service capacity and sustained load margins (X).
Queues & priorities: queue count, TC mapping, and coupling risk from shared queues.
Gate slots: available slot time, phase margin (X), and GCL depth (X).
Buffers: per-port buffers, shared pools, watermarks (X), and drop reasons.
Time budget: per-hop residence time tail + window waiting cap + guard band margin (X).

Common failure pattern: “average bandwidth looks fine” while slot occupancy, buffer peaks, or window-miss counters silently cross the determinism contract.

Admission worksheet (template)

Flow ID	Owner	Period	Max frame	Priority/TC	Path	Bound target	Reservation per hop	Monitor & thresholds	Admit?
F-001	Line-A	X	X	TC0 / Q0	P1→P3→P7	Latency X, Jitter X	slot X, shaper X, buf X	wm X, miss X, drop X	Admit
F-002	Diagnostics	X	X	BE / Qx	P2→P6	Loss X	policer X, buf X	police-hit X, drop X	Conditional
…	…	…	…	…	…	…	…	…	Version

Operational rule: any new flow must be recorded, evaluated, and admitted with a versioned contract; direct “add it and see” changes convert determinism into a tuning lottery.

Isolation rules (principles that prevent configuration collisions)

Principles

Critical classes must have dedicated queue/TC mapping and independent bounds.
Best-effort and diagnostics must never share a critical queue.
Every admitted class must have scoped counters and thresholds (X).

Enforcement points

Namespace configs by tenant / line / cell to avoid cross-edits.
Lock “critical zones” (queues/slots) behind approval and rollback packaging.
Assign hard quotas per tenant: bandwidth/slot/buffer/counter budgets (X).

Runtime monitoring & violation handling (contract must be provable)

Monitor: queue occupancy peak, watermark hits, window-miss/late counters, drop reasons, policer hits.
Detect: threshold breach (X) tied to a specific flow/tenant and port.
Act: rate-limit, downgrade to BE, isolate to a sacrificial queue, or reject renewal.
Audit: log config version + violation context for rollback and root-cause.

Admission pipeline: request → resource calc → policy → deploy → monitor → handle violations

The key requirement is traceability: admission decisions, reserved resources, and violation actions must all be tied to a versioned configuration and an auditable owner/tenant boundary.

Not covered here (scope guard)

Industrial protocol stack parameters and certification procedures.
Timing/sync algorithm internals (handled in Timing & Sync pages).

Hardware Timestamping & Time Base: Accuracy, Two-Step, and Residence Time

Why timestamps must be trustworthy

Deterministic forwarding still fails closed-loop control if timestamp tap points drift or time base states are not observable. This section focuses on hardware timestamp insertion points, residence time observability, one-step vs two-step engineering trade-offs, and a practical error budget structure.

Time base sources and observable states (no algorithm details)

Local oscillator: simple deployment; requires drift monitoring and defined holdover triggers (X).
Synchronized reference: better long-term accuracy; requires lock/offset/drift observability.
Recovered/servo time: transitions into holdover on loss; must define re-lock behavior and alarm thresholds (X).

Minimum observables (sanity signals)

Lock state (stable for X time)
Offset (within X) and drift rate (within X)
Holdover entry conditions and maximum allowed duration (X)

Timestamp tap points: where error is created

Ingress timestamp: captures arrival into the switch pipeline; determines start of residence time.
Egress timestamp: captures departure; determines the tail behavior under congestion and shaping.
MAC vs PHY boundary: a more line-side tap reduces unmodeled delay; an inner tap is easier but adds tap uncertainty (X).

Residence time and one-step vs two-step (engineering view)

Residence time

The measurable quantity is the time spent inside the device from ingress tap to egress tap. Verification should focus on tail percentiles (P99/P999) with thresholds X, not only averages.

One-step vs two-step

One-step: in-line update at line rate; demands tight implementation and validation.
Two-step: follow-up event simplifies implementation; requires robust event matching and consistency checks.
Debug impact: two-step can improve forensics by separating transmission and correction paths.

Timestamp error budget (structure, thresholds as X)

Error term	What it represents	How to observe	Threshold
Time base	offset/drift/holdover effects	lock/offset/drift counters	X
Quantization	timestamp resolution limit	timestamp LSB and distribution	X
Tap uncertainty	MAC/PHY boundary ambiguity	calibration + loopback compare	X
Queueing tail	residence time percentile tail under load	P99/P999 residence time (X)	X

Pass criteria suggestion: treat time base and tap uncertainty as bounded terms, and validate queueing tail with a load profile representative of production (thresholds X).

Sync sanity checklist (commissioning & field ops)

Lock: stable for X time; no flapping.
Offset: within X; verify after temperature steps.
Drift: within X per unit time; flag trend increases.
Holdover: trigger conditions defined (X), and maximum allowed duration (X).
Recovery: post re-lock behavior verified; re-calibrate window phase if required.

Timestamp tap points: ingress/egress (MAC vs PHY boundary) and error sources

Practical verification: document the exact tap definition, treat residence time as a percentile distribution, and require time base lock/offset/drift observability with clear holdover triggers (X).

Not covered here (scope guard)

PTP/SyncE/White-Rabbit protocol algorithm details and topology selection.
PHY signal integrity and magnetics/layout considerations.

Configuration & Parameterization: GCL Tables, Profiles, and Guardrails

Why configuration must be engineered (not “tuned”)

TSN failures often come from table drift, partial updates, and unmanaged profiles rather than incorrect theory. A deterministic network needs configuration assets that are packaged, validated, staged, committed atomically, and rolled back reliably.

Configuration inventory: the tables a TSN switch/bridge must manage

Data-plane tables

GCL: cycle, slots, gate masks, guard band (X).
Stream/flow table: period, max frame, path, class, burst policy.
Queue/TC mapping: priority → TC → queue; no holes or critical sharing.
Shaper params: CBS / ATS / rate-limit knobs aligned to the chosen profile.
Admission quotas: per-hop reservations and violation actions.

Time & timestamp configuration

Time-base policy: lock/offset/drift/holdover thresholds (X).
Timestamp mode: one-step/two-step, tap definition, residency measurement enable.
Update gating: no schedule commit when time base is not locked.

Ops guardrails

Thresholds & alarms: gate miss, late/early window, watermark, timestamp jump, holdover events.
Rollback strategy: shadow bank + atomic commit + complete rollback pack.
Audit: every applied profile ties to a version id and change record.

Parameter Set package: directory rules for a versioned config bundle

/manifest.json — version, target device, port-rate constraints, prerequisites, checksums.
/tables/gcl.csv — cycle, slot list, gate masks, guard band (X).
/tables/queue_map.json — priority/TC/queue mapping rules.
/tables/streams.csv — flow descriptors, path and bound targets.
/tables/shapers.json — CBS/ATS/rate-limit parameters.
/tables/quotas.json — admission reservations and violation actions.
/ops/thresholds.json — alarms and thresholds (X).
/ops/rollback/ — complete rollback pack (all dependent tables).
/checks/ — offline guardrails: slot min, GCL depth, map integrity, time-lock gates.

Design rule: configuration is not a scattered set of register writes; it is a deployable asset with dependencies, validation, and rollback semantics.

Safe change process: avoid partial updates that “hang” the network

Recommended sequence

Offline validate: manifest integrity + dependency checks.
Stage: load into shadow bank (not active).
Pre-check: time base lock/offset/drift within thresholds (X).
Atomic commit: switch at a defined boundary (e.g., cycle edge).
Canary rollout: a subset of nodes first; watch gate/window/queue tails.
Rollback-ready: auto revert if key counters exceed thresholds (X).
Audit: bind events and snapshots to config version.

Typical failure mode

A schedule update becomes unsafe when only part of the dependency set is applied (GCL updated, but queue map or shaper parameters remain old). Guardrails should enforce “all-or-nothing” activation or reject the commit.

Guardrails: rules that prevent unsafe parameter sets

Slot minimum: slot ≥ X (depends on max frame, link rate, and gate switch latency).
Guard band presence: critical window transitions reserve guard band X.
GCL depth: entry count ≤ X; reject truncated lists.
Map integrity: TC→queue mapping has no holes; critical queues are not shared.
Time-lock gate: no commit when time base is unlocked or in holdover.
Rollback completeness: rollback must include all dependent tables (GCL + map + shapers + thresholds).

When changes require re-computation (which tables must be re-validated)

Change event	Tables impacted	Must re-check
Time topology change	time policy, timestamp mode, guard band margin	lock/offset/drift (X), window phase margin (X)
Path / hop change	streams, quotas, queue map	per-hop reservation, isolation integrity, tail percentiles (X)
Port rate change	GCL slot lengths, guard band, shapers	slot minimum (X), serialization budget, window boundary tests
Queue resource change	queue map, GCL masks, shaper policies	no holes, no critical sharing, watermark caps (X)

Config bundle: a versioned Parameter Set staged to shadow banks and committed atomically across switches

A safe rollout requires dependency completeness (tables move together), pre-commit time-lock checks, and a rollback pack that restores all coupled parameters.

Not covered here (scope guard)

Timing/sync algorithm details and protocol parameter internals.
Physical layer SI/layout/magnetics and EMC design details.

Verification & Observability: Prove Bounds and Catch Field Drift

Determinism is only real when bounds are provable

Verification must demonstrate time stability, schedule correctness, and queue/shaper contract compliance under realistic worst-case load. Field observability must correlate errors with configuration versions and environmental conditions to enable fast forensics.

What must be proven (three bound categories)

Time stability: lock/offset/drift are stable; holdover events are controlled (X).
Schedule correctness: GCL is active; no late/early windows; gate misses near zero (X).
Contract compliance: queue watermarks and drop reasons remain within limits; tail percentiles meet bounds (X).

Counters and events: organize by diagnostic value

Schedule & gate

gate miss, late window, early window (threshold X)
window transition counts and anomalies

Queue & buffer

per-queue drops with drop reasons (overflow/policer/gate closed)
watermark peaks and microburst indicators (X)

Timestamp & time base

timestamp jump, lock state changes, holdover enter/exit events
offset/drift trends (X) and stability windows

Port health (switch view)

link flap events and transitions
per-port error counters (concept-level)

Operational requirement: every snapshot must bind to port + queue + (optional) flow id and the config version.

Correlate failures to environment + version + events (field forensics)

Error: gate miss spike / watermark over / tail percentile jump / drop reason change.
Environment: temperature, voltage, power events.
Version: config_version + change record.
Link events: flap transitions as a frequent trigger source.

Bring-up test plan (minimum closed loop)

Test	Method	Observables	Pass criteria
Sync lock test	stabilize and step temperature / load	lock, offset, drift, holdover events	stable within X
GCL boundary	probe near window edges	late/early window, gate miss	≤ X
Worst-case background	stress BE + burst patterns	watermark, drops, tail percentiles	P99/P999 ≤ X
E2E bound proof	measure per-hop + end-to-end distribution	latency/jitter/loss distributions	upper bound ≤ X

Field black-box schema (flight-recorder style)

timestamp: local time + sync state
config_version: active Parameter Set id
event_type: holdover_enter, gate_miss_spike, watermark_over, link_flap, …
context: port, queue, (optional) flow_id
counters_snapshot: gate/window/queue/drop/time-base counters
env_snapshot: temperature, voltage, power-event flags
action_taken: rate-limit / downgrade / rollback / alert

Trigger strategy: periodic snapshots every X seconds plus event-triggered snapshots on threshold crossing, stored in a ring buffer of size X.

Observe & correlate: traffic → counters → event log → diagnosis (with env + version)

A useful black box ties counters and events to the active config version and environment snapshot, enabling fast “what changed” answers and safe rollback decisions.

Not covered here (scope guard)

Industrial stack certification steps and protocol parameter deep-dives.
Timing/sync algorithm internals and PHY SI/layout details.

Engineering Checklist: Design → Bring-up → Production Gates

Engineering gates that turn mechanisms into repeatable outcomes

Each gate enforces resource sufficiency, time stability, configuration integrity, and proof-quality evidence so determinism does not depend on manual tuning.

Design Gate Goal: resource sufficiency + evidence readiness

Resources & pipeline capacity

Check: queue/TC count covers isolation (no critical sharing). Pass: isolation plan has no overlaps (X). Evidence: queue map + profile sheet.
Check: per-port buffer covers worst-case microburst. Pass: watermark budget ≥ X. Evidence: buffer sizing worksheet.
Check: GCL depth/slot count supports cycle composition. Pass: entry count ≤ X, slot min ≥ X. Evidence: GCL template + guardrail report.
Check: shaper instances and granularity match profile. Pass: all required shapers available (X). Evidence: shaper allocation table.

Time base & commit safety

Check: lock/offset/drift/holdover policy is defined. Pass: thresholds documented (X). Evidence: time policy card + manifest prereqs.
Check: “time-lock gate” blocks schedule commits when unlocked. Pass: commit is rejected under unlock/holdover. Evidence: guardrail rule + test record.

Config asset & observability readiness

Check: Parameter Set package is versioned and complete. Pass: manifest + dependent tables present. Evidence: config bundle tree + checksums.
Check: counters/events cover schedule + queue + time base. Pass: must-have signals enabled (X). Evidence: counter map + black-box schema.

Bring-up Gate Goal: prove bounds under worst-case load

Time stability & schedule correctness

Check: sync lock remains stable for X minutes. Pass: offset/drift within X; holdover events within X. Evidence: lock trend + event log.
Check: GCL activation is correct. Pass: late/early window ≤ X; gate miss ≤ X. Evidence: counter snapshots tied to version.
Check: window boundary probe is clean. Pass: no boundary violations (X). Evidence: boundary test record.

Worst-case background stress

Check: key queues stay below watermark limits. Pass: watermark ≤ X; no overflow drops. Evidence: per-queue watermark log.
Check: tail percentiles meet bounds. Pass: P99/P999 ≤ X. Evidence: latency distribution export.
Check: drop reasons are explainable. Pass: expected-only drop reasons. Evidence: drop-reason breakdown.

Change safety rehearsal

Check: shadow load + atomic commit works at cycle boundary. Pass: no partial activation. Evidence: commit audit with version ids.
Check: rollback triggers on threshold crossing. Pass: rollback completes and restores bounds (X). Evidence: rollback event + post-rollback counters.

Production Gate Goal: repeatability + rollback + forensics

Config governance

Check: every device boots with a known config_version. Pass: version is readable and logged. Evidence: boot log + inventory report.
Check: dependency completeness is enforced. Pass: reject incomplete bundles. Evidence: guardrail rejection record.

Rollback readiness

Check: rollback drill is performed and timed. Pass: restore within X; bounds recover. Evidence: drill report + counters before/after.
Check: no-commit under unlock/holdover is enforced. Pass: commit blocked. Evidence: audit + event log.

Field black-box completeness

Check: snapshot includes version + env + counters. Pass: required fields present. Evidence: schema validation report.
Check: data capture completeness exceeds X%. Pass: completeness > X%. Evidence: periodic audit summary.

3-gate pipeline: design → bring-up → production → field forensics (pass criteria at each gate)

Each gate ties pass criteria to measurable counters, snapshots, and the active configuration version, enabling consistent outcomes across design, lab, production, and field.

Scope guard (not expanded here)

PHY/layout/magnetics/EMC implementation details (refer to PHY co-design & protection pages).
Timing algorithm internals and industrial stack certification procedures.

Applications: Where TSN Switch/Bridge Actually Pays Off

Use cases should map to mechanism recipes (not standard clause lists)

Each scenario is defined by workload patterns and bound targets. The mechanism recipe selects Gate/Shaper/Admission/Timestamp/Observe as a coherent set.

Use-case → mechanism recipe (concept-level)

Use case	Bound target (X)	Gate	Shaper	Admission	Timestamp	Observe
Motion control	bounded latency + jitter (X)	on (windowed cyclic)	optional (BE smoothing)	required (no new flow breaks bounds)	optional (alignment proof)	gate/window counters + tail P99/P999
Machine vision / imaging	burst control + trigger determinism (X)	sometimes (trigger windows)	required (microburst suppression)	recommended (capacity reservation)	recommended (event alignment)	watermark + drop reason + tail shift
PLC + distributed I/O	cyclic + acyclic coexistence (X)	optional (hard partitions)	recommended (class-based shaping)	recommended (resource isolation)	optional	queue isolation + BE starvation watch
Robot cell / multi-axis	multi-flow coordination bounds (X)	on when cyclic is strict	optional (tail control)	required (prevent “tuning lottery”)	recommended	per-flow quotas + violation events
Power / rail / utility	time integrity + auditability (X)	optional	recommended (traffic smoothing)	recommended	required (trusted stamps)	timestamp jump + holdover + event log
Edge gateway / bridging	mixed domains + safe updates (X)	optional	recommended	required (policy isolation)	recommended	versioned config + correlation black-box

Mechanism recipes should be validated using the proof hooks defined in verification and observability, then locked into a versioned Parameter Set.

Recipe patterns (quick reference)

Cyclic hard bounds

Gate + Admission + Window counters. Shaper is secondary for smoothing non-critical traffic.

Burst-heavy data paths

Shaper + Watermark telemetry + Drop-reason breakdown. Gate is used only when trigger windows are strict.

Multi-tenant and expansion-safe

Admission quotas + strict isolation + versioned config bundles prevent new flows from breaking existing guarantees.

Recipe cards: each use case enables a different mechanism set (Gate / Shaper / Admission / Timestamp / Observe)

Recipes should be exported as versioned profiles and validated with schedule counters, watermark telemetry, and time-base stability checks.

Scope guard (not expanded here)

Protocol stack specifics and certification checklists for PROFINET/EtherCAT/CIP.
PHY/magnetics/EMC implementation detail and connector-level design.

H2-13. IC Selection Logic (TSN Switch / Bridge)

This section avoids “product dumping” and instead defines a repeatable selection method: requirements → TSN mechanisms → resource sufficiency → verifiable observability → operable configuration governance. The goal is to filter out parts that claim TSN support but cannot prove or sustain deterministic bounds in production and field operation.

Selection funnel (5 steps that prevent “tuning lottery”)

Hard constraints — port count / port speed / host interface / thermal & package / industrial temp grade. Output: shortlist that physically fits the design.
Determinism profile — target bounds for latency, jitter, loss, and time error, plus traffic shape (periodic / bursty / mixed). Output: which mechanisms are mandatory.
Mechanism coverage — gate windows, shaping, admission control, and hardware timestamps. Output: “must-have list” per use-case.
Resource sufficiency — queue count, per-port buffer, GCL depth, shaper instances, policing granularity. Output: proof that worst-case traffic still fits.
Operate-ability — counters, events, mirror/trace aids, configuration bundle + rollback + safe commit. Output: ability to keep bounds stable over lifecycle.

Reference material examples (non-exhaustive): Microchip LAN9662 / LAN9668; NXP SJA1105P/Q/R/S / SJA1110; Renesas RZ/N2L / RZ/T2M. Use these part numbers as anchors for capability checklists, not as an implied “best choice”.

Example part buckets (how to read “fit”, not a shopping list)

Industrial TSN switch with integrated CPU: Microchip LAN9668 (orderable examples: LAN9668-9MX, LAN9668-I/9MX). Fit: multi-port gateways / remote IO / TSN edge switches.
Compact TSN end-point / small switch role: Microchip LAN9662 (orderable example: LAN9662/9MX). Fit: TSN-capable endpoints or small-port bridge designs.
Automotive TSN switch family (AVB/TSN focus): NXP SJA1105P / SJA1105Q / SJA1105R / SJA1105S. Fit: deterministic multi-domain aggregation when safety/security hooks matter.
Multi-gig safe & secure TSN Ethernet switch SoC: NXP SJA1110 family. Fit: TSN switching with security/safety features and strong ecosystem.
MCU/MPU with integrated TSN-compliant small switch: Renesas RZ/N2L, RZ/T2M. Fit: TSN bridge / controller designs where compute + small-port TSN is sufficient.

Tip: selection should be driven by mechanism + resource + observability, not by a single “TSN supported” checkbox.

Selection scorecard (dimension → why → how to verify → pass (X) → red flags)

Dimension	Why it matters	How to verify (engineering)	Pass criteria (X)	Red flags / reject fast
Ports & speeds Example materials: LAN9662, LAN9668, SJA1105P/Q/R/S, SJA1110	Determines how many deterministic flows can be isolated without queue sharing. Port speed impacts guard band sizing and worst-case serialization time per hop.	Confirm port modes, link partners, and CPU/host port bandwidth. Validate that worst-case frame + background traffic still meets the end-to-end bound.	Ports ≥ X; required speeds supported; CPU/host port not a bottleneck (utilization < X%).	CPU/host port saturates under mirror/telemetry. “TSN supported” but only on a subset of ports or queue classes.
Switching mode (store-and-forward vs cut-through)	Impacts per-hop latency shape and how errors are contained. Determinism needs bounded delay, not just average.	Measure per-hop forwarding delay (ingress timestamp → egress timestamp) under load and with gate windows enabled.	Per-hop delay upper bound ≤ X; mode is stable across MTU and VLAN/priority mixes.	Cut-through path lacks deterministic gating integration (gate applies “after” unpredictable pipeline stages).
TSN mechanism coverage	Different traffic types require different tools: time windows for strict periodic, shaping for mixed loads, admission to keep bounds valid when new streams appear.	Map each required flow class to one of: Gate / Shaper / Policer / Admission / Timestamp. Verify all required blocks are hardware-backed and have counters.	Required blocks present; each block has measurable state & events; no “software-only” critical path.	“Supported” but missing gate-miss / timestamp-jump / policing-drop reasons.
Queue count, per-class isolation & mapping	If critical traffic shares queues with best-effort, tail jitter and microbursts become unbounded and hard to debug.	Confirm independent queues per traffic class, queue depth controls, and queue-level watermark counters.	Dedicated queues for critical classes; mapping has no “holes”; watermark visibility available.	Queue mapping cannot be audited; only port-level drops exist (no per-queue attribution).
GCL depth, time resolution & safe update	Determinism relies on the schedule being representable without compressing slots or merging classes. Updates must not “tear” live traffic.	Check: max entries, minimum slot time, and whether shadow/atomic commit exists for GCL. Validate “late/early window” counters during schedule changes.	GCL depth ≥ X; min slot ≤ X; schedule switch without packet loss beyond X ppm.	No shadow table; updates require “stop traffic” maintenance windows.
Guard band & preemption support (when needed)	Guard bands protect time windows from being invaded by long frames. Preemption reduces wasted guard time at higher utilization.	Validate calculated guard time vs measured gate edge behavior under maximum MTU background traffic. Confirm counters exist for “late/blocked due to guard”.	No gate-edge violations; guard time margin ≥ X; preemption behavior is deterministic when enabled.	Preemption “supported” but cannot be validated (no counters / no clear enable scope).
Shaping instances & stability under microbursts	Many designs do not need strict gating everywhere. Shapers must remain stable under burst, credit resets, and queue coupling.	Stress test with bursty best-effort plus periodic flows. Observe per-queue occupancy, drop reasons, and shaping state transitions.	Tail latency bound holds under worst-case background load; no unexplained credit “jumps”.	No per-queue occupancy; only aggregate port counters exist (debug becomes guesswork).
Admission control & resource accounting	Without admission, determinism can be invalidated the day a new stream is added. Admission is what turns “tunable” into “guaranteed”.	Confirm: per-stream descriptors, resource model (bandwidth / slot / queue / buffer), and enforcement. Validate rejection behavior and “violation events”.	Admission decision is explainable; violations are logged; rejection occurs before bounds are harmed.	“Admission” exists only as a software convention (no hardware enforcement / no event logs).
Hardware timestamping & tap-point clarity	Timestamp error couples into scheduling and closed-loop control. If the tap point is unclear, residence time and correction become unverifiable.	Confirm ingress/egress timestamp capture paths, correction reporting, and “timestamp jump / holdover” events.	Timestamp error budget ≤ X; tap point documented; residence time observable per hop.	Hardware timestamps exist, but no access to raw capture records or no error/step detection.
Counters, events, mirror/trace aids	Field failures are mostly diagnosability failures. The minimum set must isolate: queue drops, gate misses, timing anomalies, and configuration version.	Require per-port + per-queue counters, gate-miss/late/early flags, timestamp events, plus mirror support for capture.	Black-box completeness > X%; event-to-counter correlation works with config version tagging.	Only link-level counters exist; no queue-level attribution; no gate miss observability.
Configuration governance (bundle, shadow, rollback)	TSN failures are often configuration failures. Without safe updates, networks freeze and field drift becomes unmanageable.	Verify: config package versioning, dependency checks, atomic commit (shadow → swap), and rollback rehearsal.	Rollback success rate ≥ X%; shadow swap time ≤ X; “time base unlocked” prevents schedule activation.	Schedule/config updates require reboot or uncontrolled transient behavior; no “safe commit” path.

Scorecard usage: assign weights according to the determinism profile, then require that every “must-have” row passes verification with measurable evidence.

Bring-up verification hooks (must exist before committing a part)

1) Traffic generation hooks

Ability to inject periodic flows + bursty best-effort simultaneously (worst-case background load).
Repeatable stress profiles: microburst, long-frame invasion attempts, mixed priority classes.
Pass criteria: P99.999 latency ≤ X, jitter ≤ X, and zero unexplained drops in the critical class.

2) Data-plane self-test hooks

Port/pipeline loopback modes (to separate “configuration vs environment” quickly).
PRBS / built-in traffic test (if available) to validate datapath stability without external complexity.
Pass criteria: self-test completes with error counters stable within X / hour.

3) Time & timestamp hooks

Ingress/egress timestamp capture with clear tap-point definition.
Residence time / correction evidence path (concept-level requirement).
Events: timestamp jump, timebase unlock, holdover enter/exit.
Pass criteria: time error ≤ X; no timestamp step under stress; holdover behavior matches guardrails.

4) Gate/shaper/queue enforcement hooks

Counters: per-queue drop, per-queue watermark, gate miss, late/early window, policing drops (with reasons).
Schedule update safety: shadow table + atomic swap, plus “deny activation if timebase not locked”.
Pass criteria: zero gate-edge violations; all drops are attributable and bounded within X.

Reject-fast red flags (high probability of “un-debuggable determinism”)

No gate miss / late/early window counters → schedule failures cannot be proven or localized.
Hardware timestamps exist but tap-point is unclear or inaccessible → time error budget cannot be validated.
Only port-level drops, no per-queue attribution → microburst and starvation become guesswork.
No shadow/atomic commit for GCL and mapping tables → updates introduce uncontrolled transient behavior.
CPU/host port becomes the bottleneck under observability → deterministic network collapses when debugging is needed most.

Decision output template (copy/paste into a design review)

Chosen material: PN = ________ (examples: LAN9668 / SJA1110 / SJA1105P/Q/R/S / RZ/N2L)

Hard constraints: ports = X, speeds = X, host = X, temp = X

Determinism profile: latency ≤ X, jitter ≤ X, loss ≤ X, time error ≤ X

Mechanism recipe: Gate (Y/N), Shaper (CBS/ATS/Rate), Admission (Y/N), Timestamp (Y/N)

Resource proof: queues = X, buffer = X, GCL depth = X, min slot = X

Observability minimum set: per-queue drop + watermark + gate miss + timestamp events

Config governance: bundle versioning + shadow swap + rollback rehearsal

Bring-up plan: lock stability X min; gate-edge violations = 0; P99.999 bound validated

Known risks: ________ | Mitigations: ________

Diagram — Selection funnel + scorecard + bring-up hooks (concept map)

The diagram is intentionally mechanism- and evidence-oriented: the part number is only useful if it passes the scorecard with measurable proofs.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13. FAQs (TSN Switch / Bridge)

Scope: long-tail troubleshooting only. Each answer is a fixed, measurable 4-line format: Likely cause / Quick check / Fix / Pass criteria (threshold placeholders X/Y).

Q1. Enabling Qbv causes sporadic packet loss — guard band underestimated or gate switch overhead ignored?

Likely cause: Guard band T_guard smaller than worst-case frame serialization + internal gate edge latency; gate schedule allows long BE frame to overlap a critical window.

Quick check: Read/plot gate_miss_cnt, late_window_cnt, per_queue_drop_cnt during max-MTU background traffic; correlate drops with gate edge timestamps.

Fix: Increase T_guard using worst-case MTU and link rate; move BE queue to a fully closed window around critical slots; enable/validate preemption only if utilization loss is unacceptable.

Pass criteria: gate_miss_cnt=0 and late_window_cnt=0 over Y cycles; critical-class drops ≤ X ppm under worst-case background load.

Q2. End-to-end latency percentiles look great, but rare spikes violate the hard bound — microburst or admission leak?

Likely cause: Microburst drives queue above planned watermark; or admission accounting missed a resource dimension (slot/queue/buffer), allowing occasional overload.

Quick check: Compare p99_999_e2e_us vs max_e2e_us; inspect per_queue_watermark near spikes; check admission_violation_evt and policer_drop_reason.

Fix: Add ingress policing for burst sources; increase isolation (dedicated queue) for critical traffic; tighten admission worksheet using worst-case burst + multi-hop residence time; add gate windows for strict bound if shaping-only is insufficient.

Pass criteria: max_e2e_us ≤ X over Y minutes of worst-case stress; per_queue_watermark never exceeds X% of capacity; admission_violation_evt=0.

Q3. Time sync shows “locked”, yet windows are shifted — timestamp tap point mismatch or time-base step?

Likely cause: Hidden time-base step (holdover enter/exit) or timestamp tap point differs across hops, causing schedule-to-time mapping error.

Quick check: Inspect timebase_step_evt, holdover_evt, ts_jump_cnt; measure window edge error win_edge_err_ns at ingress/egress stamps per hop.

Fix: Enforce guardrail: block schedule activation unless time base is stable; standardize timestamp tap point usage across devices; recompute schedule alignment after any sync topology or clock source change.

Pass criteria: timebase_step_evt=0 and ts_jump_cnt=0 over Y minutes; absolute win_edge_err_ns ≤ X on every hop.

Q4. Preemption enabled, but throughput drops or retransmissions spike — fragment handling or queue mapping issue?

Likely cause: Fragmentation overhead counted incorrectly; preemptable class mapped to a queue with conflicting shaping/gating; fragment reassembly pressure increases buffer churn.

Quick check: Track preempt_frag_cnt, preempt_abort_cnt, reassembly_err_cnt, and per-queue watermark; confirm mapping table prio_to_queue_map matches design intent.

Fix: Restrict preemption to the minimal BE class; verify non-preemptable critical queues remain isolated; increase buffer headroom for reassembly or reduce BE burst with policing.

Pass criteria: Throughput loss ≤ X% at target load; reassembly_err_cnt=0; retransmission indicators (CRC/retry counters) remain within X ppm.

Q5. One traffic class is always starving — CBS/ATS parameters wrong or gate schedule conflict?

Likely cause: Queue shares bandwidth with higher priority without guaranteed service; CBS credit never recovers due to burst pattern; gate closes the class during peak arrivals.

Quick check: Inspect per_queue_tx_bytes (starved queue flat), cbs_credit_min/cbs_credit_reset_cnt, and gate open ratio gate_open_time_us per cycle.

Fix: Assign a dedicated queue; adjust CBS/ATS to match required rate and burst tolerance; align gate slots with expected arrivals or remove gating for that class if shaping is sufficient.

Pass criteria: Starved class achieves ≥ X% of requested rate over Y cycles; queue drop ≤ X ppm; no sustained negative credit beyond planned floor.

Q6. Determinism breaks after topology change — multi-hop schedule alignment or residence-time update missing?

Likely cause: Path length/hop count changed but GCL offsets and residence-time corrections stayed old; time-domain calibration no longer matches the forwarding path.

Quick check: Compare hop_count and per-hop res_time_ns before/after; check win_edge_err_ns growth across hops; verify config bundle version consistency across switches.

Fix: Recompute multi-hop schedule offsets; update residence-time/correction terms; enforce bundle deployment rule: all nodes switch schedules atomically with rollback-ready staging.

Pass criteria: For each hop: win_edge_err_ns ≤ X; end-to-end max_e2e_us ≤ X; no schedule/bundle mismatch events over Y minutes.

Q7. Temperature change triggers window miss — oscillator drift or aggressive holdover strategy?

Likely cause: Time base drift increases offset_ppb until schedule edges misalign; holdover enter/exit introduces a step or higher wander.

Quick check: Log temperature temp_c vs offset_ns/drift_ppb; inspect holdover_evt and timebase_step_evt; count late_window_cnt.

Fix: Tighten clock quality or compensation; adjust holdover thresholds to avoid chattering; add guard margin to window edges if drift budget requires it.

Pass criteria: drift_ppb ≤ X across operating temperature; timebase_step_evt=0; late_window_cnt=0 over Y thermal cycles.

Q8. Measured latency is significantly higher than theory — store-and-forward path or buffer watermark throttling?

Likely cause: Forwarding mode or pipeline adds extra serialization/queuing; hidden buffering triggers high watermark, causing additional delay or shaping backpressure.

Quick check: Measure per-hop components: ingress_ts→egress_ts; inspect store_fwd_active state; check per_queue_watermark and throttle_evt.

Fix: Align design assumptions with actual forwarding mode; increase buffer headroom or reduce burstiness via policing; ensure critical queues do not share with BE bursts.

Pass criteria: Per-hop latency decomposition matches budget within ±X%; throttle_evt=0 during bounded-load tests; max_e2e_us ≤ X.

Q9. Background traffic increases critical jitter — queue isolation missing or ingress policing absent?

Likely cause: Critical and BE flows share queue or share buffer headroom; bursts enter unpoliced and create microbursts that push critical frames off schedule.

Quick check: Compare critical jitter with/without BE load; read prio_to_queue_map, per_queue_watermark, policer_drop_cnt; validate that critical queue has dedicated gate/shaper.

Fix: Enforce dedicated queue for critical class; add ingress policing for BE sources; apply shaping to reduce burst; use gate windows if strict jitter bound is required.

Pass criteria: Critical jitter_us ≤ X under worst-case BE load; critical queue watermark stays below X%; critical drops ≤ X ppm.

Q10. After configuration update, the network sporadically flaps — bundle inconsistency/rollback gap or time-base relock?

Likely cause: Partial deployment causes mismatched schedules/mappings; lack of atomic commit introduces transient invalid states; time base relock triggers schedule misalignment.

Quick check: Compare config_bundle_ver across nodes; inspect atomic_swap_evt, rollback_evt, timebase_unlock_evt; correlate with link_flap_evt.

Fix: Enforce staged rollout with pre-check and post-check; require shadow tables + atomic swap; block schedule activation if time base is not locked; rehearse rollback for each bundle version.

Pass criteria: config_bundle_ver identical on all nodes; link_flap_evt ≤ X per Y hours; timebase_unlock_evt=0 during swap window.

Q11. Mirroring/capture shows frame order “scrambled” — true multi-queue reordering or preemption visualization artifact?

Likely cause: Observation point merges queues and timestamps differently than the forwarding path; preemption fragments distort capture order; true reordering can occur if multiple egress queues drain without a strict ordering policy.

Quick check: Capture at multiple points (ingress vs egress); compare sequence fields seq_id; inspect mirror_port_util, preempt_frag_cnt, and per-queue drain counters.

Fix: Use consistent capture tap points; annotate captures with ingress/egress timestamps; if true reordering is confirmed, enforce per-flow queue mapping and deterministic scheduling for that flow class.

Pass criteria: For critical flows: observed seq_id monotonic at egress; mirror overhead mirror_port_util ≤ X%; no unexplained reorder events over Y minutes.

Q12. Admission passes, yet congestion still happens — which resource dimension is missing (slot/queue/buffer)?

Likely cause: Admission model accounts bandwidth but ignores at least one of: slot occupancy, queue contention, buffer headroom, burst size, or hop-by-hop residence time.

Quick check: Recompute per-flow resource terms: slot_us, frames_per_cycle, burst_bytes, buffer_bytes, hop_res_time_ns; compare predicted vs observed watermark and gate misses.

Fix: Extend worksheet to include slot + queue + buffer constraints; enforce per-class dedicated queues; add policing for burst-limited terms; reject streams that violate any single dimension.

Pass criteria: For each admitted stream: predicted vs measured queue occupancy error ≤ X%; gate_miss_cnt=0; congestion indicators (drops/watermarks) remain within X ppm under validation load.

TSN Switch / Bridge for Deterministic Industrial Ethernet

TSN Switch / Bridge for Deterministic Industrial Ethernet

Definition & Scope Guard: What “TSN Switch/Bridge” Covers

Determinism Goals & Key Specs: Latency, Jitter, Loss, and Time Error

TSN Feature Map: Which Mechanism Solves Which Problem

Hardware Architecture: Ingress → Classify → Queue → Shape → Gate → Egress

Time-Aware Windows: Gate Control, Schedules, Guard Bands, and Preemption

Shaping for Determinism: CBS / ATS / Rate-Limit and Queue Discipline

Admission Control & Isolation: Keep the Network from Becoming a Tuning Lottery

Hardware Timestamping & Time Base: Accuracy, Two-Step, and Residence Time

Configuration & Parameterization: GCL Tables, Profiles, and Guardrails

Verification & Observability: Prove Bounds and Catch Field Drift

Engineering Checklist: Design → Bring-up → Production Gates

Applications: Where TSN Switch/Bridge Actually Pays Off

H2-13. IC Selection Logic (TSN Switch / Bridge)

Selection funnel (5 steps that prevent “tuning lottery”)

Example part buckets (how to read “fit”, not a shopping list)

Selection scorecard (dimension → why → how to verify → pass (X) → red flags)

Bring-up verification hooks (must exist before committing a part)

Reject-fast red flags (high probability of “un-debuggable determinism”)

Decision output template (copy/paste into a design review)

Diagram — Selection funnel + scorecard + bring-up hooks (concept map)

Request a Quote

Accepted Formats

Attachment

H2-13. FAQs (TSN Switch / Bridge)

Explore

Categories

Get in Touch

TSN Switch / Bridge for Deterministic Industrial Ethernet

TSN Switch / Bridge for Deterministic Industrial Ethernet

Definition & Scope Guard: What “TSN Switch/Bridge” Covers

Determinism Goals & Key Specs: Latency, Jitter, Loss, and Time Error

TSN Feature Map: Which Mechanism Solves Which Problem

Hardware Architecture: Ingress → Classify → Queue → Shape → Gate → Egress

Time-Aware Windows: Gate Control, Schedules, Guard Bands, and Preemption

Shaping for Determinism: CBS / ATS / Rate-Limit and Queue Discipline

Admission Control & Isolation: Keep the Network from Becoming a Tuning Lottery

Hardware Timestamping & Time Base: Accuracy, Two-Step, and Residence Time

Configuration & Parameterization: GCL Tables, Profiles, and Guardrails

Verification & Observability: Prove Bounds and Catch Field Drift

Engineering Checklist: Design → Bring-up → Production Gates

Applications: Where TSN Switch/Bridge Actually Pays Off

H2-13. IC Selection Logic (TSN Switch / Bridge)

Selection funnel (5 steps that prevent “tuning lottery”)

Example part buckets (how to read “fit”, not a shopping list)

Selection scorecard (dimension → why → how to verify → pass (X) → red flags)

Bring-up verification hooks (must exist before committing a part)

Reject-fast red flags (high probability of “un-debuggable determinism”)

Decision output template (copy/paste into a design review)

Diagram — Selection funnel + scorecard + bring-up hooks (concept map)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

H2-13. FAQs (TSN Switch / Bridge)

Explore

Categories

Get in Touch