Managed / Enterprise Switch IC: L2/L3, ACL, QoS & Telemetry

Q: Average utilization is low, but packet drops appear in short bursts

Likely cause: microbursts from fan-in oversubscription; per-queue limit/headroom too small; tail-drop triggers before congestion becomes visible in averages. Quick check: correlate burst drops with per-queue watermark(max); compare ingress aggregate vs egress line-rate in the same time window; verify which queue/class is dropping (per-queue drop counters). Fix: split traffic into more queues; adjust buffer sharing/headroom; cap or reshape bursty sources (token-bucket shaping); validate LAG hashing if fan-in is via LAG. Pass criteria: burst-drop rate ≤ X events/hour at Y% load; queue watermark peak ≤ X% of limit; recovery to baseline ≤ X ms.

Q: No drops, but P99 latency explodes under load (bufferbloat)

Likely cause: deep standing queues hide congestion (bufferbloat); strict-priority or uneven scheduling keeps queues persistently filled, inflating tail latency. Quick check: observe sustained queue depth/watermark plateau; plot latency histogram vs watermark; check ECN marks (if supported) and per-queue service rates. Fix: enforce queue limits to bound worst-case delay; enable/tune WRED/ECN if available; adjust scheduler weights (WRR/WFQ) and apply shaping to dominant sources. Pass criteria: P99 latency ≤ X µs at Y% load; standing queue ≤ X% for > Z ms; drop rate ≤ X.

Q: One traffic class starves others (head-of-line blocking symptoms)

Likely cause: too many flows mapped into the same queue; strict-priority dominance; shared buffer coupling causes one class to crowd out others. Quick check: verify DSCP/PCP→class→queue mapping; compare per-queue occupancy and drops; check strict-priority queue service share vs WRR/WFQ queues. Fix: separate critical and bulk flows into distinct queues; cap strict-priority or add minimum service via WRR/WFQ; apply policers on noisy classes. Pass criteria: no starvation longer than X ms; low-priority P99 ≤ X µs; high-priority loss ≤ X/1e6 packets.

Q: VLAN leakage: traffic appears in the wrong VLAN

Likely cause: trunk allow-list missing; wrong PVID/default VLAN; tag/untag action mismatch; ingress VLAN filtering disabled; stale FDB entries across VLANs. Quick check: validate per-port allowed VLAN list and PVID; confirm tag/untag rules; check VLAN filter-drop counters; confirm MAC table entries are VLAN-scoped (VLAN+MAC). Fix: enable ingress VLAN filtering; explicitly configure allow-lists; correct PVID and tag actions; clear and relearn FDB after fixing topology/config. Pass criteria: cross-VLAN frames observed = 0 over X minutes; VLAN filter-drop counter stable within ≤ X/minute under steady state.

Q: MAC flapping / frequent MAC moves cause instability

Likely cause: L2 loop, misconfigured LAG, or topology changes causing the same source MAC to be learned on multiple ports; aggressive aging amplifies churn. Quick check: read MAC move/learn counters; track one flapping MAC across ports and VLANs; check STP state changes and link flap logs; verify LAG membership consistency. Fix: eliminate loops (STP/loop-guard); correct LAG configuration; tune aging only after topology is stable; consider MAC pinning for fixed endpoints. Pass criteria: MAC move events ≤ X/hour; topology-change events ≤ X/day; CPU utilization for control tasks ≤ X%.

Q: LAG is enabled, but load is heavily imbalanced

Likely cause: hash key too narrow (e.g., only L2/L3); elephant flows dominate a single member; asymmetry from source traffic patterns. Quick check: compare per-member byte/packet counters; identify top flows; check configured hash fields (L2/L3/L4) and whether symmetric hashing is enabled. Fix: expand hash to include L4 where possible; split elephant flows at the source; add more LAG members or adjust traffic engineering policies. Pass criteria: member utilization within ± X% over Y minutes; LAG member drops ≤ X/1e6 packets.

Q: Multicast floods the whole network unexpectedly

Likely cause: IGMP/MLD snooping off or not active on the VLAN; querier missing; group aging collapses; control packets blocked by ACL so the table never builds. Quick check: confirm snooping enabled per VLAN; verify querier presence; inspect group-table entries and aging; read unknown-multicast and multicast-flood counters. Fix: enable querier on the correct VLAN; adjust aging/robustness parameters; ensure IGMP/MLD reports are permitted; apply storm guard for unknown multicast. Pass criteria: multicast flood counter = 0 in steady state; unknown-multicast rate ≤ X pps; group table stable with ≥ X% expected members.

Q: Multicast subscribers intermittently miss streams

Likely cause: join/leave timing vs aging mismatch; IGMP reports dropped under congestion; control packets punted then delayed; VLAN/ACL interaction blocks membership maintenance. Quick check: confirm reports are seen on ingress (mirror); compare group entry aging vs report intervals; check control-queue drops and CPU punt rates; verify ACL permits IGMP/MLD. Fix: tune IGMP/MLD timers to match endpoints; protect IGMP reports with QoS mapping; reduce punts; ensure VLAN membership and ACL rules allow report/querier traffic. Pass criteria: join-to-forwarding latency ≤ X ms; stream loss ≤ X packets/1e6; group table matches membership with ≥ X% accuracy.

Q: ACL blocks legitimate traffic (false drops)

Likely cause: TCAM rule priority conflict; default action too aggressive; match fields too broad; slice/entry exhaustion changes rule placement or disables intended rules. Quick check: read per-rule hit/drop counters; verify rule order and default action; mirror packets before the drop stage; check TCAM utilization and slice allocation. Fix: reorder by specificity; narrow matches; add explicit permits; align default action with policy; reserve TCAM slices for critical rules and disable noisy logging. Pass criteria: false-drop counter = 0 over X-hour soak; intended-drop rate within ± X%; CPU punt rate ≤ X pps.

Q: Enabling mirroring/telemetry makes performance worse

Likely cause: SPAN oversubscription (mirror egress slower than mirrored traffic); sampling/export overload; CPU interrupts and punts increase when telemetry is misconfigured. Quick check: compare mirrored traffic rate vs mirror port speed; check mirror-port drop counters; observe CPU punt/IRQ rate; confirm export queue occupancy if streaming telemetry is enabled. Fix: mirror only necessary subsets; rate-limit sampling; use a high-speed dedicated mirror port; shift telemetry export off the critical path; reduce per-packet logging. Pass criteria: enabling telemetry increases P99 latency by ≤ X%; mirror-port drops ≤ X/minute; CPU utilization increase ≤ X%.

← Back to: Industrial Ethernet & TSN

Core idea

A managed/enterprise switch IC is a deterministic packet-processing pipeline with finite tables, queues, and counters. The winning design turns features into measurable gates—resource budgets, observability, and acceptance targets—so field issues can be diagnosed and fixed fast.

H2-1 · What is a Managed / Enterprise Switch IC

Section intent

Define the enterprise-class switch IC boundary versus unmanaged/smart/TSN classes.
Turn “feature lists” into selection triggers and engineering expectations.
Keep TSN/PTP/PoE/stack mechanisms out of scope (only capability-level pointers).

Definition (one sentence)

A managed / enterprise switch IC is a multi-port switching silicon platform that delivers hardware forwarding plus scalable policy, traffic management, and observability hooks (L2/L3, ACL/TCAM, multicast control, QoS shaping, and line-rate telemetry) for networks that must be diagnosable and maintainable at scale.

Stop line: TSN time-window scheduling, PTP servo algorithms, and PoE power-path design are not expanded here.

The boundary is not “one feature” — it is a system capability level

Policy at scale: VLAN/LAG/STP + QoS + ACL + multicast control form a coherent pipeline, not isolated toggles.
Measurable hardware resources: TCAM entries, MAC/route/neighbor tables, queues, and buffers have visible limits and counters.
Operations & diagnostics: line-rate counters, mirroring, event hooks, and field logs enable fast root-cause in real deployments.

Decision triggers (choose enterprise-class when…)

Multiple VLANs and segmentation policies must remain predictable after growth and maintenance cycles.
Traffic classes require explicit queue mapping, shaping, and congestion behavior (not “best effort”).
ACL rules must be enforced in hardware (TCAM) with counters and deterministic priority order.
Multicast must be controlled (IGMP/MLD snooping) to avoid flooding, jitter spikes, and CPU punt storms.
Field diagnosis must be fast: telemetry counters, mirroring, and “black-box” event trails are required.
Microbursts/oversubscription are real: buffer visibility and queue-depth instrumentation matter.

What this page will deliver

Switch-class feature matrix (unmanaged vs smart vs enterprise vs TSN).
Enterprise silicon block map: data-plane pipeline, TCAM/ACL, queues/shapers, multicast control, telemetry.
Engineering hooks: bring-up checks, counter baselines, and field-debug instrumentation.
Selection scorecard: resources (TCAM/buffers/queues/tables) + performance + manageability.

Diagram: A capability-level map. The “Enterprise” column is the scope of this page; TSN timing windows and sync algorithms are routed to dedicated pages.

H2-2 · Scope Guard & Page Routing

Section intent

Lock the page boundary (what is covered vs not covered) to prevent topic overlap.
Provide keyword-based routing rules so readers reach the correct deep-dive page fast.
Use “stop-line examples” to show how out-of-scope questions are handled without expanding this page.

In-scope (covered on this page)

L2 forwarding at scale: VLAN/trunk, LAG, MAC learning/aging/move behavior.
L3 offload at capability level: routing tables, neighbor tables (ARP/ND), ECMP behaviors and limits.
ACL policy in hardware: TCAM matching, rule priority, policers, per-rule counters.
Multicast control: IGMP/MLD snooping, querier presence, flooding guards.
QoS without TSN windows: class mapping, queues, schedulers, shapers, WRED/ECN hooks.
Line-rate telemetry: counters, mirroring, events, field logs, and “black-box” diagnostics hooks.
Engineering hooks: bring-up loopback/PRBS usage and counter baselining.

Out-of-scope (routed to dedicated pages)

TSN deterministic windows: Qbv/Qci/Qav, GCL, time-slot tables, admission control → Go to TSN Switch / Bridge
Timing & sync algorithms: PTP servo, E2E/P2P corrections, SyncE holdover, WR → Go to Timing & Sync
PoE / PoDL power-path design: classification, current limits, center-tap surge paths → Go to PoE / PoDL
Industrial protocol stack details: PROFINET/EtherCAT/CIP certification behaviors → Go to Industrial Ethernet Stacks
ESD/surge/EMC component selection: TVS/CMC/shield/return path, creepage/clearance → Go to PHY Co-Design & Protection
Crypto & secure boot details: MACsec/DTLS/TLS offload, key storage → Go to Security

Keyword routing rules (fast navigation)

If the question contains TSN window terms…

Keywords: Qbv, Qci, Qav, GCL, gate-control list, time-slot, admission control → TSN Switch / Bridge

If the question contains time-sync terms…

Keywords: PTP one-step/two-step, E2E/P2P, SyncE, holdover, WR → Timing & Sync

If the question contains power-over-cable terms…

Keywords: 802.3af/at/bt, PSE, PD, PoDL class, center-tap, power allocation → PoE / PoDL

If the question contains stack/certification terms…

Keywords: PROFINET IRT, EtherCAT DC, CIP Sync, certification, conformance → Industrial Ethernet Stacks

If the question contains EMC/protection terms…

Keywords: TVS, CMC, ESD, surge, shield bond, return path, creepage, intrinsic safety → PHY Co-Design & Protection

Stop-line examples (how out-of-scope questions are handled)

Question: “How to build the TSN gate-control list (Qbv)?”

Handling: This page only checks whether the switch silicon exposes TSN window capabilities; mechanism and parameter tables are routed to TSN Switch / Bridge.

Question: “Why does PTP offset drift over temperature?”

Handling: This page only covers timestamp exposure and switch-side observability counters; drift analysis is routed to Timing & Sync.

Question: “How to size PoE surge protection on the center-tap?”

Handling: Power-path and protection sizing are routed to PoE / PoDL and PHY Co-Design & Protection.

Diagram: A routing map. When keywords match TSN timing windows, sync, PoE, stacks, protection, or security, the deep-dive belongs to the linked page.

H2-3 · Silicon Block Diagram: Data Plane vs Control Plane vs Management

Section intent

Split switch silicon into data plane, control plane, and management plane to form a stable “map” for all later chapters.
Bind each block to an observability bucket (counters / status / table resources) so failures can be localized quickly.
Keep TSN/PTP/PoE mechanisms out of scope; only capability-level routing is allowed.

Three-plane engineering definition

Data plane (fast path)

Line-rate parsing, lookup, policy, buffering, scheduling, and egress. Primary risks are drop/latency behaviors driven by table misses, ACL actions, queue buildup, and buffer limits.

Control plane (slow path)

CPU/assist logic that maintains tables and handles exceptions (learn/age, route/neighbor maintenance, punts). Primary risk is CPU punt storms that turn line-rate traffic into a bottleneck.

Management plane (operations & telemetry)

Configuration, telemetry export, event logs, and remote access paths (PCIe/MDIO/I²C/EEPROM). Primary risk is incomplete visibility that prevents root-cause and repeatable recovery.

Module inventory (what exists, where it belongs, what to observe)

Port MAC / PCS / SerDes

Data

Role: capture frames at line rate and surface link-quality signals. Observability: link flap, CRC/FCS, symbol/code errors, pause/PFC, EEE entry/exit.

Parser / Classifier

Data

Role: extract L2/L3/L4 fields and build metadata (VLAN/DSCP/flow key). Observability: parse errors, unknown types, punt reasons.

Lookup: FDB / LPM / Neighbor

Data

Role: decide forwarding domain and next hop using tables. Observability: table utilization, hit/miss, collision/rehash, route/ARP/ND pressure.

ACL / Policy (TCAM + actions)

Data

Role: enforce allow/deny/remark/police/mirror decisions. Observability: per-rule hits, policer drops, remark counts, mirror triggers.

Queue / Buffer Manager

Data

Role: enqueue traffic classes, manage shared buffers, mark or drop under congestion. Observability: queue depth peak, WRED/tail drops, pause/PFC events.

Scheduler / Shaper

Data

Role: determine service order and rate limits (SP/WRR/WFQ, token bucket). Observability: serviced bytes, shaper throttles, queue starvation indicators.

CPU / Exception Engine

Control

Role: maintain tables and handle punt traffic. Observability: punt rate by reason, CPU load, table update latency, event timestamps.

Config / Telemetry / Logs (PCIe/MDIO/I²C/EEPROM)

Mgmt

Role: configure, export counters, and preserve field evidence. Observability: config change log, export health, black-box ring buffer fields.

Observability taxonomy (counter/register buckets)

Port quality: link flap, CRC/FCS, symbol/code errors, pause/PFC, EEE.
Parse/classify: parse errors, unknown types, punt-to-CPU reasons.
Table resources: FDB/LPM/neighbor/TCAM utilization, hit/miss, collision/rehash.
Policy actions: ACL hit per rule, policer drops, remark/mirror counters.
Queues/buffers: depth peaks, WRED/tail drops, congestion marks, pause events.
Events/logs: config changes, thermal/power, port flap timeline, black-box fields.

Diagram: The map used by later chapters. Each block binds to a counter/resource bucket so failures can be localized without mixing TSN/PTP/PoE mechanisms.

H2-4 · Switching Pipeline Deep Dive (Ingress→Decision→Egress)

Section intent

Explain the end-to-end packet journey using a fixed 8-stage pipeline.
Highlight the decision points (lookup, ACL, classify, queue/buffer) that dominate loss and latency.
Attach typical counters, failure modes, and pass-criteria placeholders (X) per stage.

The fixed 8-stage pipeline

Ingress receive: Port MAC/PCS/SerDes accepts a frame and records link-layer health.
Parse & metadata: Header fields become metadata (VLAN/DSCP/flow key).
Lookup (L2/L3): FDB/LPM/neighbor tables decide destination and next hop. Decision
ACL/Policy: TCAM match triggers allow/deny/remark/police/mirror. Decision
Classify to QoS: Map to traffic class/queue/color. Decision
Enqueue & buffer: Shared buffer/queue behavior determines drops/marks. Decision
Schedule & shape: Service order and rate limiting decide timing.
Egress transmit: Rewrite, mirror, and transmit on the destination port.

Pipeline stage table (function → counters → failure → pass criteria)

Stage 1 · Ingress receive

Typical counters: link flap, CRC/FCS, code errors, pause/PFC events.
Common failure: physical errors masquerading as higher-layer drops.
Pass criteria: CRC/FCS error rate < X over Y minutes; link up stability > X hours.

Stage 2 · Parse & metadata

Typical counters: parse errors, unknown types, punt reasons.
Common failure: wrong VLAN/DSCP interpretation causes incorrect class or policy path.
Pass criteria: parse error count < X; metadata classification matches test vectors.

Stage 3 · Lookup (L2/L3) — Decision point

Typical counters: FDB/LPM/neighbor hit/miss, table utilization, collision/rehash.
Common failure: table miss or resource exhaustion leads to flooding or CPU punt.
Pass criteria: utilization < X%; miss rate < X per 1k frames during steady state.

Stage 4 · ACL/Policy — Decision point

Typical counters: per-rule hit, deny drops, policer drops, remark/mirror counts.
Common failure: rule priority/order mismatch causes unintended drops or remarking.
Pass criteria: expected rule-hit distribution; deny drops < X for allowed flows.

Stage 5 · Classify to QoS — Decision point

Typical counters: class-to-queue mapping stats, remark counters, per-queue byte counters.
Common failure: wrong mapping sends critical traffic into a congested queue.
Pass criteria: critical class always maps to target queue; mismatch rate < X.

Stage 6 · Enqueue & buffer — Decision point

Typical counters: queue depth peak, tail drops, WRED drops, pause/PFC events.
Common failure: microbursts build queues faster than drain rate; drops appear despite low average utilization.
Pass criteria: peak depth < X; drop rate < X%; recovery time < X ms.

Stage 7 · Schedule & shape

Typical counters: served bytes, starvation indicators, shaper throttles.
Common failure: strict priority starvation or shaper parameters too aggressive.
Pass criteria: starvation events < X; shaped rate within ±X% of target.

Stage 8 · Egress transmit

Typical counters: egress drops, mirror oversubscription, rewrite errors (if exposed).
Common failure: egress congestion or mirror path drops hide the root signal.
Pass criteria: egress drop rate < X; mirror capture completeness > X%.

Diagram: Stages 3–6 dominate most “mystery drops” and latency spikes. Use counters at those decision points before changing shaping parameters.

H2-5 · L2 Feature Set: VLAN, MAC Learning, STP, LAG

Section intent & scope

Turn L2 terms into repeatable engineering checks: configuration entry points, observability, and failure patterns.
Cover capability-level interfaces and pitfalls on enterprise switches.
Not covered: ring redundancy protocol implementation details (MRP/HSR/PRP) and deep protocol internals.

VLAN & trunking (define the forwarding domain)

VLAN behavior is an L2 boundary contract. Most “mystery cross-talk” comes from boundary collapse: a wrong PVID/native VLAN pairing, an overly wide allowed list, or inconsistent tag/untag behavior across ports.

Ingress decision: untagged frames map into a VLAN (PVID / port VLAN).
Egress contract: access ports untag; trunk ports tag (plus native VLAN exception).
Allowed list: the trunk must explicitly permit the VLANs that should cross.
Common pitfall: VLAN leakage when native VLAN and PVID assumptions differ across links.

MAC learning / FDB (learn → age → move)

Forwarding behavior depends on FDB health. When learning is polluted or aging is too aggressive, the switch falls back to flooding or exception handling, which can look like random loss or bursts of latency.

Learn: source MAC + VLAN binds to an ingress port in the FDB.
Age: inactive entries expire; traffic to unknown destinations floods.
Move: the same MAC appears on a different port; repeated moves become MAC flapping.
Common pitfall: flapping indicates loops, wiring errors, dual-homing mistakes, or mis-scoped VLANs.

STP (interface-level behavior, not protocol internals)

STP changes forwarding eligibility on ports. During state transitions, MAC movement and transient flooding can increase. The most actionable checks are port state and state-transition timing, not protocol packet details.

State visibility: blocking / learning / forwarding per port.
Failure symptom: “only one uplink works” when a port is blocked unexpectedly.
Interaction hotspot: STP + trunk + LAG can disguise the true failing link.

LAG (aggregation) — hash, symmetry, and imbalance evidence

LAG scales bandwidth by distributing flows across member links. The distribution is governed by a hash key (L2/L3/L4 fields). Imbalance is often traffic-shape driven (few large flows), not a silicon defect.

Hash key: choose fields that reflect flow diversity (e.g., include L3/L4 when possible).
Symmetry: ensure bidirectional flows keep a stable member mapping if required by the system.
Imbalance proof: compare per-member byte/packet counters and peak queue depth per egress.
Common pitfall: changing hash keys without verifying upstream NAT/encapsulation effects.

L2 checklist (configuration → verify → failure → pass criteria)

VLAN boundary

What to set: port mode (access/trunk), PVID/native VLAN, allowed VLAN list.
Quick verify: per-port VLAN membership, tagged/untagged egress behavior.
Failure symptom: VLAN leakage or silent blackholes on specific VLANs.
Pass criteria: only intended VLANs traverse trunks; leakage events = 0 (X).

FDB stability

What to set: aging time, learning rules (per VLAN if supported), storm guards.
Quick verify: learn/age/move counters, FDB utilization, MAC move rate.
Failure symptom: MAC flapping, flooding bursts, CPU punt increases.
Pass criteria: move rate < X/min; utilization < X%; flooding stable.

STP operational sanity

What to set: STP mode enablement, port roles, edge/portfast policy (if applicable).
Quick verify: per-port STP state, state-transition timing, topology-change indicators.
Failure symptom: unexpected blocking, intermittent reachability during convergence.
Pass criteria: expected ports in forwarding; convergence < X seconds.

LAG distribution

What to set: member links, LACP mode, hash key selection.
Quick verify: per-member counters, hash distribution, per-queue depth on egress.
Failure symptom: one member saturates while others idle; microburst drops on one path.
Pass criteria: member utilization within X%; drops < X under expected load.

Diagram: VLAN boundary decisions, FDB learning/aging/move, and LAG hashing are the three L2 pillars that most often explain “unexpected” reachability and drops.

H2-6 · L3 Feature Set: Routing, ARP/ND, ECMP, NAT (Capability-level)

Section intent & scope

Explain what L3 hardware forwarding actually does in silicon: lookup → next-hop → rewrite → egress.
Focus on resource bottlenecks that cause punts and performance cliffs (route, neighbor, ECMP, ACL).
Not covered: dynamic routing protocol courses (OSPF/BGP) and deep control-plane protocol design.

L3 hardware fast path (what happens on a hit)

On a route hit, silicon performs deterministic actions: longest-prefix match (or host route), next-hop selection, header rewrite, and forwards into the egress pipeline. When lookups miss or neighbors are unresolved, traffic is punted or flooded, producing sudden latency and throughput collapse.

Lookup: LPM and host routes decide the forwarding action.
Next-hop: bind to a resolved neighbor (ARP/ND) and output port.
Rewrite: MAC DA/SA, TTL decrement, IP checksum update.
Egress: enqueue/schedule applies the configured QoS policy (no algorithm deep dive here).

ARP/ND (neighbor table) — the frequent hidden bottleneck

Many real systems fail due to neighbor pressure, not route count. Neighbor misses trigger resolution traffic and punts. Aging and churn can create intermittent “it works, then stalls” patterns.

Resource: ARP/ND entries and unresolved (incomplete) entries.
Symptoms: punt spikes, burst latency, periodic stalls as neighbors expire.
Observability: neighbor utilization, incomplete count, ARP/ND request rate (X).

ECMP (multi-path) — throughput vs. explainability

ECMP spreads flows across multiple next-hops using a hash key. It improves aggregate throughput but introduces a distribution model that must be observable: per-path counters and stable hashing assumptions matter.

Hash key: field selection determines balance and flow stability.
Resource: ECMP groups and paths; exhaustion forces fallback behavior.
Evidence: per-path utilization and queue depth peaks, not average link usage.

NAT (capability-level only)

If NAT exists in an enterprise switch IC, the practical constraints are session-table capacity and timeout behavior. When session resources saturate, new-flow setup fails or punts to slow paths. This section records capability and resource impact only.

Resource: session entries, port allocation, per-session timers (X).
Symptoms: new connections fail, intermittent stalls, CPU punt storms.

Resource budget placeholders (capacity planning)

Route entries

Consumed by prefixes/VRFs. Exhaustion causes lookup miss and punt behavior. Pass criteria: utilization < X%.

ARP/ND entries

Consumed by active neighbors. Exhaustion increases unresolved entries and punts. Pass criteria: incomplete < X; utilization < X%.

ECMP groups / paths

Consumed by multi-path routes. Exhaustion forces reduced path diversity or software fallback. Pass criteria: utilization < X%.

ACL entries (TCAM)

Consumed by security and policy filters. Exhaustion leads to rule compromises or unexpected default behavior. Pass criteria: headroom > X%.

Diagram: L3 offload is deterministic on hits. Most production cliffs come from route/neighbor/ECMP resource misses that punt traffic to slow paths.

H2-7 · QoS, Scheduling & Shaping (Non-TSN)

Section intent & boundary (Non-TSN determinism)

Convert QoS from “settings” into an explainable queue/scheduler/shaper model.
Define the non-TSN boundary: statistical service guarantees, not time-slot determinism.
Not covered: TSN time-aware shaping (Qbv) and time-slot scheduling. CBS is mentioned only as “capability supported or not”.

QoS truth chain: DSCP/PCP → class → queue → scheduler

All QoS outcomes depend on a single internal truth: which packets map into which traffic class and egress queue. If mapping is ambiguous, scheduler and shaping tuning becomes non-repeatable.

Inputs: DSCP (L3), PCP (VLAN), port-based policy, ACL remark.
Outputs: traffic class and queue ID with fixed counters per queue.
First verification: per-class and per-queue counters align with expected traffic.

Queues & buffers: microburst → depth → tail latency → drop/mark

Many industrial “P99 spikes” happen under moderate average load due to microbursts. The correct evidence is queue depth and watermarks, not average utilization.

Depth/occupancy: rising depth predicts latency growth before drops begin.
Watermarks: peak depth reveals brief congestion that averages hide.
Drop/mark: thresholds trigger WRED/ECN or tail drop depending on policy.

Scheduling: SP vs WRR/WFQ (service guarantees and failure modes)

Scheduler choice defines who gets served first under contention. SP protects critical flows but can starve lower classes. WRR/WFQ improves fairness, but does not provide strict time-slot determinism.

SP (Strict Priority): predictable for high class; starvation risk for low class.
WRR/WFQ: controlled sharing; tail latency still depends on burstiness and buffers.
Evidence: per-queue service counters and per-queue drop/mark rates.

Shaping: token bucket (rate + burst) and pacing stability

Token bucket shaping enforces a long-term rate limit while allowing short bursts. A too-small burst creates frequent throttling; a too-large burst propagates congestion into downstream hops.

Rate: average bandwidth ceiling for the class.
Burst: short-term tolerance window; directly impacts jitter and microburst smoothing.
Evidence: shaper drop/mark counters and “sawtooth” throughput patterns.

Congestion control: WRED / ECN (trigger window, not algorithm theory)

WRED/ECN is best treated as a trigger window. Below the lower threshold, no action. Between thresholds, probabilistic mark/drop. Above the upper threshold, forced action. ECN requires end-to-end support to be effective.

Early action: mark/drop begins as depth enters a configured window.
Forced action: tail drop or forced mark at high depth.
Evidence: ECN mark counters, early drop counters, and depth correlation.

Deliverable · QoS mapping (DSCP/PCP → class → queue → scheduler)

Replace a wide table with a mobile-safe mapping list. Each entry is a policy row that can be verified with per-queue counters.

Row A · Critical control

Example

Input: DSCP = X, PCP = X
Class: TC = X
Queue: Q = X
Scheduler: SP / WRR (X)
Verify: Q counters increase only for intended traffic.

Row B · Best-effort

Example

Input: DSCP = X, PCP = X
Class: TC = X
Queue: Q = X
Scheduler: WRR / WFQ (X)
Verify: depth watermarks and drop rate remain within targets.

Pass criteria placeholders (measured at defined points)

Loss rate

Target: drop rate < X per 1M packets on queue Q under load profile Y.

Tail latency

Target: P99 < X ms for class TC measured at egress port P.

Congestion recovery

Target: queue depth returns below threshold within X seconds after burst ends.

Diagram: QoS becomes explainable when mapping, queue depth, scheduler choice, shaping parameters, and drop/mark triggers are observed together.

H2-8 · Multicast Control: IGMP/MLD Snooping, Querier, Storm Guard

Section intent: first-principles troubleshooting

Explain multicast as a table-driven forwarding system (member ports vs. VLAN flooding).
Provide the shortest troubleshooting path for: flooding, subscription loss, and latency jitter.
Keep the focus on IGMP/MLD snooping and operational controls (querier, aging, storm guard), not protocol theory.

Core model: group table present → member-only forwarding

Snooping turns multicast from “flood inside VLAN” into “forward only to member ports”. If the group table is missing, stale, or not refreshed, behavior degrades to flooding or intermittent delivery.

IGMP: IPv4 multicast control messages.
MLD: IPv6 multicast control messages.
One shared principle: maintain group membership and refresh timers.

Deliverable · Snooping state machine (join / leave / aging)

Join

Trigger: host sends join (report).
Table action: add/update group entry + member port.
Observe: group entry count increases; member port list updates.

Leave

Trigger: host sends leave.
Table action: start confirmation (query/timeout) then remove member.
Observe: leave counters; member removal after confirmation window (X).

Aging

Trigger: refresh timeout.
Table action: expire group/member entries.
Observe: aging counters; sudden flooding if table empties unexpectedly.

Shortest troubleshooting path: flooding

Step 1 — Querier: verify a querier exists on the VLAN; missing querier often triggers table decay and flooding.
Step 2 — Aging: check whether group entries age out faster than refresh; correlate with entry count drops.
Step 3 — VLAN & ACL: confirm IGMP/MLD control messages are not blocked or isolated.
Step 4 — CPU punt: if control-plane load spikes, snooping updates may lag and degrade to flood.

Shortest troubleshooting path: subscription loss / latency jitter

Subscription loss: confirm join arrives → confirm member port added → confirm forwarding only to members.
Latency jitter: verify whether flooding is occurring (unexpected fan-out) → check storm guard thresholds → check queue depth.
Common trap: storm control silently drops BUM traffic when snooping degrades to flood.

Storm guard (protection vs. false positives)

Storm control protects the fabric by limiting broadcast/unknown-unicast/multicast (BUM). When snooping fails and multicast turns into VLAN-wide flooding, storm guard can become the dominant loss mechanism. Thresholds must be validated for degraded modes.

Threshold placeholder: BUM rate limit = X (per port / per VLAN depending on chip).
Verify: storm-drop counters correlate with jitter/loss events.

Diagram: With snooping, multicast forwards only to member ports. If querier/refresh fails, behavior degrades toward VLAN flooding and storm control may become the primary loss mechanism.

H2-9 · ACL & Security Hooks: TCAM, Policer, Isolation

Section intent & boundary (capability-level security hooks)

Explain ACL as a resource system: TCAM width/slices/priority and how rule overlap causes “false drops”.
Turn ACL from “can it be configured” into “can it be verified and scaled without conflicts”.
Not covered: deep MACsec/TLS/key management. Only “integrated or not” and interface hooks are discussed.

ACL insertion point defines what can be matched

ACL behavior is not only about “fields supported” but also about where ACL is evaluated in the forwarding pipeline. Parser availability, metadata visibility, and stage ordering define match fidelity.

Ingress (early): fast filtering, may have limited L4 visibility.
Post-lookup: can combine L2/L3 decision metadata with policy.
Egress (late): can police/shape/mirror with queue context.

TCAM resource model: entries ≠ capacity

Rule capacity depends on key width and slicing. A single “complex” rule can consume multiple slices/banks. Shared TCAM pools across features can also reduce effective ACL capacity.

Width expansion: IPv6 + 5-tuple + inner headers increase key width and slice usage.
Sharing: ACL/Policy/Filters may share TCAM slices (implementation-specific).
Practical check: track TCAM utilization and rule install failures at scale.

Priority & conflicts: overlap + default action causes false drops

ACL issues often present as “random” failures because overlapping rules and ambiguous priority resolve differently than expected. A misaligned default action can turn misses into widespread drops.

Overlap trap: a broad deny rule can shadow a narrow permit rule.
Stop condition: first-hit behavior vs. multi-hit behavior determines resolution.
Default action: permit/deny on miss must be explicitly defined and tested.

Action engine: permit/deny/remark/police/mirror

ACL actions can reshape traffic behavior, not just allow or drop. Policing and mirroring are common sources of “intermittent” symptoms when thresholds or overhead are underestimated.

Remark: DSCP/PCP rewrite can change QoS mapping downstream.
Police: exceeding CIR/PIR or burst limits triggers drops (often mistaken for link instability).
Mirror: SPAN overhead can amplify congestion and distort latency measurements.

Verification loop: counters determine whether rules are provable

A scalable ACL design requires measurable evidence. Per-rule counters may be limited by a shared pool; policer and mirror counters are often the minimum proof required for acceptance and field debugging.

Per-rule hit: validate match correctness and detect shadowing.
Policer drops: confirm threshold behavior under load.
Mirror bytes: quantify overhead and potential congestion impact.

Deliverables · ACL design checklist & resource budget placeholders

ACL rule design checklist

Match fields: L2/L3/L4/metadata (capability-defined)
Priority model: rule ordering plan (avoid overlap ambiguity)
Default action: explicit permit/deny on miss
Action set: permit/deny/remark/police/mirror
Counters: hit + drops + mirror bytes (minimum proof)

Resource budget (placeholders)

TCAM slices: total X, reserved for ACL X
Effective entries: X (depends on key width)
Per-rule counters: X (shared pool size X)
Policer instances: X
Hitless update: supported? X

Diagram: The practical ACL design problem is TCAM resource consumption, rule overlap, and verifiable counters—not just “supported fields”.

H2-10 · Line-rate Telemetry & Manageability

Section intent: observability as a primary selection value

Prioritize line-rate counters and event logs that shorten field debug cycles.
Cover mirroring concepts, sampling concepts, in-band telemetry capability, and black-box hooks.
Provide a minimal counter map and a field log schema that closes the troubleshooting loop.

Observability layers: Port / Queue / Table / CPU

A reliable debug loop requires evidence across multiple layers. Without queue depth/watermarks, tail latency cannot be explained. Without punt-by-reason, control-plane degradations look like random data-plane failures.

Port: link flap, CRC/FCS, error bursts.
Queue: depth, watermark, drops, ECN marks.
Table: MAC moves, multicast group entries, aging.
CPU: punt rate and exception reasons.

Mirror, sampling, and in-band capabilities (concept-level)

Mirroring provides payload visibility but can add overhead. Sampling is low-impact but probabilistic. In-band telemetry can embed hop metadata if supported, but should be treated as a capability flag rather than a protocol deep dive.

SPAN/RSPAN/ERSPAN: concept only (payload visibility vs overhead).
sFlow: concept only (trend monitoring vs microburst reconstruction limits).
INT/IOAM: capability flag only (where overhead is paid).

Deliverable · Counter map (minimum set for closed-loop debugging)

Use grouped counter sets to answer root-cause questions quickly without requiring full packet capture.

Link / Port

CRC/FCS, link flaps, error bursts (X)

QoS / Queues

Per-queue depth, watermarks, drops, ECN marks (X)

Control / Exceptions

CPU punts by reason, exception drops, trap counters (X)

Policy / Security

ACL hits, policer drops, mirror bytes (X)

Deliverable · Field log schema (forensics-ready)

Time & identity

timestamp (X), device id (X), port id (X), VLAN/BD (X)

Link & traffic events

link up/down, flap count, CRC surge, storm trigger

System context

temperature, rails, reset reason, boot count, FW version, config revision

Policy and changes

ACL update, QoS change, mirror enable/disable, admin actions (X)

Diagram: Line-rate counters plus event logs reduce reliance on full packet capture and accelerate field forensics using export paths and a black-box buffer.

H2-11 · Performance Reality: Buffers, Microbursts, Latency, Head-of-Line Blocking

Section intent: convert marketing numbers into engineering budgets and verifiable tests

Explain why average utilization can look fine while microbursts still drop packets.
Separate throughput, P50/P99 latency, loss, and recovery time into measurable acceptance targets.
Address bufferbloat and head-of-line blocking as the common root causes of “mysterious” tail latency.
Not covered: TSN time-aware scheduling (Qbv/Qci) details.

Four metrics that must be separated: throughput / latency / loss / recovery

Line-rate throughput does not imply low tail latency. Zero loss does not imply predictable latency. Acceptance targets must explicitly include P50 and P99 latency plus recovery time after congestion events.

Throughput: line-rate under defined frame sizes and feature enablement.
Latency: P50 and P99 (tail) under defined load and queue policy.
Loss: microburst drops vs sustained congestion drops.
Recovery: time to return to baseline after burst or oversubscription.

Microbursts: why “20% average” can still drop

Microbursts occur when multiple ingress sources align toward the same egress in a short window. The average rate can remain low while the instantaneous arrival rate exceeds egress capacity.

Window matters: burst duration (μs–ms) determines required buffering.
Rate gap: ingress aggregate > egress rate causes queue buildup.
Evidence: queue watermark spikes + short drop bursts.

Buffer architecture: shared pool vs per-queue limits

Total buffer size is not the same as effective buffer per queue. Shared pools can be exhausted by a subset of flows, while per-queue caps can cause earlier drops. Headroom reservation protects priority traffic but reduces general capacity.

Shared: flexible but contention-driven (whoever arrives first can consume).
Per-queue cap: prevents starvation but may amplify microburst loss.
Headroom: reserved buffer for critical traffic (policy-defined).

Head-of-line blocking: one bad actor inflates others’ tail latency

HOL appears when different traffic types are forced to share the same queue or when shared buffering makes one class crowd out others. Tail latency grows even if throughput remains high.

Queue coupling: coarse classification maps too many flows into one queue.
Shared pool coupling: one class consumes shared buffer, others drop or stall.
Evidence: queue depth + watermark + drop counters correlate with P99 spikes.

Bufferbloat: no drops, but latency becomes uncontrollable

Deep queues can hide congestion by absorbing bursts, but the queueing delay becomes the dominant latency term. For industrial systems, the correct target is typically P99 latency, not only drop rate.

Queue limits: enforce an upper bound on worst-case queueing delay.
WRED/ECN: if supported, can reduce standing queues (capability-level).
Proof: queue watermark distribution + latency histogram.

Deliverables · Buffer budgeting cards & pass criteria placeholders

Buffer budgeting card (template)

Port speed: X Gbps
Oversubscription: X:1 (ingress aggregate vs egress)
Burst size: X bytes / X μs window
Target P99 latency: X μs
Required buffer: X KB / MB
Pool policy: shared / per-queue cap / headroom (X)

Pass criteria (placeholders)

Latency: P50 ≤ X, P99 ≤ X
Microburst drops: drop rate ≤ X (define window)
Recovery: congestion recovery time ≤ X ms

Diagram: Microbursts build queues fast; WRED or hard drops can occur, and tail latency grows with queue depth even when average load looks low.

H2-12 · Engineering Checklist (Design → Bring-up → Production)

Section intent: move from “works” to “manufacturable and serviceable”

Use checklists that are evidence-driven: each item must have a proof point and a pass threshold placeholder.
Organize by project phase: Design, Bring-up, Production.
Make the result repeatable for teams and factories with versioning and black-box fields.

How to use this checklist (evidence + threshold + traceability)

Each checkbox item is written as an action with a required evidence field and a pass criterion placeholder. This format enables consistent bring-up, stable production release, and faster RMA forensics.

Design checklist (interface / clock / power / thermal / straps / EMC hooks)

☐ Select host interface (SGMII/QSGMII/USXGMII/PCIe) → Evidence: lane map + pin plan → Pass: X
☐ Define ref clock source and routing constraints → Evidence: jitter/route notes → Pass: X
☐ Build power tree (rails, sequencing, PDN targets) → Evidence: rail list + probe points → Pass: X
☐ Plan thermal path (heatsink/airflow/copper) for peak modes → Evidence: thermal model → Pass: X
☐ Define strap/EEPROM policy and recovery path → Evidence: config map + versioning → Pass: X
☐ Reserve EMC/ESD hooks (layout keepouts, grounding strategy) → Evidence: layout rules → Pass: X

Bring-up checklist (self-test / loopback / PRBS / FDB+ACL+IGMP / baselines / fault injection)

☐ Port self-test and baseline counters → Evidence: CRC/drop/flap = X → Pass: X
☐ PHY/MAC loopback per port and speed → Evidence: loopback log → Pass: X
☐ PRBS patterns (if available) under load → Evidence: error counters → Pass: X
☐ Validate FDB learning/aging and moves → Evidence: MAC table stats → Pass: X
☐ Validate ACL core use cases and counters → Evidence: hit/drop proof → Pass: X
☐ Validate IGMP snooping behavior (join/leave/aging) → Evidence: group table stats → Pass: X
☐ Record counter baselines (idle + typical load) → Evidence: baseline snapshot → Pass: X
☐ Fault injection (link flap, overload, mirror enable) → Evidence: recovery time → Pass: X

Production checklist (version lock / config backup / thresholds / black-box fields / aging)

☐ Lock FW/SDK/config revision → Evidence: version stamp → Pass: X
☐ Backup “golden config” + rollback config → Evidence: hash + storage location → Pass: X
☐ Set telemetry thresholds (P99, watermarks, punts) → Evidence: threshold table → Pass: X
☐ Define RMA black-box fields (reset, temp, rails, flap, key counters) → Evidence: log schema → Pass: X
☐ Perform burn-in / thermal cycling and validate drift → Evidence: test report → Pass: X

Diagram: A phase-based checklist converts switch integration from “it works” into evidence-driven, repeatable bring-up and production readiness.

H2-13 · Applications + IC Selection Logic

Section intent: convert features into application gates, resource budgets, and verification-ready acceptance targets

In-scope (this page)

Applications framed as switch roles + failure patterns + measurable pass targets (X placeholders).
Selection gates: non-negotiables → resource budgets → field operability (maintainability).
Capability-to-verification mapping that closes the loop with counters, logs, and bring-up tests.
Reference material numbers (part numbers) for silicon shortlisting (not a recommendation).

Out-of-scope (route to sibling pages)

TSN time-aware scheduling (Qbv/Qci) deep details.
PTP/SyncE/WR timing math and calibration procedures.
PoE/PoDL classification, thermal, surge energy sizing.
Industrial stacks (PROFINET/EtherCAT/CIP) implementation details.
ESD/surge/magnetics component-by-component design deep dive.

Application buckets (role → pain → required capabilities → pass targets)

A1 · Industrial cabinet aggregation switch

System role: aggregate PLC/IO/drive ports into a deterministic, segmented plant network.
Top pain: VLAN leakage, MAC flapping, microburst drops during synchronized cycles.
Key capabilities: VLAN/LAG/STP basics, QoS mapping, robust counters (drops, watermarks).
Pass targets: P99 latency ≤ X, microburst drop ≤ X, recovery time ≤ X ms.

A2 · Edge switch + gateway uplink (segmentation first)

System role: enforce network zones at the edge and forward to an uplink/gateway.
Top pain: ACL rule conflicts, unexpected punts to CPU, hard-to-debug intermittent drops.
Key capabilities: TCAM ACL + per-rule counters, policer, mirroring, event logs.
Pass targets: ACL hit counters consistent, CPU punt rate ≤ X, drop hotspots identified within X minutes.

A3 · Vision / imaging multicast distribution

System role: distribute multicast streams to subscribers without flooding the plant.
Top pain: multicast flooding, missing querier, group aging issues, jittery delivery.
Key capabilities: IGMP/MLD snooping + querier option, storm guard, group-table observability.
Pass targets: no-flood guarantee under X joins/sec, group aging stable within X seconds, loss ≤ X.

A4 · Oversubscribed uplink (microbursts are the real enemy)

System role: fan-in many access ports into fewer uplinks (oversubscription).
Top pain: “average looks fine” but tail latency and burst drops appear randomly.
Key capabilities: queue watermarks, WRED/ECN (if available), per-queue limits, shaping.
Pass targets: watermark distribution controlled, P99 ≤ X, drop bursts ≤ X per hour.

A5 · Remote maintenance / fast triage is the #1 KPI

System role: prioritize observability and auditability over raw port density.
Top pain: slow root-cause, missing data, non-reproducible failures.
Key capabilities: line-rate counters, event logs, black-box snapshots, mirroring, sampling hooks.
Pass targets: RCA time ≤ X, black-box fields complete, config change trace ≤ X minutes.

Selection gates (cut the candidate list in minutes, not weeks)

Gate 1 · Non-negotiables

Port count + speeds + host interfaces (SGMII/QSGMII/USXGMII/PCIe) match the system design.
VLAN + trunking + LAG + STP primitives meet the topology needs.
IGMP/MLD snooping present if multicast bucket is in scope.
ACL/TCAM present if zone isolation is required.
Observability exists: queue watermarks, drops, punts, and mirroring hooks.

Gate 2 · Resource budgets (X placeholders)

FDB / MAC entries: X
VLANs / translation rules: X
ACL / TCAM entries (and slices): X
Multicast group entries: X
Queues per port + scheduling options: X
Total buffer + per-queue policy + headroom: X
Route / ARP/ND / ECMP (if L3 offload is required): X

Gate 3 · Field operability (maintainability)

Counter coverage: CRC/drop, queue depth/watermarks, ECN marks (if any), CPU punts.
Event logs: temperature, rails, link flap, config change, watchdog/reset reason.
Bring-up proof: loopback/PRBS (if available), baseline snapshots, fault-injection recovery.
Release controls: version lock, config backup/rollback, black-box schema for RMA.
Acceptance targets: P50/P99 latency ≤ X, microburst drop ≤ X, recovery ≤ X ms.

Capability → verification map (engineering closure)

QoS mapping

Verify: DSCP/PCP → class → queue mapping is deterministic.
Evidence: per-queue counters + watermark distribution.
Pass: P99 ≤ X, drops ≤ X, recovery ≤ X ms.

IGMP/MLD snooping

Verify: join/leave/aging → correct group forwarding only to members.
Evidence: group table stats + flood counters.
Pass: flood = 0 under X joins/sec, aging stable within X seconds.

ACL / TCAM rules

Verify: priority + default action + conflict rules behave as designed.
Evidence: per-rule hit/drop counters + mirror samples.
Pass: no false drops in X-hour soak, punt rate ≤ X.

Telemetry + black-box fields

Verify: counters + events enable fast root-cause.
Evidence: exported counters + change logs + snapshots on fault.
Pass: RCA time ≤ X, field schema completeness ≥ X%.

Buffers + microbursts

Verify: burst tests produce controlled watermarks and acceptable tail latency.
Evidence: queue depth (watermark), burst drop counters, latency histogram.
Pass: P99 ≤ X, microburst drops ≤ X per hour, recovery ≤ X ms.

Shortlist template + reference material numbers (part numbers)

The part numbers below are provided as reference examples for shortlisting. Always confirm the exact feature set (L2/L3, TCAM, queues, counters), temperature grade, and current datasheets.

Managed L2 switch SoC (industrial/embedded)

Microchip: VSC7426, VSC7427 (SparX-III family examples)
Microchip: VSC7514 (10-port L2 switch example)

Embedded/SMB switch IC with integrated PHY + SerDes

Marvell Link Street: 88E6390X (11-port switch example)
Marvell family example: 88E6190 (model reference)

High port-count managed MAC switch controller

Realtek: RTL8393M-VC (52-port managed MAC switch controller example)

Enterprise / multilayer switch SoC examples

Broadcom: BCM56150 (family example)
Broadcom: BCM56160 (family example)
Broadcom: BCM53156XUB1KFBG (Robo family example)

Shortlist card (copy/paste template)

Candidate: X
Gate 1: pass / fail (interfaces, VLAN/LAG/STP, IGMP, ACL, counters)
Gate 2 budgets: FDB X · VLAN X · ACL X · Multicast X · Queues X · Buffer X
Gate 3 operability: logs + black-box fields + baseline + fault injection
Risks: X
Acceptance: P99 ≤ X · drops ≤ X · recovery ≤ X ms

Diagram: application needs flow through three selection gates, producing a shortlist; verification closes the loop with counters, logs, and bring-up evidence.

Page boundary reminder

When the requirement becomes time-aware determinism, timing calibration, PoE power negotiation, or protocol-stack certification, route to the corresponding sibling pages to avoid scope overlap.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-14 · FAQs (Troubleshooting, data-driven)

Scope: only L2/L3 (capability-level), QoS (non-TSN), IGMP/MLD, ACL/TCAM, telemetry, buffers/microbursts

Each answer follows a strict 4-line format and ends with measurable pass criteria placeholders (X).

Average utilization is low, but packet drops appear in short bursts

Likely cause: microbursts from fan-in oversubscription; per-queue limit/headroom too small; tail-drop triggers before congestion becomes visible in averages.
Quick check: correlate burst drops with per-queue watermark(max); compare ingress aggregate vs egress line-rate in the same time window; verify which queue/class is dropping (per-queue drop counters).
Fix: split traffic into more queues; adjust buffer sharing/headroom; cap or reshape bursty sources (token-bucket shaping); validate LAG hashing if fan-in is via LAG.
Pass criteria: burst-drop rate ≤ X events/hour at Y% load; queue watermark peak ≤ X% of limit; recovery to baseline ≤ X ms.

No drops, but P99 latency explodes under load (bufferbloat)

Likely cause: deep standing queues hide congestion (bufferbloat); strict-priority or uneven scheduling keeps queues persistently filled, inflating tail latency.
Quick check: observe sustained queue depth/watermark plateau; plot latency histogram vs watermark; check ECN marks (if supported) and per-queue service rates.
Fix: enforce queue limits to bound worst-case delay; enable/tune WRED/ECN if available; adjust scheduler weights (WRR/WFQ) and apply shaping to dominant sources.
Pass criteria: P99 latency ≤ X µs at Y% load; standing queue ≤ X% for > Z ms; drop rate ≤ X.

One traffic class starves others (head-of-line blocking symptoms)

Likely cause: too many flows mapped into the same queue; strict-priority dominance; shared buffer coupling causes one class to crowd out others.
Quick check: verify DSCP/PCP→class→queue mapping; compare per-queue occupancy and drops; check strict-priority queue service share vs WRR/WFQ queues.
Fix: separate critical and bulk flows into distinct queues; cap strict-priority or add minimum service via WRR/WFQ; apply policers on noisy classes.
Pass criteria: no starvation longer than X ms; low-priority P99 ≤ X µs; high-priority loss ≤ X/1e6 packets.

VLAN leakage: traffic appears in the wrong VLAN

Likely cause: trunk allow-list missing; wrong PVID/default VLAN; tag/untag action mismatch; ingress VLAN filtering disabled; stale FDB entries across VLANs.
Quick check: validate per-port allowed VLAN list and PVID; confirm tag/untag rules; check VLAN filter-drop counters; confirm MAC table entries are VLAN-scoped (VLAN+MAC).
Fix: enable ingress VLAN filtering; explicitly configure allow-lists; correct PVID and tag actions; clear and relearn FDB after fixing topology/config.
Pass criteria: cross-VLAN frames observed = 0 over X minutes; VLAN filter-drop counter stable within ≤ X/minute under steady state.

MAC flapping / frequent MAC moves cause instability

Likely cause: L2 loop, misconfigured LAG, or topology changes causing the same source MAC to be learned on multiple ports; aggressive aging amplifies churn.
Quick check: read MAC move/learn counters; track one flapping MAC across ports and VLANs; check STP state changes and link flap logs; verify LAG membership consistency.
Fix: eliminate loops (STP/loop-guard); correct LAG configuration; tune aging only after topology is stable; consider MAC pinning for fixed endpoints.
Pass criteria: MAC move events ≤ X/hour; topology-change events ≤ X/day; CPU utilization for control tasks ≤ X%.

LAG is enabled, but load is heavily imbalanced

Likely cause: hash key too narrow (e.g., only L2/L3); elephant flows dominate a single member; asymmetry from source traffic patterns.
Quick check: compare per-member byte/packet counters; identify top flows; check configured hash fields (L2/L3/L4) and whether symmetric hashing is enabled.
Fix: expand hash to include L4 where possible; split elephant flows at the source; add more LAG members or adjust traffic engineering policies.
Pass criteria: member utilization within ± X% over Y minutes; LAG member drops ≤ X/1e6 packets.

Multicast floods the whole network unexpectedly

Likely cause: IGMP/MLD snooping off or not active on the VLAN; querier missing; group aging collapses; control packets blocked by ACL so the table never builds.
Quick check: confirm snooping enabled per VLAN; verify querier presence; inspect group-table entries and aging; read unknown-multicast and multicast-flood counters.
Fix: enable querier on the correct VLAN; adjust aging/robustness parameters; ensure IGMP/MLD reports are permitted; apply storm guard for unknown multicast.
Pass criteria: multicast flood counter = 0 in steady state; unknown-multicast rate ≤ X pps; group table stable with ≥ X% expected members.

Multicast subscribers intermittently miss streams

Likely cause: join/leave timing vs aging mismatch; IGMP reports dropped under congestion; control packets punted then delayed; VLAN/ACL interaction blocks membership maintenance.
Quick check: confirm reports are seen on ingress (mirror); compare group entry aging vs report intervals; check control-queue drops and CPU punt rates; verify ACL permits IGMP/MLD.
Fix: tune IGMP/MLD timers to match endpoints; protect IGMP reports with QoS mapping; reduce punts; ensure VLAN membership and ACL rules allow report/querier traffic.
Pass criteria: join-to-forwarding latency ≤ X ms; stream loss ≤ X packets/1e6; group table matches membership with ≥ X% accuracy.

ACL blocks legitimate traffic (false drops)

Likely cause: TCAM rule priority conflict; default action too aggressive; match fields too broad; slice/entry exhaustion changes rule placement or disables intended rules.
Quick check: read per-rule hit/drop counters; verify rule order and default action; mirror packets before the drop stage; check TCAM utilization and slice allocation.
Fix: reorder by specificity; narrow matches; add explicit permits; align default action with policy; reserve TCAM slices for critical rules and disable noisy logging.
Pass criteria: false-drop counter = 0 over X-hour soak; intended-drop rate within ± X%; CPU punt rate ≤ X pps.

Enabling mirroring/telemetry makes performance worse

Likely cause: SPAN oversubscription (mirror egress slower than mirrored traffic); sampling/export overload; CPU interrupts and punts increase when telemetry is misconfigured.
Quick check: compare mirrored traffic rate vs mirror port speed; check mirror-port drop counters; observe CPU punt/IRQ rate; confirm export queue occupancy if streaming telemetry is enabled.
Fix: mirror only necessary subsets; rate-limit sampling; use a high-speed dedicated mirror port; shift telemetry export off the critical path; reduce per-packet logging.
Pass criteria: enabling telemetry increases P99 latency by ≤ X%; mirror-port drops ≤ X/minute; CPU utilization increase ≤ X%.

Counters don’t match: hosts see loss, but switch stats look “clean”

Likely cause: counter scope mismatch (ingress vs egress); different sampling windows; loss happens outside the measured stage (e.g., mirror port, control queue, or offloaded path not included).
Quick check: align measurement windows; compare ingress/egress/queue-level counters simultaneously; run a controlled traffic burst and reconcile deltas; confirm if drops occur on mirror/control/export paths.
Fix: standardize counter definitions and collection cadence; record snapshot bundles (counters + watermarks + punts) on anomaly triggers; add correlation IDs in logs if supported.
Pass criteria: reconciliation error ≤ X% over Y minutes; timestamp skew ≤ X ms; anomaly snapshots captured within X seconds.

High CPU punt rate causes random loss or jitter

Likely cause: exception paths punt too much traffic (unknown L2/L3, TTL, ACL log, multicast control); CPU queue starvation and interrupt storms create unpredictable control-plane delays and drops.
Quick check: read punt reason counters/registers; check CPU queue depth/watermarks and IRQ load; correlate loss with punt bursts; verify which flows are hitting exception rules.
Fix: enable hardware handling for common exceptions; narrow ACL logging and reduce punt-generating rules; isolate control traffic into protected queues; rate-limit punt sources.
Pass criteria: punt rate ≤ X pps; CPU queue depth ≤ X% steady; loss under Y% load ≤ X/1e6 packets.

Managed / Enterprise Switch IC: L2/L3, ACL, QoS & Telemetry

Managed / Enterprise Switch IC: L2/L3, ACL, QoS & Telemetry

H2-1 · What is a Managed / Enterprise Switch IC

H2-2 · Scope Guard & Page Routing

H2-3 · Silicon Block Diagram: Data Plane vs Control Plane vs Management

H2-4 · Switching Pipeline Deep Dive (Ingress→Decision→Egress)

H2-5 · L2 Feature Set: VLAN, MAC Learning, STP, LAG

H2-6 · L3 Feature Set: Routing, ARP/ND, ECMP, NAT (Capability-level)

H2-7 · QoS, Scheduling & Shaping (Non-TSN)

H2-8 · Multicast Control: IGMP/MLD Snooping, Querier, Storm Guard

H2-9 · ACL & Security Hooks: TCAM, Policer, Isolation

H2-10 · Line-rate Telemetry & Manageability

H2-11 · Performance Reality: Buffers, Microbursts, Latency, Head-of-Line Blocking

H2-12 · Engineering Checklist (Design → Bring-up → Production)

H2-13 · Applications + IC Selection Logic

Application buckets (role → pain → required capabilities → pass targets)

Selection gates (cut the candidate list in minutes, not weeks)

Capability → verification map (engineering closure)

Shortlist template + reference material numbers (part numbers)

Request a Quote

Accepted Formats

Attachment

H2-14 · FAQs (Troubleshooting, data-driven)

Explore

Categories

Get in Touch

Managed / Enterprise Switch IC: L2/L3, ACL, QoS & Telemetry

Managed / Enterprise Switch IC: L2/L3, ACL, QoS & Telemetry

H2-1 · What is a Managed / Enterprise Switch IC

H2-2 · Scope Guard & Page Routing

H2-3 · Silicon Block Diagram: Data Plane vs Control Plane vs Management

H2-4 · Switching Pipeline Deep Dive (Ingress→Decision→Egress)

H2-5 · L2 Feature Set: VLAN, MAC Learning, STP, LAG

H2-6 · L3 Feature Set: Routing, ARP/ND, ECMP, NAT (Capability-level)

H2-7 · QoS, Scheduling & Shaping (Non-TSN)

H2-8 · Multicast Control: IGMP/MLD Snooping, Querier, Storm Guard

H2-9 · ACL & Security Hooks: TCAM, Policer, Isolation

H2-10 · Line-rate Telemetry & Manageability

H2-11 · Performance Reality: Buffers, Microbursts, Latency, Head-of-Line Blocking

H2-12 · Engineering Checklist (Design → Bring-up → Production)

H2-13 · Applications + IC Selection Logic

Application buckets (role → pain → required capabilities → pass targets)

Selection gates (cut the candidate list in minutes, not weeks)

Capability → verification map (engineering closure)

Shortlist template + reference material numbers (part numbers)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

H2-14 · FAQs (Troubleshooting, data-driven)

Explore

Categories

Get in Touch