123 Main Street, New York, NY 10001

Ethernet Controller & MAC (RMII, GMII, SGMII)

← Back to: Industrial Ethernet & TSN

An Ethernet Controller/MAC is the traffic “timekeeper and traffic cop” between the CPU and the PHY: it defines how frames are queued, moved (DMA), offloaded, and timestamped.

This page shows how to choose RMII/GMII/SGMII semantics, control FIFO/descriptor behavior, avoid latency/jitter traps, and validate PTP hardware timestamps with measurable pass criteria.

H2-1 · Definition & Where It Sits (MAC vs Controller vs NIC)

What this block is (and why it matters)

An Ethernet Controller/MAC is where packet I/O becomes measurable and controllable: it defines the data path (DMA/FIFOs/queues), latency behavior under bursts, and the exact tap points for PTP hardware timestamps.

MAC vs Controller vs NIC vs SoC MAC (practical boundaries)
MAC (block)
Provides frame TX/RX and standard host↔PHY interfaces (RMII/GMII/RGMII/SGMII). Does not guarantee system-level throughput or low tail-latency without a strong DMA/FIFO/queue design.
Ethernet Controller (chip / subsystem)
Adds the “production” pieces: DMA engines, RX/TX FIFOs, descriptor rings, offloads (checksum/TSO), timestamp paths, counters, and drop reasons—enabling repeatable performance and field diagnostics.
NIC (module / card)
A packaged solution (often controller + PHY + magnetics) optimized for integration and compliance, trading flexibility for faster time-to-deploy and well-defined interoperability.
SoC integrated MAC
Reduces BOM and pin count, but the real limit is often FIFO depth, DMA behavior, timestamp quality, and driver maturity—these determine burst tolerance and tail-latency.
Decision hooks (use these to avoid re-spins)
  • Latency-first: prioritize shallow/controllable queues, clear drop reasons, and predictable DMA servicing.
  • Timestamp-first: demand explicit PTP timestamp tap points and a measurable timestamp path budget.
  • Operability-first: require rich counters (CRC/drops/overruns/underruns) and event logging fields.
Stop line (scope guard)
This chapter stays at system placement and Controller/MAC responsibilities. PHY electrical behavior, SI/layout, TVS/CMC, and magnetics selection are linked out to Ethernet PHY and PHY Co-Design & Protection.
Diagram · System placement map (responsibility boundaries)
System placement map for Ethernet Controller and MAC Block diagram showing CPU/Memory connected to MAC/Controller, then PHY, then Magnetics/Connector, highlighting DMA, RX/TX FIFO, checksum/TSO, PTP timestamps, and counters. CPU/Memory → MAC/Controller → PHY → Magnetics/Connector CPU / Memory Driver / Stack MAC / Controller DMA RX FIFO TX FIFO TSO CKSUM PTP TS Counters RMII · GMII/RGMII · SGMII PHY Link / EQ / AN Magnetics + Connector This page focuses here
The Controller/MAC boundary is defined by measurable data-path behavior (DMA/FIFOs/queues) and timestamp tap points. PHY electrical/SI topics are intentionally excluded here.

H2-2 · Interface Map: RMII / GMII / RGMII / SGMII (Semantics & Clocking)

Interface choice is a risk trade (not just a pin-count choice)
Selection ladder (minimal, practical)
  • 10/100: RMII is the common low-pin choice when a clean reference clock plan exists.
  • 1G: GMII/RGMII or SGMII—choose based on clocking tolerance, latency sensitivity, and debug visibility.
  • 2.5G and above: commonly SGMII-family / USXGMII-class interfaces (treated as an entry point only here).
Clocking & semantics (what must be explicitly defined)
  • Clock source ownership: which side provides the reference clock (MAC/Controller, PHY, or external).
  • Status/negotiation transport: how link speed/duplex/status is conveyed (explicit pins vs in-band signaling).
  • Latency stability knobs: clock-domain crossing and buffering choices that change tail-latency under bursts.
Common interface pitfalls (symptom → first check)
RGMII “ID” delay mismatch
Symptom: link-up but unstable throughput, sporadic errors, or intermittent drops under load.
First check: confirm whether TX/RX internal delays are enabled on the correct side (exactly once).
Fix: make delay ownership explicit and consistent across MAC and PHY configuration.
Pass criteria: error counters remain < X over Y minutes at sustained load.
SGMII auto-neg / in-band status mismatch
Symptom: speed/duplex reported incorrectly, flapping, or “works on bench, fails in system.”
First check: ensure both ends agree on in-band status usage and negotiation mode (forced vs AN).
Fix: standardize negotiation policy and validate partner status interpretation.
Pass criteria: negotiated mode stays stable for Z hours with repeated link cycles.
RMII reference clock source ambiguity
Symptom: link brings up intermittently, RX/TX stalls, or counters show underrun/overrun without clear CRC errors.
First check: verify which device is the clock master and whether the clock is present at reset/strap time.
Fix: define clock ownership, startup sequencing, and validate with a simple “clock-present” bring-up checklist.
Pass criteria: link-up success rate ≥ (100% − X%) over N cold boots.
Stop line (scope guard)
This chapter explains interface semantics and clocking ownership only. PHY-side electrical constraints and PCB length-matching rules are linked out to Ethernet PHY and Ref Clock & Layout.
Diagram · Interface decision ladder (speed → interface → constraints)
Interface decision ladder for RMII, GMII/RGMII, and SGMII Diagram mapping Ethernet speed tiers to common MAC-to-PHY interfaces, with icons for clock ownership, latency risk, and pin cost. Choose by speed tier + clocking + latency + pins Speed tier 10/100 1G 2.5G+ Common interfaces RMII Ref clock plan Latency risk Pins GMII / RGMII ID delay ownership Latency risk SGMII In-band status / AN Latency risk Pins USXGMII-class interfaces appear at higher speeds (entry point only here).
Interface selection should be documented as clock ownership + status transport + latency risk. This prevents “link-up but unstable under load” failures caused by ambiguous negotiation or timing assumptions.

H2-3 · Data Path Anatomy: RX/TX Pipeline, FIFOs, DMA, Descriptors

Why “throughput looks fine” but latency/drops are unstable

Instability typically comes from queueing and service mismatch inside the Controller/MAC data path: FIFO depth, DMA servicing cadence, descriptor supply (ring health), and backpressure behavior. This chapter turns RX/TX into a measurable pipeline with clear observability points.

Data-path contract (RX and TX)
RX pipeline (where drops and overruns appear)
MAC → RX FIFO → DMA → RX ring/descriptor → stack
Interpretation: RX FIFO absorbs bursts; DMA drains into memory; RX ring health determines whether frames are accepted or dropped.
TX pipeline (where stalls and underruns appear)
stack → TX descriptor → DMA → TX FIFO → MAC
Interpretation: TX ring feeds DMA; TX FIFO decouples memory timing from line timing; underruns typically indicate service cadence problems.
Minimum observability set (Controller-side)
  • Drops: per-reason counters (ring full, buffer unavailable, policy drop).
  • FIFO events: RX overrun, TX underrun, watermark hits.
  • Backpressure: pause frames sent/received and internal throttle events.
  • DMA / ring: descriptor starvation, wrap errors, ownership mismatches.
RX anatomy (what each box guarantees)
RX FIFO (burst absorber)
Role: absorb micro-bursts and decouple line timing from DMA service.
Failure mode: overrun when service lag exceeds FIFO capacity; “no CRC errors but drops.”
Quick check: RX overrun counter + FIFO watermark events + pause/backpressure activity.
Pass criteria: RX overrun < X per hour at sustained load, with stable P99 latency.
DMA drain (service cadence)
Role: move frames from FIFO into memory with predictable bursts and alignment.
Failure mode: jittery service (bursty DMA) causes FIFO oscillation and tail-latency spikes.
Quick check: DMA completion pacing + ring fill level over time (avoid saw-tooth extremes).
Pass criteria: ring occupancy stays within a target band; no starvation events in Y minutes.
RX ring/descriptor (buffer supply)
Role: advertise buffer availability and preserve ordering/ownership contracts.
Failure mode: descriptor starvation (ring empty) drops frames even when line is clean.
Quick check: “buffer unavailable” drop reason + starvation counter + wrap/ownership flags.
Pass criteria: starvation < X per 10^6 frames and no wrap/ownership anomalies.
TX anatomy (why stalls look like “random latency”)
TX ring/descriptor (work queue)
Role: queue transmit work with clear ownership and batching behavior.
Failure mode: ring overfill or ownership mismatch causes long tail latency or periodic stalls.
Quick check: ring fullness + “TX busy” duration + descriptor reclaim rate.
Pass criteria: reclaim keeps up with enqueue; TX busy never exceeds X ms at steady load.
TX FIFO (line-timing decoupler)
Role: smooth memory/service jitter before frames hit the MAC/line.
Failure mode: underrun indicates service gaps or overly aggressive pacing/coalescing upstream.
Quick check: TX underrun counter + pause/backpressure correlation + DMA pacing.
Pass criteria: TX underrun = 0 in Y minutes at target throughput and packet size mix.
Descriptor engineering checklist (Controller-centric)
  • Burst: DMA burst length must match memory behavior to avoid saw-tooth latency.
  • Cache line & alignment: descriptor and buffers should align to reduce jitter from extra transactions.
  • Ownership: producer/consumer flags must be monotonic and recoverable after resets.
  • Wrap: ring wrap boundaries must be validated under stress (long soak, mixed packet sizes).
Stop line (scope guard)
This chapter stays at Controller/MAC pipeline contracts and Controller-side metrics. OS networking stack internals and driver parameter catalogs are intentionally excluded (only observable symptoms and interface-level expectations are used here).
Diagram · RX/TX pipeline with queues and observability points
RX and TX pipeline with queues Block diagram showing RX: MAC to RX FIFO to DMA to RX ring to stack and TX: stack to TX ring to DMA to TX FIFO to MAC, with markers for drops, overrun, underrun, and pause/backpressure. Controller pipeline = queues + service cadence + observability RX (ingress) TX (egress) MAC RX FIFO DMA RX Ring Stack DROP OVR PAUSE Stack TX Ring DMA TX FIFO MAC UND PAUSE
Observability should be placed at queue boundaries (FIFO and ring) and service boundaries (DMA pacing). This makes “random” latency and drops reproducible.

H2-4 · Latency & Determinism: From Micro-bursts to Bufferbloat

Determinism starts with the correct metrics
Three metrics (do not mix them)
  • Throughput: average delivery rate.
  • Latency: time per packet, tracked as P50/P99 (tail latency matters most).
  • Jitter: latency variation (spread between P50 and P99/P999).
Two layers of determinism
  • End-to-end: application-to-application behavior (system-level).
  • Controller-internal: FIFO/ring queueing and DMA service cadence (this page’s focus).
Pass criteria template (placeholders)
P99 latency < X (target window), drop rate < Y, pause rate < Z, measured over a fixed window W with a representative packet mix.
Micro-bursts: short spikes that dominate tail latency
Mechanism (Controller view)
A micro-burst is a brief period where arrival rate exceeds service rate. Frames accumulate in RX FIFO / rings, turning into a sudden rise in P99 latency even when average throughput looks fine.
What to observe (fastest signals)
  • Ring occupancy leaving its normal band (rapid climb, slow drain).
  • FIFO watermark hits followed by drops or pause frames.
  • Tail latency (P99) rising while average throughput stays stable.
Bufferbloat: deep queues hide drops but destroy predictability
What it is
Bufferbloat happens when FIFO/rings are made “very deep” to avoid drops. Drops decrease, but the queueing delay becomes unbounded, and tail latency becomes unpredictable.
Low-latency FIFO strategy (Controller-first)
  • Shallow queues: keep queueing delay within a target window X.
  • Fast service: maintain consistent DMA drain/reclaim cadence during bursts.
  • Explicit drop policy: drops must have reasons (observable), avoiding “silent loss.”
Stop line (scope guard)
This chapter covers Controller-internal queueing and tail-latency control. TSN time-window scheduling (gate control lists / time slots) is intentionally excluded; if deterministic time windows are required, use TSN Switch / Bridge.
Diagram · Queue depth vs tail latency (P99) with a target window
Queue depth versus tail latency A simple trend chart showing P99 tail latency increasing sharply with deeper queue depth, with a target latency window marked and regions labeled micro-burst and bufferbloat. Deeper queues reduce drops but inflate tail latency (P99) P99 Latency Queue depth Target window X Micro-burst absorbed Bufferbloat region Queue policy boundary
The goal is to keep typical queueing below Target window X while making drops explicit and observable. Deep queues can “look stable” on throughput but still break determinism via tail latency.

H2-5 · Offloads: Checksum / TSO / LRO / VLAN / Filtering (What Helps vs Hurts)

When offloads help — and when they hurt determinism and debugging

Offloads can reduce CPU cost and raise throughput, but may also increase tail latency, alter queue behavior, and make packet capture misleading. A latency-first approach evaluates every offload with the same four axes: determinism, observability, CPU, and throughput.

Latency-first evaluation checklist
Determinism
Measure P99 / P999 latency and queue occupancy drift. Any feature that changes batching or merge windows can increase jitter.
Observability
Confirm whether captures and counters still reflect reality. Some offloads postpone checksum work or coalesce segments, making tools “lie” unless configured.
CPU & throughput
Verify improvements under a representative packet mix. Averages can look better while tails get worse.
Checksum offload (benefit vs capture pitfalls)
What it helps
  • CPU cycles saved for per-packet checksum work.
  • Higher throughput when CPU is the bottleneck.
What can go wrong
Captures may show “bad checksum” because the checksum is inserted/validated by hardware late in the pipeline. This is often a tooling perspective issue, not a link-quality problem.
Practical rule
Debug mode: disable temporarily for clean captures. Production mode: enable if CPU budget is tight and determinism targets remain satisfied.
Pass criteria: capture interpretation matches hardware reality; no increase in P99 beyond X.
TSO and LRO/GRO (batching changes queue behavior)
TSO (Transmit Segmentation Offload)
Benefit: fewer CPU operations by letting hardware segment large payloads into MSS-sized frames.
Risk: large “work items” can dominate TX ring service, increasing tail latency for small real-time flows.
Quick check: compare TX ring occupancy and P99 latency with/without TSO under mixed packet sizes.
Use when: throughput-first bulk traffic; avoid for strict cyclic real-time streams unless isolated.
LRO/GRO (Receive coalescing)
Benefit: fewer interrupts/processing overhead by combining packets before delivery.
Risk: merge windows add waiting time and reshape arrival timing, often increasing jitter for real-time traffic.
Quick check: P99/P999 jitter comparison and queue drift when coalescing is enabled.
Use when: non-real-time traffic or when real-time flows are isolated to different queues.
Isolation hint (Controller-side)
If coalescing is required for bulk traffic, keep real-time flows in a dedicated class/queue using VLAN/priority tagging and hardware filtering where available.
Pass criteria: real-time class P99 latency stays within X while bulk class consumes offload benefits.
VLAN / QoS tagging and filtering (Endpoint Controller scope)
What hardware classification can do
  • Apply VLAN tags and priority markings for traffic classes.
  • Steer frames into dedicated RX queues (when supported).
  • Drop/accept based on simple rules to protect critical queues.
Latency-first use
Critical flows should use a stable class mapping and consistent counters (drops, queue occupancy, pause activity). Non-critical flows can use heavier batching/offloads without affecting determinism targets.
Pass criteria: critical class P99 < X and drop reasons remain explainable.
Stop line (scope guard)
This chapter covers Endpoint Controller/MAC offloads and their impact on queues, determinism, and observability. Switch-side ACL/QoS policy and shaping are intentionally excluded and belong to the Switch/TSN pages.
Diagram · Offload decision table (latency-first)
Offload decision table A table-style block diagram comparing checksum, TSO, LRO/GRO, VLAN/QoS, and filtering across benefit, risk, and use-when columns. Offloads: ✅ Benefit ⚠ Risk 🎯 Use when (latency-first) Offload ✅ Benefit ⚠ Risk 🎯 Use when Checksum CPU↓ / throughput↑ capture confusing prod enable TSO CPU↓ / big payload P99↑ (batch) bulk flows LRO / GRO interrupts↓ jitter↑ (merge) non-RT VLAN / QoS class steer policy drift isolate RT Filtering protect queues silent drops guard rails
Keep the table “keyword-dense” and validate each decision using P99 latency, queue drift, and explainable counters.

H2-6 · PTP Hardware Timestamping: Where the Timestamp Is Taken

Timestamp accuracy is dominated by the tap point and the timestamp path

Two devices can both claim “hardware timestamping” yet show different offsets and jitter. The difference usually comes from where the timestamp is taken and how much queueing and service jitter exists between the event and the software-visible readout.

One-step vs two-step (Controller capability boundary)
One-step
Timestamp correction is applied at transmit time by hardware. This requires a supported path to insert/adjust the timestamp field at the right place.
Best when: the tap point is close to MAC egress and queueing is controlled.
Two-step
The packet is sent first, then a follow-up conveys the precise timestamp. This can improve compatibility, but still depends on the tap point and path jitter.
Best when: system integration/compatibility is prioritized.
Timestamp path error sources (what actually moves offset and jitter)
  • FIFO queueing: variable waiting time before the tap point or before readout.
  • DMA delay: service cadence and descriptor availability change the visibility timing.
  • Clock-domain crossing: synchronization granularity and phase uncertainty add error.
  • Interrupt coalescing: reporting delay changes when software “sees” the event.
Timestamp path budget (template)
Break the path into segments and track fixed delay and jitter separately:
Tap → FIFO → DMA → memory → readout.
Pass criteria: total jitter < X and drift remains stable over window W.
Stop line (scope guard)
This chapter covers hardware timestamp tap points and the timestamp path budget. PTP servo algorithms and system-wide timing strategies belong to the Timing & Sync pages.
Diagram · Timestamp tap-point map (MAC → FIFO → DMA → memory)
Timestamp tap-point map A block diagram of a PTP packet flowing through MAC, FIFO, DMA, and memory with tap points marked and error sources like queueing jitter, DMA service jitter, clock domain crossing, and interrupt reporting delay. Tap point + path jitter determine observed PTP offset stability PTP pkt MAC FIFO DMA Memory Tap A (MAC) Tap B (FIFO) Tap C (visible) Queueing jitter Service jitter CDC granularity IRQ / readout Timestamp path budget: Tap → FIFO → DMA → Memory → Readout (Delay + Jitter)
Prefer tap points closer to the MAC/line side when determinism is critical, and quantify each segment’s delay/jitter using a consistent measurement window.

H2-7 · Clocks & Clock Domains: Ref Clock, PLL, CDC, Sync to Timebase

Where “timebase quality” comes from inside a Controller/MAC

Stable timestamps require a clean reference clock, predictable PLL behavior, and well-defined clock-domain crossings. This chapter focuses on the Controller/MAC interior: timebase construction, CDC risk points, and calibration fields used to keep drift and jitter observable.

Timebase building blocks (concept-level, implementation-facing)
  • Free-running counter: the internal time counter driven by a clock source (ref/PLL-derived).
  • Frequency adjust: fine rate trimming to reduce drift against an external time/frequency reference.
  • Phase adjust: offset alignment for fixed biases (e.g., deterministic pipeline delay compensation).
  • Capture/compare hooks: timestamp capture points and scheduled compare events for alignment tasks.
Practical reading
A timebase is “good” when the short-term noise floor (timestamp jitter) and long-term drift (frequency error vs temperature/age) are both measurable and controllable.
Ref clock and PLL (how clock quality maps into timestamp stability)
Short-term: jitter and phase noise
Timestamp noise floors track the timebase clock’s short-term stability. Any added jitter inside PLL/clock trees can raise the observed variance even when average offset looks stable.
Long-term: drift and temperature
Drift often correlates with temperature and supply conditions. A robust design logs temperature, lock status, and timebase adjustment activity to keep drift explainable over long windows.
CDC risk points (interface clock ↔ MAC core ↔ system timebase)
  • CDC granularity: synchronization steps inject quantization-like uncertainty.
  • Crossing FIFOs: occupancy and service cadence can add variable latency.
  • Mixed domains: interface clock (GMII/RGMII/SGMII) and system clock drift differently.
Quick validation pattern
Under controlled traffic, compare timestamp jitter across load steps and correlate with crossing FIFO occupancy.
Pass criteria: jitter remains < X and does not scale sharply with occupancy.
Calibration and monitoring (deterministic latency + drift fields)
Calibrate fixed terms
Measure deterministic latency terms (pipeline depth, fixed offsets) and treat them as a baseline. Keep these terms separate from jitter so “offset shift” and “noise growth” are not confused.
Monitor drift drivers
Log temperature, PLL lock transitions, link-state changes, and timebase adjustment activity. Drift that cannot be correlated to environmental fields is difficult to debug in production.
Scope guard for SyncE/PTP details
The controller perspective is limited to how an external time/frequency reference is applied via frequency and phase adjustment. Network-wide SyncE jitter templates and PTP servo algorithms belong to dedicated timing pages.
Stop line (scope guard)
This chapter covers Controller/MAC clock domains, timebase construction, CDC risks, and calibration observables. SyncE network jitter templates and PTP servo/network-topology topics are intentionally excluded.
Diagram · Clock domain crossing & timebase (3 domains)
Clock domain crossing and timebase Three clock domains: PHY interface clock domain, MAC core clock domain with PLL, and system clock domain containing a timebase with counter, frequency adjust, and phase adjust. CDC bridges connect domains and highlight uncertainty sources. 3 clock domains: interface clock ↔ MAC core ↔ system timebase PHY interface clock domain IF clk RMII/RGMII/SGMII RX/TX ports line-side events Observables link / lock / errors MAC core clock domain Ref clk PLL jitter shaping MAC core pipelines / FIFOs System clock + timebase System clk Timebase Counter Freq adjust Phase adjust Cal & logs Temp / Lock / Drift CDC CDC phase uncertainty service jitter
Use this map to localize jitter sources: domain clocks set the noise floor, while CDC bridges and FIFO service cadence often dominate load-dependent variance.

H2-8 · Interrupts, Polling, Coalescing: The Hidden Latency Knobs

Throughput can look fine while tail latency collapses

Tail latency is often dominated by service cadence: interrupt storms, polling budgets, and coalescing thresholds shape when packets become visible to software. This chapter focuses on adjustable knobs, their symptoms, and validation patterns — not OS parameter catalogs.

Service models (behavior-level)
Interrupt-driven
Low-load responsiveness can be excellent. Under heavy load, frequent interrupts can fragment CPU time and inflate P99/P999 latency.
Polling-driven
More stable cadence at high load, but requires correct budgets. Too small causes “sawtooth” service; too large can starve other tasks.
Hybrid approach
Switches between interrupt and polling behaviors depending on load to reduce storms while maintaining acceptable responsiveness.
Coalescing knobs (timer / packet / budget) and what they reshape
The three knobs
  • Timer threshold: wait time window before raising an interrupt or delivering a batch.
  • Packet threshold: wait until N packets accumulate before service.
  • Budget: how many packets/bytes are processed per service pass.
Typical effects (latency-first)
Larger timer or packet thresholds often improve CPU efficiency but increase waiting time, inflating P99 latency. Larger budgets can reduce ring pressure but may starve other tasks if CPU time becomes too concentrated.
Validation pattern (no OS deep-dive)
Tune one knob at a time and validate using a layered pass criteria: P50/P99/P999 latency, drop reasons, and CPU utilization.
Pass criteria: P99 stays < X, drops remain explainable, CPU stays < X%.
Stop line (scope guard)
This chapter explains adjustable service cadence knobs and their symptoms/validation patterns. OS-specific parameter catalogs and driver-implementation details are intentionally excluded.
Diagram · Coalescing trade-off map (knobs → cadence → outcomes)
Coalescing trade-off map Three knobs (timer threshold, packet threshold, budget) influence service cadence, which affects P99 latency, CPU load, and drops/ring pressure. Arrows show typical trade-offs. Knobs shape service cadence, which shapes tail latency Knobs Timer threshold Packet threshold Budget Service cadence Batch window when packets are served Ring recovery pressure vs fairness Outcomes P99 latency tail behavior CPU load efficiency vs churn Drops / pressure ring occupancy Typical trade-off: higher thresholds → CPU↓ but P99↑ (validate with P50/P99/P999)
Use the map to tune determinism: adjust thresholds to avoid interrupt storms while keeping coalescing windows small enough to meet P99 targets.

H2-9 · Measurement & Bring-up: What to Probe, What to Log

Bring-up needs an observability loop, not guesswork

Link-up and throughput alone are not enough. Robust bring-up defines what to probe at Controller/MAC boundaries, what counters to log, how to build a minimal test matrix, and how to set pass criteria for latency and timestamps.

Observability layers (Controller-centric)
Layer 1 · Link events
Link up/down, speed/duplex changes, auto-neg completion, pause state transitions.
Layer 2 · Queues and DMA
FIFO over/under, ring occupancy peaks, descriptor errors, DMA faults/timeouts, service cadence counters.
Layer 3 · Time observables
Timestamp error histogram, offset step events, lock transitions, timebase adjustment activity, temperature/voltage correlation.
Must-have counters (organized by failure mode)
Integrity
CRC errors, alignment/errors (when applicable), malformed frame counters.
Congestion and backpressure
Drops, RX overruns, TX underruns, pause frames (rx/tx), retry counters (if exposed).
Descriptor / ring contract
Descriptor ownership errors, wrap/stride errors, DMA faults, timeouts, alignment exceptions.
Service cadence
Interrupt counts, coalescing timer hits, budget exhaustion (concept-level), polling cycles (concept-level).
First triage mapping
CRC rises → treat as integrity path risk; cross-check with load and temperature fields.
Drops + FIFO overrun → treat as queue depth/service cadence risk.
Descriptor/DMA errors → treat as memory contract/alignment risk.
What to probe (Controller/MAC boundaries)
  • Ingress: RX entry counters and timestamp capture path.
  • Queues: RX/TX FIFO watermarks, ring occupancy peaks and dwell time.
  • DMA: completion rates, fault/timeout counters, burst behavior (concept-level).
  • Egress: TX underrun, pause behavior, egress timestamp tap-point.
Why these probes matter
Queue peaks explain burst-induced tail latency. DMA faults explain “looks fine until it stalls”. Tap-point visibility explains load-dependent timestamp drift when timestamps couple to FIFO and service cadence.
Test matrix (methodology, not tool specifics)
Minimal sufficient coverage
  • Path: internal loopback → external loopback → normal link.
  • Load: steady flow and micro-burst injections.
  • Targets: throughput, P50/P99 latency, drops, timestamp sanity.
  • Environment: temperature steps and voltage edges (where relevant).
What each stage proves
Loopback validates internal datapaths. Throughput validates sustained service capacity. Latency validates tail behavior under bursts. PTP sanity validates timestamp tap-point and timebase stability.
Black-box logging fields (field forensics)
  • Timestamp error histogram: buckets (0–50 ns / 50–200 ns / 200 ns–1 μs / >1 μs) with X thresholds.
  • Temperature / voltage: periodic samples + event-trigger snapshots.
  • Link events: up/down, speed change, pause state change, lock transitions (if exposed).
  • Queue peaks: FIFO/ring peak occupancy and dwell window.
  • Drop reason tags: overrun / no desc / underrun / budget / pause (as available).
The goal
Logs must answer “when it got worse” and “what the system state was” without relying on external instruments.
Pass criteria (bring-up gates)
Gate 1 · Function
Link stability and event sanity within X events per hour (threshold placeholder).
Gate 2 · Performance
Throughput ≥ X% of target, P99 latency < X, drops = 0 or explainable by reason tags (threshold placeholders).
Gate 3 · Time
Timestamp histogram tail buckets remain under X%; offset step events under X per hour (threshold placeholders).
Stop line (scope guard)
This chapter covers Controller/MAC probing, counters, logging, and test methodology. PHY-side TDR/return-loss and cable diagnostics are intentionally excluded.
Diagram · Bring-up test ladder (Link → Loopback → Throughput → Latency → PTP)
Bring-up test ladder A five-step bring-up ladder: link-up, loopback, throughput, latency, and PTP sanity. Each step lists a few key observables and includes rollback arrows to the previous step on failure. Bring-up ladder: prove layers in order, log fields at every step 1 Link-up Speed Events Pause 2 Loopback CRC Drops Ring 3 Throughput pps DMA FIFO 4 Latency P99 Queue peak Coalesce 5 PTP sanity TS histogram Tap point Drift
Move forward only when the step’s observables stay stable under both steady and burst loads; roll back one step when a failure appears to localize the layer.

H2-10 · Design Hooks & Pitfalls (Controller-centric)

Pitfalls are best handled as “symptom → first check”

This section consolidates controller-centric pitfalls without drifting into PHY electricals or PCB layout. Each item maps a visible symptom to the first check that narrows the root cause quickly.

Pitfall checklist (symptom → first check)
Ring too large → tail latency explodes
First check: ring occupancy peaks and dwell time; confirm bufferbloat-like growth.
Ring too small → burst drops
First check: drop reason tags under burst injection; correlate with FIFO overruns.
DMA burst vs cache line mismatch → jitter / CPU spikes
First check: descriptor alignment, wrap/stride errors, DMA timeout/fault counters.
Coalescing too aggressive → low-load latency worsens
First check: timer/packet thresholds vs P99; identify waiting-window inflation.
Offload conflicts with capture/diagnostics
First check: whether checksum/TSO/LRO-like features alter what is visible at the capture point.
Timestamp tap point wrong → offset drifts with load
First check: timestamp location coupling to FIFO and service cadence; review histogram tail growth under load.
Descriptor ownership / wrap issues → periodic stalls
First check: descriptor errors and DMA timeouts; confirm ring wrap boundary conditions.
Pause/backpressure misunderstood → “not congested but stuck”
First check: pause frame counters and queue peaks; verify whether backpressure is persistent or bursty.
Stop line (scope guard)
Pitfalls here are Controller/MAC-centric. PHY electrical effects and PCB length-matching/layout topics are intentionally excluded.
Diagram · Pitfall checklist map (symptom → first check)
Pitfall checklist map A grid of pitfall cards showing symptom and first check, feeding into three check hubs: counters, logs, and knobs. Arrows connect pitfall cards to the relevant hub. Symptom → First check, then route to counters / logs / knobs Check hubs Counters drops / fifo / desc Logs events / temp / hist Knobs ring / coalesce P99 ↑ ring peak burst drop fifo overrun jitter spikes desc align slow response timer window capture odd offload on offset drifts tap point
Each pitfall routes to a small set of check hubs. This keeps troubleshooting bounded to Controller/MAC observables before branching into other pages.

H2-11 · Engineering Checklist (Design → Bring-up → Production)

This section turns controller/MAC decisions into three execution gates. Each checklist item is written as a testable contract: CheckHowPass (threshold placeholder X) → Evidence.

Scope guard: Controller/MAC only. PHY electrical/PCB length matching, magnetics placement, and switch/TSN scheduling are intentionally out of scope here.

Gate A

Design gate — Freeze contracts & budgets before layout

  • Check: Interface contract (RMII/RGMII/SGMII semantics + status reporting).
    How: Document clock source, link state path, and auto-negotiation boundaries.
    Pass: Link-up/down events are deterministic within X retries; no “phantom link” states.
    Evidence: Link-event log + state diagram snapshot.
  • Check: Descriptor ring contract (alignment, ownership, wrap, cache behavior).
    How: Fix ring entry size, cache-line alignment, and DMA burst policy in a single spec page.
    Pass: Descriptor errors = 0 across stress tests; no wrap-related drops.
    Evidence: Driver debug counters + ring dump excerpt.
  • Check: Queue strategy (FIFO/ring depth vs tail latency).
    How: Define burst tolerance and explicit drop reasons (overrun/underrun/timeout).
    Pass: P99 latency < X and drops are classifiable (no “unknown”).
    Evidence: Queue peak histogram + drop-reason breakdown.
  • Check: Offload policy (debug-first vs latency-first vs throughput-first profiles).
    How: Decide default enable/disable for checksum/TSO/LRO/VLAN filtering and document side-effects on capture/visibility.
    Pass: Packet capture remains interpretable; latency profile does not regress beyond X.
    Evidence: A/B profile report + capture notes.
  • Check: PTP timestamp path budget (tap point, FIFO/DMA/jitter contributors).
    How: Write a “timestamp chain” budget: ingress/egress tap → queueing → DMA → memory visibility.
    Pass: Timestamp error tail buckets < X of samples under load.
    Evidence: Timestamp histogram + load annotation.
  • Check: Instrumentation hooks reserved (counters, logs, event triggers).
    How: Lock a must-have field list: CRC, drops, FIFO over/under, descriptor, pause, coalescing, thermal/power events.
    Pass: Field logs can explain every drop class without guessing.
    Evidence: “Black-box schema” v1 + sample export.

Example PNs (controller/MAC anchors — verify features in datasheets):

  • PCIe controllers with PTP focus: Intel i210-AT, Intel I225, Intel i350
  • PCIe→GbE controllers (PTP-capable variants exist): Microchip LAN7430, LAN7431
  • USB→GbE controller (offload anchors): Microchip LAN7800
  • Reference clock XO anchors (25 MHz examples): SiTime SiT1602AI-22-33E-25.000000, Abracon ASFL1-25.000MHZ-EC-T
Gate B

Bring-up gate — Execute a repeatable test ladder

  • Check: Bring-up ladder is followed (Link → Loopback → Throughput → Latency → PTP sanity).
    How: Freeze the step order and capture artifacts per step (counters + logs + plots).
    Pass: Each step meets its gate threshold X before moving forward.
    Evidence: One-page bring-up report (v1).
  • Check: Must-have counters are readable and meaningful (CRC, drops, FIFO over/under, descriptor, pause, retry).
    How: Correlate counter deltas with injected events (micro-burst, CPU load, link flap).
    Pass: “Symptom → First counter to check” mapping is stable across runs.
    Evidence: Counter snapshot set + correlation notes.
  • Check: Queue peak is captured (not just average throughput).
    How: Record peak ring occupancy and peak FIFO watermark in a fixed time window.
    Pass: No overflow at target burst profile; P99 remains < X.
    Evidence: Peak histogram + burst configuration.
  • Check: Latency metrics are standardized (P50/P99/P999 + measurement tap point).
    How: Lock the denominator: which timestamp, where taken, and how synchronized.
    Pass: Cross-lab deltas < X with the same method.
    Evidence: “Latency definition” page + sample dataset.
  • Check: PTP sanity under load (offset stability + histogram tail).
    How: Compare idle vs sustained traffic; watch for offset step events tied to queueing/coalescing.
    Pass: Offset drift and step rate within X over Y minutes.
    Evidence: Offset plot + histogram + event log.

Example PNs (bring-up friendly ecosystems): Intel i210-AT / I225 / i350 (PTP timestamping focus), Microchip LAN7430/LAN7431 (register-level visibility), TI AM3358 (CPSW timestamping module in industrial SDK ecosystems), NXP i.MX RT10xx (1588/PTP application-note ecosystems).

Gate C

Production gate — Prove long-run stability & regression

  • Check: Soak stability (multi-day).
    How: Continuous traffic + periodic PTP sanity; log every event and counter drift.
    Pass: Unexplained drop/step events < X per day.
    Evidence: Soak log + daily summary.
  • Check: Temperature sweep (cold/room/hot) impacts on latency & timestamp tails.
    How: Step temperature, mark steady-state windows, compare hist tails under identical traffic.
    Pass: Tail buckets remain within budget X across temperature.
    Evidence: Temp-tagged histogram set.
  • Check: Power margin sweep (voltage corners + events).
    How: Induce controlled dips/noise; correlate with FIFO/desc errors and offset steps.
    Pass: No systematic correlation beyond X.
    Evidence: Power-event log + counter correlation.
  • Check: Regression suite is frozen (no “human memory” dependency).
    How: Lock the test ladder + key stress patterns; version every threshold.
    Pass: New firmware/driver does not break any gate threshold X.
    Evidence: CI-style regression report.
  • Check: Field log schema compatibility (forensics over months/years).
    How: Version log schema; keep backward-readable exports.
    Pass: Cross-version comparison is possible without re-parsing hacks.
    Evidence: Schema changelog + samples.

Production note: Favor parts with stable counter sets, clear timestamp capabilities, and well-documented register behaviors. Example anchors: Intel i350 (multi-port deployments), Intel I225 (TSN-adjacent ecosystems), Microchip LAN7430/LAN7431 (PTP-capable controller family).

Diagram · 3-Gate Checklist Cards (controller/MAC execution flow)

Design → Bring-up → Production: evidence-driven gates Gate A Design Contracts • Budgets • Hooks IF CLK RING PTP OFF LOG Gate B Bring-up Test ladder • Counters • Correlate LINK LB THRU LAT PTP LOG Gate C Production Soak • Temp • Regression SOAK TEMP PWR REG SCHEMA SAFE Evidence outputs Counters Logs Reports

H2-12 · Applications & IC Selection Logic (before FAQ)

The goal is not “pick a brand.” The goal is to select a controller/MAC capability set that can be verified against measurable outcomes: Latency, Timing (PTP), and Throughput.

Scope guard: This logic stays controller-centric. Switch/TSN scheduling parameters, PHY SI/layout rules, and compliance workflows are referenced only as “handoff points”.

A) Application buckets → primary goal → must-have capabilities

Gateway / Edge compute

Primary goal: high throughput + stable tail latency under mixed traffic.
Must-have: DMA robustness, multi-queue visibility, offload profiles that do not hide drops, clear counter taxonomy.
First validation: micro-burst stress → queue peak + drop reason + P99.

Example PNs: Intel i210-AT, Intel I225, Intel i350, Microchip LAN7430, Microchip LAN7800.

PLC / Industrial controller

Primary goal: predictable latency and debuggability across temperature and long uptimes.
Must-have: stable counter set, deterministic event logging, clear interrupt/coalescing knobs, timebase hooks for PTP or system time alignment.
First validation: soak + temp sweep → “unexplained drop” rate.

Example PNs: TI Sitara AM3358, NXP i.MX RT10xx (integrated Ethernet MAC families), Intel i210-AT (PTP-focused ecosystems).

Remote I/O / Field box

Primary goal: fast fault isolation and consistent behavior during link disturbances.
Must-have: explicit drop/overrun classification, link-event traceability, minimal “hidden buffering” features enabled by default.
First validation: link flap + burst injection → counter correlation.

Example PNs: NXP i.MX RT10xx, TI AM3358, Microchip LAN7431 (PCIe→RGMII class).

High-speed imaging / Motion control

Primary goal: timestamp integrity and bounded tail latency under load.
Must-have: hardware timestamping with well-defined tap point, histogram/step observability, queue visibility under bursts.
First validation: PTP sanity under sustained traffic → tail buckets + step events.

Example PNs: Intel i210-AT, Intel i350, Microchip LAN7430.

B) Key specs that actually matter (controller-centric)

Interface & porting

What to check: port count, host interface type (PCIe / USB / integrated), and MAC↔PHY interface contract (RMII/RGMII/SGMII).
How to validate: link event determinism + stable status reporting under flap conditions.
Pass: no ambiguous link states; recovery within X seconds.

Queues, DMA, offloads

What to check: FIFO/ring depth controls, DMA burst behavior, descriptor rules, checksum/TSO/LRO policy.
How to validate: micro-burst tests with queue peak logging + drop reason classification.
Pass: P99 bounded < X and drops remain explainable.

Timing, diagnostics, and forensics hooks

What to check: hardware timestamping capability, tap point clarity, histogram/step observability, black-box event logs.
How to validate: idle vs load PTP sanity; correlate offset steps with queueing/coalescing.
Pass: timestamp tail buckets within budget X under load.

C) Selection decision flow (Latency / Timing / Throughput)

Start with the primary objective, then lock the must-have capabilities and the first validation artifact.

Choose the entry: Latency-first / Timing-first / Throughput-first Latency-first Timing-first (PTP) Throughput-first Must-have capabilities • Queue peak visibility • Drop reason taxonomy First validation P99 + peak histogram Must-have capabilities • Clear tap point • TS histogram/steps First validation Offset + tail buckets Must-have capabilities • DMA robustness • TSO/checksum policy First validation pps + underrun/drops Scope guard: controller/MAC selection logic only (handoff to PHY / Switch-TSN / Compliance pages when needed)

Example PNs by entry (anchors only)

  • Latency-first anchors: Intel i210-AT, Intel I225, Microchip LAN7431
  • Timing-first (PTP) anchors: Intel i210-AT, Intel i350, Microchip LAN7430, TI AM3358, NXP i.MX RT10xx
  • Throughput-first anchors: Intel I225, Intel i350, Microchip LAN7430, Microchip LAN7800

For each PN, the first pass is always the same: confirm timestamp tap point, counter availability, and queue observability in the datasheet + driver documentation.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13 · FAQs (Controller/MAC Troubleshooting)

These FAQs close out long-tail troubleshooting without expanding the main body. Each answer follows a fixed 4-line, measurable format: Likely cause / Quick check / Fix / Pass criteria (threshold placeholders X, Y, Z).

Scope guard: Interface semantics, FIFO/DMA, offloads, timestamping tap points, and measurement methodology only. PHY electrical/PCB SI, switch/TSN scheduling, and protocol-stack tuning are out of scope here.

Throughput is fine but P99 latency explodes under bursts — first knob to check?

Likely cause Queue depth/ring growth during micro-bursts (bufferbloat) or interrupt coalescing delaying service.

Quick check Capture ring occupancy peak + FIFO watermark in a fixed window; compare P50 vs P99; note coalescing timer/packet thresholds.

Fix Reduce effective queueing (smaller rings / lower watermarks), tighten coalescing (smaller timer or packet threshold), and ensure drops are explicit (reason-tagged) rather than silent delay.

Pass criteria Under burst load Y pps for Z s: P99 latency < X ms and ring peak stays below X% of capacity.

Enabling TSO increases jitter — TSO segmentation point or queueing?

Likely cause Segmentation happens late (near DMA/MAC), creating bursty small-packet emission, or increases per-queue contention that amplifies tail jitter.

Quick check Compare TX queue peak, TX underrun, and pacing intervals with TSO on/off; inspect whether real-time flows share the same queue as bulk traffic.

Fix Disable TSO for latency-critical traffic (profile/queue split), or keep TSO but enforce smaller queueing windows (coalescing and ring limits) to prevent burst amplification.

Pass criteria With target traffic mix: jitter (P99-P50) < X µs and TX queue peak does not exceed X% during sustained load Y.

PTP offset drifts although PHY claims timestamping — wrong tap point?

Likely cause Timestamp is taken at a different point than assumed (MAC vs PHY vs PCS), or the driver reads the wrong timestamp source under load (queueing makes it visible as drift).

Quick check Log timestamp histogram (tail buckets) vs load; verify the reported tap point by matching the timestamp to ingress/egress events and confirming which register/path provides it.

Fix Force a single timestamp source (MAC-side or PHY-side) and ensure the driver uses the matching path consistently; reduce queueing variability (coalescing/queue limits) to prevent tap-point ambiguity.

Pass criteria Under traffic load Y: offset drift < X ns over Z min and tail bucket > X ns stays below X%.

RX drops with no CRC errors — FIFO overrun or descriptor starvation?

Likely cause RX FIFO overruns (service too slow) or RX ring runs out of descriptors (refill starvation), both of which drop frames before CRC would fail.

Quick check Compare RX FIFO overrun vs RX no-descriptor/ring empty counters; log ring occupancy peak and ISR/coalescing thresholds during the drop window.

Fix If FIFO overruns dominate: reduce coalescing latency and shorten service interval; if descriptor starvation dominates: increase refill headroom (ring size or refill policy) and ensure descriptors are cache-line aligned.

Pass criteria At load Y pps: RX drops < X per 10⁶ packets, and FIFO-overrun + no-descriptor counters remain at 0.

Link is stable but intermittent stalls — pause frames or backpressure?

Likely cause Flow-control pause/backpressure stops transmission (or drains RX servicing) without dropping link; stalls appear as “traffic freezes” with no CRC faults.

Quick check Read TX/RX pause frame counters, queue occupancy, and backpressure events during stall windows; check whether stalls correlate with queue high-watermarks.

Fix For diagnosis, disable flow control temporarily; then tune pause thresholds/watermarks or adopt shallower queues so backpressure triggers less often and resolves faster.

Pass criteria No stall longer than X ms over Z min at load Y, and pause/backpressure events stay below X per minute.

Different OS shows different latency — interrupt coalescing mismatch?

Likely cause Different driver defaults for interrupt moderation/coalescing, queue mapping, or polling budget lead to different service intervals and tail latency.

Quick check Compare coalescing timer/packet thresholds, number of queues used, and ISR rate. Measure P50/P99 under identical traffic and pin CPU affinity consistently.

Fix Align coalescing and queue settings across OS/driver profiles; prioritize smaller coalescing windows for latency-first modes and validate with the same measurement tap point.

Pass criteria Under the same traffic Y: P99 delta between OSes < X µs, and ISR/coalescing settings match within X%.

SGMII links up but negotiation looks wrong — in-band status mismatch?

Likely cause PCS “in-band status” encoding/decoding mismatch or one side forcing speed/duplex while the other expects auto-negotiation semantics.

Quick check Read PCS status for resolved speed/duplex, check whether in-band status is enabled, and compare link partner advertisement vs forced settings across both ends.

Fix For debug, force a known good mode on both ends (speed/duplex + in-band status policy). Then re-enable auto-negotiation with consistent in-band status handling.

Pass criteria Resolved mode matches expected (speed/duplex) across Z re-links, and mismatch events = 0 during Y minutes of traffic.

Checksum offload makes captures “look broken” — how to verify correctly?

Likely cause Captures occur before hardware inserts/verifies checksums; software displays “bad checksum” although on-wire frames are correct (or checksum status is carried out-of-band).

Quick check Use the driver’s RX checksum status flags/counters; compare captures with checksum offload enabled vs disabled; validate on-wire integrity using a known-good external capture point.

Fix For debugging, disable checksum offload (or force verification in software) and rely on hardware-reported checksum status once validated; document the capture tap point in the test report.

Pass criteria Captures remain interpretable (no false “broken” checksums) and checksum error counters < X per 10⁶ packets over Z minutes at load Y.

Timestamp is stable in lab but shifts with temperature — timebase vs CDC?

Likely cause Timebase frequency drifts with temperature, or clock-domain crossing (CDC) adds temperature-dependent delay/jitter visible in timestamp tails.

Quick check Log offset vs temperature alongside timestamp histogram tails; compare idle vs load across temp steps; flag deterministic “steps” versus smooth drift.

Fix Improve timebase stability (calibration/compensation hooks) and reduce CDC sensitivity by minimizing variable queueing and ensuring consistent clock-domain mapping for timestamp capture.

Pass criteria Across temperature range: drift slope < X ns/min, and tail bucket > X ns stays below X% under load Y.

DMA errors only at high load — cache line/alignment or burst length?

Likely cause Descriptor/data buffers are not cache-line aligned or coherent, or DMA burst/outstanding settings exceed the memory subsystem’s safe envelope under peak load.

Quick check Check DMA error counters vs traffic rate; validate descriptor alignment and buffer boundaries; log whether errors appear at a repeatable throughput threshold.

Fix Enforce cache-line alignment for descriptors/buffers and use a conservative DMA burst/outstanding configuration; verify memory coherency handling is consistent for both RX and TX paths.

Pass criteria DMA error counters remain at 0 up to load Y for Z minutes; throughput stays within X% of target.

Low-latency mode reduces throughput too much — what minimum FIFO target?

Likely cause Latency-first settings make queues too shallow and service too frequent, increasing overhead and reducing batching efficiency (especially for small packets).

Quick check Observe throughput vs CPU/ISR rate; check TX/RX underrun/overrun counters; log queue occupancy distribution (not just peak) to see if batching collapsed.

Fix Set a minimum FIFO/ring headroom target (small but non-zero), and use “tight coalescing” rather than “no batching” (e.g., small timer + modest packet threshold).

Pass criteria Throughput ≥ (100 − X)% of target while P99 latency < Y ms; underrun/overrun counters remain at 0.

Counters show drops but app sees none — where is loss hidden (driver ring)?

Likely cause Drops occur in a layer not directly visible to the application (e.g., ring-level drop recovered by higher-layer retries, or drops counted as “queue discard” before delivery).

Quick check Compare per-queue driver counters (ring drops, no-descriptor, FIFO overrun) against per-flow observations; confirm whether drops align with burst windows and ring occupancy peaks.

Fix Make loss explicit and attributable: enable reason-tagged drop accounting, reduce burst-induced ring starvation (ring headroom + tighter servicing), and separate latency-critical traffic from bulk queues.

Pass criteria Drops are fully explainable by counters (no “unknown loss”), and ring-level drops < X per 10⁶ packets under burst profile Y for Z s.