123 Main Street, New York, NY 10001

Automotive Ethernet PHY (100/1000BASE-T1): TC10, EMC, ASIL Hooks

← Back to: Industrial Ethernet & TSN

Core idea

Automotive Ethernet T1 PHY reliability is won by closing three loops: TC10 wake robustness, EMC-shaped transmit, and ASIL-grade evidence. This page turns each loop into measurable checks and pass/fail criteria so link stability, EMI compliance, and diagnosability can be proven—not guessed.

What this page delivers

This page focuses on 100/1000BASE-T1 PHY link robustness in vehicles—specifically TC10 sleep/wake, EMC-shaped transmit, and ASIL-friendly diagnostics—so design, validation, and troubleshooting can be executed with repeatable evidence.

H2-1. Definition & Where It Fits (100/1000BASE-T1 PHY in the vehicle)

Automotive Ethernet PHYs translate a controller/MAC interface into a single-twisted-pair in-vehicle link. The engineering goal is not only “link up”, but stable margins with evidence across harness variation, EMC constraints, low-power wake, and safety diagnostics.

Positioning: what problem is being solved
  • Physical link reality: in-vehicle wiring (connectors, branches, shields) creates reflections and common-mode paths that can destabilize training and increase burst errors.
  • EMC constraint: “works on a bench” may still fail radiated limits; transmit shaping and port co-design are required to reduce peaks without collapsing margin.
  • Low-power behavior: TC10 sleep/wake adds state transitions and timing windows; robust wake must handle noisy environments without wake storms.
  • Safety observability: ASIL-friendly hooks depend on counters, fault reasons, and consistent evidence logs—not just a link LED.
100BASE-T1 vs 1000BASE-T1: the practical differences that matter

The headline rate is not the only decision lever. For bring-up and production repeatability, the differentiators are margin sensitivity, EMC shaping headroom, and how harness variability shifts the training boundary.

  • 100BASE-T1: often easier to stabilize on diverse harnesses; EMC is still real, but the link typically tolerates more wiring variance before retrains appear.
  • 1000BASE-T1: higher throughput often means tighter constraints on port co-design (layout, protection parasitics, common-mode control) and tuning discipline for shaping vs margin.
  • Decision principle: select by system constraints first (harness, EMC limits, wake behavior, diagnostic evidence needs), then map to the minimum viable link rate.
Scope guard (this page)
TC10 EMC-shaped TX ASIL hooks

Coverage stays at the PHY and its co-design boundaries (port, harness, wake, diagnostics). Topics below are intentionally out of scope here and should be handled on their own pages.

  • Out of scope: TSN (802.1 time scheduling/shaping), PTP/SyncE deep dives, industrial protocol stacks (PROFINET/EtherCAT/CIP), PoDL/PoE power classes.
  • Allowed mention: one-sentence constraints + link to the dedicated page (no tutorials inside this page).
Key terms (minimal, PHY-boundary only)
PHY vs MAC
The PHY handles signaling, training, and error visibility at the physical/PCS boundary; the MAC/controller handles frames, queues, and host interfacing.
TC10
A low-power state model where wake entry/exit timing and wake noise immunity become first-class link requirements.
EMC-shaped transmit
PHY-side knobs (edge/drive, pre-emphasis/EQ, common-mode control) used to reduce radiated peaks while maintaining a stable training boundary.
ASIL hooks
Diagnostics that make link health observable: fault reasons, counters, interrupt causes, and evidence fields for consistent logging.
Diagram — Vehicle Domain Link Map (PHY boundary + three main hooks)
Vehicle Domain Link Map ECU to remote camera link over single twisted pair with PHY hooks for TC10, EMC-shaped transmit, and ASIL diagnostics evidence chain. Domain ECU MAC / Controller MAC IF 100/1000BASE-T1 PHY TC10 EMC ASIL STP Harness Connectors • Branches • Shield paths Main risks Reflection • Common-mode Camera / Sensor ECU Link consumer / bridge 100/1000BASE-T1 PHY MAC IF Engineering hooks TC10: sleep/wake windows EMC: TX shaping knobs ASIL: diag evidence chain Out of scope here TSN / PTP / PoDL / Stacks (handled on dedicated pages)

The diagram intentionally keeps only PHY-boundary hooks: TC10 behavior, EMC transmit shaping knobs, and ASIL-friendly diagnostics evidence.

H2-2. Link Bring-up State Machine (Power-up → Link → Normal)

Bring-up must be driven by state + evidence. Each state below defines what must be true, what can be observed (interrupt reasons, MDIO status, counters), and the fastest next action when the link stalls or flaps.

Why a state machine matters (field-proof bring-up)
  • Avoid false diagnoses: a configuration/clock problem often looks like “training failure” unless states are separated.
  • Make failures comparable: production and field logs become actionable only when counters and state transitions are consistent.
  • Connect to main hooks: TC10 mainly affects transitions and recovery; EMC shaping affects training boundary and burst errors; ASIL hooks define evidence fields.
Pre-link prerequisites (the minimum set that prevents blind alleys)
  • Power rails: correct ramp + stable reset release; eliminate brownout oscillation during first training attempt.
  • Reference clock: stable and within spec; treat marginal clock as a primary cause for intermittent training or link flaps.
  • Configuration latch: straps/MDIO values must be read back and recorded as a baseline for correlation across units and harnesses.
  • Port co-design sanity: protection and common-mode parts must not introduce excessive mismatch; verify the PHY sees a consistent port environment.
State-by-state bring-up guide (entry/exit/observables/signature/first action)
RESET
Entry: reset asserted or power-on reset active.
Exit: reset deasserted with rails + clock stable for ≥ X ms.
Observables: reset pin level; basic status shows “not ready”; no meaningful counters yet.
Failure signature: reset bounce / brownout oscillation → repeated partial bring-ups.
First action: validate rail stability + reset release timing; log the exact release moment and any droop events.
CONFIG (straps / MDIO)
Entry: reset released; strap latch window or MDIO configuration start.
Exit: configuration readback matches intended profile; interrupts cleared; baseline counters zeroed.
Observables: MDIO readback; strap status; interrupt cause register after clear.
Failure signature: “training fails” only on some units → mismatched strap pull or MDIO write sequence.
First action: always read back and log config; standardize a single golden config and diff against it.
INIT
Entry: clocks locked internally; port biasing active; link attempt begins.
Exit: PHY declares readiness to train (pre-train checks passed).
Observables: “ready” status bit; interrupt reason if pre-checks fail.
Failure signature: immediate error before training → clock/rail/config issue masquerading as link problem.
First action: confirm readiness flags and capture the first failing reason code (no guessing).
TRAIN
Entry: training starts (equalization / adaptation / convergence attempts).
Exit: convergence passes quality thresholds (e.g., quality metric ≥ X, no timeouts).
Observables: training progress/status; timeouts; error counters (should remain within X during training).
Failure signature: passes on bench but times out on harness → reflection/common-mode environment moves the boundary.
First action: capture training result codes + time-to-converge; avoid changing multiple knobs at once.
LINK_UP
Entry: training complete; link reported up.
Exit: stable counters and no retrain within Y seconds; baseline established.
Observables: link status; CRC/error counters; “retrain” events; interrupt reasons.
Failure signature: link up but unusable → burst errors, repeated retrains, or counter explosions under load.
First action: define a baseline window (Y seconds) and record counters per 1k frames or per minute (consistent denominators).
NORMAL
Entry: link stable with a known-good baseline; operating traffic present.
Exit: none; transitions occur to TC10 or recovery paths as designed.
Observables: health counters, temperature/voltage events, TC10 entry/exit events.
Failure signature: periodic flaps (minutes apart) → environmental coupling, recovery aggressiveness, or wake noise.
First action: correlate counter bursts with events (temp, rail noise, wake, EMC test steps) using a unified log schema.
ERROR ↔ RECOVERY
Entry: fault reason asserted (timeout, severe error burst, loss of lock, explicit remote fault).
Exit: recovery completes and retraining returns to LINK_UP without storm behavior.
Observables: fault reason codes; retry counters; backoff timers; retrain counts per hour.
Failure signature: retry storm → recovery criteria too aggressive or wake/noise repeatedly triggers transitions.
First action: log “fault reason + state transition timeline”; enforce a storm guard (rate limit) with pass criteria X.
Diagram — Bring-up State Machine (with observability hooks)
Bring-up State Machine RESET to CONFIG to INIT to TRAIN to LINK_UP to NORMAL with ERROR and RECOVERY loops. Each state shows INT, MDIO, and counter observability. Observability INT reason MDIO status Counters RESET CONFIG straps / MDIO INIT TRAIN converge ≤ X ms LINK_UP baseline window Y NORMAL traffic + events ERROR fault reason code RECOVERY retry + backoff + storm guard Key pass evidence retrain/hour ≤ X errors/1k ≤ X

The same state machine is reused later for TC10, EMC shaping, ASIL diagnostics, and troubleshooting: every symptom must map to a state and an evidence set.

H2-3. Signal Integrity on Single-Twisted Pair (Reach, harness effects, reflections)

The in-vehicle STP harness is not a uniform transmission line: connectors, branches, and shielding paths create discontinuities that shift the training boundary. The practical goal is to prevent reflections and mode conversion from landing inside the PHY’s effective window, which otherwise leads to timeouts, burst errors, and retrain storms.

Key harness risks that reduce PHY stability (no theory detours)
  • Stub (drop line): a short branch behaves like a reflection generator; if the return timing overlaps the PHY’s effective window, training becomes fragile.
  • T-branch / multi-branch: every branch point adds impedance steps; multiple reflections can stack and move the link across the stability cliff.
  • Connector discontinuity: contact geometry and shield transitions introduce local mismatch that may be “invisible” on a bench but destabilizing in-vehicle.
  • Return loss (reflection strength): a practical indicator that the channel is deviating from its expected impedance profile.
  • Mode conversion (DM → CM leakage): asymmetry, shield bonding choices, and mismatched parasitics can convert differential energy into common-mode, hurting both EMC and margin.
100 vs 1000: what becomes tighter (engineering, not marketing)
  • 1000BASE-T1 typically has a narrower stability margin against discontinuities (branch points, connector mismatch, port parasitics), so the same harness shift can trigger retrains sooner.
  • 100BASE-T1 often tolerates more harness variance before training fails, but severe mode conversion can still cause burst errors and EMC peaks.
  • Practical implication: channel variability must be treated as a first-class input; a “golden bench cable” is not a sufficient bring-up reference.
Harness & connector rules checklist (design → bring-up → production)

Each rule is written as a constraint + symptom hook, so failures map back to the bring-up state machine (TRAIN / LINK_UP / NORMAL).

Rule 1 — Stub length ≤ X (placeholder)
Why: stub reflections can return inside the effective window and destabilize convergence.
Quick verify: training time distribution widens; retrain/hour rises; errors appear in bursts.
Rule 2 — Avoid extra T-branches and hidden spur additions
Why: branch points create stacked reflections and shift the training boundary under temperature and layout changes.
Quick verify: link works on bench but fails on vehicle harness; retrain events correlate with harness configuration.
Rule 3 — Keep connector transitions consistent (one “stack-up”)
Why: connector geometry changes local impedance and can turn marginal links into intermittent ones.
Quick verify: only one node or one harness batch shows errors; swapping connectors moves the problem.
Rule 4 — Treat shielding and return paths as part of the channel
Why: poor shield bonding increases mode conversion and injects common-mode noise into the PHY.
Quick verify: EMC steps trigger link flaps; burst errors correlate with chassis events or wiring reroutes.
Rule 5 — Keep the port environment symmetric (minimize Δparasitics)
Why: asymmetry converts differential energy into common-mode, reducing margin and raising emissions.
Quick verify: changing TVS/CMC vendor worsens BER even with the same footprint; errors appear immediately after the change.
Rule 6 — Avoid routing harness near strong noisy bundles when possible
Why: injected common-mode noise can push the channel over the edge and cause periodic retrains.
Quick verify: failures appear only in vehicle integration; rerouting changes the error rate without PHY config changes.
Rule 7 — Establish a “baseline harness” and log its identity
Why: without a baseline, “improvements” can be statistical artifacts from changing channels.
Quick verify: compare training time + retrain/hour + errors/1k against the baseline on each test station.
Rule 8 — Validate across temperature and supply corners
Why: edge rate and thresholds drift; a reflection timing that is “outside the window” can move inside it.
Quick verify: errors only at hot/cold; retrain rises when rails are noisy or slightly low.
Rule 9 — Use a consistent counter denominator (per 1k frames / per minute)
Why: inconsistent windows create false stability/instability conclusions.
Quick verify: lock a baseline window Y seconds and compare errors/1k and retrain/hour across runs.
Rule 10 — Change one knob at a time (channel vs PHY tuning)
Why: channel changes and PHY tuning changes look similar in symptoms; mixing them blocks root-cause isolation.
Quick verify: keep harness constant while tuning, then keep tuning constant while swapping harness variants.
Fast mapping: risk → symptom → observable → fix (PHY-boundary only)
Case 1 — Bench OK, vehicle harness fails
Risk: added branch / longer spur / different connector stack-up moves reflection timing.
Symptom: TRAIN timeouts or repeated convergence attempts.
Observable: train_time_ms ↑, train_attempts ↑, retrain events appear immediately after plug-in.
Fix: remove hidden branches; enforce stub ≤ X; standardize harness variant and retest on the same baseline.
Case 2 — Only one node is “fragile”
Risk: local connector mismatch or a longer drop line on that node.
Symptom: LINK_UP but burst errors under load; retrain spikes on that endpoint.
Observable: errors/1k higher on one endpoint; swapping harness/connector moves the failure.
Fix: unify connector stack-up; shorten drop; verify port symmetry and shield termination at that node.
Case 3 — Errors appear only at hot/cold
Risk: drift changes edge timing and thresholds so a reflection moves into the effective window.
Symptom: NORMAL is stable at room temp but shows periodic burst errors at corners.
Observable: burst_flag asserted; errors correlate with temp_c changes; retrain/hour rises at corners.
Fix: tighten harness topology; increase margin via port symmetry and controlled shielding; re-baseline across corners.
Case 4 — EMC step triggers link flaps
Risk: mode conversion / common-mode injection pushes the channel over the stability cliff.
Symptom: NORMAL → ERROR → RECOVERY loops during specific EMC stimuli.
Observable: fault reason codes align with EMC steps; error bursts cluster in time windows.
Fix: stabilize shield bonding and return paths; reduce asymmetry; then tune TX shaping with a locked harness baseline.
Case 5 — Protection vendor change makes BER worse
Risk: Δparasitics (Cdiff / mismatch) increases mode conversion and local mismatch at the port.
Symptom: immediate shift in errors/1k after BOM change; training becomes slower.
Observable: train_time_ms ↑; errors rise even with unchanged harness; unit-to-unit variance increases.
Fix: enforce symmetry constraints; qualify parts by channel impact (not only ESD energy); re-run baseline with the same harness.
Case 6 — Retrain storms after harness reroute
Risk: increased coupled noise and altered shield return paths change common-mode behavior.
Symptom: ERROR ↔ RECOVERY repeats; link never settles into a stable baseline.
Observable: retrain/hour exceeds X; failures cluster when nearby loads switch.
Fix: restore routing separation; improve shield bonding strategy; apply a storm guard and re-qualify across events.
Diagram — Harness Reflection Map (trunk, branches, connectors, effective window)
Harness Reflection Map Trunk with two branch stubs and connectors. Reflection return arrow indicates timing landing into or outside a PHY effective window. A common-mode path is shown as a dashed rail. ECU + PHY 100/1000BASE-T1 effective window timing margin Remote ECU 100/1000BASE-T1 CON CON stub X stub X reflection if inside window → fragile CM path (mode conversion risk) branch point reflection return window overlap CM

Use this map to classify failures before tuning: first lock the harness topology (stubs/branches/connectors), then evaluate PHY settings against a stable baseline channel.

H2-4. Clocking & Latency Hooks inside the PHY (what matters for automotive)

Clocking affects link stability primarily through lock behavior (PLL/CDR) and the repeatability of internal boundaries. The actionable objective is to expose measurable hooks so training time, burst errors, and latency variation can be correlated to events.

Clock inputs (what to treat as first-class risk)
  • REFCLK stability: marginal reference clock frequently appears as intermittent training or periodic flaps rather than a clean fail.
  • Noise coupling path: clock routing and return-plane integrity can inject supply/EMI noise into PLL lock behavior.
  • Temperature dependency: clock source and internal thresholds drift; correlation fields (temp_c, vdd_mv) must be logged alongside link metrics.
Internal clock domains (minimal model for debugging)
  • REFCLK → PLL → TX path: transmit shaping and symbol timing rely on PLL stability; lock transitions can change observed error patterns.
  • Line → CDR → RX path: the receiver CDR must remain locked under harness and noise conditions; lock instability often appears as burst errors.
  • PCS → MAC IF boundary: monitoring should be anchored at PHY evidence (status/counters), not inferred from higher-layer traffic behavior.
What to measure (hooks that enable repeatable correlation)
Board-level hooks
  • REFCLK test pad: probeable reference clock point near the PHY (correlate with training outcomes).
  • Clean return path: preserve reference plane continuity so clock quality reflects the source, not layout artifacts.
PHY observability (MDIO / INT / counters)
  • Lock states: pll_lock, cdr_lock, and first fault reason code.
  • Training evidence: train_time_ms, train_attempts, timeout_count.
  • Stability evidence: errors_per_1k, retrain_per_hour, burst_flag (threshold X placeholders).
Recommended correlation fields (log schema)

Use a stable denominator and always log the same context fields alongside errors:

  • Context: harness_id, connector_stack, topology_variant
  • Environment: temp_c, vdd_mv, tc10_event, emc_step
  • Evidence: pll_lock, cdr_lock, train_time_ms, errors_per_1k, retrain_per_hour
Diagram — Clock Domain Block (REFCLK → PLL / RX CDR → PCS → MAC IF, with test hooks)
Clock Domain Block REFCLK input with test pad feeds PLL for TX timing and shaping. RX uses CDR lock to recover clock. PCS bridges to MAC interface. Lock indicators and MDIO/INT/counter observability are shown. REFCLK clock source TP probe hook PLL LOCK INT MDIO TX shaping knobs RX / PCS / MAC boundary RX CDR LOCK MDIO PCS status + counters MAC IF host interface errors/1k • retrain/hour • burst fault reason • events

The diagram anchors debugging to PHY evidence: lock states, training time, counters, and a probeable REFCLK point—enabling repeatable correlation across harness and environment.

H2-5. TC10 Sleep/Wake Behavior (low-power with wake robustness)

TC10 power states must be engineered as an observable three-stage contract: Entry (enter reliably), Monitor (stay asleep without false wakes or storms), and Exit (wake, retrain, and remain stable). The goal is to prevent false wake, wake failure, and wake oscillation using PHY-visible hooks (state bits, interrupts, counters) and board-level signals.

TC10 Stage A — Entry (enter TC10 deterministically)
Entry prerequisites (keep it PHY-bound)
  • Power domain readiness: rails stable before state transition; avoid brownout-induced bounce.
  • Config completeness: strap/MDIO options consistent with intended TC10 mode; wake enables set.
  • Wake pin definition: polarity and pull network defined; avoid floating inputs.
Observable hooks (what must be readable)
  • PHY: tc10_state, wake_enable, interrupt_reason (INT), fault_reason (MDIO).
  • Board: WAKE pin level/edges, local rail monitor, refclk presence (if applicable).
  • Counters: tc10_toggle_count, entry_attempts, entry_fail_count.
Pass criteria (placeholders X)
  • Entry success rate: ≥ X% over Y cycles (same harness + environment).
  • Entry time distribution: entry_time_ms P50 ≤ X and P95 ≤ X (avoid tail risk).
  • No bounce: tc10_state does not revert within Z ms after entry.
TC10 Stage B — Monitor (stay asleep; stop false wake & storms)
False wake (wake without real intent)
Typical PHY-visible triggers: common-mode injection, WAKE pin glitches, threshold jitter under harness/EMC events.
Quick checks: compare wake_reason codes vs WAKE edge count; correlate with temp_c / emc_step.
Fix direction: debounce/filter window (X ms), stabilize pull network, reduce asymmetry that increases mode conversion.
Pass criteria: false_wake_rate ≤ X/hour in Y-hour soak under corners.
Wake storm (sleep↔wake oscillation)
Definition (placeholder): wake_events_per_minute > X or tc10_toggle_count > X within Y minutes.
Observable: repeated INT reasons, retrain_count spikes, train_time_ms does not converge.
Fix direction: storm guard (rate-limited re-entry), widen stability window, lock baseline harness before tuning knobs.
Pass criteria: storm_events = 0 over Y hours; sleep_residency ≥ X%.
TC10 Stage C — Exit (wake → retrain → stable normal)
Exit chain: detect → retrain → stabilize
  • Wake detect: classify as local wake vs remote wake (wake_reason), consider debounce window X ms.
  • Link retrain: train_attempts and train_time_ms must converge; record first fault reason.
  • Normal stability window: verify errors_per_1k and retrain_per_hour stay below X in the first Y minutes.
Pass criteria (placeholders X)
  • Wake-to-link distribution: wake_to_link_ms P50 ≤ X and P95 ≤ X.
  • Training tail control: train_time_ms P95 ≤ X; no repeated rollbacks within Z minutes.
  • Post-wake stability: errors_per_1k ≤ X and retrain_per_hour ≤ X during Y-minute observation window.
Recommended evidence fields (log contract)
  • Events: tc10_entry, tc10_exit, wake_detect, retrain_start, link_up
  • Reasons: wake_reason, int_reason, fault_reason
  • Counters: wake_count, tc10_toggle_count, retrain_count, errors_per_1k
  • Context: temp_c, vdd_mv, harness_id, topology_variant, emc_step
  • Timing: entry_time_ms, wake_to_link_ms, train_time_ms
Diagram — TC10 Timeline (Sleep → Wake detect → Link re-train → Normal)
TC10 Timeline Horizontal timeline with four state blocks: SLEEP, WAKE DETECT, RE-TRAIN, NORMAL. Windows marked for debounce, training convergence, and stability observation. Local and remote wake arrows and INT/MDIO/CNT tags included. time SLEEP TC10 active WAKE DETECT RE-TRAIN NORMAL debounce X ms training X ms stability X s local wake remote wake observability INT MDIO CNT wake_reason • train_time_ms • errors_per_1k • retrain/hour • temp_c • vdd_mv

Use three windows as non-negotiable gates: debounce (X ms), training convergence (X ms), and post-wake stability observation (X s) to prevent “wake then fall back” behavior.

H2-6. EMC-Shaped Transmit (spectral shaping, edge control, common-mode control)

Transmit shaping is a set of PHY-level knobs that reduce spectral peaks and manage common-mode behavior, but every knob trades off against link margin. The disciplined flow is: lock the harness baseline, tune one knob, and verify both EMI peak and BER margin across corners.

Shaping goals (keep scope to PHY effects)
  • Lower EMI peaks: reduce radiated hotspots by controlling edges and spectral distribution.
  • Control common-mode spectrum: limit DM→CM leakage sensitivity and reduce CM-driven emissions.
  • Preserve link margin: ensure training and stability remain robust across temperature and harness variation.
Knobs (each with effect, side effect, validation, pass criteria placeholders)
Knob — Slew / edge rate
Effect: slower edges reduce high-frequency spectral peaks.
Side effect: smaller eye opening under corner harness; training tail can widen.
Validate: training_time P95, errors_per_1k, retrain/hour vs EMI peak scan.
Pass criteria: EMI_peak_reduction ≥ X and errors_per_1k ≤ X (same baseline).
Knob — Drive strength
Effect: adjusts signal amplitude and spectral energy distribution.
Side effect: too strong can worsen emissions; too weak can collapse margin on long/variable harness.
Validate: compare errors burst rate vs harness variants; confirm stable retrain/hour.
Pass criteria: retrain_per_hour ≤ X and BER margin trend stable across corners.
Knob — Pre-emphasis / De-emphasis / EQ profile
Effect: compensates channel loss; can recover margin on difficult harnesses.
Side effect: over-emphasis amplifies noise; can create burst errors and worsen EMI in certain bands.
Validate: correlate errors with load and temperature; check EMI peaks with fixed harness baseline.
Pass criteria: errors_per_1k ≤ X and EMI_peak ≤ X (no new peak regressions).
Knob — Adaptive EQ (enable/limit/freeze)
Effect: tracks channel variation; improves robustness when harness variance is unavoidable.
Side effect: adaptation can drift under noise, producing instability or longer convergence tails.
Validate: train_attempts distribution and retrain/hour under EMC stimuli and corners.
Pass criteria: train_time_ms P95 ≤ X and storm_events = 0 in Y-hour soak.
Knob — Common-mode control
Effect: shapes CM behavior to reduce CM-driven emissions and sensitivity.
Side effect: CM changes can interact with wake detect and marginal channels if asymmetry exists.
Validate: monitor false_wake_rate and errors_per_1k before/after CM tuning with identical harness.
Pass criteria: CM emission trend improves and false_wake_rate ≤ X/hour.
Knob — Shaping profile selection (freeze after validation)
Effect: provides discrete, repeatable settings across units and stations.
Side effect: profiles that pass in one harness may fail in another; baseline lock is mandatory.
Validate: corner replay (temp + harness variants) with stable denominators (per 1k / per hour).
Pass criteria: EMI and margin criteria both met across all required corner runs.
Trade-off closure (how to avoid “passes EMC but becomes fragile”)
  1. Lock baseline: harness_id + topology_variant + connector_stack fixed.
  2. Tune one knob: change a single parameter; keep all others frozen.
  3. Measure both axes: EMI peak trend + errors_per_1k / retrain/hour.
  4. Corner replay: temperature + worst harness variants; verify training tail (P95).
  5. Freeze profile: document settings as a versioned profile for production repeatability.
Diagram — TX Shaping Knob Panel (knobs → EMI peak / BER margin trade-off)
TX Shaping Knob Panel Left panel with multiple knobs (SLEW, DRIVE, PRE-EMPH, EQ, CM CTRL, PROFILE). Arrows feed into two outputs: EMI PEAK and BER MARGIN. Validation tags show EMI, SCOPE, CNT. TX shaping panel SLEW DRIVE PRE-EMPH EQ CM CTRL PROFILE EMI PEAK trend BER MARGIN errors / retrain trade-off: EMI ↔ margin EMI SCOPE CNT keep harness baseline fixed • tune one knob • verify both axes

Treat shaping as a repeatable profile: adjust a single knob at a time and freeze the profile only after EMI and margin pass criteria remain stable across corner harness and temperature runs.

H2-7. ASIL Hooks & Diagnostics (fail-silent/fail-safe observability)

Functional safety integration at the PHY is about an auditable evidence chain: what is detected, how it reacts, and what is logged. Diagnostics should be expressed as items with clear windows, denominators, and thresholds (X) so system monitors can make deterministic decisions.

Diagnostic coverage map (PHY-visible)
  • Link integrity: link_state, flap detection, retrain tracking.
  • Signal quality: SQI/quality indicators and minima over windows.
  • Error accounting: CRC/frame errors, burst behavior, errors per 1k frames.
  • Remote fault: remote_fault flags and train_fail codes for correlation.
  • Latch & safe entry: fault latch state and controlled safe-state paths.
Diagnostic items (Detection → Reaction → Evidence log)
Item — Link flap monitor
Detection
link_flap_count > X within Y minutes, or retrain_count > X/hour. Use consistent denominators and windowing.
Reaction
warn → degrade (rate-limit re-train) → safe state if persistent beyond Z windows. Use latch if required by safety concept.
Evidence log
link_state, link_flap_count, retrain_count, retrain_reason, train_time_ms P95, temp_c, vdd_mv, harness_id.
Item — SQI / quality threshold
Detection
quality_min < X over window Y seconds, or quality_p95 drifts below X across temperature corners.
Reaction
request service / degrade link profile, then enter safe state if quality remains below X for Z windows.
Evidence log
sqi, quality_min, quality_p95, topology_variant, emc_profile_id, errors_per_1k, retrain_count, temp_c.
Item — CRC / error burst detector
Detection
errors_per_1k > X, or err_burst_count > X per minute with burst_duration_ms above X. Keep window consistent.
Reaction
isolate traffic class, degrade speed/profile, then safe state if bursts persist beyond Z windows.
Evidence log
crc_err, frame_err, err_burst_count, errors_per_1k, train_fail_code, remote_fault, temp_c, vdd_mv.
Item — Remote fault / train-fail correlation
Detection
remote_fault asserted or train_fail_code != 0 during bring-up/retrain. Count occurrences per X cycles.
Reaction
request service / stop repeated retrain storms, then safe state if failure repeats above X within Y minutes.
Evidence log
remote_fault, train_fail_code, retrain_reason, train_time_ms, harness_id, topology_variant, emc_profile_id.
Evidence field dictionary (recommended)
  • Link: link_state, link_flap_count, retrain_count, retrain_reason
  • Quality: sqi, quality_min, quality_p95
  • Errors: crc_err, frame_err, err_burst_count, errors_per_1k
  • Training: train_fail_code, train_time_ms, remote_fault
  • Action: reaction_level, safe_state_entered, latch_state
  • Context: temp_c, vdd_mv, harness_id, topology_variant, emc_profile_id, tc10_state
Diagram — ASIL Evidence Chain (PHY → MCU monitor → DTC/log → safe state)
ASIL Evidence Chain Four-stage chain: PHY evidence to MCU safety monitor to DTC/log to safe state. Each stage includes small evidence field capsules like errors_per_1k, train_fail_code, reaction_level. PHY evidence Counters SQI IRQ Latch MCU monitor Window Threshold Vote Action DTC & log DTC code Snapshot Counters Context safe state degrade isolate reset evidence fields errors_per_1k train_fail_code reaction_level safe_state_entered

Keep the evidence chain auditable: every decision must have a windowed detection rule and a reproducible snapshot field set.

H2-8. Protection & Survivability at the PHY Port (ESD/surge, CM choke, magnetics note)

Port protection must survive ESD/surge families while preserving differential integrity. The engineering focus is the stack (CMC / TVS / connector) and the return path. Low-cap TVS symmetry (Cdiff/ΔC) and CMC placement can improve EMC yet create margin loss and training tails if misapplied.

Risk → Check → Fix → Pass criteria (placeholders X)
Case — ESD/surge passes once, then link becomes “fragile”
Risk
Energy return path injects into sensitive reference, shifting common-mode balance and increasing burst errors.
Check
Compare pre/post event: errors_per_1k, retrain/hour, SQI minima. Verify return path continuity and avoid plane cuts under the stack.
Fix
Re-anchor clamp/return closer to connector and shorten loop area. Keep differential routing continuous with direct return reference.
Pass criteria (X)
After Y repeated events: errors_per_1k ≤ X and retrain/hour ≤ X; no new link_flap episodes in Z minutes.
Case — Same TVS footprint, different vendor worsens BER
Risk
TVS asymmetry introduces ΔC between the pair (Cdiff/ΔC), raising mode conversion and reflections that shrink training margin.
Check
A/B compare with identical harness: train_time_ms P95, errors_per_1k, and SQI minima. Inspect symmetry of placement and return routing.
Fix
Select lower ΔC options, enforce pair symmetry, and keep clamp path short and direct to the intended return.
Pass criteria (X)
Training success ≥ X% over Y cycles, errors_per_1k ≤ X, and no new burst-error pattern after the swap.
Case — Adding a CMC improves EMI but causes sporadic link flaps
Risk
CMC adds differential insertion loss/phase shift; training tails widen and retrain storms may appear on marginal harness corners.
Check
Compare before/after: train_time_ms P95, retrain/hour, errors burst rate. Verify placement priority and keep routing continuous.
Fix
Place CMC per priority near connector, avoid stubs, and re-validate shaping profile if margins change. Freeze only after corners pass.
Pass criteria (X)
EMI improves without regressions: train_time_ms P95 ≤ X, retrain/hour ≤ X, errors_per_1k ≤ X across required corners.
Diagram — Port Protection Stack (PHY → CMC → TVS → Connector → Harness)
Port Protection Stack Block diagram: PHY to CMC to TVS to Connector to Harness. Shows placement priorities P1/P2/P3 and a bold return path to chassis/shield. Includes differential pair routing continuity markers. PHY TX/RX CMC P2 TVS P1 Connector P3 Harness return path chassis / shield short loop clamp symmetry placement priorities P1: clamp near connector • P2: CMC per guidance • P3: connector stack integrity

Keep the protection stack symmetric and the return path short. Any change in TVS/CMC can shift training tails and false-wake behavior; re-validate with the same denominators and windows.

H2-9. Reference Design & Layout (minimum rules that prevent re-spins)

Layout quality determines whether the PHY has stable training margin and repeatable EMC outcomes. The rules below are the minimum set that prevents common re-spins: differential continuity, correct port/shield handling near the connector, and power/ground practices that avoid hidden coupling paths.

The 10 hard rules (Rule → Why → Quick verify)

Each rule is written as an executable action with a fast verification step. Thresholds are placeholders (X).

A) Differential pair integrity (routing)
Rule 1 — Keep a continuous reference plane under the pair
Why
Plane cuts force return current detours, raising common-mode energy and shrinking training margin.
Quick verify
Review the pair path for any split/slot crossings. Any unavoidable crossing must include a short, direct return bridge (X rules).
Rule 2 — Match geometry and symmetry before chasing micrometer length match
Why
Over-aggressive meanders add discontinuities and mode conversion; stable symmetry usually beats excessive serpentine length match.
Quick verify
Limit serpentine density and keep both sides mirror-symmetric. Correlate any meander region with error burst hotspots (X).
Rule 3 — Keep via transitions paired, symmetric, and short
Why
Asymmetric transitions create differential imbalance and reflections that show up as longer training tails and intermittent CRC bursts.
Quick verify
Inspect every layer swap: two signal vias with symmetric spacing and a nearby return strategy. Flag any “single-via” deviation.
Rule 4 — Avoid T-branches and minimize board-level stubs
Why
Stub reflections can land inside training/sample windows, turning “link up” into periodic flaps under temperature or harness variation.
Quick verify
Count all branching points and measure any spur length. Target spur length < X (board-level), then re-check train_time_ms P95.
B) Connector & shield (port-near best practices)
Rule 5 — Provide a short, low-impedance shield bond near the connector
Why
High-frequency common-mode energy must return locally; otherwise it couples into board ground and amplifies radiated emissions and error bursts.
Quick verify
Confirm the shield bond loop is short and wide. Any “long trace to ground” around the connector is a red flag.
Rule 6 — Place Y-cap or discharge paths as close as possible to the entry point
Why
A discharge element that is far from the connector turns into an injection path; the loop area becomes the antenna.
Quick verify
Compare loop area for the discharge return. If the return detours around keep-outs or plane gaps, treat as a re-spin candidate.
Rule 7 — Follow the protection-stack priority (TVS/CMC) and do not “push it inward”
Why
Clamping after the energy has already entered the board defeats the purpose and can worsen both survivability and link margin.
Quick verify
TVS should sit near the connector (priority P1). Any relocation requires re-baselining train_time_ms P95 and errors_per_1k (X).
C) Power & ground (noise and coupling control)
Rule 8 — Use a staged decoupling network: near-HF + mid + bulk
Why
PHY internal clocking and transmit shaping are sensitive to supply transients; poor decoupling turns current steps into common-mode disturbance.
Quick verify
Check capacitor placement by loop area (pad-to-via-to-plane). Any long stubs to decaps should be flagged for revision.
Rule 9 — Identify and cut the coupling paths (REFCLK/IO → supply → port)
Why
Many field-only CRC bursts come from internal coupling, not from the harness. Layout must prevent digital noise from modulating the port common-mode.
Quick verify
Mark three likely coupling paths on the layout (X). Validate by correlating error bursts with switching activity and supply noise snapshots.
Rule 10 — Add test hooks without breaking symmetry or return continuity
Why
Test pads, probe grounds, and fixtures can create stubs and common-mode injection, producing misleading “fails” during debug.
Quick verify
A/B compare with and without probes/fixtures: sqi_min and errors_per_1k should not shift beyond X.
Diagram — Layout Do / Don’t map (return, vias, TVS placement, stubs)
Layout Do / Don’t Map Two-column visual: DO and DON’T. Each column contains four mini-scenes: return path continuity, via symmetry, TVS placement, and stub avoidance. DO DON’T return vias TVS stubs return vias TVS stubs plane continuous plane cut detour via pair asymmetry TVS connector short return TVS connector long loop no branch T-branch

Treat the port region as a controlled system: return path, symmetry, and placement priorities decide both robustness and EMC repeatability.

H2-10. Validation & Test Hooks (what to measure, how to pass)

Validation should establish a reproducible baseline, prove stability across corners, and provide traceable evidence fields. The checklist below is organized as Design gate, Bring-up gate, and Production gate, with placeholder thresholds (X).

Checklist (Design gate → Bring-up gate → Production gate)
Gate A — Design gate (prevent “built-in” failures)
Must-check
  • Port stack placement: TVS/CMC priority and short return loops.
  • Pair continuity: no plane-split crossings, symmetric transitions, minimal stubs.
  • Supply integrity: staged decoupling and identified coupling paths.
Must-log (design traceability)
layout_rev, stack_variant, return_path_note, emc_profile_id, topology_variant, harness_class (X).
Pass criteria (X)
All critical placement/continuity items pass (no waivers). Any deviation must be paired with a baseline re-test plan.
Gate B — Bring-up gate (baseline + reproducibility)
Must-measure
  • Loopback / PRBS (if supported) to separate PHY margin from higher layers.
  • Training logs: success rate and train_time_ms distribution (P95 recommended).
  • Error counters baseline: errors_per_1k, err_burst_count, retrain/hour.
Must-log (evidence fields)
train_time_ms, train_fail_code, retrain_count, retrain_reason, errors_per_1k, err_burst_count, sqi, temp_c, vdd_mv, harness_id.
Pass criteria (X)
training success ≥ X% over Y cycles, errors_per_1k ≤ X, retrain/hour ≤ X, and train_time_ms P95 ≤ X.
Gate C — Production gate (traceability + fast health check)
Must-measure
  • Short-duration PRBS/loopback snapshot + counter readout for pass/fail.
  • Corner sampling plan: temperature/voltage points and harness variants (simplified).
  • EMC regression check: if TX shaping profile changes, re-baseline link metrics.
Must-log (station fields)
station_id, fixture_id, timestamp, fw_version, cal_id, plus the full evidence fields from Bring-up gate.
Pass criteria (X)
production fail rate ≤ X, train_time_ms P95 drift ≤ X between stations, and errors_per_1k ≤ X under the defined snapshot window.
EMC pre-scan loop (measure → map to knobs → re-verify)
  • Measure: record peak map (frequency + amplitude + setup tags) and link evidence fields during the same window.
  • Map: adjust TX shaping knobs (slew/drive/EQ/common-mode control) targeted to the dominant peak type.
  • Re-verify: confirm no regression in training success, errors_per_1k, and retrain/hour (threshold X).
Diagram — Validation ladder (bench → harness → vehicle → EMC → production)
Validation Ladder Five-step ladder showing increasing realism from bench to production. Each step includes three short must-test items as capsules. bench PRBS loopback baseline harness train_time P95 errors_per_1k retrain vehicle temp/vdd event log flap EMC peak map knob adjust re-verify production health station trace gate gate gate

Build confidence bottom-up: establish a baseline on the bench, then add harness realism, vehicle corners, EMC closure, and production traceability.

H2-11. Applications (camera, zonal, domain controller links)

This section maps real vehicle link use-cases to three page mainlines: TC10 wake robustness, EMC-shaped transmit, and ASIL-grade observability. Only PHY-facing constraints and pass/fail gates are included—no stack/TSN expansion.

Use-case A · Camera / ADAS edge link Focus: TC10 + EMC + harness variability
Context
Point-to-point STP/UTP link to a remote sensor module (camera/radar/telematics). Cold-start and frequent sleep/wake cycles often combine with strict EMI limits and harness/connector variability across trims.
PHY constraints (only PHY-facing)
  • Wake path sensitivity: TC10 entry/exit windows, wake filter behavior, and wake-source arbitration (local vs remote).
  • EMI vs margin trade: TX spectral shaping can lower peaks but reduce eye/BER margin under temperature/harness drift.
  • Burst errors: error bursts often correlate with retrain attempts, CM noise events, or edge-rate shifts.
Design hooks
  • Profile ID: freeze a per-vehicle “PHY profile” (TC10 + TX shaping + diagnostics set) and log the profile_id with every field event.
  • Wake de-glitch: place wake filter/debounce close to the PHY wake pin and keep wake ground return short/quiet.
  • Closed-loop EMC: link pre-scan “hot bands” back to a limited set of TX shaping knobs (avoid uncontrolled tuning).
Pass criteria (placeholders)
  • Wake success rate ≥ X% across temperature/voltage corners; false-wake ≤ X/hour.
  • Retrain count ≤ X/day; error_bursts_per_1k ≤ X with defined time window.
  • EMI peak reduction ≥ X dB versus baseline with no BER regression beyond X.
Example BOM (non-exhaustive, for grounding the discussion)
Automotive Ethernet PHY examples: TI DP83TG720S-Q1 (1000BASE-T1) NXP TJA1121 (1000BASE-T1) Marvell 88Q2110 / 88Q2112 (100/1000BASE-T1) NXP TJA1101B (100BASE-T1) Microchip LAN8770 (100BASE-T1)
Low-C ESD / TVS examples for in-vehicle T1 lines: Nexperia PESD1ETH1G-LS Nexperia PESD2ETH1G-T Nexperia PESD2ETH100-T
Common-mode choke examples (use vendor reference values): TDK ACT1210G-800-2P-TL05 Pulse AE5002
Use-case B · Zonal (spur-rich harness) Focus: stubs + false wake + diagnostics
Context
Short branches and topology variants dominate. Field issues often present as “random drops” or “wake storms” that only occur in certain harness builds.
PHY constraints
  • Reflection timing: spur/stub reflections can land inside the sampling/training window and create repeatable CRC bursts.
  • Wake ambiguity: CM events can look like wake activity unless wake filtering and grounding are controlled.
  • Serviceability: without standardized counters and fields, “which branch failed” becomes unanswerable.
Design hooks
  • Harness rule: enforce stub length < X and document topology_variant per vehicle variant.
  • Counters schema: define a single window/denominator for error rates (errors_per_1k) and log P50/P95.
  • Topology forensics: log harness_id + connector_id + cmc/tvs footprint revision to correlate failures.
Pass criteria (placeholders)
  • Train_time_ms P95 ≤ X across harness variants; retrain/hour ≤ X.
  • False wake ≤ X/hour in the worst-case CM event environment.
  • CRC bursts correlate to a single captured topology signature within X minutes of logging.
Example BOM anchors
PHY examples often used in zonal links: NXP TJA1101B (100BASE-T1) TI DP83TC811R-Q1 (100BASE-T1) Broadcom BCM89810 (100BASE-T1) Microchip LAN8770 (100BASE-T1)
Use-case C · Domain controller (multi-port) Focus: consistency + logs + isolation of noise paths
Context
Multi-port PHY deployment concentrates clock/power noise and increases the need for identical configuration, identical diagnostics, and repeatable production gates per port.
PHY constraints
  • Port-to-port drift: identical harnesses can behave differently if REFCLK routing, decoupling, or CM return is asymmetric.
  • Profile skew: mixed TX shaping / TC10 / counter settings destroy comparability and slow field triage.
  • Evidence alignment: logs must share the same field schema and time window to compare ports.
Design hooks
  • Per-port profile control: enforce a single “golden” configuration set and gate any deviations by profile_id.
  • Noise containment: isolate per-port return and avoid plane cuts under the MDI and protection parts.
  • Forensics-ready counters: snapshot counters on every wake/retrain/error burst with temp_c and vdd_mv.
Pass criteria (placeholders)
  • Port-to-port train_time_ms delta ≤ X (P95); port-to-port error_rate delta ≤ X.
  • All ports meet EMC target with a single shaping profile (no per-port “special tuning”).
  • Field triage can identify failing port + harness_id + protection_rev within X minutes.
Example BOM anchors
1G multi-port deployments commonly anchor on: TI DP83TG720S-Q1 NXP TJA1121 Marvell 88Q2110 / 88Q2112
Use-case Tiles (PHY view)
3 scenes · few words · more elements
Use-case Tiles (PHY-only): TC10 · EMC shaping · ASIL observability Camera / ADAS Domain ECU PHY STP/UTP harness Remote PHY Camera TC10 · EMC · harness drift Zonal (many spurs) Controller + PHY Main trunk Node Node stubs · false wake · counters Domain controller MCU / SoC safety monitor + logs Port1/2/3/4 PHY Harness ports consistency · profile · evidence
Diagram intent: quickly select the matching scene, then apply the same three mainlines (TC10 / EMC shaping / ASIL evidence) with scene-specific gates.

H2-12. IC Selection Logic (metrics → decision tree)

Selection is framed as a decision flow with explicit gates, not a vendor comparison. The goal is to minimize re-spins by deciding topology/harness first, then EMC strategy, then TC10 wake robustness, then diagnostics/ASIL evidence.

Decision flow (4 steps, each outputs a gate)
Step 1 · Topology / harness
Decide trunk vs spur-rich, expected harness variants, and connector environment. Gate outputs: stub_len < X, return_loss_margin ≥ X, topology_variant logged.
Step 2 · EMC strategy
Require a controllable set of TX shaping knobs and a measurable EMC closure loop. Gate outputs: shaping_range ≥ X, CM control mode available, no uncontrolled “field tuning”.
Step 3 · TC10 sleep/wake robustness
Confirm entry/monitor/exit behavior, wake sources, and false-wake suppression. Gate outputs: wake_success ≥ X%, false_wake ≤ X/hour, retrain_after_wake ≤ X.
Step 4 · Diagnostics / ASIL evidence
Ensure counters + IRQ/fault latch + logging fields exist to build an auditable evidence chain. Gate outputs: counters_set ≥ X, evidence_fields defined, safe-state controllable.
Parameter checklist (ask these before committing)
  • Speed grade: choose 100BASE-T1 vs 1000BASE-T1 based on bandwidth + harness/EMC headroom.
  • TC10 behavior: entry/monitor/exit conditions, supported wake sources, and false-wake suppression knobs.
  • TX shaping: slew/drive/EQ/common-mode control range and how knobs are validated.
  • Diagnostics set: SQI/quality metrics, CRC/error counters, remote fault reporting, IRQ behavior.
  • Logging schema: required fields + window definition to compare across ports and vehicles.
  • Port protection compatibility: tolerance to low-C TVS mismatch (Cdiff/ΔC) and optional CMC insertion loss.
  • Temperature grade & EMC collateral: corner stability evidence and available compliance resources.
Concrete part-number anchors (examples, not a recommendation list)
100BASE-T1 PHY examples: NXP TJA1101B TI DP83TC811R-Q1 Microchip LAN8770 Broadcom BCM89810
1000BASE-T1 PHY examples: TI DP83TG720S-Q1 NXP TJA1121 Marvell 88Q2110 / 88Q2112
ESD / TVS examples for T1 MDI lines: Nexperia PESD1ETH1G-LS Nexperia PESD2ETH1G-T Nexperia PESD2ETH100-T
CMC examples (validate against PHY vendor ref designs): TDK ACT1210G-800-2P-TL05 Pulse AE5002
Selection Flow (4-step gates)
topology → EMC → TC10 → evidence
1) Topology stub < X RL margin ≥ X 2) EMC plan shaping ≥ X CM control 3) TC10 wake wake ≥ X% false ≤ X/h 4) Evidence counters ≥ X fields defined Outcome Choose 100 vs 1000BASE-T1 + lock a PHY profile_id + define evidence fields for production & field service. profile_id (frozen) shaping knobs (bounded) evidence fields (standard)
Diagram intent: prevent re-spins by forcing decisions in the correct order and turning each step into a measurable gate.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13. FAQs (TC10 / EMC shaping / ASIL hooks / harness & port protection)

Purpose: close long-tail troubleshooting without expanding scope. Each FAQ uses a fixed, auditable 4-line format and shares the same metric/log schema.

Shared metric schema (used in every Pass criteria)
Core metrics
  • errors_per_1k: PHY-layer error events per 1000 frames (window = W seconds, fixed across ports/vehicles).
  • burst_count: number of bursts where ≥ M errors occur within T ms (burst definition is fixed).
  • retrain_count: retrain attempts per hour, and retrain_after_wake within S seconds after wake.
  • train_time_ms P95: 95th percentile bring-up time across N cycles (captures tail failures).
Required log fields (minimum)
profile_id harness_id batch_id topology_variant wake_src wake_reason temp_c vdd_mv train_time_ms errors_per_1k burst_count retrain_count
▸ TC10 enters sleep, then sporadic “wake storm” — check wake source or harness common-mode events first?
Likely cause: wake_src is truly asserted (legit remote/local wake) or CM disturbance is interpreted as wake activity (ground/shield/CMC/return path).
Quick check: correlate each wake with wake_reason + IRQ cause and the same-window burst_count / errors_per_1k; verify harness_id/batch_id pattern.
Fix: tighten wake filter/debounce (bounded change), then harden CM return (shield bond/Y-cap location/CMC placement) and freeze a single profile_id for TC10+filters.
Pass criteria: false_wake ≤ X/hour, wake storm disappears under worst CM events; burst_count ≤ X in window W.
▸ Remote wake success rate is low — wake window/debounce or power domains not ready?
Likely cause: wake detect window is too narrow/filtered out or VDD/REFCLK/strap readiness is late so wake triggers retrain failure.
Quick check: timestamp sequence: wake_detect → rails_ready → refclk_stable → retrain_start; flag cases where rails_ready arrives after wake window (S seconds).
Fix: adjust wake window/debounce (bounded), then gate wake exit on rails_ready/refclk_stable; keep reset/strap deterministic across all ports.
Pass criteria: wake_success ≥ X% across temp/voltage corners; retrain_after_wake ≤ X within S seconds.
▸ Link meets BER/CRC targets but EMI fails — tune TX shaping first or check CM return/CMC placement first?
Likely cause: TX shaping profile is not aligned to harness resonance or CM return path is uncontrolled (shield bond/return discontinuity/CMC location).
Quick check: lock harness_id and environment, then A/B one shaping knob at a time; if EMI hot-band moves with shaping, start there—if not, suspect return/CMC placement.
Fix: define a bounded knob set (slew/drive/EQ/CM control) → freeze profile_id; if still failing, rework CM return (short, continuous, near connector).
Pass criteria: EMI hot-bands meet target with fixed profile_id; errors_per_1k ≤ X and no burst_count regression beyond X.
▸ Increasing drive improves errors but worsens EMI — how to balance with a single metric set?
Likely cause: objective function is undefined—drive raises eye margin but spikes spectral peaks; “good BER” and “good EMI” are not gated together.
Quick check: use a 3-metric gate: errors_per_1k (window W) + EMI_peak at hot-band + retrain_count/hour; compare profiles by profile_id only.
Fix: bind drive to shaping/EQ as one profile, set hard upper bound for drive, and require EMI+retrain gates before accepting any “BER improvement”.
Pass criteria: errors_per_1k ≤ X, EMI_peak ≤ X, retrain_count/hour ≤ X under the same harness_id and corners.
▸ Training becomes sporadically failing after TVS change — suspect Cdiff or ΔC mismatch first?
Likely cause: TVS adds too much differential capacitance (Cdiff) or pair mismatch (ΔC) creates mode conversion that breaks training margin.
Quick check: A/B compare old vs new TVS on the same board/harness; flag if failures correlate with direction, temperature, or only certain batch_id.
Fix: revert to lower-C/low-ΔC TVS, keep placement symmetric and as close as possible to the connector with a controlled return path.
Pass criteria: train_success ≥ X%, train_time_ms P95 ≤ X, retrain_count/hour ≤ X across corners.
▸ Adding CMC improves EMI but sporadic link drops appear — check DM insertion loss or CM saturation first?
Likely cause: CMC adds differential-mode loss/bandwidth limitation or saturates under certain CM bias/events, causing nonlinear bursts.
Quick check: correlate drops with burst_count + temp_c/vdd_mv; if failures cluster with high-current events or temperature rise, suspect saturation; otherwise suspect insertion loss.
Fix: select CMC validated for T1 bandwidth, place per reference design, and avoid pairing “aggressive shaping + marginal CMC” (freeze safe profile_id).
Pass criteria: EMI target met and retrain_count/hour ≤ X; burst_count ≤ X in W; errors_per_1k ≤ X.
▸ Link is stable at low temp but drops at high temp — TX amplitude drift or power-noise coupling first?
Likely cause: temperature shifts TX edge/level enough to reduce margin or supply/ground noise coupling increases at high temp (decoupling/return weakness).
Quick check: plot temp_c vs train_time_ms P95 and errors_per_1k for fixed harness_id/profile_id; if VDD variation co-moves with errors, suspect power coupling.
Fix: choose a temperature-robust shaping profile, then harden local decoupling and return continuity near the PHY and MDI network.
Pass criteria: across temp range: retrain_count/hour ≤ X, errors_per_1k ≤ X, train_time_ms P95 ≤ X.
▸ One harness batch has many issues — what are the two fastest correlation checks (log fields)?
Likely cause: harness electrical variation (impedance/return loss) or assembly/shield bond variation causing CM susceptibility.
Quick check: (1) batch_id/topology_variant vs train_time_ms P95 + errors_per_1k; (2) connector_id/shield_bond_rev vs false_wake + burst_count (same window W).
Fix: enforce required fields (batch_id, topology_variant, shield_bond_rev), quarantine failing batch, and tighten harness gate (stub<X, RL≥X).
Pass criteria: failing batches identified within X minutes of logs; replacement batch restores errors_per_1k ≤ X and retrain_count/hour ≤ X.
▸ Error counters spike but scope looks “fine” — metric window/denominator mismatch or burst mechanism?
Likely cause: counter is interpreted with the wrong window/denominator or errors occur as short bursts that the scope never triggered on.
Quick check: align the counter window W to scope trigger capture; check burst_count with a fixed definition (≥M errors within T ms) and snapshot at every burst.
Fix: standardize metric definitions (errors_per_1k, burst_count) and add burst-triggered captures/log snapshots tied to profile_id and harness_id.
Pass criteria: errors_per_1k ≤ X and burst_count ≤ X in W; no unexplained spikes after definition unification.
▸ ASIL diagnostic coverage is weak — which PHY-side evidence fields are most often missing?
Likely cause: evidence chain breaks because mandatory PHY observables are not logged consistently per event (wake/retrain/burst).
Quick check: for each event type, verify presence of: profile_id, wake_reason, train_time_ms, errors_per_1k, burst_count, retrain_count, temp_c, vdd_mv (completeness audit).
Fix: define an evidence schema + snapshot timing rules (on wake, on retrain, on burst), then connect to MCU safety monitor → DTC/log → safe state triggers.
Pass criteria: evidence completeness ≥ X% and event-to-DTC latency ≤ X ms; safe-state action taken within X when required.
▸ After wake, immediate repeated retraining occurs — retrain thresholds or over-aggressive EMC shaping?
Likely cause: retrain trigger threshold is too sensitive or EMC shaping reduces margin below what wake transient conditions require.
Quick check: inspect retrain_after_wake within S seconds and compare profiles by profile_id; if retrains cluster only on one aggressive profile, suspect shaping.
Fix: relax retrain thresholds within bounded limits and revert to a known-safe shaping profile for wake transitions; lock the wake profile_id.
Pass criteria: retrain_after_wake ≤ X, wake_success ≥ X%, errors_per_1k ≤ X in the first W seconds post-wake.
▸ Multi-port PHY: only one port is fragile — layout/return asymmetry or parameter inconsistency first?
Likely cause: per-port layout/return path asymmetry (MDI/protection/ground) or profile skew (different shaping/TC10/counter settings).
Quick check: audit profile_id equality across ports first; then compare port-to-port deltas for train_time_ms P95 and errors_per_1k under the same harness_id and window W.
Fix: enforce identical profile_id across ports; if fragility remains, correct return continuity and protection symmetry (placement + shortest CM return).
Pass criteria: port-to-port delta ≤ X (train_time_ms P95, errors_per_1k, retrain_count/hour); no single-port outliers across corners.