CAN XL PHY (~10 Mbps): Triple-Speed Compatibility & Low-EMI
← Back to: Automotive Fieldbuses: CAN / LIN / FlexRay
CAN XL PHY makes ~10 Mbps feasible by combining Classic/FD/XL coexistence with tighter physical-layer timing and topology discipline. Success is proven by measurable window margin and harness-level evidence—not by bench waveforms alone.
CAN XL PHY: what it is, what it solves, and scope guard
CAN XL PHY enables a higher-speed data domain (~10 Mbps) while remaining compatible with Classic CAN and CAN FD networks—at the cost of much tighter physical-layer timing, topology, and emission constraints.
- Timing margin shrinks: loop-delay symmetry and propagation uncertainty consume the sample-point window faster at ~10 Mbps.
- Topology becomes stricter: stubs, harness capacitance, and reflections that were “fine” at lower rates can dominate bit errors.
- EMI trade-offs sharpen: emission-suppressed transmit shaping reduces radiation, but can also reduce eye-opening if margins are not budgeted.
- Triple-speed coexistence from a PHY viewpoint (Classic/FD/XL)
- ~10 Mbps timing budget: loop delay, symmetry, sample window erosion
- Low-EMI TX shaping strategy and margin-aware validation hooks
- SIC/SIC-XL waveform deep-dive (use the SIC subpage)
- CMC / split-termination / TVS selection details (use EMC/Protection)
- Controller / protocol / gateway / DoIP (use CAN Controller/Bridge)
- Classic/FD is stable, but errors appear only when switching into XL.
- A harness/topology change flips pass/fail, even though silicon looks “identical”.
- EMI improves after shaping, but bit errors increase (margin vs shaping conflict).
Where CAN XL fits: triple-speed domains and system-level patterns
CAN XL is justified when the system needs a higher-speed data domain, and the harness/topology/EMI constraints can be engineered and validated. Otherwise, a slower domain often delivers higher end-to-end reliability at lower integration cost.
-
Need: Is the payload/latency requirement genuinely above what Classic/FD can deliver in the target topology?
Output: If FD capacity is sufficient, avoid XL to reduce harness constraints and validation burden.
-
Harness reality: Can stub length, branch count, and connector parasitics be controlled to protect the ~10 Mbps sample window?
Output: If topology cannot be improved, XL failure modes become reflection- and delay-driven.
-
EMI budget: Can emission targets be met with margin-aware TX shaping and system co-design (routing/returns/termination policy)?
Output: If shaping fixes EMI but breaks link margin, the design needs a timing/topology re-balance.
- Shorter harness segments → higher XL feasibility
- Explicit margin validation on real harness
- Edge shaping tuned to the cluster harness
- Topology discipline: stubs must stay short
- Coexistence stress: reflections and asymmetry
- Validate mode transitions under worst-case loads
- Topology risk: long stubs / many branches / unpredictable connectors that cannot be controlled or segmented.
- Timing risk: unknown or drifting loop-delay symmetry across temperature/aging, leaving no sample-window margin.
- EMI risk: stringent emission targets require aggressive shaping that collapses margin, and system remediation is not possible.
PHY building blocks: driver, receiver, common-mode control, and host interface
A CAN XL PHY can be treated as four cooperating blocks—mode/encode gate, TX shaping, RX decision, and host interface—plus side-channel protection hooks. This decomposition enables margin budgeting and fast failure attribution without drifting into controller/register details.
TX path (shaping & emissions)
- Knobs: Slew step, drive strength, symmetry trim.
- Trade-off: Lower EMI often means smaller timing margin.
- Observe: rise/fall mismatch, edge “ringing”, error sensitivity vs shaping level.
Practical goal: reduce common-mode excitation while preserving a stable sample window on real harnesses.
RX path (decision stability)
- Mechanisms: threshold/ hysteresis, common-mode rejection, input filtering.
- Failure signature: errors cluster with CM noise, temperature, or supply events.
- Observe: decision jitter, CM disturbances, correlation with temperature/voltage.
Practical goal: maintain stable decision boundaries even when harness CM noise and reflections rise at higher speed.
Host interface & protection hooks
- Host rate: higher throughput requires predictable PHY latency (no register deep-dive).
- Fail-safe: defined behavior under open/short/undefined bus states.
- Dominant timeout: prevents a stuck-dominant condition from monopolizing the bus.
- Thermal: shutdown/recovery behavior must be observable and attributable.
Practical goal: expose clear flags/counters so failures can be attributed to protection triggers vs true margin loss.
- Errors change strongly with shaping level → TX/topology/EMI coupling is dominant.
- Errors track temperature/voltage → RX threshold/latency drift is likely dominant.
- Timeout/thermal flags appear → protection-triggered drops must be separated from margin failures.
Backward/forward compatibility: coexistence rules and failure modes
Compatibility claims do not guarantee stability on mixed networks. Real-world coexistence is gated by PHY constraints: waveform symmetry, loop-delay consistency, termination policy, and disciplined mode transitions on real harnesses.
Legacy node distortion
- Symptom: XL errors appear only after adding an older node.
- PHY cause: edge deformation increases reflections and common-mode excitation.
- Quick discriminator: errors rise when shaping becomes “faster” or when a stub is activated.
- Fix direction: segment topology or enforce stricter edge/termination discipline.
Mixed harness reflections
- Symptom: short harness passes; long/branched harness fails at XL.
- PHY cause: stub/branch reflections collapse the sample window at ~10 Mbps.
- Quick discriminator: errors correlate with harness variant, branch activation, or connector swaps.
- Fix direction: tighten stub rules, refine termination policy, or segment the network.
Mode transition transient
- Symptom: stable in steady-state, but errors cluster at mode switches.
- PHY cause: transient shaping/latency mismatch triggers mis-decision under tight margins.
- Quick discriminator: errors align with transition events more than with steady bus load.
- Fix direction: discipline transitions, validate worst-case timing under real harness stress.
- Waveform symmetry: avoid unnecessary common-mode excitation and over-zero jitter.
- Loop-delay symmetry: keep timing uncertainty from consuming the sample-point window.
- Termination policy: ensure reflections are damped and consistent across variants.
- Mode-transition discipline: verify switch events under worst-case topology and EMI conditions.
Timing budget at ~10 Mbps: loop delay symmetry, sample-point window, propagation segments
At ~10 Mbps, the bit time is only 100 ns. The remaining sample-point window is what survives after subtracting harness propagation, TX/RX delays, asymmetry, and uncertainty. Stability depends on budgeting and measuring these terms on the real harness variant, not on a short bench cable.
- Bit time: Tbit = 1 / 10 Mbps = 100 ns.
- Sample-point window: the time region where the receiver decision is stable under worst-case harness and noise.
- Budget principle: keep an explicit “uncertainty bucket” so drift and unmodeled effects do not become hidden margin killers.
- Tprop — harness propagation (length + velocity factor).
- Ttx — TX path delay (incl. drift vs temperature/voltage/lot).
- Trx — RX path delay (incl. decision pipeline and drift).
- Tasym — asymmetry penalty (direction/edge/path mismatch consuming the window).
- Tunc + Tmargin — uncertainty bucket + explicit engineering margin.
-
Measure Tprop: estimate propagation per harness variant and compute total propagation for the longest path.
Output: Tprop_total and a consistent one-way vs round-trip definition.
-
Measure loop delay: align a TX-side event with an RX-side decision event using the same trigger definition.
Output: Tloop_total, plus a clear statement of what constitutes “start” and “decision”.
-
Measure Tasym: compare dominant↔recessive edge timing or two direction paths and take worst-case mismatch.
Output: Tasym = max(ΔTedge) as a window-eating penalty term.
-
Sweep drift (cold/hot): repeat 1–3 at temperature/voltage corners and build an uncertainty envelope.
Output: Tunc bucket to prevent hidden margin loss.
Emission-suppressed transmit stages: shaping strategy vs robustness trade-offs
Faster edges do not automatically improve link quality. Aggressive edge rate increases radiation and crosstalk, while overly slow shaping can collapse the remaining sample window. Practical shaping selects a level that passes EMI targets while preserving timing margin on the worst harness variant.
- Faster edge: more high-frequency content → higher EMI and stronger coupling into nearby conductors.
- Slower edge: reduced spectral content but longer transition time → smaller decision window at ~10 Mbps.
- Engineering target: pass EMI while keeping Twindow_remaining above the required threshold (see the timing budget).
- Reduces dV/dt driven radiation
- Too slow shrinks timing margin
- Improves load adaptation / edge control
- Too strong excites reflections
- Reduces common-mode excitation
- Improves mixed-network robustness
- Short, controlled harness: prefer Max margin or Balanced to protect the decision window.
- Moderate branches / connector variants: default to Balanced and validate mode transitions on the worst variant.
- Long/heavy harness or mixed legacy nodes: prefer Aggressive shaping plus symmetry trim, and segment the network if the window collapses.
Boundary note: external CMC/TVS selection and placement belong to the EMC/Protection subpage; this section focuses on internal shaping strategy.
Network constraints: termination, stubs, harness capacitance, and topology limits for XL
At ~10 Mbps, XL failures are frequently topology-driven: reflections and ringing inflate over-zero jitter, while stubs and harness loading shrink the decision window. This section provides practical rules and fast checks to separate termination issues, stub-driven eye collapse, and load-capacitance constraints—without expanding into EMC component design details.
Termination: damping vs reflection
- Role: absorb energy and suppress ringing on the trunk.
- Failure signature: errors change strongly with harness length/variant or connector swaps.
- Fast check: compare worst-variant error behavior with and without selected branches connected.
Boundary: CMC / split-termination component details belong to the EMC subpage.
Stubs & T-branches: XL is stricter
- Mechanism: stubs create reflection sources that arrive near the sample window.
- Symptoms: eye collapse, ringing, over-zero jitter, mode-switch error clustering.
- Rule template: keep stubs short (X cm placeholder) and avoid uncontrolled T-branches.
Harness capacitance & loading
- Effect: heavier loading slows edges and reduces Twindow_remaining.
- Interaction: EMI-friendly shaping may further shrink timing margin on heavy harnesses.
- Fast check: if one shaping step changes errors dramatically, topology/loading is likely dominating the window.
- Worst-variant first: validate on the longest, heaviest, most-branched harness variant—not a short bench cable.
- Branch isolation test: disconnect a suspect branch and observe whether errors disappear (stub-driven reflection).
- Connector sensitivity: errors that follow connector swaps suggest impedance discontinuity and reflection hot-spots.
- Termination sanity: verify end nodes are true endpoints; “almost-endpoint” placements often behave as long stubs.
- Shaping sensitivity: if slower shaping increases errors, the decision window is being consumed—do not keep slowing edges blindly.
- Remediation ladder: shorten stubs → segment topology → re-evaluate termination policy → then tune shaping.
- Stub rule: all stubs ≤ X cm (platform threshold placeholder).
- Worst variant: passes error criteria within X per Y minutes on the worst harness.
- No clustering: no error bursts on branch connect/disconnect or mode transitions.
- Window tie-in: Twindow_remaining ≥ X ns after accounting for topology/loading (see timing budget).
Robustness & protections at PHY level: shorts, ESD, common-mode range, thermal behavior
Vehicle harnesses impose real stress: short-to-battery/ground events, ESD injections, large common-mode swings, and thermal overload. A robust PHY must survive, react predictably (limit/shutdown/retry), and expose clear logging hooks so protection triggers can be separated from true margin loss.
Shorts & recovery behavior
- Events: short-to-VBAT, short-to-GND, miswires.
- PHY actions: current limiting, shutdown, timed retry, dominant-timeout.
- Recoverability: depends on fault removal and thermal cooldown.
ESD: pass vs “degrade”
- Immediate: reset/drop or latch protection event.
- After-effect: higher error sensitivity indicates margin shrink.
- Judgment: compare pre/post error curves on the same worst harness.
Boundary: TVS selection and placement belong to the EMC subpage.
Common-mode & ground offset
- Impact: RX decisions become unstable when CM swings push beyond tolerance.
- Signature: errors correlate with load steps, ground shifts, or HV-domain activity.
- Note: isolation may be required when domain ground potential differences are large.
- Event type: short / thermal / ESD suspected / CM out-of-range.
- Counters: occurrences, duration, and timestamp of each trigger.
- Context: PHY mode, shaping level, and whether a mode transition occurred.
- Outcome: recovered (Y/N), cooldown time, post-recovery error sensitivity shift.
Diagnostics hooks: what to log, what counters matter, and how to attribute failures
Field failures become actionable only when symptoms are captured as structured logs. A minimal, repeatable log schema should answer: when the event happened, under what load/mode it happened, and under what environment/harness state it happened. With these three dimensions, failures can be attributed to timing margin, SI/reflection, or protection-triggered behavior.
PHY-visible signals (abstract, controller-agnostic)
- dominant-timeout
- thermal / shutdown
- retry / limit active
- standby / sleep
- selective wake status
- wake source tag
- error pin / interrupt
- fault latch
- mode transition marker
Boundary: protocol/controller register details are intentionally excluded; this section focuses on PHY-facing observability and cross-layer aggregation fields.
Required logs (three dimensions)
- t_start / t_end (timestamp)
- event_type (drop/burst/transition)
- duration_ms, burst_count
- bus_util_% (windowed)
- burstiness (peak/avg)
- mode_state (Classic/FD/XL)
- shaping_level (A/B/C)
- temp_C, vbat_V, vio_V
- harness_variant_id
- connector_state (abstract)
Engineering note: if one dimension is missing (time / load / environment), attribution collapses into guesswork and the same issue repeats across variants.
High-value counters (small set, high attribution power)
- protect_trigger_count (windowed)
- dominant_timeout_count
- thermal_event_count (enter/exit)
- recovery_success_rate
- error_burst_count (per Y minutes)
- mode_transition_error_count
- wake_false_count (if used)
- post_event_sensitivity_shift
These counters map directly into the attribution tree: timing-driven, SI/reflection-driven, or protection-triggered.
Pass criteria templates (threshold placeholders)
- error_burst_count ≤ X per Y minutes (worst harness variant)
- mode_transition_error_count ≤ X within window Y
- protect_trigger_count = 0 (or ≤ X) during validation
- post_event_sensitivity_shift ≤ X after stress/recovery
Validation plan: how to measure XL margins on real harness (bench ≠ vehicle)
A credible XL validation plan must be layered: IC-level establishes a clean baseline, PCB-level introduces connector and protection parasitics, and harness-level validates the real topology and environmental corners. The delivery decision belongs to harness-level results, not short-cable bench runs.
Layered validation: A → B → C
- A) IC-level: short cable, idealized load → isolate PHY baseline capability.
- B) PCB-level: connectors, ESD structures, layout parasitics → ensure board implementation does not consume margin.
- C) Harness-level: real stubs/loads/temperature variants → determine real Twindow_remaining and robustness.
Required measurements (margin-focused)
- Waveform symmetry: confirm symmetry stability across load, variant, and temperature corners.
- Loop delay: measure a consistent Tloop definition and track drift across corners.
- Sample-point margin: compute Twindow_remaining using the timing budget framework (threshold X placeholder).
- Shaping robustness: validate three shaping presets (A/B/C) on the worst harness variant and record sensitivity curves.
SOP (execution order)
- Define worst harness variant: length, stubs, node count, connector version.
- Freeze logging schema: time/load/environment fields (aligned with diagnostics hooks).
- Run A-level baseline: verify symmetry, delay stability, and baseline margin.
- Run B-level baseline: repeat with connectors and protection parasitics present.
- Run C-level worst case: test three shaping levels on worst harness and compute Twindow_remaining.
- Sweep temperature corners: repeat C-level critical runs (cold/hot) and capture drift.
- Check transition clustering: count errors around mode transitions in fixed windows.
- Export evidence pack: KPI summary + window stats + log excerpts per variant.
Common measurement traps (avoid false confidence)
- Probe ground bounce: can create fake ringing; keep return paths controlled and consistent.
- Bandwidth-limited “smoothness”: can hide edge stress; verify measurement bandwidth and trigger definitions.
- Trigger window misses transients: capture mode transitions and burst windows explicitly.
- Mixed reference points: do not mix TXD vs bus vs RX decision points without redefining delay metrics.
Applications (CAN XL PHY-focused): where XL wins and how systems deploy it
Covers: XL deployment patterns, PHY constraints, and verification hooks (timing window, topology control, shaping sweep, attribution logs).
Does NOT cover: controller/register details, SIC waveform deep-dive, or discrete EMC BOM (CMC/TVS/split termination).
CAN XL PHY delivers value when a controlled segment can be created: stable loop-delay symmetry, managed stubs/termination, and a repeatable harness evidence pack. XL success is determined more by topology control + verification discipline than by nominal data rate.
- PHY constraints: common-mode excursions and harsh fault profiles demand stable receiver decision and explainable protection behavior (thermal/short recovery).
- Deployment pattern: keep XL as a short, clearly terminated segment; avoid uncontrolled long stubs and variant-heavy branches.
- Verification hooks: temperature corners + burst load; correlate faults with protection flags and shaping states (attribution-first).
Example material numbers (verify package/suffix/availability)
- CAN XL / SIC XL PHY: TCAN6062-Q1, TJA1483A, NT156
- Port ESD (low-C examples): ESD2CANXL24-Q1, ESDCAN24-2BLY
- PHY constraints: mixed domains succeed only when segment boundaries are clean (termination, connectors, harness variants).
- Deployment pattern: treat XL as a “managed link” inside the zone; use gateways to isolate uncontrolled branches and legacy domains.
- Verification hooks: create a harness evidence pack per variant_id (loop delay, symmetry, shaping sweep, and transient capture).
Example material numbers (verify package/suffix/availability)
- CAN SIC XL PHY: TJA1482A, TJA1483A, NT156
- Mixed-bus sleep shielding (XL passive feature examples): TJA1446, TJA1465
- PHY constraints: short-run clusters tolerate higher rate when stubs are disciplined; long irregular harness branches collapse the sampling window first.
- Deployment pattern: keep clusters local; upgrade segments selectively instead of “XL everywhere”.
- Verification hooks: run shaping sensitivity curves and mode-transition stress in the exact connector + protection stack.
Example material numbers (verify package/suffix/availability)
- CAN XL PHY: TCAN6062-Q1
- CAN SIC XL PHY: TJA1482A, NT156
- Port ESD (examples): ESD2CANXL24-Q1, ESDCAN06-2BLY
IC selection logic (CAN XL PHY): the spec checklist + decision flow
Selection must map harness reality to PHY controllability. Prioritize observable timing margin, shaping flexibility, and failure attribution over datasheet-only “max bitrate” claims.
- Loop-delay symmetry: require stable TX/RX delay balance across voltage/temperature; verify on harness with repeatable method.
- Delay drift visibility: prefer devices that expose state/flags enabling timing-vs-environment correlation.
- Mode-transition stability: demand predictable behavior during domain switching; validate with transient capture and counters.
- Shaping steps: require multi-step slew/drive control to tune margin vs EMI on real harness.
- Short-to-VBAT/GND behavior: ensure defined protection action and recovery conditions (thermal throttling, retry policy).
- Common-mode tolerance: confirm receiver stability under ground offset and noisy return paths; isolate only when domain offsets are unavoidable.
- Fail-safe defaults: verify bus behavior during undervoltage, open, and controller reset scenarios.
- Protection flags: dominant timeout, thermal, wake/standby state—must be capturable with timestamps.
- Event counters: prefer counters that separate timing issues vs protection-triggered behavior.
- Attribution traceability: ability to link failures to shaping state, mode transition, and environment.
- Slew steps: multi-level edge control enables harness-dependent tuning.
- Drive strength: adjustable or segmented drive helps balance reflections vs emissions.
- Symmetry strategy: aim to reduce common-mode excitation; validate by comparing margin under shaping sweeps.
- Topology tolerance: match device capability to stub limits and harness capacitance reality.
- Connector/protection parasitics: ensure the PHY remains stable with the real ESD/protection stack.
- Evidence pack readiness: require a repeatable sign-off set: loop delay, symmetry, shaping sweep, and failure attribution logs.
Output profiles (capability bundles) with example material numbers
Choose when fault stress and explainable recovery dominate. Prioritize protection observability and stable receiver behavior under common-mode events.
- PHY examples: NT156, TJA1483A, TCAN6062-Q1
- ESD examples: ESD2CANXL24-Q1, ESDCAN24-2BLY
Default choice when both EMI and margin matter. Require multi-step shaping and a clean sign-off evidence pack on representative harness.
- PHY examples: TCAN6062-Q1, TJA1482A
- Mixed-bus sleep shielding (optional): TJA1446, TJA1465
Choose only when topology is controlled and evidence proves sufficient timing window after aggressive shaping. Avoid in variant-heavy or stub-rich harness.
- PHY examples: TCAN6062-Q1, TJA1482A
- ESD examples (lower C preferred): ESD2CANXL24-Q1
Recommended topics you might also need
Request a Quote
FAQs (CAN XL PHY): troubleshooting without expanding scope
Each answer is a fixed 4-line micro-SOP. All checks stay at the PHY level (timing window, topology sensitivity, shaping trade-offs, protection behavior, measurement pitfalls).
harness_variant_id · mode_state(Classic/FD/XL) · shaping_level(Aggressive/Balanced/MaxMargin) · error_burst_count(/window) · mode_transition_error_count(/transitions) · protect_trigger_count(thermal/timeout) · Twindow_remaining(ns or %bit) · Tloop_symmetry_delta(ns) · temp_C · vbat_V · observation_window(Y minutes)
Classic/FD are stable, but switching to XL drops immediately — check timing window first or stub/topology first?
Likely cause: Either Timing margin collapses (Twindow_remaining too small) or SI/Reflection dominates (stub/termination sensitivity in XL).
Quick check: Keep harness fixed; sweep shaping_level (Aggressive→MaxMargin) and log error_burst_count per observation_window (Y minutes) in XL.
Fix: If errors track shaping_level ⇒ rebudget timing + choose margin-first shaping; if errors track harness_variant_id/stub changes ⇒ shorten stubs, enforce clear end termination, and segment the XL link.
Pass criteria: error_burst_count ≤ X per Y minutes in XL across worst harness_variant_id and temp_C at Tmin/Tmax; mode_transition_error_count ≤ X per Y transitions.
Same 10 Mbps target: harness A is stable, harness B is not — what is the first loop-delay symmetry measurement?
Likely cause: Timing margin differs due to propagation + asymmetry; harness B increases Tloop_symmetry_delta and eats Twindow_remaining.
Quick check: Measure Tloop in both directions on both harnesses and compute Tloop_symmetry_delta = |T(A→B) − T(B→A)| (same probe points, same trigger definition).
Fix: If Tloop_symmetry_delta is large ⇒ tighten topology (stub discipline, segment boundaries) and select shaping_level that preserves Twindow_remaining on the worst harness_variant_id.
Pass criteria: Tloop_symmetry_delta ≤ X ns and Twindow_remaining ≥ X ns (or ≥ X% bit time) across Tmin/Tmax; error_burst_count ≤ X per Y minutes.
Emission-suppressed shaping improves EMI, but bit errors rise — suspect slew/drive first or sampling window first?
Likely cause: Timing margin is being traded away: slower edges reduce EMI but shrink Twindow_remaining (especially on worst harness_variant_id).
Quick check: Run a 3-point shaping sweep (Aggressive/Balanced/MaxMargin) and plot error_burst_count vs shaping_level at fixed mode_state=XL and fixed observation_window.
Fix: Choose the lowest-EMI shaping_level that still preserves Twindow_remaining; if only MaxMargin works, topology governance is required before aggressive EMI shaping is allowed.
Pass criteria: For the selected shaping_level, error_burst_count ≤ X per Y minutes and Twindow_remaining ≥ X ns across Tmin/Tmax and worst harness_variant_id.
Scope waveform looks symmetric, yet XL has intermittent errors — could probe/bandwidth create a false “clean” view?
Likely cause: Measurement artifact (bandwidth limiting, ground bounce, trigger window bias) hides fast transients and overstates margin.
Quick check: Repeat capture at the same node using two setups (different probe/grounding) and compare error_burst_count correlation to captures; also widen capture to include mode_transition windows.
Fix: Standardize a “reference measurement recipe” (probe point, bandwidth, grounding, trigger definition) and base sign-off on Twindow_remaining + counters, not on a single “pretty” trace.
Pass criteria: Under the standardized recipe, Twindow_remaining ≥ X ns and error_burst_count ≤ X per Y minutes; results repeat within ±X% across setups.
Passes at low temperature but fails at high temperature (or vice versa) — suspect TX/RX delay drift or threshold drift first?
Likely cause: Timing margin changes with temp_C (TX/RX delay drift) and/or receiver decision stability shifts (effective thresholds/hysteresis).
Quick check: Run the same harness_variant_id at Tmin and Tmax, log Twindow_remaining proxy (from timing budget inputs) + error_burst_count + protect_trigger_count.
Fix: If Twindow_remaining shrinks with temp ⇒ shift to margin-first shaping and tighten topology; if Twindow is stable but errors grow ⇒ focus on common-mode noise sources and receiver stability verification.
Pass criteria: error_burst_count ≤ X per Y minutes and protect_trigger_count ≤ X across Tmin/Tmax with stable mode_state=XL and fixed observation_window.
After adding ESD/protection, XL becomes less stable — check parasitic capacitance first or return-path/layout first?
Likely cause: Added parasitics (capacitance/mismatch) slow edges and distort symmetry, and/or return-path disruption increases common-mode excitation.
Quick check: A/B compare boards with and without the protection stack using the same harness_variant_id; log shaping_level sensitivity and mode_transition_error_count.
Fix: Keep the protection network tightly coupled and symmetric; if the added stack forces only MaxMargin shaping to pass, treat it as a topology/stack constraint in the evidence pack.
Pass criteria: With the final protection stack, error_burst_count ≤ X per Y minutes and Twindow_remaining ≥ X ns (or ≥ X% bit) for worst harness_variant_id.
Adding one node degrades the whole XL network — is it stub/load or a legacy node distorting edges?
Likely cause: SI/Reflection from increased stub length/load or mixed-node behavior causes ringing/zero-cross jitter that collapses XL eye margin.
Quick check: Toggle the node in/out and bucket results by connector_state; compare error_burst_count and shaping_level sensitivity on the same harness_variant_id.
Fix: If the node introduces high sensitivity, shorten its stub and enforce segment boundaries; if the node is legacy/mixed, isolate it behind a gateway segment instead of keeping it on the XL segment.
Pass criteria: With all intended nodes present, error_burst_count ≤ X per Y minutes and mode_transition_error_count ≤ X per Y transitions across worst connector_state.
Short bench cable is OK, but errors cluster in specific vehicle operating conditions — power/common-mode or harness motion first?
Likely cause: Vehicle conditions add environment coupling (common-mode shifts, supply noise, ground offsets) and/or variant dynamics (motion-related connector/harness changes).
Quick check: Correlate errors with temp_C and vbat_V; bucket error_burst_count by operating mode and harness_variant_id (vehicle vs bench).
Fix: If correlation tracks power/CM events ⇒ prioritize receiver stability checks and protection attribution; if correlation tracks motion/connector_state ⇒ treat as topology/variant governance issue and re-sign-off at harness level.
Pass criteria: Across defined vehicle scenarios, error_burst_count ≤ X per Y minutes with protect_trigger_count ≤ X, and the worst-case bucket is repeatable within ±X%.
Occasional dominant timeout / thermal flag appears — how to distinguish real faults from false triggers?
Likely cause: Protection-triggered behavior is either legitimately responding to stress (short/overtemp) or being provoked by transient margin collapse (timing/SI events).
Quick check: Time-align protect flags with error bursts: log protect_trigger_count and error_burst_count in the same observation_window; tag shaping_level and temp_C at each event.
Fix: If flags precede errors ⇒ treat as real protection stress and validate recovery (cooldown + retry); if errors precede flags ⇒ treat as margin collapse and address timing/topology before tuning protection thresholds.
Pass criteria: protect_trigger_count ≤ X per Y hours, recovery_success_rate ≥ X%, and post-event error_burst_count returns to baseline within X minutes.
Lowering the data rate fixes it; pushing to max breaks — check propagation window first or driver shaping profile first?
Likely cause: At max rate, propagation + loop delay consumes the sampling window; shaping choice can either recover or further reduce Twindow_remaining.
Quick check: For the failing harness_variant_id, compare two conditions: max rate with MaxMargin shaping vs one-step lower rate with same shaping; log Twindow_remaining proxy + error_burst_count.
Fix: If only lower rate passes even with MaxMargin shaping ⇒ topology governance is required; if MaxMargin at max rate passes ⇒ lock the profile and forbid aggressive shaping on that harness variant.
Pass criteria: At the target max rate, error_burst_count ≤ X per Y minutes and Twindow_remaining ≥ X ns on worst harness_variant_id; shaping_level is fixed and documented.
Same ECU, different production lots show different margin — how to define production consistency metrics?
Likely cause: Delay drift distribution (TX/RX delay + asymmetry) and small receiver threshold shifts can move Twindow_remaining across the acceptance boundary.
Quick check: Define a fixed harness_variant_id + fixed shaping_level and measure lot-to-lot deltas: Δerror_burst_count, ΔTloop_symmetry_delta, and pass rate in Tmin/Tmax corners.
Fix: Freeze the sign-off recipe (harness variant, shaping profile, measurement recipe) and add a guard-band requirement on Twindow_remaining to absorb distribution tails.
Pass criteria: Across N samples, pass_rate ≥ X% with Twindow_remaining ≥ X ns and Tloop_symmetry_delta ≤ X ns; lot-to-lot drift ≤ X% on key metrics.
Acceptance passed, but performance degrades after weeks — connector/harness aging or cumulative ESD damage first?
Likely cause: SI/Reflection drift from connector/harness aging (increased impedance discontinuities) and/or latent damage shifting protection/receiver behavior.
Quick check: Compare current evidence pack vs baseline on the same harness_variant_id: error_burst_count, shaping sensitivity, and protect_trigger_count trend over time.
Fix: If shaping sensitivity increases ⇒ treat as discontinuity aging and tighten segment governance; if protection flags trend upward ⇒ investigate stress exposure and enforce post-event health checks.
Pass criteria: Over Z weeks, error_burst_count remains within baseline ±X% and protect_trigger_count ≤ X per Y hours; no monotonic worsening across repeated tests.