Selective Wake / Partial Networking (ISO 11898-6)
← Back to: Automotive Fieldbuses: CAN / LIN / FlexRay
Selective Wake (ISO 11898-6) enables a CAN transceiver to recognize only specific wake frames in ultra-low power, while Partial Networking turns that capability into a system policy that wakes only the needed ECU domain. This guide focuses on filter-table design, wake attribution, and validation so standby current targets are met without false wakes.
Scope Guard & Quick Map
This page focuses on ISO 11898-6 selective wake / partial networking end-to-end: how to design frame filters, attribute wake sources, validate ultra-low standby current, and debug false wakes without drifting into general CAN PHY waveform or EMC component selection.
What this page delivers
- Filter design: ID/mask, DLC and payload match strategy, table sizing and update policy.
- Wake attribution: a practical event schema to identify bus/local/timed/remote wake sources.
- Validation & debug: standby Iq measurement method, false-wake control, and reproducible tests.
- Selection logic: PN-centric specs that decide success (without vendor part-number lists).
Not covered here (to avoid cross-page overlap)
- CAN FD/XL waveform, sampling-point, PHY timing details (belongs to CAN FD/XL transceiver pages).
- CMC/TVS/split termination component-level EMC design (belongs to EMC/Protection & Co-Design page).
- SBC internal power-rail/WDG/reset architecture (belongs to SBC pages).
Rule: if a claim requires oscilloscope-level PHY waveform details or EMC component parasitics to justify, it is routed out to the dedicated sibling page instead of expanding this page.
One-page pipeline map (how the page is organized)
Sleep → Bus activity → Filter → Wake decision → Host wake → Network resume
Each block above maps to a later chapter: filter strategy, false-wake control, measurement, diagnostics, and selection.
Next decisions (what to pin down early)
- Define wake sources and an attribution schema before tuning filters.
- Decide the service/OTA update policy for filter tables to avoid field-side surprises.
- Set KPIs for standby Iq and false wake rate to keep later validation objective.
Why Partial Networking (Power Budget & Wake Cost)
Partial networking is not about saving a little current on a single node. Its value comes from scale: as node count grows, standby current adds up linearly, while false wakes create a nonlinear system cost (multiple ECUs and domains may power up together). The goal is targeted wake—wake only the ECUs that must act.
Standby current budget (engineering accounting, not marketing)
A useful budget separates what is unavoidable (always-on domain) from what can be kept in selective-wake listening. The accounting below keeps later measurement and validation objective.
Practical tip: budget by domain (body/comfort, gateway, always-on) rather than by a flat ECU list—this avoids mixing incomparable standby modes.
Wake event cost (why false wakes hurt)
A false wake is not only a momentary interrupt. It is often a chain reaction: one wake source triggers policy logic, powers multiple rails, starts clocks, logs events, and may wake adjacent domains through the gateway. That is why false-wake control is a first-class requirement, not a late-stage tweak.
- Energy: domain power-up + MCU active time + bus traffic during resume.
- Wear: repeated power cycling, thermal cycling, and NVM write amplification from event logging.
- Field behavior: parked battery drain and hard-to-reproduce “night wake” complaints.
KPIs to lock before implementation (placeholders)
- Standby Iq: ≤ X (by domain / by ECU role).
- False wake rate: ≤ Y per day (measured with defined harness/noise conditions).
- Attribution coverage: ≥ Z% wake events tagged with a source and filter hit ID.
These KPIs directly drive later chapters: filter strategy, false-wake robustness knobs, measurement setup, and diagnostics logging.
Next decisions (to keep the project controllable)
- Define the domain split and which ECUs must remain always-on versus PN-listening.
- Write the wake attribution fields (source, filter-hit ID, timestamp, version) before tuning filters.
- Lock KPI placeholders (X/Y/Z) and measurement conditions so validation does not drift later.
Frame Filter Design (ID/mask, DLC, payload match, table strategy)
A selective-wake filter table is a controlled policy asset: it converts raw bus activity into a reliable wake decision under low-power constraints. A robust design balances three tensions—false wake vs missed wake, table size vs coverage, and factory configuration vs OTA evolution vs service-mode bypass.
Outputs to lock (engineering deliverables)
- Min-set ID/mask: smallest mask coverage that matches the intended wake intents without table explosion.
- Second-stage checks: DLC and payload match rules that suppress noise-like frames and unintended IDs.
- Priority & conflicts: deterministic hit order and a defined default for “no hit”.
- Lifecycle governance: versioning, OTA update rules, rollback, and service bypass controls.
- Evidence hooks: filter_hit_id / pattern_id and policy_version for attribution and debugging.
Min-set ID/mask design (avoid table explosion)
Mask design works best as a two-stage funnel: a fast, broad ID/mask pre-filter that limits candidate frames, followed by a stricter DLC/payload match that collapses false positives. The goal is coverage with compression.
- Group by wake intent (diagnostic, comfort action, safety/critical intent) instead of per-ECU lists.
- Cluster IDs into maskable families (prefix/contiguous ranges) to reduce entries.
- Close the funnel using DLC/payload rules to prevent broad masks from waking on unrelated traffic.
Practical rule: coarse masks are acceptable only if a second stage removes the most common false hits.
DLC and payload match (reduce false wakes without missed wakes)
- Prefer stable fields: service/function codes over volatile counters or rolling values.
- Use compatibility groups: allow an OR-set of acceptable patterns if software evolves over time.
- Plan for OTA changes: if payload semantics may change, require policy_version discipline and rollback.
Table strategies (whitelist, pattern-based, service bypass)
Whitelist: most predictable; table may grow; best for stable wake intents.
Pattern-based: compact across multiple IDs; must be hardened against unintended matches.
Service-mode bypass: required for diagnostics; must be logged, time-bounded, and reversible.
Lifecycle governance (factory → OTA → service)
- policy_version: every shipped table is versioned and logged with each wake event.
- Rollback: a fail-safe path exists if an OTA policy increases missed wakes or false wakes.
- Compatibility: mixed-version nodes must not break core wake intents during rollout.
- Audit: bypass enablement is time-bounded and recorded (who/when/how long/how many wakes).
Next decisions (to keep the table controllable)
- Freeze an attribution schema: wake_source + filter_hit_id + policy_version.
- Define service bypass governance: time-bounded, logged, and reversible.
- Specify hit priority rules so behavior remains deterministic across updates.
Wake Events & Timing (sleep entry, latencies, edge cases)
Timing clarity is what makes selective wake measurable and debuggable. This chapter defines the PN state machine, the entry conditions for “true PN listening”, and the latency segments from bus activity to host readiness and network resume. Any explanation that requires CAN FD waveform or sampling-point details is routed to PHY-focused pages.
PN state machine (roles and transitions)
Active → Prepare-sleep → PN listening → Wake pending → Active. Each transition must be observable through a defined status bit, pin, or log event so that field failures are reproducible.
Sleep entry conditions (define “true PN listening”)
- Bus idle: bus inactivity holds for ≥ X ms (placeholder) before entering PN listening.
- Host handshake: MCU/SBC grants low-power entry (placeholder signal/flag).
- Transceiver mode: PN listening mode is confirmed by a readable status or event record.
Without these three definitions, “it was asleep” becomes untestable, and false-wake investigations become guesswork.
Wake latency as segments (assign responsibility)
- Detect latency: activity → match → wake asserted (transceiver + filter pipeline).
- Host wake latency: wake asserted → host ready (clock, reset, interrupt response, policy).
- Network resume latency: host ready → resume traffic (gateway/ECU resume policy window).
Segmenting latency prevents misdiagnosis: slow resume is not always a transceiver issue.
Activity windows and debouncing (edge cases without waveform details)
- Detect window: require minimum activity duration ≥ A (placeholder) to ignore short spikes.
- Confirm window: second-stage match completes within B (placeholder) for a valid wake.
- Cooldown window: ignore re-triggers for C (placeholder) after a valid wake to suppress chatter.
- Evidence: count “activity with no hit” to separate noise from intended wake traffic.
Next decisions (to prevent timing ambiguity)
- Define “true PN listening” via bus idle, host handshake, and transceiver mode confirmation.
- Measure latency by segments (detect/host/resume) to avoid misattribution during debug.
- Introduce counters for “activity with no hit” to separate noise from intended wake traffic.
False Wake & Immunity (noise patterns, robustness knobs)
False wake is the dominant failure mode for selective wake and partial networking: it drains the standby budget and makes field issues hard to reproduce. This section turns “random wake-ups” into an engineering control loop: classify the observable symptom, map it to a coupling class, then apply a deterministic mitigation knob with measurable KPIs.
Scope lock (what is controlled here)
- Controlled: symptom classification, attribution evidence, filter/timing/policy knobs, and KPI definitions.
- Not expanded: CMC/TVS component implementation or detailed EMC layout; route those to EMC/Protection subpages.
- Not expanded: CAN FD waveform/sampling-point analysis; route those to PHY-focused pages.
False-wake source buckets (observable patterns)
Common-mode coupling: activity is detected, but second-stage hits are rare (“activity-no-hit” rises).
Harness pickup: correlation with environment or harness routing; sporadic wake clusters.
Ground bounce / return path: wakes align with local high-current switching events.
Post-ESD degradation: “passes once” but becomes more fragile; false wake rate drifts upward after stress.
Policy/software mismatch: appears after OTA or variant change; hit patterns shift with policy_version.
Robustness knobs (control levers + trade-offs)
Layer 1 · Filter knobs
- tighten ID/mask coverage; reduce candidates
- strengthen DLC/data match; remove false positives
- use compatibility OR-sets for evolving payloads
Trade-off: higher missed-wake risk unless service governance and versioning are enforced.
Layer 2 · Timing knobs
- debounce activity (A) to ignore short spikes
- confirm window (B) for second-stage match
- cooldown window (C) to suppress chatter
Trade-off: longer wake latency and a need to validate short-intent wake use cases.
Layer 3 · Attribution & policy knobs
- second-confirm wake_source in Wake Pending
- isolate service bypass (time-bounded + audited)
- arbitrate remote/timed wake vs PN filtering
Trade-off: more software policy, but much higher field serviceability.
KPIs and measurement scope (placeholders)
- False wake rate: ≤ X/day over a defined off/locked window (24–72 h), temperature and battery range as placeholders.
- Attribution completeness: ≥ Z% of wake events have wake_source + policy_version (+ hit_id if applicable).
- Activity-no-hit counter: tracked to separate coupling noise from intended wake traffic.
Fast control loop (field-friendly)
- Classify: check wake_source, filter_hit_id, and “activity-no-hit” counters.
- Decide knob: noise-like activity → timing knobs; unintended matches → filter knobs; drift after stress → confirm + audit.
- Freeze evidence: bind changes to policy_version and rerun the same KPI window.
Engineering Checklist (Design → Bring-up → Production)
This checklist converts selective wake and partial networking into executable gates. Each line item specifies what must be defined, what evidence must exist, and the pass criteria (placeholder thresholds) that prevent late-stage surprises.
Design gate
- Domains & roles: evidence: domain map + ECU role list; pass: coverage ≥ X%.
- Attribution schema: evidence: wake_source enum + required fields; pass: completeness = 100%.
- Filter strategy: evidence: whitelist/pattern plan; pass: table size ≤ X entries.
- Priority rules: evidence: hit order + default no-hit; pass: deterministic behavior = yes.
- OTA & rollback: evidence: policy_version + rollback plan; pass: rollback validated = yes.
- Timing anchors: evidence: T0–T3 + windows A/B/C; pass: measurable anchors = yes.
Bring-up gate
- Standby Iq: evidence: per-domain measurement log; pass: Iq ≤ X.
- Wake sensitivity: evidence: intended wake testcases; pass: success ≥ X%.
- False wake attempt: evidence: 24–72 h run log; pass: false wake ≤ Y/day.
- Activity-no-hit: evidence: counter + timestamps; pass: bounded within X.
- Attribution check: evidence: wake records; pass: completeness ≥ Z%.
- Version trace: evidence: policy_version present; pass: 100% of events tagged.
Production gate
- Temp sweep: evidence: cold/hot logs; pass: false wake ≤ Y/day.
- Aging drift: evidence: ≥ 72 h static log; pass: drift ≤ X.
- Post-ESD regression: evidence: before/after comparison; pass: delta ≤ X%.
- Harness variants: evidence: variant matrix; pass: no new wake_source classes.
- Service audit: evidence: bypass logs; pass: traceability = 100%.
- Black-box export: evidence: retrieval format; pass: export success = 100%.
Pass criteria placeholders (how to fill X/Y/Z)
Use a single KPI window definition across stages (off/locked duration, temperature range, battery range). Keep thresholds consistent across variants, then tighten only after attribution completeness is stable.
Measurement & Validation (standby Iq and wake sensitivity)
Partial networking succeeds only if standby current and wake sensitivity are measured with stable definitions, repeatable setups, and complete context logging. This section standardizes the measurement window, exposes the most common low-current traps, and defines a reproducible wake-injection protocol.
Lock the measurement definition first
- State proof: verify the DUT is in PN listening (not only “MCU asleep”) via mode/status evidence.
- Stabilization window: wait X minutes after entering PN listening before sampling Iq.
- Scope flags: mark service bypass, diagnostics, and remote/timed wake policies in every run record.
Standby Iq traps (how measurements get biased)
Burden voltage: meter drop shifts the DUT operating point; Iq changes with range or instrument model.
Autorange spikes: range switching introduces transients; entry-to-sleep readings can be misleading.
Ground/harness error: long loops and alternate returns inject offsets; the same board differs by cabling.
Fixture leakage: chambers, relays, and test adapters can add parallel leakage paths.
Fast sanity check: if Iq strongly depends on instrument range, harness routing, or fixture choice, the setup is dominating the result.
Standardize the topology (where to measure and what to timestamp)
- Power path: PSU → Iq meter (or shunt measurement point) → DUT.
- Signal path: CAN tool injects patterns and load profiles, independently logged.
- Time anchors: record T_enter_PN and T_sample_start for every run.
Wake sensitivity (repeatable injection contract)
- Controlled input: pattern generator defines frame content, cadence, repetition count, and bus load profile.
- Noise profiles: test with spikes, bursts, and long-idle rare events (defined profiles, not ad hoc noise).
- Outputs: wake success rate, wake latency distribution (P50/P95 placeholders), and spurious triggers.
Required run record fields (for comparability)
Must-have
temperature, battery_voltage, bus_load_profile, node_count, filter_table_version, policy_version, T_enter_PN, T_sample_start
Strongly recommended
instrument_model, fixed_range, sampling_rate, harness_variant, ground_scheme, fixture_id
Diagnostics & Safety Hooks (counters, attribution, minimal ASIL visibility)
Partial networking must be serviceable: every wake should be attributable, replayable, and comparable across policy versions. This section defines the minimal data set, the counters that close the loop, and the “fault injection” paths needed to validate diagnostics without expanding into full safety standard documentation.
Minimal service data set (versioned and small)
- timestamp: consistent time base (placeholder) for ordering and replay.
- wake_source: bus / local / timed / remote / diagnostic (enumerated).
- filter_hit_id or pattern_id: present when selective-wake evidence exists.
- policy_version and filter_table_version: required for comparability.
- outcome: wake_valid / spurious / ignored (policy-dependent placeholder).
Counters that close the loop
- wake_events_by_source: counts per wake_source (rolling window placeholder).
- filter_hits / filter_misses: evidence of intended vs unintended matches.
- false_wake_suspects: events failing second-confirm or policy validation.
- activity_no_hit: activity detected but no match (coupling/noise indicator).
- rate_limited_wake: suppressed by cooldown windows (chatter indicator).
Each counter must specify: window definition and reset policy (key-on, service reset, or rolling).
Fault injection (validate diagnostics, not to “break” the system)
- bus wake injection: send matching / non-matching / boundary frames and verify counters + records.
- local wake injection: toggle local wake input and verify wake_source attribution.
- hit/miss forcing: select specific filter entries and confirm filter_hit_id behavior.
- acceptance: records are complete (required fields) and ring buffer replay is consistent.
Minimal ASIL-facing visibility (hook principles)
- PN state: Active / PN listen / Wake pending (enumerated).
- last_wake_source: last classified source (enumerated).
- diagnostic flag: threshold-based alert derived from counters (placeholder).
- policy_version: the active strategy version for traceability.
Scope note: only interface hooks and minimal data are defined here; full safety standard processes are intentionally out of scope.
H2-11 · Applications (HV domains, gateways, and isolation topology patterns)
Isolated CAN/CAN FD is most valuable at domain boundaries: where ground potential differences (GPD), fast dv/dt, and fault energy make “same-ground assumptions” unreliable. The goal is a predictable boundary port that remains stable under switching noise, service faults, and wake/sleep policies.
HV e-Drive / Inverter “island”
- Topology: LV ECU ↔ isolated CAN FD ↔ inverter island harness.
- Why isolation: large dv/dt from power stage + noisy return paths + service fault energy.
- Practical focus: CMTI margin, correct common-mode current return, and stable wake behavior.
BMS / HV battery island
- Topology: cell monitor / stack controller ↔ isolated boundary port ↔ vehicle backbone.
- Why isolation: HV stack reference moves; service and charging events create large GPD transients.
- Practical focus: low standby current + reliable selective wake without false-wake storms.
Gateway / Domain controller boundary port
- Topology: secure gateway ↔ isolated CAN-FD “port” ↔ HV sub-network.
- Why isolation: protects the gateway ground from HV island noise and fault return currents.
- Practical focus: fault observability (bus/local/power) and serviceability logs.
Isolation topology patterns (with example material numbers)
Pattern A — Integrated isolated CAN FD (signal + isolated power)
- Fastest integration: one IC provides the barrier and an isolated DC/DC.
- Examples: TI ISOW1044; ADI ADM3055E / ADM3057E.
- When it wins: tight BOM + predictable isolated supply start/stop behavior.
Pattern B — Integrated isolated CAN FD (signal only) + external isolated power
- Signal IC: TI ISO1042-Q1; NVE IL41050TFD-1E.
- Isolated power options: transformer drivers TI SN6505A-Q1 (5V systems) or SN6507-Q1 (wide input); or module class Murata NXJ2 (reinforced insulation class; confirm grade/qualification per program).
- When it wins: strict EMI budget, custom isolated-rail sequencing, or higher isolated power needs.
Pattern C — Discrete digital isolator + standard CAN FD transceiver
- Digital isolators: TI ISO7721-Q1 (2ch) or ISO7741-Q1 (4ch); ADI ADuM120N (2ch).
- CAN FD PHY choices: TI TCAN1042-Q1, NXP TJA1044, Microchip MCP2562FD, Infineon TLE9255W.
- When it wins: reuse an existing CAN FD PHY footprint, or isolate only selected signals.
Notes: material numbers above are examples to anchor selection logic; always verify package, suffix, automotive grade (AEC-Q), and availability.
H2-12 · IC Selection Logic (decision tree + spec-to-risk mapping)
Selection should map requirements → risk → verification. The correct part is the one whose isolation rating, CMTI behavior, timing margin, low-power policy, and diagnostics match the failure modes expected in the target HV domain.
Decision tree (requirements-first)
- Isolation rating: reinforced vs basic; check working voltage / creepage / surge expectations.
- dv/dt environment: choose CMTI headroom for switching events; prioritize correct return-path strategy.
- CAN FD timing: data rate + harness loading define loop-delay and symmetry margin requirements.
- Isolated power: integrated iso DC/DC vs external rails; match EMI/noise budget and wake sequencing.
- Low-power: standby current, wake sources, false-wake tolerance, and how wake attribution is reported.
- Diagnostics: fault observability for service (bus/local/power) and safety hooks for monitoring.
Shortlist bucket A — Integrated isolated transceiver + integrated isolated power
- TI: ISOW1044 (isolated CAN FD with integrated DC/DC).
- ADI: ADM3055E / ADM3057E (isolated CAN FD with integrated isolated DC/DC).
- Use when: isolated rail sequencing must be simple, and BOM/power-tree risk must be minimized.
Shortlist bucket B — Integrated isolated transceiver (signal only) + external isolated supply
- Signal IC examples: TI ISO1042-Q1; NVE IL41050TFD-1E.
- Isolated power examples: TI SN6505A-Q1 (push-pull transformer driver) or TI SN6507-Q1 (wide VIN push-pull driver); module class example Murata NXJ2 (reinforced insulation class; confirm grade).
- Use when: EMI constraints require a custom isolated converter frequency/transformer, or isolated power needs exceed integrated limits.
Shortlist bucket C — Discrete isolator + standard CAN FD PHY (modular approach)
- Digital isolators: TI ISO7721-Q1, TI ISO7741-Q1, ADI ADuM120N.
- CAN FD PHY examples: TI TCAN1042-Q1, NXP TJA1044, Microchip MCP2562FD, Infineon TLE9255W.
- Use when: separating isolation and PHY helps reuse legacy designs, or isolates only specific signals/wake paths.
Spec-to-risk mapping (what each spec protects against)
Isolation rating & working voltage
Risk: barrier overstress or insufficient creepage margin under contamination and surge.
Verify: program-specific insulation coordination review + layout creepage audit on final PCB.
Mitigate: choose reinforced/basic per safety target; keepout + slot strategy; conformal coating policy.
CMTI (dv/dt immunity)
Risk: inverter switching injects displacement current through barrier capacitance → false toggles, resets, bus errors.
Verify: test during worst-case dv/dt events; correlate errors with switching edges and ground noise.
Mitigate: higher-CMTI parts + correct return paths + controlled stitching/Y-cap usage (path-first).
Loop delay & symmetry (CAN FD timing)
Risk: sample-point margin collapse at higher bit rates; asymmetry behaves like phase error.
Verify: measure with real harness + temperature sweep + load/stub variants; confirm margin at targeted data rate.
Mitigate: select parts with strong loop-delay specs; avoid front-end components that distort edges at FD rates.
Isolated power architecture
Risk: isolated DC/DC noise back-injection → bus-side ground bounce → CANH/L offsets and intermittent errors.
Verify: probe VISO ripple + bus common-mode + error counters under load transitions and sleep/wake.
Mitigate: integrated low-emission iso DC/DC (when suitable) or external converter tuned to EMI budget.
Protection & front-end parasitics
Risk: TVS/CMC/split termination placed like a non-isolated design forces common-mode current into the wrong reference.
Verify: emissions/immunity + fault tests with correct return paths; inspect where surge current actually returns.
Mitigate: low-cap CAN ESD devices (e.g., Nexperia PESD2CANFD60VT-Q, Littelfuse SM24CANB-02HTG) + CMC choices (e.g., TDK ACT45B family, Murata DLW5BSN152SQ2L) placed to preserve paths.
Diagnostics & fail-safe behavior
Risk: faults become un-attributable (bus vs local vs power), slowing service and safety reaction.
Verify: confirm fault pins/telemetry map to system logs; verify dominant-timeout and thermal behaviors.
Mitigate: choose parts with clear status reporting; architect MCU monitoring and black-box fields early.
Practical rule: every “must-have spec” must pair with a “how to verify it in the real harness” plan.
Recommended topics you might also need
Request a Quote
FAQs (Selective Wake / Partial Networking)
Each FAQ uses a fixed 4-line structure: Likely cause / Quick check / Fix / Pass criteria. Thresholds are placeholders, but units and measurement windows are defined for comparability.
▶ Standby Iq misses target: check burden voltage or PN mode entry first?
Likely cause: instrument burden/range switching biases Iq, or the ECU never reached PN listening (still in a higher-power standby).
Quick check: log PN-state evidence (mode/status + timestamps) and repeat Iq with fixed range and known burden (same harness/fixture).
Fix: freeze the measurement contract (fixed range + stabilization window) and enforce a “PN entered” gate before sampling.
Pass criteria: Iq(PN) ≤ X µA over Y min at Vbatt=Z V, T=T°C, with PN-state proven.
▶ False wake happens a few times per day: filter table issue or noise coupling?
Likely cause: unintended filter hit (too broad ID/mask or conditions), or bus activity is detected without a valid hit (coupling/noise patterns).
Quick check: compare filter_hit_id present vs activity_no_hit counters during false wakes (same policy version).
Fix: tighten whitelist (min set), add debounce/second-confirm for suspicious sources, and rate-limit wake during chatter windows.
Pass criteria: false_wake_rate ≤ X/day over Y days, attribution completeness ≥ Z%.
▶ It should wake but does not: ID/mask too tight or DLC/payload conditions wrong?
Likely cause: the intended frame never matches due to an over-tight ID/mask, or the second-stage match (DLC/payload) rejects valid variants.
Quick check: replay the exact wake frame set and log per-frame hit/miss reason (ID match vs DLC/payload mismatch).
Fix: redesign the minimal whitelist (group IDs with masks) and relax DLC/payload to tolerate known-safe variants (service/OTA controlled).
Pass criteria: wake_success ≥ X% for N repetitions; filter_miss for intended frames = 0 under defined bus load.
▶ Gateway remote wake is flaky: attribution wrong or domain policy conflict?
Likely cause: wake_source is misclassified (remote vs bus vs timed), or the domain policy blocks wake (cooldown/rate-limit/bypass rules).
Quick check: correlate remote requests with request_origin, policy_version, and the resulting counters (rate_limited_wake / wake_events_by_source).
Fix: formalize remote-wake boundary rules and ensure gateway emits either (a) policy-approved direct wake, or (b) filtered wake frames with audit.
Pass criteria: remote_wake_success ≥ X% over N trials; attribution completeness ≥ Z% with consistent wake_source.
▶ Service mode requires full-network wake: how to bypass filtering safely and roll back?
Likely cause: a service session needs broad access, but an unrestricted bypass risks permanent standby regression and uncontrolled wakes.
Quick check: verify the bypass path records request_origin, duration_limit, and policy_version before enabling.
Fix: implement time-bounded service bypass with automatic expiry + explicit rollback to the prior filter_table_version.
Pass criteria: bypass expires within X min; post-service Iq(PN) returns to baseline within Δ ≤ Y µA; all bypass events are auditable.
▶ False wakes increase after ESD: how to judge degradation and run regression?
Likely cause: post-stress sensitivity shift increases spurious activity detection or changes filter behavior under noisy conditions.
Quick check: compare pre/post false_wake_rate, activity_no_hit, and hit distributions under the same injected noise profile.
Fix: tighten debounce/second-confirm for suspect paths, re-validate filter table minimality, and gate release on post-stress regression results.
Pass criteria: post-stress false_wake_rate increase ≤ ΔX/day; wake success for intended frames unchanged within ±Y%.
▶ Temperature changes shift false-wake rate: threshold drift or noise spectrum first?
Likely cause: sensitivity/threshold and debounce behavior change with temperature, or the environment changes coupling (harness/ground) producing different activity patterns.
Quick check: sweep temperature with identical injection profiles and log T, Vbatt, activity_no_hit, and hit_id histogram.
Fix: adjust debounce/second-confirm strategy for temperature corners and ensure policy versions are temperature-qualified with recorded limits.
Pass criteria: false_wake_rate ≤ X/day across Tmin..Tmax, with attribution completeness ≥ Z%.
▶ A frame matches, but the ECU does not wake: wake pin chain or host handshake?
Likely cause: hit occurs but wake signal is not delivered (pin/power domain), or the host policy rejects/does not acknowledge wake.
Quick check: timestamp three points: hit → wake pin assertion → host-ready; identify the first missing transition.
Fix: correct wake pin domain/pull configuration or modify host policy to preserve wake evidence through reset/boot.
Pass criteria: 100% of intended hits lead to host-ready within P95 ≤ X ms under defined load; missing-transition count = 0.
▶ False wake attribution is unclear: what is the minimal required logging set?
Likely cause: wake events lack versioning or evidence fields, making root cause indistinguishable (filter-hit vs no-hit activity vs local/timed).
Quick check: inspect records for presence of source, type, timestamp, policy_version, filter_table_version, and filter_hit_id (when applicable).
Fix: implement a ring buffer schema with the minimal set plus counters (wake_events_by_source, filter_hits, activity_no_hit, suspects).
Pass criteria: attribution completeness ≥ Z% over Y days; every wake maps to exactly one source/type and a policy version.
▶ Problems appear after OTA filter-table update: how to validate version/rollback and compatibility?
Likely cause: updated rules change hit coverage (unexpected misses/hits), or the update is not aligned with gateway/domain policy versions.
Quick check: A/B replay the wake-frame set using old vs new filter_table_version and compare hit/miss stats plus wake success/latency distributions.
Fix: enforce semantic versioning + rollback, and gate deployment on regression: intended wakes unchanged, false-wake does not increase.
Pass criteria: rollback success = 100%; regression pass: wake_success Δ within ±X%, false_wake_rate Δ ≤ Y/day.
▶ After waking, current never returns to PN: policy not restored or periodic tasks keep the domain alive?
Likely cause: the system fails to re-enter PN listening (state machine stuck), or timed/remote activities repeatedly prevent sleep entry.
Quick check: log state transitions (Active → Prepare-sleep → PN listen) and count wake requests per hour (timed/remote/diagnostic).
Fix: implement an explicit “policy restore” path after service/remote operations and budget periodic wakes with cooldown enforcement.
Pass criteria: PN re-entry time ≤ X s after task completion; Iq returns within Δ ≤ Y µA of baseline; wake_request_rate ≤ Z/hour.