123 Main Street, New York, NY 10001

I3C Compatibility Hooks for Mixed I2C Systems

← Back to: I²C / SPI / UART — Serial Peripheral Buses

I³C Compatibility Hooks is a practical blueprint for running mixed I³C + legacy I²C buses safely: detect capabilities first, then escalate only when the bus is proven clean.

It focuses on engineering hooks—segmentation, gating, dynamic addressing, guarded HDR transitions, and fast hang recovery—so enumeration is repeatable, failures are observable, and recovery is measurable.

H2-1 Scope guardrails → no topic drift

Definition & Scope: What “I3C Compatibility Hooks” Means

Working definition (engineering-oriented)

I3C Compatibility Hooks are the required hardware, firmware, and verification “hooks” that keep a mixed bus (I3C targets + legacy I2C devices) reliable across bring-up, escalation (push-pull/HDR), fault recovery, and production test.

  • Compatibility is gated, not assumed: any “upgrade” requires capability proof or topology isolation.
  • Hang is a design input: detection → isolation → recovery must be deterministic and observable.
  • Production readiness matters: the system must expose measurable KPIs and repeatable tests.

Hook taxonomy (three pillars that must exist)

HW hooks (topology & electrical control)

  • Segmentation: switches/muxes to quarantine legacy segments and cut off a hung branch.
  • Isolation: safety/functional isolation with a defined delay budget and CMTI requirement.
  • Protection: low-C ESD arrays + series-R/RC damping to reduce false hangs and spurious edges.
  • Observability: test pads / sense points for SDA/SCL and segment enable/reset lines.

FW hooks (state machines & guarded transitions)

  • Enumeration: dynamic addressing flow + address table reconciliation after resets/hot-plug.
  • Capability gate: allow escalation only after consistent capability exchange and bus “purity” proof.
  • Degrade & recover: deterministic timeouts, retries, bus-clear ladder, segment reset, and re-enumeration.
  • Telemetry: structured logs (reason codes, retry counters, timestamps), not a single “timeout” bucket.

TEST hooks (verification & production acceptance)

  • Fault injection: emulate SDA/SCL stuck-low, brown-out, and “slow target” behaviors.
  • Health metrics: track NAK rate, retry distribution, recovery success rate, and time-to-recover.
  • Production pins/paths: BIST/loopback entry points and a fast “mixed-bus acceptance” script.

Mixed-bus scenarios & measurable success criteria

Typical mixed-bus boundaries

  • Base case: 1× I3C master + (I3C targets + legacy I2C devices) sharing SDA/SCL.
  • Multi-rail: pull-ups to different domains, isolated segments, and power sequencing constraints.
  • Hot-plug & brown-out: devices appear/disappear; partial state machines cause stuck lines.
  • Upgrade attempts: push-pull/HDR entry must be prevented unless preconditions are satisfied.

KPI-style pass criteria (placeholders)

  • Enumeration success: ENTDAA / discovery success rate ≥ X% across N boots.
  • Recovery time: hang detected → bus restored ≤ X ms (95th percentile).
  • HDR escalation reliability: HDR entry fail rate ≤ X ppm (or ≤ X%/day).
  • Observability: failures mapped to reason codes (timeout subtype, stuck-line type, capability mismatch).

Out-of-scope (kept for sibling pages)

  • Full I2C pull-up derivations and rise-time tables (only mixed-bus implications appear here).
  • Protocol encyclopedias (command-by-command coverage); only engineering gates and failure patterns are retained.
  • Deep PHY/HDR theory; only “why HDR needs gating/segmentation” and guarded transitions are covered here.
Scope map: mixed-bus system model and hook pillars (HW / FW / TEST).
I3C Compatibility Hooks — Scope Map I3C Master Capability Gate I3C Targets Target A Target B Legacy I2C Device 1 Device 2 SDA/SCL SDA/SCL Hooks (must exist) HW Segment / Isolate Protect / Sense FW Enumerate / Gate Degrade / Recover TEST Fault Inject Metrics / Logs
H2-2 Compatibility is not symmetric → escalation must be guarded

Compatibility Model: I2C-on-I3C and I3C-in-I2C Constraints

Engineering reading of compatibility (what can coexist, under which rules)

  • I3C master ⇄ legacy I2C device: generally workable in compatibility mode (open-drain behavior, conservative timing, and strict timeout/recovery policies).
  • I2C master ⇄ I3C device: often fallback-only or limited; behavior depends on whether the target implements a robust I2C-compatible interface path.
  • HDR / aggressive escalation: typically requires the active bus segment to be I3C-only (or proven “I3C-clean” by capability exchange), otherwise legacy devices can disrupt entry/exit.

Three “red lines” in mixed-bus design (keep these unbroken)

Red line #1: legacy devices can disrupt escalation

Consequence: HDR entry fails, bus enters unstable states, or transactions become non-deterministic.
First guard: capability gate + segmentation (HDR only on an I3C-only segment).

Red line #2: a single legacy hang can stall the entire bus

Consequence: discovery/ENTDAA/CCC can time out, causing system-wide communication loss.
First guard: deterministic timeouts + recovery ladder + segment cut capability.

Red line #3: address & recovery state consistency is fragile

Consequence: address tables drift from real bus state; devices “disappear” after recovery; mis-targeted writes occur.
First guard: reconcile/verify after recovery; re-enumerate when invariants fail.

Practical rule: treat performance escalation as a guarded state transition

Compatibility ladder (recommended): start in a safe mixed-bus mode → prove capability/purity → escalate on an isolated segment → verify → keep a fast rollback path.

  • Precheck invariants: bus idle observed, no stuck lines, stable power-good, no repeated timeouts.
  • Gate decision: escalation allowed only if capability exchange is consistent and legacy devices are absent or isolated.
  • Logging contract: every deny/rollback records a reason code (capability mismatch / stuck-line / timeout subtype / segment fault) + counters + timestamps.

Pass criteria: escalation decisions are repeatable; rollback restores communications within X ms without power-cycling the entire system.

Compatibility matrix: master vs target, plus an HDR gating note.
Compatibility Matrix (Engineering View) Master Target I3C Target Legacy I2C I3C Master Supports gating I2C Master Often limited Full feature Compat mode Fallback only Native HDR / aggressive escalation requires: I3C-only segment OR proven “I3C-clean” via capability gate
H2-3 Electrical coexistence: two-phase strategy + rails + EMC

Electrical Coexistence Hooks: OD vs PP, Pull-ups, Level Domains

Key difference (engineering impact)

Legacy I2C relies on open-drain (OD) + pull-ups, which is naturally shareable but edge-rate limited. I3C can switch to push-pull (PP) for higher speed and lower dynamic power, but mixed-bus PP/HDR is fragile unless the bus segment is proven “I3C-clean” or physically isolated.

  • OD is tolerant; PP is sensitive: faster edges increase susceptibility to ringing, overshoot, and sampling window pollution.
  • Mixed-bus failures look like protocol issues: bit slips, false START/STOP, repeated retries, and “phantom hangs” can be electrical artifacts.
  • The fix is a system policy: start with a safe electrical common denominator, then escalate only behind a capability gate.

Two-phase signaling strategy (required for mixed buses)

Phase 1 · Common denominator (safe mixed-bus mode)

  • OD behavior enforced: conservative timing and bounded edge-rate.
  • Discovery & basic access first: ensure read/write stability before any performance upgrade.
  • Timeout policy active: “slow” and “stuck” are separated (retry vs recover).

Capability gate · Proof before escalation

  • Bus is idle: no stuck-low, stable rails, stable segment connectivity.
  • Capability exchange is consistent: targets respond predictably to required capability queries.
  • Legacy is absent or isolated: escalation only on an I3C-only segment or a proven “clean” segment.

Phase 2 · Escalation (PP / HDR only behind the gate)

  • Escalate on a controlled segment: avoid sharing PP/HDR with unknown legacy devices.
  • Rollback is mandatory: failed entry exits to Phase 1 quickly and records a reason code.
  • Verify after upgrade: quick post-check before long transfers and production acceptance.

Pass criteria: escalation decisions are repeatable; rollback restores stable communications within X ms.

Multi-rail domains & EMC hooks (what breaks mixed buses most often)

Pull-up rail selection (rule-of-thumb, mixed-bus safe)

  • Choose a rail with defined behavior during brown-out: prevent “ghost powering” through IO structures.
  • Preserve OD semantics across level/isolation: avoid translators that can drive ambiguous push-pull during noise.
  • Budget static power explicitly: pull-ups that are “too strong” waste power; “too weak” create slow edges and false triggers.

Edge-rate & ringing control (mixed-bus sensitivity)

  • Fast edges amplify artifacts: ringing → mis-sampling → retries → timeout → “fake hang”.
  • Simple damping helps: small source series-R (or RC shaping) at the edge source reduces Q and overshoot.
  • Return path continuity matters: split grounds and long stubs increase common-mode noise and false edges.

Quick verification (minimal but decisive)

  • Compare Phase 1 vs Phase 2 waveforms: overshoot/ringing must not grow beyond threshold X.
  • Track error signatures: retries/NAKs rising with PP/HDR indicates electrical gating is insufficient.
  • Log gate decisions: allow/deny/rollback must emit reason codes and time-to-recover.

Out-of-scope: pull-up derivations and full rise-time tables are kept in the I2C pull-up / timing subpage; only mixed-bus implications are retained here.

Two-phase signaling: start in OD + pull-up, then escalate to PP/HDR only behind a capability gate and/or segmentation.
Electrical Coexistence — Two-Phase Signaling SDA/SCL backbone Pull-ups Phase 1 OD + conservative timing Discovery Compat IO Timeout & Recovery enabled Gate Idle + Cap No stuck lines Legacy isolated Phase 2 PP / HDR on I3C-only Escalate Verify Rollback path required Segmentation (recommended) I3C Master Switch I3C Segment Legacy Segment rollback
H2-4 Dynamic addressing as a system: table consistency + triggers + recovery

Dynamic Addressing Hooks: ENTDAA, Address Table, Collision Avoidance

ENTDAA is not just an action — it needs system hooks around it

Dynamic addressing becomes reliable only when the system maintains a consistent DA table and treats address assignment as a discover → assign → verify → reconcile loop across resets, hot-plug, and recovery events.

DA table fields (minimum useful set)

  • Identity tuple: stable device identity/feature signature (used to disambiguate same-model instances).
  • Last known address: dynamic address + validity stamp (fresh / stale).
  • Health state: ok / flaky / quarantined, plus last failure reason code.
  • Timestamps & counters: last-verified time, retry count, and recovery count.

Re-run triggers & collision avoidance (mixed I3C + I2C reality)

Recommended triggers to re-run discovery/addressing

  • Cold boot / warm reset: DA table starts “stale” unless verified.
  • Hot-plug / segment change: topology change invalidates assumptions.
  • Post-recovery: bus clear / segment reset requires reconcile + verify.
  • Invariant failure: repeated verify failures, identity mismatch, or abnormal disappearance rate.

Coexistence with I2C static addresses (avoid mis-targeted access)

  • Separate access paths: legacy devices are accessed only via a compatibility path; I3C targets via DA table.
  • Type confirmation before writes: a write must confirm target type from the DA table/capability gate.
  • Same-model multiplicity: require an identity tuple or topology anchor; otherwise mapping is unstable.

Pass criteria: after any trigger, the system re-establishes a consistent DA table and restores stable access within X ms.

Failure modes (what to diagnose first)

ENTDAA stalls / times out

  • First check: stuck line vs slow target vs legacy hang dragging a segment.
  • Fix direction: isolate the suspect segment, recover, then re-run discover/assign behind a gate.
  • Log contract: record timeout subtype + segment identity + recovery action taken.

DA table drifts from real bus state

  • First check: verify/reconcile skipped after recovery, reset, or topology change.
  • Fix direction: enforce reconcile on triggers; stale entries become “untrusted” until verified.
  • Safety: block writes when identity mismatches; quarantine until a clean re-enumeration.

Same-model instances become indistinguishable

  • First check: identity tuple is missing or not stable across boots.
  • Fix direction: add a topology anchor (segment/port mapping) and require identity verification post-assign.
  • Rule: avoid “best-guess mapping”; it breaks production and recovery determinism.

Out-of-scope: full spec term glossaries are omitted; only the system logic required for reliable mixed-bus operation is retained.

Dynamic addressing state machine (reusable firmware flow): Idle → Discover → Assign → Verify → Run → Reconcile/Recover.
Dynamic Addressing — State Machine Idle Safe mode Discover Scan Assign ENTDAA Verify Identity Run Normal Reconcile / Recover Re-enumerate DA Table ID tuple Addr + valid Health Time + cnt Triggers Hot-plug Segment change Post-recovery Invariant fail fault recover → discover used by
H2-5 Capability gate before PP/HDR/high-rate escalation

CCC / Capability Exchange Hooks: Detect Before You Escalate

What CCC means in practice (not a glossary)

In mixed-bus systems, capability exchange is the admission control for escalation. It turns “assumptions” into evidence: which targets respond reliably, which features are consistent, and which segment is safe for PP / HDR / higher frequency.

  • Gate, not “nice to have”: escalation without a gate creates intermittent failures that resemble protocol bugs.
  • Repeatability matters: the same query should produce consistent results across resets and retries.
  • Rollback is required: a failed gate must drive a deterministic fallback to compatibility mode.

Three-stage admission (scan → query → escalate)

Stage 1 · Detect legacy / non-responsive targets

  • Output: bus_purity = i3c_only / mixed / unknown
  • Rule: “unknown” must behave like “mixed” (no escalation).

Stage 2 · Read the minimal “escalation-critical” capability set

  • Capture (placeholders): max_data_rate / hdr_support / ibi_support
  • Consistency: repeated reads must converge (no flip-flop behavior).
  • Output: cap_snapshot + cap_consistency = pass/fail

Stage 3 · Allow escalation only behind the gate (otherwise fallback)

  • Allow PP/HDR: only if bus_purity=i3c_only AND cap_consistency=pass.
  • Fallback: compatibility mode (OD/limited rate) + reason code.
  • Rollback: failed entry returns to safe mode within X ms.

Pass criteria: gate decisions are repeatable; false-allow rate ≤ X.

Observability hooks (log contract)

Capability exchange must emit structured evidence so escalation failures are diagnosable and statistically measurable.

  • timestamp — event time
  • segment_id — which branch/switch path
  • scan_result — i3c_only / mixed / unknown
  • cap_set — capability items queried (minimal set)
  • resp_code — response/timeout subtype (placeholder)
  • retry_count — number of retries
  • cap_snapshot_hash — stable summary for comparison
  • gate_decision — allow_pp / allow_hdr / fallback / hold_safe
  • reason_code — no_response / inconsistent_caps / mixed_bus / stuck_line / timeout_subtype
  • time_to_recover — p95 ≤ X ms (placeholder)

Out-of-scope: CCC command-by-command explanations are omitted by design; only escalation-critical semantics are retained.

Capability gate: scan + capability query → decision → allow escalation or force fallback, with structured logs.
Capability Gate (Detect Before Escalate) Bus Scan legacy? unknown? Capability Query cap set (min) Gate purity + caps consistent? i3c-only? Allow PP segment-safe Allow HDR I3C-only Fallback OD / limited rate Logs & Counters reason_code retries segment_id gate_decision
H2-6 Power-up → idle check → probe → ENTDAA/CCC → tables → run, with deterministic fallback

Mixed-Bus Bring-up Sequencing: Power-up, Discovery, Fallback

Recommended sequence (each step leaves evidence)

  1. Power stable → record power_state=stable (PG/rails stable, placeholder).
  2. Bus idle checkbus_idle=pass/fail (no stuck-low, segment connectivity consistent).
  3. Light probebus_purity=mixed/i3c_only/unknown (unknown behaves as mixed).
  4. ENTDAA + CCC gate → build DA table + cap_snapshot.
  5. Build tables → device list, address map, health flags, segment_id mapping.
  6. Enter RUNrun_mode=compat/escalated with a quick post-verify before long transfers.

Design rule: never escalate immediately after power-up; escalate only after the gate and post-verify pass.

Deterministic fallback (avoid random behavior)

Fallback triggers (minimal set)

  • bus_purity != i3c_only (mixed/unknown)
  • cap_consistency=fail or repeated no-response
  • bus_idle=fail or stuck/timeout subtype observed
  • post-assign verify fails above threshold X

Fallback actions (must be reversible)

  • Lock compatibility mode: OD + limited rate.
  • Isolate suspect segments: switch/mux a branch when a legacy hang drags the bus.
  • Emit reason codes: enable field correlation and production triage.

Escalation criteria (aligns with H2-5)

  • i3c_only segment + consistent caps
  • DA table verified (identity + address mapping stable)
  • Rollback path active with time-to-recover p95 ≤ X ms

Hot-plug / brown-out boundaries (when tables become untrusted)

Invalidation conditions (DA table becomes stale)

  • Topology change: segment switch state changes.
  • Power domain drop: a target may reboot and lose its dynamic address.
  • Identity mismatch: verify fails or device signature changes unexpectedly.
  • Recovery event: bus clear/segment reset occurred without reconcile.

Re-enumeration triggers (how to re-enter the flow)

  • Hot-plug detected → return to Light Probe.
  • Post-recovery → reconcile + verify, then gate.
  • Invariant failure → quarantine stale entries, re-run discovery.

Pass criteria: after hot-plug/brown-out, the system returns to a stable mode and rebuilds trusted tables within X ms.

Mixed-bus bring-up flow: deterministic sequencing with Branch A (compatibility) and Branch B (escalation).
Bring-up Flow (Power-up → Discovery → Fallback / Escalate) Power stable Bus idle? no stuck-low Light probe ENTDAA CCC gate Build tables RUN Branch A Compatibility mode OD / low rate mixed/unknown Branch B Escalate after gate PP/HDR i3c-only Recover ladder (placeholder) idle fail Hot-plug / brown-out re-enter probe
H2-7 HDR is a guarded transition, not a toggle

HDR / Mixed Operation Hooks: Entry/Exit Without Breaking Legacy Devices

HDR entry preconditions (must be provable)

HDR entry requires evidence that the active segment is I3C-only, stable, and ready. Any uncertainty must force compatibility mode (OD / limited rate).

1) Bus-only-I3C proof (from capability gate)

  • Required: bus_purity = i3c_only
  • Record: gate_snapshot_hash / segment_id / gate_timestamp
  • Deny: mixed or unknown segments

2) Health proof (no hang, no storm)

  • Required: bus_idle=pass (guard time ≥ X)
  • Limits: hang_rate ≤ X (windowed)
  • IBI: ibi_storm_flag = 0 (placeholder)

3) Critical targets ready (defined set)

  • Required: target_ready_mask covers the critical set
  • Output: missing_ready_list (if any)
  • Rule: missing ready → no HDR entry

Pass criteria: HDR entry attempts are repeatable; entry failure is bounded and observable (fallback within X ms).

Guarded entry/exit (fast rollback with reason codes)

Entry budgets (placeholders)

  • entry_timeout_budget = X ms
  • entry_retryX
  • fallback_budget_p95X ms

Failure handling (must be deterministic)

  • On entry fail: immediate fallback to compat mode (OD/limited rate).
  • Always log: hdr_fail_reason_code + segment_id + retry_count.
  • No silent retry loops: bounded attempts, then rollback.

Exit + verify (don’t assume “exit = safe”)

  • After exit: verify bus_idle + critical target responsiveness.
  • If verify fails: compat mode + re-enter discovery/reconcile.
  • Evidence: verify_result + time_to_stable.

In-HDR error policy (keep only escalation-critical logic)

Retry / CRC / timeout (placeholders)

  • retry_limitX (exceed → exit HDR)
  • short timeout → bounded retry
  • long timeout → rollback + verify

Error taxonomy (for statistics)

  • bus-class (idle/line anomalies)
  • target-class (identity/ready mismatches)
  • link-class (CRC/retry/timeout)

“Claims HDR support, but behaves inconsistently” (policy table)

Mixed systems need a sanity test and a policy table so unstable targets do not poison HDR transitions.

  • capability_sanity_test: repeated reads across resets; fail → do not allow HDR.
  • policy_table: device_id → allow_hdr / allow_pp / compat_only (whitelist/blacklist/graylist).
  • graylist rule: always compat-only until consistent evidence exists.
HDR guarded transition: protected entry/exit with explicit precheck, verify, and fast fallback.
HDR Guarded Transition (state machine) Normal (Compat) Precheck gate + health HDR bounded Exit Verify Fallback Compat (OD/low) reason_code + logs precheck pass? in-HDR errors? verify pass?
H2-8 Detect fast → isolate fast → recover fast, without corrupting I3C state

Hanging Legacy I2C Devices: Detection, Isolation, and Safe Recovery

Hang types that matter in mixed operation

SDA stuck-low

Bus becomes unusable; must detect with line sampling and isolate the segment before recovery attempts.

SCL stuck-low

Clock is held; transactions time out and may be misdiagnosed unless timeout subtypes are logged.

Infinite clock stretching

Legacy slave delays indefinitely; requires stretch timeout + bounded recovery ladder.

Brown-out half state-machine

Partial reset causes unpredictable line behavior and identity drift; table invalidation + re-enumeration is required after recovery.

Detection hooks (line sampling + timeouts + statistics)

Line-state sampling (idle proof)

  • sda_level / scl_level (sampled)
  • idle_guard_timeX (continuous)
  • stuck_low_counter (windowed)

Per-transaction timeouts

  • txn_timeout (per transaction)
  • stretch_timeout (clock stretch guard)
  • timeout_subtype (no_response / stretch / stuck_line)

Metrics (for acceptance)

  • hang_rateX (windowed)
  • recovery_success_rateX%
  • time_to_recover_p95X ms

Isolation hooks (stop the bleed before recovery)

In mixed systems, recovery is safer after isolation. Segment isolation protects healthy targets and prevents repeated recovery actions from corrupting dynamic addressing and HDR eligibility.

  • Segment isolation: switch/mux/isolator disconnects the suspect branch, preserving the trunk.
  • Logical quarantine: segment_health = healthy/suspect/quarantine (placeholder).
  • State hygiene: recovery implies DA-table invalidation and re-entry to discovery when needed.

Safe recovery ladder (soft → hard)

Level 1 · Soft

  • bus-clear (clock pulses)
  • bounded retries
  • exit if bus_idle=pass within X

Level 2 · Segment reset

  • disconnect/reconnect suspect segment
  • validate trunk stability
  • re-probe before reintegration

Level 3 · Target reset

  • reset the target (if available)
  • verify identity + responsiveness
  • failure → escalate ladder

Level 4 · Power cycle

  • power-domain reset (last resort)
  • invalidate DA table
  • re-enumerate + re-gate before HDR

Pass criteria: recovery is repeatable; time_to_recover_p95 ≤ X ms; success_rate ≥ X%.

Hang recovery ladder: isolate first, then escalate recovery strength only as needed.
Hang Recovery Ladder (mixed-bus safe) Level 1 Soft: bus-clear bounded retries Level 2 Segment reset disconnect/reconnect Level 3 Target reset identity verify Level 4 Power cycle re-enumerate idle within X recover p95 ≤ X ms success ≥ X% re-gate before HDR isolate first HDR lockout during recovery → compat mode + verify only
H2-9 Turn “mixed bus” into a structure problem: segment, isolate, and gate escalation

Segmentation & Isolation Architectures for Mixed Buses

Three architecture templates (choose fault containment first)

Template 1 · Single bus

  • Best for: few targets, short traces, low legacy risk.
  • Main risk: one hang can stall the entire bus.
  • HDR posture: usually disabled or highly constrained.
  • Must-have hooks: strict timeouts + recovery + telemetry.

Template 2 · Switched segments

  • Best for: multiple branches and unknown legacy behavior.
  • Main benefit: fault containment + controlled reintegration.
  • HDR posture: allowed only on proven i3c-only segments.
  • Must-have hooks: segment health states + re-enumeration triggers.

Template 3 · Isolated domains

  • Best for: cross-ground/noisy systems or safety boundaries.
  • Main benefit: domain faults do not propagate across isolation.
  • HDR posture: per-domain policy; escalation requires stable evidence.
  • Must-have hooks: delay budget + OD semantics preservation.

Pass criteria: HDR is treated as a segment attribute, not a bus-wide switch.

Segment policy (what can escalate, what must stay compatible)

Mixed-bus robustness comes from a policy table that binds escalation permissions to evidence and segment health.

segment_id segment_type allow_pp allow_hdr gate_source quarantine_rule rejoin_condition
A i3c_only yes yes gate_snapshot_id hang_rate > X verify pass + re-gate
B mixed no no compat evidence recover_fail > X stable window ≥ X
C legacy_only no no compat-only stuck_low > X re-probe only
  • Rule 1: mixed segments default to compat-only (no HDR) unless isolation proves HDR traffic cannot touch legacy.
  • Rule 2: legacy-only segments are always compat-only.
  • Rule 3: quarantined segments must recover + verify + re-gate before any escalation.

Budgeting (segmentation/isolation changes timing and observability)

Budget items (placeholders)

  • added_delay (switch/isolator)
  • cap_increment (port input + switch parasitics)
  • edge_rate_change (slew / ringing)
  • recovery_time (isolate + recover + re-enumerate)

Acceptance checks (placeholders)

  • compat: transaction success ≥ X%, bus_idle pass
  • i3c_only HDR: entry success ≥ X%, fallback p95 ≤ X ms
  • isolation: cross-domain faults do not propagate (test-case verified)

Mixed-bus pitfalls (only the ones that break escalation)

Pitfall 1 · Fault containment failure

  • legacy hang corrupts bus-wide state (DA table, gate evidence)
  • solution: per-segment isolation + per-segment health + re-gate after recovery
  • evidence: segment_id tagged on every gate / recovery / entry log

Pitfall 2 · Ghost powering

  • powered-off segment is back-fed via pull-ups / protection paths → half state-machine
  • solution: pull-up to the correct rail, ensure true hi-Z when off, isolate by segment when needed
  • acceptance: off segment must not sink SDA/SCL; recovery must force re-enumeration
Segmented topology map: HDR allowed only on the proven i3c-only segment; mixed and legacy segments stay compat-only.
Segmented Topology Map I3C Master gate + logs Segment Switch Segment A I3C-only HDR allowed Policy Health Segment B Mixed Compat only Segment C Legacy-only Compat only Quarantine: isolate segment → recover → verify → re-gate
H2-10 Maintainable firmware system: state machine + policies + telemetry + production hooks

Firmware Robustness Hooks: State Machines, Timeouts, Telemetry

Unified state machine (actions are state-bounded)

Discover

Line-state checks + lightweight probing; no escalation allowed.

Assign

Build identity and address table; record retries and outcomes.

Run

Normal operation; separate compat-run from i3c-run; HDR only if gate says i3c-only.

Degrade

Uncertainty or failure trend → compat-only, no HDR/PP, minimal safe transactions.

Recover

Isolation + recovery ladder; then return to Discover/Reconcile with DA-table hygiene.

Pass criteria: every escalation action is gated by state + evidence (no hidden “try HDR” paths).

Timeout & retry policy (bounded, configurable, never infinite)

Policy item Default (placeholder) Escalation rule
txn_timeout X timeout_subtype → retry or degrade
stretch_timeout X exceed → recover ladder
retry_limit X exceed → degrade
segment_quarantine_condition X isolate segment → recover → verify
hdr_entry_budget X ms fail → fallback + reason_code

Pass criteria: no infinite waits; escalation is deterministic and produces logs suitable for statistics.

Telemetry schema (fields that close the debug loop)

Protocol outcomes

  • ccc_failure_code, ccc_retry_count
  • entdaa_retry_count
  • hdr_entry_attempts, hdr_entry_fail_reason

Bus/segment context

  • bus_purity (i3c_only/mixed/unknown)
  • segment_id, segment_health
  • gate_snapshot_id

Line/hang visibility

  • line_stuck_duration (SDA/SCL)
  • timeout_subtype distribution
  • fallback_time_ms (p50/p95)

Pass criteria: every failure can be binned by segment + phase + reason_code (not only free-form logs).

Production measurability hooks (structure only; no code)

  • BIST: line-state + minimal probe + policy snapshot export.
  • Loopback/echo: deterministic request/response path (placeholder).
  • Test command entry: manufacturing mode hooks (placeholder).
  • Pass criteria: success ≥ X%, recovery p95 ≤ X ms, no HDR attempts on mixed segments.
Firmware architecture blocks: bounded state machine, shared policy, and structured telemetry outputs.
Firmware Architecture Blocks I3C Bus Driver Core state machine Address Table DA + identity + health Error Manager timeouts + degrade + recover Telemetry Logger fields + counters Production Test Hooks BIST + loopback + cmd entry events policies counters export

Engineering Checklist: Design → Bring-up → Production

A mixed I³C + legacy I²C system is “production-ready” only when segmentation rules, fallback posture, and observability are frozen as measurable acceptance criteria. Each checklist item below binds an action to a quick check and a pass criterion (threshold placeholders).

Design checklist (architecture & board hooks)

  • Item: Freeze segmentation policy: which segments are I³C-only (HDR allowed) vs Mixed/Legacy (compat-only).
    Quick check: A segment policy table exists (segment_id → allowed modes) and is referenced by firmware logs.
    Pass criteria: HDR/PP escalation attempts on Mixed/Legacy segments = 0.
  • Item: Define “compat-first” posture: power-up and uncertain states stay in OD + limited speed until capability gate passes.
    Quick check: State machine has explicit phases (Discover/Assign/Run/Escalate/Degrade/Recover).
    Pass criteria: Degrade-to-compat transition latency (p95) < X ms.
  • Item: Reserve hardware “control hooks” for isolation and recovery (segment enable, reset, power-cycle).
    Quick check: Test points/pins exist for SDA/SCL, SEG_EN, RESET, PGOOD; controlled power switch available for last-resort recovery.
    Pass criteria: All critical hooks observable in lab and fixture: coverage ≥ X%.
  • Item: Prevent ghost-powering across powered-off modules/segments (true Hi-Z boundary).
    Quick check: In powered-off state, SDA/SCL do not clamp low/high through protection paths; isolation boundary does not back-feed I/O rails.
    Pass criteria: Off-segment leakage/phantom pull is below X µA (placeholder) and bus idle remains valid.
Reference materials (Design) — example part numbers
  • I³C hub / fanout (multi-target aggregation): NXP P3H2840HN (I³C hub family)
  • Segment switch / mux (address isolation, controlled fanout): TI TCA9548A (8-ch I²C mux), NXP PCA9846 (4-ch I²C switch w/ reset)
  • Hot-swap / bus buffer (capacitance isolation): NXP PCA9511A (hot swappable I²C/SMBus buffer)
  • Power-cycle hook (last-resort recovery): TI TPS22918 or TPS22910A (load switch families)
  • ESD protection (low-C dual lines): TI TPD2E2U06 (2-ch TVS/ESD array)
  • Pull-ups / damping (examples): Vishay CRCW06034K70FKEA (4.7 kΩ, 0603), Vishay CRCW060310K0FKEA (10 kΩ, 0603), Vishay CRCW060322R0FKEA (22 Ω series, 0603)
Notes: part numbers are examples; verify speed grade, package suffix, rail limits, capacitance/ESD ratings, and sourcing.

Bring-up checklist (deterministic sequencing & fault-injection)

  • Item: Run “compat-first” bring-up: bus idle check → light probe → ENTDAA/capability exchange → build tables → run.
    Quick check: Logs show ordered phases with timestamps; mixed-bus stays OD/limited speed.
    Pass criteria: Enumeration success rate ≥ X%; time-to-run (p95) < X ms.
  • Item: Capture addressing traceability: device identity → assigned address → health state (DA table consistency).
    Quick check: DA table export exists (per boot) and correlates with bus scan results.
    Pass criteria: Address collision count = 0; reconcile events ≤ X per hour.
  • Item: Fault-inject hanging legacy behavior (stuck-low, infinite stretch, brown-out half-state).
    Quick check: Recovery ladder triggers: timeout → bus-clear → segment isolate → optional power-cycle.
    Pass criteria: Recovery success rate ≥ X%; recovery time (p95) < X ms.
  • Item: Rehearse HDR entry/exit only on I³C-only segments (guarded transition).
    Quick check: Capability gate snapshot recorded before escalation; whitelist/blacklist supported for inconsistent targets.
    Pass criteria: HDR entry fail rate ≤ X ppm; fast fallback-to-compat < X ms.
Reference materials (Bring-up) — example part numbers
  • Stuck-bus recovery / auto-disconnect: TI TCA4307 (hot-swappable I²C buffer w/ stuck bus recovery)
  • Differential extender for noisy/long reach (compat segments): NXP PCA9615DP (dI²C buffer/extender)
  • Isolation for OD/compat segments: TI ISO1540 (I²C isolator), ADI ADuM1250 (hot-swappable I²C isolator)
Notes: classic I²C isolators are best aligned to OD/compat semantics; keep HDR/PP paths on I³C-only segments unless a bridge explicitly supports the mode.

Production checklist (fast acceptance & KPI logging)

  • Item: Define a minimal acceptance sequence: probe → enumerate → export tables → inject one recoverable fault → verify recovery.
    Quick check: Fixture can read back DA table + counters; optional HDR test runs only on I³C-only segment.
    Pass criteria: Total test time ≤ X s.
  • Item: Lock KPI definitions (same fields in lab + production): retries, timeouts, stuck durations, fallback events.
    Quick check: Telemetry schema versioned; counters reset/rolled per unit and stored in manufacturing logs.
    Pass criteria: ENTDAA success ≥ X%; recovery time < X ms; HDR entry fail ≤ X ppm.
  • Item: Enforce “no silent failures”: every escalation denial logs a reason code and a gate snapshot.
    Quick check: Production report includes top-N denial reasons and correlation with segment_id/board revision.
    Pass criteria: Missing gate snapshot events = 0.
Checklist Flow + Probe Points Design, Bring-up, and Production blocks with check items and probe points: SDA, SCL, SEG_EN, RESET, PGOOD. DESIGN BRING-UP PRODUCTION Segment policy frozen Fallback posture defined Probe points reserved Compat-first sequence DA table traceability Fault injection + recovery Fast acceptance run KPI schema locked No silent failures Recommended probe points SDA / SCL SEG_EN RESET PGOOD Power-cycle
Diagram: checklist lifecycle and the minimum set of probe/control points to make failures reproducible and recoverable.

Applications: Where Compatibility Hooks Matter Most

Typical mixed-bus scenarios are best handled as “use-case → required hooks” mappings. Each card below lists common failure modes, the minimum hook set, and example materials.

Use-case A · Sensor aggregation (legacy I²C sensors + new I³C targets)

Typical failure modes: address conflicts, legacy device hangs blocking discovery, “unsafe escalation” causing intermittent faults.
Minimum required hooks: Dynamic addressing Capability gate Segmentation Hang recovery Telemetry
  • Pass criteria (placeholders): ENTDAA success ≥ X%; recovery time (p95) < X ms; address collisions = 0.
Example material stack
  • I³C aggregation/fanout: NXP P3H2840HN
  • Address isolation / same-address sensors: TI TCA9548A (mux) or NXP PCA9846 (switch)
  • Stuck-bus containment: TI TCA4307 (auto-disconnect + clock pulses)
  • ESD protection on SDA/SCL: TI TPD2E2U06

Use-case B · Board management / debug port (field access, hot-plug, brown-out)

Typical failure modes: intermittent bus lockups, repeated recovery cycles without root-cause visibility, field-induced ESD events.
Minimum required hooks: Timeout policy Recovery ladder Segment quarantine Telemetry
  • Pass criteria (placeholders): recovery success ≥ X%; repeated recoveries ≤ X/day; missing reason-code events = 0.
Example material stack
  • Auto recovery / hot-swap buffer: TI TCA4307
  • Hot-swap capacitance isolation: NXP PCA9511A
  • Controlled fanout / isolation of risky targets: TI TCA9548A
  • Field ESD hardening: TI TPD2E2U06 + series resistor Vishay CRCW060322R0FKEA (22 Ω)

Use-case C · Multi-power modular system (plug-in modules, split grounds, safety boundaries)

Typical failure modes: ghost-powering, half-initialized legacy devices, inconsistent tables after module power events.
Minimum required hooks: Segmentation Power boundary rules Re-enumeration triggers Telemetry
  • Pass criteria (placeholders): off-segment does not disturb bus idle; re-enumeration time < X ms; post-event DA table matches scan (mismatch = 0).
Example material stack
  • Power-cycle control for module segments: TI TPS22918 / TPS22910A
  • Isolation for compat-only segments: TI ISO1540 or ADI ADuM1250
  • Segment switching / quarantine: TI TCA9548A
  • Pull-up examples: Vishay CRCW06034K70FKEA (4.7 kΩ) / CRCW060310K0FKEA (10 kΩ)
Use-case to Hooks Mapping Left column has three use-cases; right column has five hooks; lines show mapping. Use-cases A · Sensors mixed-bus legacy I²C + new I³C targets B · BMC / debug port field access, hot-plug, ESD C · Multi-power modules isolation & power boundaries Hooks Dynamic addressing (ENTDAA + DA table) Capability gate (CCC sanity checks) Segmentation / quarantine (switch/mux) Hang recovery ladder (timeouts → reset) Telemetry (reason codes + counters)
Diagram: common deployments mapped to the minimum hook set; keep HDR/PP escalation constrained to I³C-only segments.
Material-number quick list (examples)

NXP P3H2840HN (I³C hub) · TI TCA9548A (I²C mux) · NXP PCA9846 (I²C switch) · TI TCA4307 (stuck-bus recovery buffer) · NXP PCA9511A (hot-swap I²C buffer) · NXP PCA9615DP (differential I²C extender) · TI ISO1540 / ADI ADuM1250 (I²C isolators for compat segments) · TI TPD2E2U06 (dual-line ESD) · TI TPS22918 / TPS22910A (load switches) · Vishay CRCW06034K70FKEA, CRCW060310K0FKEA, CRCW060322R0FKEA (passives examples).

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (Mixed I³C + legacy I²C compatibility hooks)

Each answer is intentionally compressed into four executable lines: Likely cause / Quick check / Fix / Pass criteria (threshold placeholders).

ENTDAA sometimes succeeds, sometimes times out — first log field to compare?
Likely cause: Bus not truly idle (stuck-low or long stretch) or segment policy/gate differs between runs (mixed vs I³C-only).
Quick check: Compare bus_idle_ok, sda_stuck_low_ms/scl_stuck_low_ms, and segment_id + ccc_snapshot_id for the good vs bad run.
Fix: Enforce “compat-first” sequencing (idle check → light probe → gate → ENTDAA), and quarantine any segment that violates idle/stretches before ENTDAA retries.
Pass criteria: ENTDAA success rate ≥ X% across N cold boots; ENTDAA timeout events ≤ X per 10^3 attempts.
Dynamic addresses change after warm reset — what did I forget to persist?
Likely cause: DA table not persisted/versioned, or warm reset path skips reconcile/verify and reuses stale identity mapping.
Quick check: Compare warm_reset_flag, da_table_version + da_table_crc, and device_id_hash → assigned_addr mapping before vs after reset.
Fix: Define a warm-reset rule: either (A) force re-enumeration after any reset that can disturb targets, or (B) persist identity+health+address, then run a mandatory “Verify/Reconcile” phase on resume.
Pass criteria: After warm reset, DA table mismatch count = 0 across N cycles; time-to-run (p95) < X ms.
Mixed bus works in SDR, but HDR entry always fails — what capability gate is missing?
Likely cause: Escalation attempted without proving “I³C-only” purity (legacy presence) or without validating required HDR capabilities for all critical targets.
Quick check: Confirm legacy_presence_flag=0 on the HDR segment and hdr_allowed_flag=1 derived from a recorded ccc_snapshot_id.
Fix: Move HDR to an I³C-only segment (switch/quarantine mixed devices), and require a gate whitelist of targets that pass capability sanity checks before each HDR attempt.
Pass criteria: HDR entry fail rate ≤ X ppm over 10^6 attempts; fallback-to-compat latency (p95) < X ms.
Legacy I²C device hangs and the whole bus dies — fastest isolation test?
Likely cause: A single legacy target holds SDA/SCL (stuck-low or infinite stretch) and the topology lacks a controllable isolation boundary.
Quick check: Toggle SEG_EN / switch channel to disconnect the suspect segment and re-check bus_idle_ok within X ms.
Fix: Add segmentation (mux/switch) so a bad segment can be quarantined, then apply recovery ladder on the isolated segment (bus-clear → reset → power-cycle).
Pass criteria: Quarantine restores main bus idle in < X ms; hang-induced global outage rate ≤ X per 10^6 transactions.
After recovery, some targets disappear until power-cycle — what state machine bug is typical?
Likely cause: Recovery path returns to Run without a mandatory reconcile/re-discover, leaving DA table and target state desynchronized.
Quick check: Verify recovery_step_level is followed by re_discover_count and verify_phase_time_ms (no “silent resume”).
Fix: Make recovery deterministic: Recover → Verify bus idle → Reconcile DA table → Re-enable only healthy segments → Run (with denial reasons logged).
Pass criteria: Post-recovery “missing target” events = 0 across N injected faults; recovery time (p95) < X ms.
IBI storms appear only with one vendor’s target — first sanity check?
Likely cause: Vendor target generates excessive IBIs due to misconfigured enable/thresholds, or a compatibility bug triggers repeated interrupt conditions.
Quick check: Log ibi_rate_per_s and ibi_source_addr; temporarily disable IBI for that target and confirm storm stops within X ms.
Fix: Apply a vendor quarantine policy (whitelist for IBI + rate limit + fallback to compat-only segment) until capability sanity checks pass consistently.
Pass criteria: IBI rate ≤ X / s for Y s window; storm-induced transaction loss ≤ X ppm.
Two identical boards: one can escalate to HDR, the other can’t — what’s the first SI/edge check?
Likely cause: Edge-rate/ringing differences or pull-up domain differences create marginal sampling windows; HDR guard rejects escalation due to intermittent precheck failures.
Quick check: Compare SDA/SCL tR/tF (rise/fall), overshoot/ringing, and bus_idle_ok just before HDR attempt; confirm the same ccc_snapshot_id yields hdr_allowed_flag.
Fix: Normalize edge control (pull-up + optional small series damping) and keep HDR on I³C-only segment; if marginal, widen guard (more precheck retries) without allowing mixed-bus HDR.
Pass criteria: HDR precheck pass rate ≥ X% across N boards; HDR entry fail ≤ X ppm.
Works on bench, fails in system with hot-plug — what should I debounce/sequence?
Likely cause: Hot-plug introduces brief brown-out/half-state targets; enabling the segment before rails settle creates false hangs and table inconsistencies.
Quick check: Log hotplug_event_count, power_good_to_en_ms, and bus idle within X ms after SEG_EN; correlate failures to short/variable enable delay.
Fix: Add debounce sequencing: require stable PGOOD for X ms, then enable segment; force compat-only until a fresh gate + enumerate completes.
Pass criteria: Hot-plug success ≥ X% over N insertions; post-plug recovery time (p95) < X ms.
Switching/isolator added, now enumeration is flaky — what’s the first delay/edge artifact check?
Likely cause: Added device changes rise-time, edge symmetry, or introduces enable-time glitches; firmware starts discovery before signals stabilize.
Quick check: Measure SDA/SCL tR/tF before vs after insertion and verify segment_enable_settle_msX ms before ENTDAA; compare entdaa_retry_count.
Fix: Add a settle window after enable, keep compat-only on mixed segments, and adjust pull-up/edge control (do not “mask” by excessive retries).
Pass criteria: ENTDAA retries (p95) ≤ X; enumerate time (p95) < X ms; flake rate ≤ X%.
Fallback to I²C mode is stable but too slow — what’s the safest “partial upgrade” strategy?
Likely cause: Mixed segment contains legacy behavior risk; full escalation (PP/HDR) is unsafe, but the topology allows an I³C-only subset to be upgraded.
Quick check: Identify a segment where legacy_presence_flag=0 and gate yields pp_allowed_flag/hdr_allowed_flag; confirm mixed segment remains compat-only with HDR attempts=0.
Fix: Segment the bus: keep mixed/legacy segment in OD/limited speed, while upgrading only the I³C-only segment (higher rate, optional PP/HDR) under a strict gate + whitelist.
Pass criteria: Throughput improvement ≥ X% on I³C-only segment; mixed segment hang rate does not increase (Δ ≤ X ppm); recovery time (p95) < X ms.
Some devices claim I³C but behave like I²C — how to quarantine safely?
Likely cause: Target supports limited I³C features, fails CCC sanity checks, or has vendor-specific behavior that breaks escalation assumptions.
Quick check: Compare ccc_fail_code, hdr_allowed_flag, and observed behavior under compat-only; if repeated gate failures occur, mark vendor_id/device_id_hash as “restricted”.
Fix: Quarantine by policy: force compat-only for that target (and/or move it to a mixed segment), disable HDR/PP for its segment, and require explicit whitelist for escalation.
Pass criteria: Gate-denial reason logged for 100% of blocked escalations; system stability maintained (error rate ≤ X ppm) with quarantine enabled.
Recovery clock pulses clear SDA but transactions still fail — what’s next step ladder?
Likely cause: Line-level stuck cleared, but target state machine remains corrupt (brown-out half-state) or DA table is out of sync after partial recovery.
Quick check: Confirm bus_idle_ok=1 yet retries/timeouts persist; check recovery_step_level and whether reconcile_count executed post-recovery.
Fix: Escalate recovery ladder: (1) bus-clear → (2) isolate segment → (3) segment reset/re-enable with settle window → (4) power-cycle only the affected domain, then re-enumerate and verify.
Pass criteria: Recovery success rate ≥ X%; max recovery step used ≤ Level X for Y% of cases; recovery time (p95) < X ms.