In BMS/HV systems, isolation must be treated as a complete, testable system—partitioning (CAN-FD vs isoSPI), reinforced insulation targets, and isolated power must be co-designed to survive HV dv/dt and to pass production gates with repeatable X/Y/N acceptance criteria.
H2-01 · What This Page Solves (Scope & Decision Map)
PurposeDefine the system problem and lock the decision inputs
This page turns BMS/HV isolation into a measurable system decision: isolation for communication, isolated power, safety/compliance targets,
EMC immunity, and production-ready validation.
Scope boundary: focuses on system-level choices and acceptance criteria. It does not explain CAN/isoSPI protocol internals
or generic DC-DC topology tutorials.
Card A · System Boundary (Pack ↔ Vehicle/Charger ↔ Service)
InterfacesPack ↔ Vehicle/Charger: isolated CAN-FD (port stack defined by protection + isolation + MCU side). Pack internal sensing chain: isoSPI / daisy chain (segmentation/bypass policy defines fault containment). Service access (optional): isolated USB/Ethernet may exist, but only isolation constraints are referenced here.
Noise context
dv/dt class = X kV/µs, harness length = L m, EMC severity = E1/E2/E3 (placeholders for worst-case budgeting).
Safety inputs
insulation level = Basic / Reinforced, lifetime target = N years @ Tmax, altitude = A m, pollution degree = PDx.
Card B · Decision Tree (Isolated CAN-FD / isoSPI Chain / Hybrid)
Decision inputs (locked): pack size (P), node count (N), harness length (L), EMC severity (E), safety class (S), service requirement (D).
Each branch maps inputs → choice → trade-offs.
Isolated CAN-FD only: fits small N, short internal chain, strong service/maintenance requirement.
Trade-off port EMC hardening and cable exposure dominate validation.
isoSPI/daisy chain only: fits internal sensing chain-centric designs where the vehicle-side gateway handles external buses.
Trade-off chain fault containment and segmentation policy become critical.
Hybrid (common shipping template): isolated CAN-FD for Pack↔Vehicle/Charger, isoSPI chain for internal cell monitoring.
Trade-off unified reinforced insulation targets must be met for both comm and power rails.
Safety/Compliance pass: VIORM ≥ X V; creepage/clearance ≥ Y mm (PD/altitude derating applied);
hi-pot/PD per plan (pass/fail thresholds = X/Y/N placeholders).
Communication robustness pass: CAN error-burst rate ≤ X/hour at worst dv/dt; isoSPI chain CRC burst ≤ X/hour;
recovery time ≤ Y ms; criteria consistent from bench to vehicle EMC level E*.
Production-ready pass: EOL includes hi-pot/PD + comm margin + safe-state verification; traceability includes certificates, test reports,
and version-locked BOM/PCB revisions.
GoalConvert isolation into three shippable templates
Each template explicitly defines communication path, isolated power path, and isolation barrier placement,
so validation and production tests map to a concrete architecture rather than ad-hoc fixes.
Page boundary reminder: protocol-stack details (frames/services) and generic power-topology tutorials are excluded.
Only isolation constraints and system-level acceptance are defined.
Centralized Pack Control
Topology snapshot: a central controller inside the pack collects measurements; isolated CAN-FD provides the external interface;
internal sensing links stay short and controlled.
Barrier placementBarrier #1 (reinforced): Pack ↔ Vehicle/Charger/Service. Barrier #2 (optional): controller ↔ specific noisy subdomains (only if required by partitioning).
Comms path
isolated CAN-FD port stack: Connector → TVS → CMC → Isolated CAN PHY → MCU.
Power path
isolated DC-DC feeds secondary rails for isolated PHY/logic; UVLO defines safe-state defaults (threshold placeholders X/Y).
Failure containment
external cable exposure dominates risk; focus on dv/dt injection paths and service-ground reference shifts.
When to use
node count N ≤ X, harness length L ≤ Y m, EMC severity E ≤ E2 (placeholders).
Primary risks
surge/ESD at connector; shield bonding inconsistency; leakage constraints when adding Y-caps.
Distributed Modules (Segmented Sensing Chain)
Topology snapshot: multiple modules each host monitors; an isoSPI/daisy chain carries measurement/control across segments;
isolation and segmentation policies define robustness.
Barrier placement
a reinforced boundary still exists at Pack ↔ Vehicle; additional internal barriers may appear per module boundary (as required by safety partitioning).
Comms path
isoSPI chain with segmentation + bypass points to prevent a single module from collapsing the entire chain.
Power path
isolated bias strategy must avoid brownout-induced “false comm faults”; define UVLO + recovery policy explicitly (X/Y placeholders).
Failure containment
single-point failures are controlled by segment boundaries and bypass behavior; diagnostics must pinpoint the failing segment quickly.
When to use
node count N ≥ X, harness complexity high, EMC severity E ≥ E2 (placeholders).
Primary risks
retry storms after transient bursts; chain recovery time; module replacement compatibility and version control.
Modular Pack (Serviceable, Replaceable Units)
Topology snapshot: swappable modules demand repeatable isolation geometry and standardized tests;
“same rating, different outcome” is prevented by strict criteria and documentation.
Barrier placement
reinforced boundary at external interface; module-to-module boundaries require consistent creepage/clearance and controlled leakage paths.
Comms path
hybrid is common: isolated CAN-FD externally + segmented internal chain for monitors; each boundary uses unified acceptance criteria.
Power path
isolated power modules simplify compliance geometry; no-load loss and thermal headroom must be bounded for parked/standby operation.
Failure containment
replacement should not change system behavior; define version-locked BOM/PCB and a minimum EOL test set.
When to use
field-service requirement high; supplier diversity expected; compliance evidence must be reproducible across builds.
Primary risks
certificate/test-setup mismatch across labs; contamination/CTI drift; leakage-current regressions after EMC fixes.
Diagram intent: every architecture explicitly defines barrier placement, communication path, and isolated power path,
so EMC/safety validation and EOL tests map to a concrete template instead of ad-hoc patches.
GoalTurn “reinforced + high VIORM” into executable acceptance targets
Reinforced insulation is not a single voltage number. This section locks the required inputs, maps them to device ratings
(VIORM/VIOTM, creepage/clearance, CTI), and defines the documentation evidence needed for repeatable compliance.
Scope boundary: engineering criteria and check items only. Standard text and long-form theory are excluded.
Lifetime and derating are expressed as placeholders (X/Y/N) to keep the page actionable and system-focused.
Card A · System Input Sheet (Fill These First)
Voltage profile: Vwork (DC/RMS) = X, Vtransient = Y (duration = N), surge/impulse class = 1.2/50 or 10/1000. Environment: Tmin/Tmax = X/Y, altitude A = X m, pollution degree PD = PDx. Lifetime target: N years @ Tmax (duty cycle = X%, contamination risk = low/med/high). Evidence strategy: hi-pot type (AC/DC) + duration + leakage limit = X/Y/N, PD requirement = yes/no.
Input lockInputs must be consistent across electrical design, PCB geometry, and production tests.
Common failure“High withstand voltage” without lifetime/altitude/PD context causes late-stage rework.
Card B · Input → Device Rating Mapping
Working & lifetime
Vwork + Tmax + N years → VIORM target (lifetime/derating applied).
Riskrating used at room-temp only; derating missed.
Impulse/surge
surge/impulse class → VIOTM / impulse withstand (test condition must be normalized).
Risklab-to-lab differences (waveform, repetition, setup).
Altitude & PD
altitude A + PDx → creepage/clearance minimums on PCB + package geometry (X mm placeholder).
Riskpackage pin pitch assumed to be sufficient; PCB surface path ignored.
Surface robustness
contamination risk → CTI class and coating/cleanliness strategy (process-controlled).
Riskcoating assumed perfect; manufacturing variation ignored.
System EMC coupling
EMC severity + leakage budget → barrier capacitance target (pF placeholder) and Y-cap policy.
RiskEMI fixed via Y-caps but leakage compliance fails.
Fail-safe behavior
safety requirement → UVLO/power-down default states and diagnosable behavior.
Riskbrownout appears as “communication fault” without clear default-state definition.
Output of this mapping: a rating cluster (VIORM/VIOTM + geometry + CTI + coupling + default states) that must be satisfied together,
not independently.
Card C · Documentation & Certificates Checklist
External evidence: VDE/UL/CB certificates (model + standard version + date + scope). Required declarations: VIORM/VIOTM + lifetime conditions, creepage/clearance geometry, barrier capacitance (typ/max), test conditions. Internal controlled docs: version-locked BOM/PCB/process (cleaning/coating), EOL test steps (hi-pot/PD + default-state + comm margin),
and change-control rules for alternates.
NormalizationTest setup parameters must be recorded (waveform, duration, repetition, leakage limit).
TraceabilityCertificate references must match the exact ordering code used in production.
Diagram intent: define a single ladder from working to transient to surge/impulse, then bind it to evidence tests.
Derating knobs (altitude/PD/CTI/geometry/coupling) must be applied consistently.
H2-04 · Communication Partitioning: When to Use Isolated CAN-FD vs isoSPI Chain
GoalAssign roles and prevent the wrong bus from carrying the wrong risk
Isolation decisions become stable when each link has a clear role: isolated CAN-FD for Pack ↔ Vehicle/Charger/Service,
isoSPI/daisy chain for internal cell-monitor transport. Hybrid designs unify acceptance criteria while isolating faults by boundary.
Scope boundary: system-level role split only. Bit timing, sampling points, and isoSPI coding details are excluded.
Compare · What Each Link Optimizes
Latency/Determinism
CAN-FD optimizes external interoperability and service paths; isoSPI optimizes controlled internal transport for measurement chains.
Node count
CAN-FD scales externally but becomes cable-exposure dominated; isoSPI scales internally with segmentation and bypass policy.
Harness exposure
CAN-FD is typically exposed to vehicle harness and service tooling; isoSPI is usually inside the pack enclosure.
EMC
CAN-FD requires port-level surge/ESD and common-mode control; isoSPI requires burst-error containment and recovery policy discipline.
Diagnostics
CAN-FD supports vehicle/service tooling; isoSPI benefits from segment counters and chain-health observability for fast pinpointing.
Design rule: role split must align with the reinforced insulation targets defined in H2-03 (VIORM/lifetime/geometry/leakage).
Conclusion · Three Deployment Templates
Only Isolated CAN-FD: Pack ↔ Vehicle/Service carries control + diagnostics.
Isolation checkpoints: reinforced barrier + port protection chain.
Validation focus: harness exposure EMC + surge/ESD + ground reference shifts.
Only isoSPI Chain: internal measurement transport dominates.
Isolation checkpoints: segmentation/bypass to contain single-point failures.
Validation focus: burst-error behavior + recovery time + no retry storms.
Diagram intent: assign external role to isolated CAN-FD and internal role to isoSPI.
Fault isolation (F) and bypass (B) enforce containment without relying on protocol-specific explanations.
ScopeMake isolated CAN-FD stable in HV switching environments
This blueprint defines a shippable isolated CAN-FD port stack: the end-to-end signal chain, the rating cluster
(CMTI/surge/common-mode/fail-safe), and the dominant false-trigger paths (dv/dt injection, ground bounce, shield discontinuity).
Excluded: CAN frames, diagnostics services, and upper-layer stacks. The focus is port physics, defaults, and evidence-ready acceptance.
Card A · Port Stack (Connector → Protection → CMC → Termination → isoPHY → MCU)
2) Protection (TVS / series element)
Role: clamp energy before isoPHY.
Check: clamp @ I(t), energy class, Cj (X/Y/N).
Pitfall: oversized Cj degrades differential integrity and emissions.
3) CMC (Common-mode choke)
Role: suppress CM noise and radiated injection.
Check: impedance band & saturation margin (X/Y).
Pitfall: placed too far from connector; return loop becomes large.
4) Termination / Bias (Rterm / split)
Role: define the differential environment.
Check: symmetry, reference choice (X/Y).
Pitfall: asymmetry converts CM into DM noise.
5) Isolated CAN PHY (isoPHY)
Role: isolate HV common-mode from logic domain.
Targets: CMTI ≥ X kV/µs, CM range = X, fail-safe = defined.
Pitfall: withstand-only selection; default-state ignored.
6) MCU / Controller
Role: stable supply + diagnosable counters.
Check: brownout vs comm fault separation (X/Y).
Pitfall: supply dip looks like “bus issue”.
Output: a single, ordered port chain that can be audited, tested, and debugged segment-by-segment.
Card B · Selection Targets (Rating Cluster, Not Single Numbers)
CMTITarget: CMTI ≥ X kV/µs at worst dv/dt. Verify: burst rate ≤ X/hour over Y minutes under switching stress.
Surge/ESDTarget: ESD ±X kV; surge class = 1.2/50 or 10/1000 (N shots). Verify: protection survives without short/open latent failure.
CM RangeTarget: tolerates ground shifts and tool references (±X). Verify: no false dominant/recessive latch during offsets.
Fail-safeTarget: power-down/UVLO defaults are explicit (dominant/recessive = X). Verify: power cycling does not create “phantom bus-off”.
Barrier CTarget: barrier capacitance ≤ X pF (aligned with leakage/EMC). Verify: EMI and comm stability pass together (not separately).
ThermalTarget: junction/ambient headroom at worst traffic + temperature (X/Y). Verify: UVLO/thermal events are logged and bounded.
Rule: pass criteria must be written in system terms (burst/recovery/default-state), then mapped back to component ratings.
dv/dt injection
Symptom: errors correlate with switching edges.
Fast check: correlate error timestamps with dv/dt events; change edge rate (X).
Fix knob: reduce CM coupling (CMC placement, loop shrink, barrier C target).
Ground bounce
Symptom: instability when high current steps occur.
Fast check: compare comm faults vs supply dip counters (X/Y).
Fix knob: supply decoupling, reference strategy, shorten return paths.
Shield discontinuity
Symptom: faults triggered by harness movement or cabinet door state.
Fast check: continuity audit end-to-end; inspect bond points (N points).
Fix knob: stabilize shield bond, avoid floating segments and intermittent contacts.
Protection side effects
Symptom: EMI improves but burst errors increase after TVS/CMC swap.
Fast check: measure added capacitance/symmetry (X pF / ΔC).
Fix knob: balance DM integrity with CM suppression; normalize test conditions.
Brownout masquerade
Symptom: “bus down” only during load steps.
Fast check: check UVLO flags vs comm flags alignment (X ms).
Fix knob: raise supply margin; define fail-safe defaults and reset policy.
Output: field-debug begins at physical paths (injection/return/shield), not at protocol layers.
Diagram intent: a single ordered chain (Connector→TVS→CMC→Rterm→isoPHY→MCU) with explicit EMC hooks (shield/chassis) and default-state focus.
ScopeEngineer chain robustness without relying on physical-layer theory
This blueprint defines the internal measurement transport as a controllable chain: topology choices, robustness knobs
(segmentation, bypass, redundancy, diagnostic taps), and acceptance criteria that prevent single-point failures from taking down the pack.
Excluded: isoSPI waveform/coding explanations. The focus is deployable constraints, containment boundaries, and testable outcomes.
Card A · Chain Topologies (What Actually Ships)
Single Chain: lowest wiring and BOM. Risk: single-point faults can propagate without segmentation.
Focus: segment boundaries + bypass policy.
Dual Path / Ring: higher availability via alternate paths. Risk: complexity and validation scope increase.
Focus: deterministic failover behavior and consistent acceptance windows.
Segmented Chain (default template): containment by boundaries. Risk: boundary implementation must be test-visible.
Focus: per-segment health counters and isolated recovery.
ScalingNode count N and harness length L must be tied to a per-segment acceptance window (X/Y/N).
ContainmentEvery topology must declare “fault blast radius” as a design output, not an assumption.
Card B · Robustness Knobs (Implementation + Side Effects + Pass Criteria)
SegmentationPrevents full-chain collapse. Side effect: more boundaries to validate. Pass: a segment fault does not affect others within X.
BypassSkips a bad node/segment. Side effect: added elements to audit. Pass: bypass engages within Y ms and chain returns stable.
RedundancyAlternate path for availability. Side effect: validation matrix growth. Pass: failover does not create retry storms (≤ X retries).
Diagnostic TapFast isolation of the failing segment. Side effect: extra access points. Pass: MTTR reduced and fault localization within N steps.
Health CountersTurns “sporadic” into measurable bursts. Side effect: counter definition must be normalized. Pass: burst rate ≤ X/hour @ window Y min.
Power-Fault DecouplingPrevents brownout masquerade. Side effect: more power states to test. Pass: UVLO events do not appear as link faults.
Rule: each knob must have an observable signal (counter/log) and a bounded recovery policy (latch/clear placeholders).
Card C · Chain Acceptance (BER/CRC, Retry Storm, Recovery Policy)
Error quality: CRC/BER burst rate ≤ X/hour, measured over Y minutes, reported per segment (A/B/C). Recovery: drop/rejoin recovery ≤ Y ms; recovery must not cascade to adjacent segments. Retry control: retries capped at X; backoff policy prevents “more retries → less stability”. Evidence: acceptance must be reproducible across benches and labs by normalizing the same window/denominator.
ContainmentSingle segment failure must be isolated by boundary behavior (no full-chain blackout).
ObservabilityFaults must be diagnosable with counters/taps without opening protocol-level debates.
Diagram intent: segmentation boundaries (BND) and bypass blocks (BYP) constrain the fault blast radius, while diagnostic taps enable fast localization.
H2-07 · Isolated Power Strategy (Bias Rails, Start-Up, Brownout Behavior)
IntentBind isolation power behavior to comm stability and safety defaults
This section defines a deployable isolated power plan for BMS/HV nodes: required secondary rails, no-load loss targets,
and start-up / UVLO behavior that prevents brownout from masquerading as link failure.
Excluded: topology teaching (flyback/LLC details) and control-loop design. Focus is selection knobs + acceptance criteria (X/Y/N placeholders).
Card A · Secondary Rail Inventory (Voltages, Power Modes, No-Load)
Rail list (fill-in)
• 5V (logic) = X mA avg / Y mA peak
• 3.3V (MCU/IO) = X mA avg / Y mA peak
• ±V (analog/bias) = ±X V, I = Y mA
• Gate/bias (if used) = X V, Ipk = Y, ripple ≤ N mVpp
Power modes (fill-in)
• Sleep: P ≤ X mW, Iq ≤ Y µA
• Idle: P ≤ X mW
• Active: P ≤ X mW @ worst traffic
No-load loss
Target: ≤ X mW (aligned with pack standby budget).
Check: measure at Vin = X, Ta = Y, output load = 0.
Dynamic behavior
Load step: ΔI = X mA.
Allowed droop: ≤ Y mV.
Recovery: ≤ N µs (measurement bandwidth noted).
Evidence hooks: define mandatory test points on both sides of the barrier (primary input, secondary rails, PG/UVLO flags).
Card B · Regulate→Isolate vs Isolate→Regulate (Noise, Efficiency, Cost, Measurability)
Noise containmentPrefer the flow that keeps high dv/dt energy away from sensitive secondary rails. Pass: secondary ripple ≤ X mVpp under worst switching.
No-load efficiencyPick the path that meets standby loss ≤ X mW. Pass: sleep Iq ≤ Y µA for N hours stability.
Start-up sequencingChoose the flow that guarantees isoPHY/MCU dependencies. Pass: rails reach valid window within X ms in a defined order.
MeasurabilityPrefer architectures with clear test points for EOL. Pass: PG/UVLO observable without opening the enclosure.
BOM riskMinimize unique rails and parts when possible. Pass: alternative parts keep acceptance unchanged (X/Y/N).
Failure containmentDefine which rail faults trigger safe-state vs degraded mode. Pass: no “bus held dominant” events during faults.
Rule: selection is an architecture-level decision tied to acceptance windows (noise/sequence/fault containment), not a single converter efficiency number.
Card C · Brownout / UVLO System Policy (Default · Detect · Recover)
Default (safe-state)
• On UVLO: isoPHY output defaults are defined (dominant/recessive = X).
• On UVLO: chain behavior is defined (bypass on/off = X).
• Goal: no retry storm and no “bus pinned” condition.
Detect (diagnosable)
• UVLO / OT / PG flags latched to logs (X fields).
• Counter window normalized (Y minutes).
• Brownout rate ≤ X/day under defined workload.
Recover (bounded)
• Re-enable criteria: Vsec > X for Y ms.
• Debounce / cooldown: N ms to prevent oscillation.
• Recovery time ≤ X ms without cascading failures.
Pass criteria (system)
• No phantom link faults during UVLO events.
• Recovery does not trigger new faults in adjacent segments.
• Default states remain consistent across power cycles (N runs).
Diagram intent: rail inventory and UVLO/PG behavior must be explicit across the barrier so brownout becomes a diagnosable system event, not a “mystery link fault”.
H2-08 · EMC & Transient Immunity in HV Switching Environments
IntentMake common-mode injection paths explicit and controllable
This section reduces HV dv/dt susceptibility to a small set of dominant common-mode injection paths, then binds each
mitigation knob (barrier C, Y-cap, shield/return, damping) to an evidence-ready test matrix and pass criteria placeholders.
Excluded: EMC encyclopedias. The focus is isolation-system paths, knobs, and acceptance matrices (ESD/EFT/Surge/ISO7637 placeholders).
Card A · The 3 Dominant Injection Paths (Source → Coupling → Victim)
Path 1dv/dt → Barrier C / stray C → Isolator / receiver threshold
Symptom: burst errors, false toggles aligned with switching edges.
Path 2Shield current → bond discontinuity → return reroute
Symptom: faults triggered by harness movement, enclosure state, or service tooling.
Path 3Y-cap path → leakage current ↔ EMI margin
Symptom: EMI improves but leakage limits (medical/portable) are violated or new injection appears.
Rule: mitigation must state which path is being weakened and what new constraint is introduced (efficiency, leakage, thermal).
Card B · Mitigation Knobs (What They Change, Side Effects, Pass Criteria)
Edge-rateReduce dv/dt at the source. Side effect: switching loss/thermal. Pass: burst ≤ X/hour with Ta = Y.
CMC placementControl CM currents at the port. Side effect: layout constraints. Pass: EMI + comm stability both pass after relocation.
Barrier C targetLower capacitive coupling across the barrier. Side effect: EMI may worsen without other knobs. Pass: barrier C ≤ X pF and EMI passes.
RC dampingSnubbers / damping reduce ringing. Side effect: added loss/heat. Pass: peak overshoot ≤ X, no thermal runaway.
Return shapingMake return path short and predictable. Side effect: mechanical constraints. Pass: no “door open” sensitivity in N trials.
Shield bondingDefine bonding policy and continuity. Side effect: system grounding coordination. Pass: continuity verified at N points, no intermittent spikes.
Y-cap minimalismUse Y-caps sparingly and deliberately. Side effect: leakage budget. Pass: leakage ≤ X while EMI still passes.
Rule: never optimize for EMI alone—each knob must pass a dual gate: EMI + burst stability, and where applicable leakage.
ESD (±X kV)
Record: resets, reconnect time, latch states.
Pass: recovery ≤ Y ms, no stuck states (N shots).
EFT (X kV)
Record: burst density in a normalized window (Y min).
Pass: burst ≤ X/hour, no retry storm.
Surge (1.2/50 or 10/1000)
Record: TVS/CMC temperature & leakage drift.
Pass: drift ≤ N%, no latent short/open.
ISO 7637 (if automotive)
Record: UVLO/PG events vs “link fault” events.
Pass: no phantom link faults during power pulses.
Normalization rule: the same denominator and window must be used across benches and labs, otherwise “pass/fail” becomes non-comparable.
Diagram intent: show where CM energy originates, how it couples through barrier capacitance, shield/return discontinuities, and Y-cap paths, and why mitigation must pass both EMI and stability gates.
H2-09 · Layout, Creepage/Clearance & Leakage Control (The Non-Negotiables)
IntentTurn insulation geometry and partitioning into reviewable layout rules
This section locks down non-negotiable PCB rules for isolated BMS/HV systems: primary/secondary partitioning,
keep-out geometry, creepage/clearance implementation (slot/coating/derating), and leakage-controlled Y-cap usage.
Excluded: full standards text. Only executable layout checklist + pass/fail gates (X/Y/N placeholders).
Card A · Partition Rules (No Cross-Gap Return, Controlled Cross-Barrier Parts Only)
Domains
Primary and Secondary copper/planes must not bridge the isolation gap; no “hidden bridges” (plane islands, via annuli, copper slivers).
Pass: keep-out violations = 0.
Return
Reference and return paths must stay inside the same domain; no test-clip grounds that create cross-gap return loops.
Pass: cross-gap return loops = 0; critical loop area ≤ X.
Crossing
Only safety-qualified cross-barrier components are allowed (e.g., certified isolation devices and safety Y-cap in a single controlled position).
Pass: non-safety cross-gap parts = 0.
Labeling
Domain boundaries and barrier line must be explicitly marked on drawings and review outputs.
Pass: labeling completeness = Y/N.
Review gate: if a return path or copper area crosses the gap, the design is considered non-compliant regardless of component ratings.
Slot / milling
Use slots to extend surface path and reduce contamination-driven leakage.
Side effects: mechanical strength + fab tolerance + cleaning requirements.
Pass: slot geometry meets X/Y; no burr/edge defects (N samples).
Keep-out geometry
Enforce minimum clearance with a defined keep-out band around the barrier.
Side effects: routing density increases; early stack-up planning required.
Pass: keep-out width ≥ X mm; opposing copper area ≤ Y%.
Conformal coating / potting
Apply coating to raise pollution tolerance and reduce surface leakage variability.
Side effects: rework difficulty; process consistency must be verified.
Pass: coating thickness ≥ X; coverage verified at N points.
Altitude derating
Increase geometric margins for high altitude environments.
Side effects: PCB area/cost increases.
Pass: altitude ≥ X requires margin adders = Y (placeholder).
Process gate: cleanliness and residue control must be treated as part of creepage reliability (recorded and auditable).
Card C · Y-Cap Boundaries (EMC Benefit vs Leakage Limits)
Placement
Y-cap is only allowed at the single controlled location shown in the geometry diagram; random placement is prohibited.
Pass: allowed location usage = Y/N.
Total C
Total Y-cap budget must be bounded to control both injection and leakage.
Pass: ΣCY ≤ X nF (placeholder) at defined operating voltage.
Leakage gate
Leakage/touch-current constraints must be satisfied in the target category (medical/portable tighter).
Pass: leakage ≤ X while EMI still passes (dual gate).
Dual gate rule: a design cannot “buy” EMI margin with Y-cap if leakage criteria are violated.
Diagram intent: the barrier is a geometric system, not a symbol. Keep-out bands, slots, and a controlled Y-cap location enforce creepage/clearance and predictable return behavior.
IntentMake isolated faults diagnosable, reproducible, and accountable
This section defines an engineering-grade event model for BMS/HV isolation systems: a normalized event dictionary,
latch/clear policies, and black-box logging fields that enable field forensics and stable recovery without retry storms.
Excluded: ASIL/IEC theory. Only deployable event definitions, policies, and evidence capture requirements.
Card A · Event Dictionary (Normalized Names, Domains, Triggers, Actions)
UVUVLO_SEC: secondary undervoltage (trigger: Vsec < X for Y ms) → safe-state → log rail + PG/UVLO + recovery time.
OTOT_ISO: isolator/phy/module overtemp (trigger: T > X) → degrade or latch → log T profile + duty state.
OCOC_SEC: secondary overcurrent (trigger: I > X for Y) → limiter/trip → log rail droop + fault duration.
BurstCRC_BURST: CRC burst over a normalized window (burst > X in Y min) → classify severity → log window definition + denominator.
BusBUS_OFF: bus-off or link down event (trigger: X occurrences in Y) → bounded recovery → log retry count and cooldown.
StormRETRY_STORM: retries exceed cap (retries > X per Y) → throttle/backoff → log throttle state and cause.
Normalization rule: every rate must declare the denominator and window, otherwise events cannot be compared across benches or labs.
Card B · Policy (Latched vs Auto-Recover vs Degrade) + Clear Conditions
Latched
Use for safety-relevant or repeatable fault patterns that require accountability.
Clear: service tool command OR stable operation for Y minutes after repair (placeholder).
Pass: no silent self-clear for latched class events.
Auto-recover
Use for bounded transient events with controlled recovery and no cascading failures.
Recover: V/T back in window for Y ms, cooldown N ms (placeholder).
Pass: recovery attempts capped at X; no retry storms.
Degrade mode
Maintain essential functions with reduced performance when partial segments fail.
Pass: degraded operation still meets essential metric ≥ X (placeholder) without triggering new fault loops.
Hard rule: every automatic recovery must leave evidence in logs (action_taken + reason + bounded attempts).
Card C · Field Forensics (Timestamp, Counter Window, Ring Buffer, Snapshot)
Timestamp
Single time base (placeholder) and declared resolution.
Pass: event ordering is stable under N resets.
Counter window
Window length = Y; denominator declared (per minute / per 1k frames / per segment).
Pass: same window used across tools and labs.
Ring buffer
Capacity: N events; retains ≥ X minutes history.
Pass: no overwrite before minimum retention.
Snapshot
Capture pre/post window: ±X ms around trigger.
Include: rails (PG/UVLO), thermal flags, link state, retry counters, recovery action.
Diagram intent: faults become auditable events by enforcing a single pipeline—classification, bounded actions, and minimum logging fields with normalized windows.
H2-11 · Validation & Production Test Plan (Hi-Pot, PD, Comms Margining)
Convert “ship-ready” into executable gates: safety (hipot/PD/impulse), communication robustness (error bursts + recovery),
and regression across temperature/rail/dv/dt/harness/EMI. Thresholds are placeholders (X/Y/N) to be locked per program.
Card A
Safety Gates: Barrier Integrity (What to test + what to store)
Hi-pot gate: level = X kVrms, dwell = Y s, leakage ≤ N μA. Store: ramp rate, dwell, trip limit, instrument profile ID.
Partial discharge (PD) gate: PD inception ≥ X kVrms; PD magnitude ≤ N pC at Y kVrms. Store: bandwidth/method, sensor coupling, noise floor.
Impulse / surge gate: waveform class = X (e.g., 1.2/50 μs), peak = Y kV, repetitions = N. Store: source impedance, coupling path, DUT config.
Barrier screen: hi-pot quick screen at X for Y. Store leakage + trip count + profile ID.
Comms smoke: CAN ping + isoSPI scan. Total test time ≤ X s. Store CRC counts + recovery time.
Power policy: UVLO toggle. Restart time ≤ X ms. Store latch/clear state + reason code.
Golden record: board SN, firmware hash, test script hash, ambient, counters window definition, pass/fail codes, operator/site ID.
Field forensics requires reproducibility: keep “window definition + timestamp + counter snapshot + trigger reason + recovery-policy version” for every failure.
Matrix is expressed as an SVG grid (not HTML tables) to prevent mobile overflow. Each cell is a program-locked gate: X/Y/N + script ID + evidence fields.
H2-12 · Applications & IC Selection (Quick Pairings for BMS/HV)
Provide reusable pairing templates (minimal stacks) and a selection funnel that maps directly to test gates (H2-11).
This section lists concrete part numbers as examples for BOM alignment.
H3-A
Applications (Reference Pairings)
Pairings are “templates”: keep the same port responsibilities and swap parts inside each layer (protection / choke / isoPHY / power) without changing system partitioning.
1) Pack-to-Vehicle / Charger: Isolated CAN-FD Port Template
Integrated isoCAN-FD + isolated power: Analog Devices ADM3057E (isolated CAN transceiver with integrated isolated DC/DC).
Isolated CAN-FD (signal only): Texas Instruments ISO1042 (galvanically-isolated CAN transceiver; add isolated power as needed).
Discrete stack (barrier + PHY): TI ISO7721 (reinforced digital isolator) + TI TCAN1051-Q1 (CAN FD transceiver).
Isolated flyback controller: Analog Devices LT8304 (no-opto isolated flyback, as a building block when higher power is needed).
This map only selects templates. Detailed constraints remain in the dedicated chapters (barrier targets, port stacks, EMC paths, and validation gates).
H3-B
IC Selection Logic (Funnel: Safety → Immunity → Timing → EMI → Power → Cost)
1) Safety
Lock system inputs first: Vwork, Vtransient, altitude, pollution degree, lifetime, PD gate. Then map to VIORM/clearance/creepage targets.
Example isoCAN-FD parts: ADM3057E, ISO1042 (use the program’s reinforced requirement to select the correct grade/package).
2) Immunity
Set CMTI/dv/dt target from the HV switching environment and verify via the H2-11 dv/dt sweep.
Discrete barrier option when integration is not desired: ISO7721 + TCAN1051-Q1.
3) Timing
CAN-FD: define loop delay + recovery SLO (≤ X ms). Avoid “works on bench, fails after burst”.
isoSPI: define segmentation and burst/retry ceilings. Host bridge example: LTC6820.
4) EMI
Barrier capacitance + common-mode emission are system properties; verify with a fixed layout and stable shield/y-cap policy.
Field triage + acceptance only. Each item is constrained to: Likely cause → Quick check → Fix → Pass criteria (X/Y/N placeholders).
FAQ 01CAN-FD bench OK, but in-vehicle bus-off bursts—first suspect dv/dt injection or shield bond?
Likely cause: Common-mode dv/dt couples through barrier capacitance / return discontinuity, or shield bond is intermittent causing CM surge into the receiver.
Quick check: Correlate bus-off timestamps with inverter switching; log TEC/REC with a fixed window; verify 360° shield termination continuity (both ends) and measure CM at the connector.
Fix: Enforce a single, robust shield bond strategy; move CMC to the connector; tighten return paths; reduce edge-rate or add snubbers; prefer lower barrier-C / higher CMTI isolation where needed.
Pass criteria: Bus-off = 0 over Y minutes at X kV/µs; TEC/REC < N; recovery time ≤ X ms (same harness length L and EMI state).
FAQ 02isoSPI chain CRC bursts only when inverter switches—barrier-C coupling or segment grounding?
Likely cause: CM injection through the isolation/coupling path (effective barrier C) or segment reference bounce converting CM noise into differential errors.
Quick check: Run inverter switching OFF/ON A/B test; measure burst density (errors per Y frames) and burst length; probe CM at segment boundaries and verify segment ground/keep-out integrity.
Fix: Add/relocate CMC close to connectors; enforce segmentation + bypass; improve return continuity; reduce effective coupling; add dv/dt damping (snubber/edge control) at the source.
Pass criteria: CRC/PEC ≤ X per Y frames; burst length ≤ N; re-sync ≤ X ms under dv/dt = X kV/µs.
FAQ 03EMC passed after adding Y-caps, but leakage current fails—what knob first?
Likely cause: Total Y-capacitance (and its chassis reference point) drives 50/60 Hz leakage beyond the system limit, especially with new ground/PE paths.
Quick check: Measure leakage at worst-case line voltage and frequency; inventory Y-cap count and values; A/B test with one cap removed to estimate dI/dC.
Fix: Reduce total Y-cap C first (sum-C); relocate to a controlled chassis node; recover EMC margin using CMC placement, edge-rate control, and return-path cleanup (not more Y-cap).
Pass criteria: Leakage ≤ X (µA or mA) at Y Vac/Hz; EMC margin ≥ N dB in the target band with the finalized cap set.
FAQ 04Different labs give different hi-pot/PD results—what must be normalized in setup & criteria?
Likely cause: Test profile mismatch (ramp rate, dwell, trip limit), PD bandwidth/noise floor differences, or fixture parasitics/humidity creating non-equivalent results.
Quick check: Compare profiles line-by-line (kV/s, dwell s, trip µA; PD band/method); run the same golden unit in both labs using the same fixture drawing and preconditioning.
Fix: Freeze a single test profile ID + fixture spec + preconditioning; define PD/hipot evidence fields as mandatory for every report; reject results without matching metadata.
Pass criteria: With fixed profile ID: leakage ≤ X µA; PD ≤ N pC at Y kVrms; cross-lab delta ≤ X% across N samples.
FAQ 05After cold soak, chain link flaps for first 2 minutes—UVLO threshold or start-up sequencing?
Likely cause: Secondary rail ramp/UVLO window is marginal at cold, or communication starts before bias rails and references settle (warm-up/soft-start gap).
Quick check: Capture secondary rail ramp at Tmin; log UVLO events with timestamps; compare reset deassert time vs first comm activity.
Fix: Increase UVLO hysteresis or raise threshold margin; delay comm enable until rail stable; improve soft-start/inrush control; prevent burst/skip mode during start-up if needed.
Pass criteria: Zero link flaps in first Y minutes at Tmin; UVLO events = 0; comm-ready ≤ X s after power-on.
FAQ 06One module swap breaks the whole chain—addressing/termination or bypass policy?
Likely cause: No segmentation/bypass containment, pinout/orientation mismatch, or termination/coupling discontinuity that propagates failure across the full chain.
Quick check: Locate the last-good node using a scan; verify bypass state and module ID; check connector keying/pinout; measure continuity across the swapped module.
Fix: Add segmentation boundaries; implement or enable bypass; standardize mechanical keying; add a diagnostic tap per segment; define swap procedure + acceptance gate.
Pass criteria: Any single-module swap affects ≤ 1 segment; remaining segments enumerate ≥ N-1; recovery ≤ X s with defined cooldown Y ms.
FAQ 07CAN recessive level looks fine, still random errors—CMC placement or return-path discontinuity?
Likely cause: CM noise converts to differential errors at discontinuities; CMC is too far from the connector; return path breaks across gaps or chassis bond is inconsistent.
Quick check: Inspect CMC location vs connector; verify return continuity and chassis bond; measure CM on bus lines and correlate errors with load switching events.
Fix: Place CMC at the connector; shorten stubs; enforce a consistent chassis reference strategy; add split termination with controlled center reference where appropriate.
Pass criteria: Error frames ≤ X per Y minutes; bus-off = 0; error bursts ≤ N and recovery ≤ X ms in the validated harness setup.
Likely cause: Lifetime model inputs are inconsistent: altitude derating, temperature acceleration, pollution/CTI, or transient stack-up pushes the real stress above the assumed VIORM conditions.
Quick check: Recompute Vwork + transients using measured data; confirm altitude and thermal profile; verify creepage/clearance and coating/cleanliness against the target class.
Fix: Apply altitude/temperature derating; increase creepage/clearance via slots/coating; upgrade package/class; reduce operating stress by repartitioning or lowering effective CM coupling.
Pass criteria: Lifetime ≥ Y years at Vwork = X and altitude = N; PD inception margin ≥ X% and post-stress leakage delta ≤ Y.
FAQ 09Service port works until charger connected—ground reference shift or CM surge path?
Likely cause: Charger introduces a new reference and CM surge path (PE/chassis), shifting common-mode levels and breaking previously stable return assumptions.
Quick check: Measure pack-to-chassis potential shift at connect; capture CM transients; A/B test with charger connected/disconnected while logging error bursts and recovery times.
Fix: Enforce a single-point chassis bond strategy; add CM choke/TVS at the service port; isolate service power; define connect/disconnect sequencing and validation cycles.
Pass criteria: No link drop over N connect cycles; CM transient at receiver ≤ X V; recovery ≤ Y ms with stable error counters.
FAQ 10EOL passes, field fails after a week—thermal aging of transformer/resin or contamination/CTI?
Likely cause: Time-dependent drift: transformer hot-spot aging / potting defects, or contamination reduces creepage and CTI margin under humidity and thermal cycling.
Quick check: Log hot-spot temperature and duty cycle; inspect residues/cleanliness; rerun leakage/PD after soak; compare to golden baseline deltas.
Fix: Reduce power loss or temperature rise; improve transformer/encapsulation selection; enforce cleaning + coating process; increase creepage via slots/keep-outs; add burn-in where justified.
Pass criteria: After Y hours at X °C (and humidity profile), leakage/PD within N; no upward trend beyond Δ = X vs baseline.
FAQ 11Burst errors appear only at high SoC—HV bus ripple coupling or bias converter no-load mode?
Likely cause: HV bus ripple increases at high SoC, and/or the isolated bias converter enters burst/skip mode at light load, injecting noise into references and thresholds.
Quick check: Measure HV bus ripple and bias-rail ripple versus SoC; correlate ripple peaks with burst errors; force converter CCM using a controlled minimum load.
Fix: Add minimum load or change converter mode; improve filtering/damping; adjust switching synchronization or add snubbers; reduce coupling via layout and return-path control.
Pass criteria: CRC/PEC ≤ X per Y minutes across SoC = N%…100%; bias ripple ≤ N mVpp; zero burst events above threshold.
Likely cause: Recovery retries are unbounded or synchronized across nodes; clear conditions are too permissive; faults are not latched, causing oscillation and repeated re-entry.
Quick check: Inspect retry counters per window; verify cooldown timers; reproduce with a single injected fault; confirm the latch/clear state machine via logs.
Fix: Add exponential backoff (min/max + jitter); latch faults with explicit clear criteria; rate-limit retries; constrain reset scope to the smallest impacted segment.
Pass criteria: Recovery attempts ≤ N per Y minutes; stable operation resumes within X s; no oscillation (burst count ≤ N and bus-off = 0).