Fire/Alarm I/O Module: Supervised Isolated DI/DO

H2-1. Definition & Scope of a Fire/Alarm I/O Module

A fire/alarm I/O module is a field-facing expansion node that converts long-cable wiring states into trusted events: supervised DIisolated DOprotected 24Vdiagnostic logs. This page stays at the module level (wiring → measurement → decision → event code), not the full panel/system architecture.

What this page covers (module boundary)

DI line supervision: EOL/DEOL windows for open/short/leakage/tamper, plus noise immunity for long cables.
DO actuation: relay or transistor outputs with protection and “proof” (commanded vs observed).
Isolation & protected power: where the barrier sits, what must not cross it, and how power faults are captured as evidence.

What is explicitly out of scope

Full access-control/fire-panel system architecture, building topology planning, or platform/cloud workflows.
Step-by-step protocol-stack tutorials (only interface requirements and measurable signals are referenced).
Certification walkthroughs (only engineering verification points and fault-injection evidence are included).

Engineering meanings (measurable definitions):
Line supervision = a window decision over sensed loop voltage/current (or equivalent ADC bins) that distinguishes normal vs open vs short vs abnormal resistance/leakage.
Isolation = a defined barrier (basic/reinforced) that limits fault energy and noise return paths; typical withstand is “kV-class” with an example range depending on the safety goal and layout (creepage/clearance).
Diagnostics = “commanded vs observed” checks + event codes that survive power dips (reset reason + brownout markers + log integrity).

Signal	Electrical form	Supervision / Proof	Common failures	Evidence to capture
DI (Supervised loop)	2-wire loop with EOL / DEOL resistors	ADC window bins (normal/open/short/tamper)	Open, short, leakage drift, bypass/tamper	Sensed V/I, ADC bin ID, time-in-fault, fault counter
DI (Dry contact)	Contact closure to reference (no loop power)	Debounce + glitch filtering	Contact bounce, induced spikes, ground noise	Edge timestamps, debounce rejects, noise-hit counter
DI (Wet/active input)	Externally powered sensor / open collector	Threshold + input protection sanity checks	Overvoltage, leakage, miswiring	Input clamp activity, threshold margin, event code
DO (Relay)	Isolated contact output	Optional contact feedback / coil current proof	Stuck contact, coil short/open, inrush reset	Coil current, flyback stress markers, commanded vs observed
DO (Low-side switch)	N-MOSFET sinking load	Current sense / VDS health window	Short-to-supply, overload, thermal shutdown	Load current waveform, OCP/OTP counters, retry mode
DO (High-side switch)	P-MOSFET / high-side driver sourcing load	Current limit + output voltage proof	Short to ground, surge stress, reverse battery	Output V/I, eFuse trips, UVLO resets, fault codes

Figure F1. Module-level boundary: field wiring is measured and classified into event codes through a defined isolation barrier and protected power.

Cite this figure Figure F1 — Fire/Alarm I/O Module scope, interfaces, and partitions (ICNavigator).

H2-2. System Block Diagram & I/O Taxonomy (DI/DO + Supervision)

This chapter establishes the complete signal flow (field wiring → measurement → decision → event code) and the power flow (24V → protection → rails), with the isolation boundary made explicit. The taxonomy below is organized by how correctness is proven, not by marketing labels.

Three non-negotiables (engineering intent)

Every DI state is a window decision: sensed loop V/I maps into bins (normal/open/short/tamper/leakage drift).
Every DO action is “commanded vs observed”: current/voltage/contact feedback proves the output actually happened.
Every fault leaves evidence: reset reason, brownout markers, and event log entries survive noisy 24V environments.

I/O taxonomy (what each class must prove)

Supervised DI loop: prove wiring integrity using EOL/DEOL windows (open/short/abnormal resistance/leakage).
Dry contact DI: prove stability using debounce + glitch rejection (separate bounce from real events).
Active/wet DI: prove safe input range (clamp activity, threshold margin, miswire detection).
Relay DO: prove actuation via coil current and optional contact feedback (detect stuck/contact issues).
FET DO (low/high side): prove load engagement using current sense and output voltage windows (short/overload/thermal).

How the rest of the page will stay vertical (no scope creep): later chapters will deep-dive DI supervision windows, long-cable noise immunity, DO protection + proof, isolation return paths, protected power, and a fault-injection validation plan. No panel-level architecture, no video, no cloud workflows.

Figure F2. A closed evidence loop: measurements feed decisions, decisions emit event codes, and protected power preserves logs through real field disturbances.

Cite this figure Figure F2 — Fire/Alarm I/O Module system block diagram with supervision, isolation, protected power, and evidence paths (ICNavigator).

H2-3. Supervised DI Front-End: EOL/DEOL, Open/Short, Tamper Detection

A supervised DI channel converts long-cable wiring integrity into stable event codes by mapping the measured loop voltage/current into ADC bins with guard bands. The goal is not “a resistor value,” but a repeatable classification: Normal vs OpenFault vs ShortFault vs Tamper, plus SupervisionFail when the measurement can no longer be trusted.

What must be distinguishable (event semantics)

Normal: measurement stays in the expected window with margin.
OpenFault: loop breaks → sensed V/I moves toward “open” bin and remains there.
ShortFault: hard short → sensed V/I collapses into “short” bin with high confidence.
Tamper: loop is bypassed or manipulated → DEOL relationship breaks or falls into a “bypass” region.
SupervisionFail: ADC saturates, reference drifts, or leakage shifts bins so boundaries are ambiguous.

Single EOL vs DEOL (what more bins buy)

Single EOL: typically supports robust separation of Normal / Open / Short with guard bands, but has limited leverage to prove bypass/tamper beyond coarse inconsistencies.
DEOL: expands the “valid states” into multiple resistance combinations, making Tamper detection more reliable by enforcing relationship checks (not just a single threshold).

Hard problems in the field (where false states come from)

Cable resistance: long runs shift the “Normal” center and reduce margin; guard bands must account for worst-case length.
Leakage & moisture: humidity and contamination create slow drift toward “short-like” bins (partial short).
Protection leakage: TVS/ESD parts add bias/leakage that can compress bin spacing, especially at high temperature.

How to set thresholds without becoming a formulas class

Bins + guard bands: each state has a center region plus a buffer zone to prevent boundary chatter.
Baseline calibration: record an installation baseline (with checks to avoid “learning” a tampered condition).
Temperature handling: use segmented thresholds or small temperature-aware offsets to keep bins separated across drift.
Fail-safe: if separation collapses (ADC near rail / reference unstable), emit SupervisionFail instead of guessing.

Analog evidence (TP measurements) Loop current or sense voltage at Rsense; check the margin to open/short bins, not only absolute values.

Digital evidence (bin behavior) ADC bin ID (Normal/Open/Short/Tamper), bin transition rate, and time spent near boundaries (guard band hits).

Event evidence (what the module reports) OpenFault / ShortFault / Tamper / SupervisionFail, plus counters and time-in-fault for each state.

Figure F3. The channel is engineered around stable classification (bins + margin), not a single resistor value.

Cite this figure Figure F3 — Supervised DI loop, sensing front-end, and example state bins (ICNavigator).

H2-4. Noise Immunity for Long Cables: Filtering, Debounce, and False Alarm Control

Long cables and harsh EMC environments create two failure patterns that look similar in a single sample: fast spikes (EFT/ESD/induction) and slow bias (leakage, protection capacitance). Noise immunity is a layered policy: analog shaping, digital confirmation, and latch/clear rules that turn samples into stable events.

Why false alarms happen (three root classes)

Fast transients: spikes push a sample across a bin boundary for <1–2 samples.
Contact bounce / chatter: mechanical jitter creates rapid toggles that mimic open/short edges.
Protection side effects: TVS/ESD leakage and capacitance add a bias path that shifts bins (temperature-dependent).

Three-layer defense (what each layer is for)

RC filtering: suppress fast spikes; tradeoff is slower edge response (must not hide real open/short).
Digital debounce: reject short toggles; expressed as a time window or a required count of consistent samples.
Periodic confirmation: require N consistent samples before latching a fault; use separate clear logic to avoid oscillation.

Latch / clear policy (stability under boundary chatter)

Fault latch: enter Open/Short/Tamper only after N consecutive in-bin samples.
Fault clear: exit only after M consecutive normal samples (often M ≠ N to prevent ping-pong).
Evidence-first: record the first threshold crossing and the final latch time to separate spikes from persistent faults.

How to verify the tuning (evidence metrics)

fault count: how often a fault is entered.
time-in-fault: total duration of the fault state.
bin transition rate: how often samples hop between bins (spike signature).

fault count High count with low time-in-fault usually indicates spikes or bounce, not a real wiring fault.

time-in-fault Real open/short faults show sustained time-in-fault and low bin transition once settled.

bin transition rate Fast spike environments show rapid bin hopping; increasing confirmation N or adding RC can reduce false latches.

Figure F4. A stable fault is a policy outcome: confirmation and latch/clear rules prevent one-sample spikes from becoming events.

Cite this figure Figure F4 — Sampling/confirmation timeline for separating spikes from real open/short faults (ICNavigator).

H2-5. Isolated DO Drivers: Relay, High/Low-Side Outputs, Protection & Proof

A DO channel is complete only when it closes three loops: Drive (the load moves), Protect (the channel survives shorts and inductive kick), and Prove (the module can report Commanded vs Observed consistency). Proof prevents silent failures such as welded contacts, missing loads, and thermal foldback that never reaches “on”.

Output types (why protection and proof differ)

Relay output: coil inrush + flyback control; contact risk includes stick/weld and bounce.
Low-side MOSFET: simple switching; short-circuit and ground-bounce dominate stress paths.
High-side MOSFET: “power-to-load” behavior; inrush, SOA, and reverse/inductive energy handling are critical.

Relay drive (the “mechanical” failure modes still need evidence)

Coil energize: the first milliseconds set peak current and heat; size the driver for inrush.
Flyback/clamp: choose a clamp that protects silicon without making release too slow.
Contact anomalies: welded/stuck contacts can make “OFF” commands ineffective—proof must catch this.

MOSFET drive (shorts and heat are the true design center)

Short-circuit limiting: fast current limiting or foldback prevents destructive SOA events.
Thermal protection: OTP triggers should be counted and reported; repeated OTP indicates undersized margins.
Inductive loads: define the freewheel/clamp path so energy returns safely without corrupting sensing.

Proof (observability) — three evidence levels

Voltage readback: confirms the output node moved; fast and cheap but not a full “load proof”.
Current readback: proves the load actually drew current; detects missing load/open circuit.
Contact/loop proof: verifies relay state or load presence; catches welded contacts and bypass wiring.

Waveform evidence Output current profile (inrush → steady → foldback), plus output node voltage during ON/OFF transitions.

Protection evidence Overcurrent event code + timestamp, and thermal shutdown count (OTP_count) per channel.

Consistency evidence Commanded vs Observed: cmd_state vs obs_state mismatch counters and last_mismatch_reason.

Practical rule: proof must detect both “commanded ON but no load current” and “commanded OFF but voltage/current still present” to expose missing loads, wiring bypass, and welded relay contacts without relying on external instruments.

Figure F5. A DO channel is three coupled paths: command drives the switch, power feeds the load, and sense/proof closes the loop.

Cite this figure Figure F5 — DO channel: command path, power path, and proof/sense path with commanded-vs-observed checks (ICNavigator).

H2-6. Galvanic Isolation Architecture: Barrier, Isolated Power, Ground-Fault Considerations

Galvanic isolation is the foundation that keeps long field wiring from collapsing logic-domain decisions. A correct isolation design is a partition (what stays on the field side), a power strategy (isolated DC/DC + post regulation), and a leakage-aware view of real-world return paths that appear under moisture, shielding, and protection-device bias.

Isolation partition (what lives on each side)

Field side: terminals, protection, DI/DO AFE, sense references, field ground.
Logic side: MCU, bus interface, configuration, event logs.
Across barrier: digital signals (isolator/transceiver) and isolated power.

Digital isolators / isolated transceivers (selection focus)

CMTI behavior: prevents dv/dt-induced bit flips that look like random events.
ESD robustness: reduce susceptibility to port strikes that couple into logic I/O.
Failure bias: prefer predictable fail-safe behavior that does not turn outputs on unexpectedly.

Isolated power (noise becomes “decision drift”)

Isolated DC/DC: provides the barrier supply but introduces ripple and switching noise.
Post regulation: LDO/filters create a quiet rail for AFE/ADC references and reduce bin jitter.
Evidence linkage: ripple spikes correlate with increased bin transition rate and false supervision events.

Ground-fault & leakage paths (what defeats “perfect isolation”)

Cable moisture leakage: slow bias shift compresses DI bins toward short-like regions.
Shield bonding: shield-to-ground choices create return paths that vary by installation.
Protection leakage: TVS/ESD leakage increases with temperature and can couple across domains.
Parasitic capacitance: isolation components have capacitive coupling that carries common-mode noise.

Noise evidence Isolated rail ripple + AFE reference noise correlated with DI bin transition rate and fault count.

Leakage evidence Leakage trend under humidity/temperature that shifts baselines and increases supervision drift.

Engineering test focus Measure return-path suspects (shield bond, TVS leakage, parasitic coupling) before changing thresholds.

Figure F6. Isolation is a partition plus a leakage map. Dashed paths show how moisture, shielding, TVS leakage, and parasitic coupling can bias measurements.

Cite this figure Figure F6 — Galvanic isolation barrier with real-world leakage/return paths (ICNavigator).

H2-7. Protected Power Tree: 24V Front-End, Reverse Polarity, OVP/UVLO, eFuse, Hold-Up

Field 24V is rarely clean: reverse wiring, surge bursts, brownouts, and hot-plug inrush can all corrupt I/O decisions and erase forensic data. A protected power tree must close three loops: Survive (don’t burn), Stay deterministic (no erratic resets), and Preserve evidence (hold-up to finish critical writes).

24V front-end (the “dirty edge” that defines reliability)

Reverse polarity: block reverse current without creating a large dropout that triggers early UVLO.
Surge absorption: clamp fast spikes so downstream eFuse and DC/DC stay within safe operating windows.
Input current limiting: control hot-plug inrush into bulk capacitance and downstream converters.

eFuse / hot-swap (turn catastrophic faults into bounded events)

Soft-start: defines inrush profile, prevents input collapse, and avoids unintended brownout resets.
OCP / short response: fast limit + controlled retry policy prevents thermal runaway and repeated chatter.
Recovery strategy: distinguish “persistent short” vs “transient overload” using fault timers and counters.

OVP / UVLO / brownout behavior (make resets explainable)

UVLO: ensures rails fall in a predictable order; prevents partial operation that corrupts supervision bins.
OVP: clamps or disconnects before converters saturate and inject noise into sensing references.
Reset accounting: brownout count and last_reset_reason provide a forensic anchor for field debugging.

Hold-up (event persistence under real outages)

Goal: guarantee a minimum “write-complete window” for event codes, timestamps, and last-good states.
Trigger: detect input collapse early (pre-UVLO) and switch to a safe write-and-freeze sequence.
Completion flag: store a “write_done” marker so the next boot can report whether the last event was finalized.

Inrush waveform evidence Measure input current during hot-plug and soft-start. Verify peak and settling time match the intended profile.

Reset and brownout evidence Track UVLO/brownout counters and last_reset_reason to separate “power instability” from “logic faults”.

Hold-up completion evidence Hold-up time must exceed log commit latency. Confirm “write_done” is set before rails drop below safe levels.

Engineering rule: when field issues appear as “random alarms,” validate power determinism first. A clean event timeline requires stable rails, bounded inrush, and a reliable hold-up commit path.

Figure F7. A protected 24V tree should bound inrush and faults, maintain deterministic UVLO/OVP behavior, and provide hold-up time to finalize event logging.

Cite this figure Figure F7 — Protected power tree with TP1/TP2/TP3 and hold-up intent (ICNavigator).

H2-8. Buzzer & Indicators: Alarm/Fault Annunciation Without Ambiguity

Indicators are not decoration; they are an operations instrument. The module must present a small set of unambiguous states with consistent mapping: State → LED code → Buzzer pattern → Event code. Ambiguity causes mis-triage, unnecessary dispatches, and missed wiring faults.

Buzzer drive & power budget (always-on patterns must be engineered)

Duty-cycled patterns: reduce average power while preserving urgency; avoid continuous drain during long faults.
Night/quiet modes: use “silence” that affects buzzer only, while LEDs remain truthful and persistent.
Driver robustness: protect the buzzer path so a shorted transducer does not collapse logic rails or mask faults.

LED encoding strategy (few channels, high information)

Core set: Power / Alarm / Fault / Comm / Zone (or channel group) is usually sufficient for field triage.
Pattern priority: Fault overrides Alarm only when it invalidates alarm meaning (e.g., sensing lost).
Consistency rule: the same event code always yields the same LED/buzzer code; no hidden “context modes”.

Silence & reset (module-level only)

Silence: acknowledges audible alert while preserving visible truth; never clears the underlying fault.
Reset: re-initializes the annunciation state machine and rechecks evidence inputs (DI bins / DO proof / power).
Auditability: log both actions as events (silence_request, reset_request) with timestamps.

Evidence mapping (make field behavior explainable)

One-to-one map: each state and sub-state maps to a single event code family.
No “mystery blinking”: a short lookup table should decode the pattern without manuals full of exceptions.
Proof-first behavior: if proof contradicts command, show “Fault/Trouble” rather than “Alarm”.

Indicator ↔ event integrity Validate that each LED/buzzer code corresponds to a single event code family; record last_state and last_event_id.

Silence semantics Silence should only affect buzzer output; LEDs and logs remain accurate and persistent until the root cause clears.

Power-aware signaling Confirm buzzer/LED patterns do not induce brownouts under weak 24V; verify brownout_count stays stable during alerting.

State	LED code (concept)	Buzzer pattern (concept)	Event codes / notes
Normal	PWR steady; COMM ok	silent	EVT_OK / heartbeat
Alarm	ALM blink pattern; Zone hint if used	burst / periodic	EVT_ALARM_* (latched until cleared)
Fault	FLT steady or fast blink	distinct cadence	EVT_FAULT_* (eFuse, OTP, proof mismatch)
Trouble	COMM/Zone slow blink; attention needed	short chirp	EVT_TROUBLE_* (supervision drift, intermittent)
Test	deterministic pattern sequence	limited duration	EVT_TEST_* (must be logged)

Figure F8. A compact annunciation state machine plus deterministic encoding blocks reduces ambiguity: state, indicator pattern, and event logs always agree.

Cite this figure Figure F8 — Annunciation state machine with deterministic LED/buzzer encoding blocks (ICNavigator).

H2-9. Self-Test, Diagnostics & Event Logging: Making Failures Forensic

A Fire/Alarm I/O module becomes field-serviceable only when failures leave a trace. The diagnostic loop must be closed end-to-end: Measurement → Classification → Event code → Log commit → Field triage. The key design rule is Commanded vs Observed: outputs are proven, inputs are supervised, and power anomalies are recorded as evidence.

DI self-test (supervision is verified, not assumed)

Weak stimulus checks: periodically apply a controlled, low-impact test stimulus and verify the sensed response stays within expected windows.
Window health: track “margin to threshold” so the module can detect supervision chains that are drifting toward false alarms.
Distinguish failure modes: separate true open/short from supervision degradation (moisture leakage, protection capacitance, cable aging).

Practical self-test goal: detect that the DI classifier still has separation between bins (Normal/Open/Short/Tamper) without disturbing normal operation.

DO diagnostics (prove the output actually happened)

Proof paths: use current/voltage feedback or contact readback to confirm physical actuation.
Mismatches are first-class events: if commanded and observed disagree, log a proof mismatch rather than silently retrying.
Bounded fault behavior: overcurrent/thermal trips are counted and time-stamped so recurring overloads are detectable.

Evidence rule: a DO channel is “healthy” only when it can explain both success and failure using measurable feedback, not just firmware intent.

Event logging (ring buffer + power-loss safety)

Ring buffer: keep the last N events with monotonic IDs so ordering remains reliable even if absolute time drifts.
Commit marker: write event record first, then a commit flag; on next boot, report incomplete commits as forensic clues.
Brownout marker: record input collapse and reset reasons to separate “electrical environment” from “logic defects”.

What to store (minimum viable forensic fields)

last_reset_reason brownout_count brownout_marker fault_counters last_N_events commit_ok / commit_incomplete di_window_margin_min do_proof_mismatch_count

These fields enable fast triage: whether the system is power-limited, supervision-limited, or output-proof-limited.

Fault counters (trend evidence) Counts reveal recurrence and escalation. Use per-channel counters to localize wiring vs module-wide issues.

Last N events (timeline evidence) Event order distinguishes cause vs effect (e.g., brownout → classifier drift → false fault vs true open first).

Brownout marker + reset reason (environment evidence) Record power collapse and reset causes to avoid misattributing power instability as I/O failure.

Observation	What it usually means	What to log / measure next
cmd=ON, obs=OFF	Driver cut off, power-limited, protection tripped, wiring open, load absent.	Proof mismatch event; trip counters; rail sag marker; channel current snapshot.
cmd=OFF, obs=ON	Relay contact stuck, leakage path, wiring backfeed, failed switch element.	Sticky contact suspect; observed current at OFF; post-event verification sample.
DI bin drifts	Protection leakage/capacitance, moisture, cable aging, reference shift after stress.	di_window_margin_min trend; post-stress false alarm rate; comparison vs clean baseline.
random resets	Brownouts, surge injection, isolation-side upset, hold-up insufficient.	last_reset_reason + brownout_marker; incomplete commit flag; retry policy counters.

Figure F9. A forensic module closes the loop from measurements to committed logs and makes Commanded vs Observed mismatches explicit rather than implicit.

Cite this figure Figure F9 — Forensic evidence chain with commanded vs observed closure (ICNavigator).

H2-10. EMC/ESD/Surge & Wiring Interface: Survive Outdoor/Long Runs

Wiring interfaces are where outdoor reality enters the module: long runs, fast ESD edges, EFT bursts, and surge energy. Protection must be engineered as a system: Protection elements + Return path + Layout separation. A key depth point is that protection devices can change supervision thresholds via leakage and capacitance, increasing false alarm risk if not accounted for.

Port protection (layered roles, not a random parts list)

TVS: clamps fast edges; may add capacitance/leakage that shifts DI sense windows.
GDT: handles high energy; requires deliberate return routing so discharge does not cross sensitive references.
RC: shapes bandwidth and impulse response; changes time constants that interact with debounce and fault confirmation.
CMC: reduces common-mode injection; effectiveness depends heavily on placement and return path integrity.

Return path (where surge current flows decides whether the module stays truthful)

Goal: keep surge current on a short, high-energy loop that returns to the correct ground reference.
Avoid: letting surge return cross the DI sense resistor/reference node, which re-biases the measurement during the event.
Isolation upsets: common-mode injection can trigger isolation-side resets or lockups; record these as event evidence.

The most damaging failure mode is “survived but untrustworthy”: the module stays powered, yet DI bins drift and false faults appear.

Protection side-effects (leakage/capacitance → threshold drift → false alarms)

Leakage: tiny currents can matter because supervised DI is often high-impedance; bins lose separation and margins shrink.
Capacitance: edges look like state transitions; without coordinated sampling windows, transient spikes become faults.
Post-stress drift: after EFT/surge, track DI window margins and false alarm rate to detect “soft damage”.

Layout rules (keep high-energy paths away from sensitive measurement)

Shortest clamp loop: place clamp elements close to the connector and route the discharge path with minimal loop area.
Sensitive keepout: keep DI sense resistor and reference routing away from surge return traces and shield drains.
Controlled coupling: separate noisy discharge nodes from ADC/reference nodes; avoid running them in parallel.

EFT/surge aftereffects (drift + false alarms) Track supervision margin and false fault rate before/after stress; drift indicates leakage/capacitance or reference shifts.

ESD isolation-side anomalies (reset/lockup markers) Log reset reasons and lockup indicators after ESD to separate transient upset from true wiring faults.

“Survived but biased” detection Use di_window_margin_min and event histograms to catch supervision bins moving toward overlap.

Figure F10. Protection is only correct when the surge return path is controlled. A return path that crosses DI sensing references can bias thresholds and increase false faults.

Cite this figure Figure F10 — Port protection and return-path control to prevent DI threshold drift (ICNavigator).

H2-11. Validation & Fault-Injection Test Plan (Engineering, Not Certification)

This plan is a repeatable engineering validation for a Fire/Alarm I/O module. It focuses on field-relevant faults (wiring degradation, load abuse, supply instability, surge aftermath) and verifies that the module produces correct event codes, stable decisions, and forensic logs even under power loss.

Test boundary (module-level, reproducible)

Validate DI supervision, DO drive + proof, protected power behavior, and logging integrity.
Use controlled injection and fixed measurement points (TP1/TP2/TP3) to keep results comparable across builds.
Do not convert this into a certification walkthrough; treat stress tests as engineering evidence for design closure.

Recommended fixtures: controlled resistor box for DI leak/half-short, switchable load bank for DO, and a programmable DC source for brownout/ramp profiles.

What must be captured (minimum forensic set)

event_code channel_id cmd_state obs_state fault_counters brownout_marker reset_reason commit_ok / commit_incomplete di_window_margin_min do_trip_count

A test passes only when the observed electrical behavior matches the expected code path, and the log commits survive worst-case interruption.

MPN examples (components commonly used to implement/validate this plan)

These are example parts (drop-in alternatives exist). They help define realistic test expectations and proof paths.

Digital isolators (logic isolation): ADI ADuM1401 / ADuM1250; TI ISO7741; Silicon Labs Si8641.
Isolated RS-485 (host/bus): ADI ADM2587E (iso + DC/DC class); TI ISO1410 (isolated transceiver family).
Isolated DC/DC (barrier power): Murata NXE1/NME series; RECOM RxxPxx family (typical isolated converters).
Hot-swap / eFuse (24V front-end protection): TI TPS2660; ST STEF01; ADI/LTC LTC4368 (surge stopper class).
High-side / low-side switches (DO transistor outputs): Infineon PROFET family BTSxxx; TI TPS1Hxxx high-side switch families; low-side drivers such as TI TPIC6B595 (power shift register class).
Relay drivers: TI DRV110 (solenoid/coil driver class); ULN darlington arrays ULN2803A (basic coil drive), with proper flyback strategy.
Current sense (proof path): TI INA180/INA181; ADI AD8418; shunt + ADC method.
TVS (line protection examples): SMBJ class SMBJ33A / SMBJ24A (select per rail); low-cap TVS families for fast lines when needed.
RTC / time base (for logs): Maxim/ADI DS3231 (high-stability RTC class) when local timestamping is required.
Non-volatile log storage: FRAM class MB85RC256V (Fujitsu/Cypress class); SPI NOR flash class W25Q64.

The test plan does not assume any single vendor. It assumes the design has: (1) measurable proof paths, (2) bounded protection behavior, and (3) a committed event log.

Fault injection matrix (core deliverable)

Each row defines a single controllable fault, the expected event code, the measurement point, and an objective pass criterion. Keep injection location explicit so results remain explainable.

Fault (Injection)	Expected event code	Measurement point	Pass criteria	Notes / MPN relevance
DI Open break loop	OpenFault	TP1: DI sense node TP2: ADC input	State enters Open bin within confirmation window; no oscillation between Normal/Open; event logged + counter increments.	Classifier window + debounce must separate Open from noise; leakage from TVS should not fake “Normal”.
DI Short hard short	ShortFault	TP1: DI input TP3: 24V rail sag (if any)	Short bin reached; protection does not latch system reset; event logged; no false recovery until fault removed.	If front-end includes eFuse/hot-swap (TPS2660/STEF01), verify it does not create ambiguous DI readings.
DI Half-short mid resistance	SupervisionFail or Tamper	TP2: ADC code LOG: di_window_margin	ADC falls into defined abnormal window; margin shrinks as expected; event code deterministic across repetitions.	This test exposes bin overlap; protection leakage/capacitance (SMBJ TVS) often shifts bins—must be measured.
DI Parallel bypass shunt around EOL	Tamper	TP2: ADC code bins	Tamper bin triggers without being misclassified as Short; event persists per spec until cleared.	If using DEOL logic, prove distinguishability. Document expected bins for each loop configuration.
DI Moisture leakage controlled leak to ground	SupervisionFail (trend) or Trouble	TP1: bias shift LOG: margin trend	No instant false alarms; margin trend indicates degradation; warning code issued when margin crosses threshold.	Validates “survived but biased” detection. Ensure log stores margin minimum and histogram.
DO Short hard short at load	OverCurrentTrip	TP3: rail current TP2: proof sense	OCP triggers within bounded time; channel recovers per retry policy; counter increments; no module-wide brownout unless specified.	High-side switch families (TPS1Hxxx / BTSxxx) should produce consistent trip markers.
DO Overload above rating	Overload or ThermalLimit	TP2: current sense LOG: thermal count	Thermal/OCP behavior matches datasheet intent; event codes differentiate “trip” from “proof mismatch”.	If proof uses INA180/INA181, verify ADC capture during trip is still valid and logged.
DO No-load open load	LoadMissing or ProofMismatch	TP2: proof current ~0	Command ON does not silently pass; proof path detects absence; mismatch logged without oscillation.	Separates “healthy command” from “physical actuation”. Proof must be implemented, not assumed.
Relay stuck simulate cmd OFF but path remains ON	StuckContact or ProofMismatch	TP2: observed current at OFF	OFF command followed by observed ON triggers stuck/mismatch event; latch policy as specified; logged with channel ID.	If relay drive is ULN2803A class, verify flyback does not mask proof sensing after command transitions.
Brownout dip below UVLO then recover	Brownout + ResetReason	TP3: 24V input LOG: brownout marker	Marker recorded; reset reason matches; on reboot, last events remain ordered; no “silent reboot”.	If front-end uses TPS2660/LTC4368 class, verify restart behavior and logging window with hold-up.
Reverse polarity input reversed (controlled)	PowerFault (optional) + safe state	TP3: input clamp TP2: rails off	No damage; module remains safe/off; after correct polarity restored, normal boot and logging functional.	Validates reverse protection stage; document any fusing requirements for safe lab execution.
Surge aftermath apply surge, then functional check	PostStressDrift or Trouble (if margin shrinks)	TP1: DI margin shift LOG: false fault rate	Module remains operational; DI bins remain separated; margin does not permanently collapse; any drift is detected and logged.	This directly validates TVS/GDT/return path design. Correlate with H2-10 return-path figure evidence.
Power-cut during log write interrupt commit	CommitIncomplete marker on next boot	LOG: commit flag LOG: last N events	On reboot, incomplete commit is detectable; log ordering preserved; counters not corrupted; next commits succeed.	If using FRAM (MB85RC256V class), validate atomicity; if using SPI NOR (W25Q64), validate commit strategy.

Suggested channel naming: DI1..DIn, DO1..DOm, and fixed measurement points TP1 (field-side sense), TP2 (ADC/proof input), TP3 (24V front-end). Keep the same naming in firmware logs so lab results and field reports match.

Figure F11. A repeatable module-level validation flow: controlled injection, objective observation at TP points, deterministic event codes, and log integrity even when power is interrupted during commit.

Cite this figure Figure F11 — Validation flow for fault injection and forensic closure (ICNavigator).

H2-12. FAQs ×12 (Accordion; each maps back to chapters)

What is the optimal EOL value? Why do long cables cause false alarms?

Answer: A typical EOL value should be chosen based on the loop resistance and environmental factors. When the cable length increases, the voltage drop increases, causing thresholds to drift and resulting in false alarms. Check: – Measure voltage at TP1 during test; – Check the event code for “SupervisionFail” or “Tamper” after line extension. Next step: Adjust EOL or increase pull-up/down resistor to ensure reliable margin.

Why does the module report “Open” even though the line is intact?

Answer: If the line is not open but reports as “Open”, this may be due to filtering issues or leakage currents caused by moisture or insulation degradation. Check: – Measure ADC code at TP2; – Monitor the leakage current at TP3 and confirm if it affects the thresholds. Next step: Increase debounce window or improve insulation protection.

Why does it report “Short” even though resistance measurements are normal?

Answer: A “Short” event with normal resistance can occur due to protection device leakage or threshold drift caused by aging or capacitive coupling. Check: – Measure TP1 for voltage during short event; – Monitor event codes for ShortFault and verify the margin window. Next step: Replace protection device or recalibrate thresholds for the DI channel.

The relay activates but the load doesn’t respond: How can I use “output proof” to determine whether it’s the contacts or wiring?

Answer: If a relay activates but the load doesn’t respond, use the output proof path to verify if the contacts are stuck or the wiring is faulty. Check: – Measure current at TP3 when the relay should activate; – Compare the observed vs commanded state for the relay operation. Next step: Use contact backtest or replace relay driver if necessary.

Why does the system reset when a load is connected? Should I check inrush or UVLO threshold first?

Answer: If the system resets when a load is attached, it’s important to check both inrush current and UVLO threshold. Check: – Measure the inrush current at TP2 during load connection; – Verify UVLO threshold at power input. Next step: Adjust soft-start timing or increase UVLO margin.

After surge/ESD testing, why do false alarms increase? Which three points should I retest?

Answer: After surge/ESD events, false alarms may increase due to post-stress drift. Check: – Measure DI margin at TP1 after surge; – Check the false fault rate at TP2; – Verify the event codes at TP3 for inconsistencies. Next step: Re-calibrate DI window margins or re-route surge return paths.

The buzzer doesn’t sound even though there’s an alarm: Is it a driver issue or a state machine strategy issue?

Answer: If an alarm is triggered but the buzzer doesn’t sound, it could be a driver issue or an issue in the state machine strategy. Check: – Verify the buzzer driver signal at TP2; – Check the state machine flow for alarm activation. Next step: Replace driver IC or adjust alarm state transition logic.

Why does the module intermittently go offline but the I/O is fine? Which two event codes should I check first?

Answer: If the module intermittently goes offline but I/O remains functional, check the event codes for power instability and supervision drift. Check: – Monitor the event codes for “Brownout” or “CommFault”; – Check the brownout marker and reset reason in the log. Next step: Test power hold-up time and inspect communication paths.

After parallel sensors are added to a long cable, why do false alarms occur? Is it due to DEOL being damaged or a narrow threshold window?

Answer: If parallel sensors on long lines cause false alarms, verify if DEOL is compromised or if the threshold window is too narrow. Check: – Measure the margin to threshold at TP1; – Inspect the event codes and check if they align with the expected window. Next step: Adjust EOL resistance or widen threshold window.

Why are faults reported more often in humid conditions? How can I distinguish between insulation degradation vs port leakage?

Answer: During humid conditions, insulation degradation or port leakage can cause false alarms. Check: – Measure the leakage current at TP2; – Compare insulation resistance with dry conditions. Next step: Improve insulation or enhance port protection.

Why does the history get lost after power recovery? Is it due to insufficient hold-up time or a write strategy issue?

Answer: If history is lost after a power failure, check if the hold-up time is insufficient or if the write strategy is flawed. Check: – Monitor the hold-up time against commit completion; – Verify if event logs were successfully written during power down. Next step: Extend hold-up time or adjust write-back strategy.

What is the fastest way to perform a “fault injection” verification without missing any points?

Answer: To quickly verify fault injection coverage, focus on a minimal set of faults that can test the full system behavior. Check: – Review the fault injection matrix for coverage; – Verify if each fault has an associated event code, measurement point, and pass criteria. Next step: Start with DI Open and DO Short, then expand to environmental and power fault injections.

Fire/Alarm I/O Module Design: Isolated Supervised DI/DO