123 Main Street, New York, NY 10001

SIS Logic Solver: Voting, Lockstep Safety MCU, Isolated I/O

← Back to: Industrial Sensing & Process Control

A SIS Logic Solver is the decision core of a safety loop: it validates input health, applies deterministic 1oo2/2oo3 voting, and drives a defined safe-state sequence. Its credibility comes from isolation-aware signal integrity plus traceable evidence (snapshots, timestamps, cause codes) that makes every trip predictable, reproducible, and auditable.

H2-1. Role of the SIS Logic Solver in the Safety Loop

Extractable definition (for SEO / AI)

A SIS Logic Solver is the decision core of a Safety Instrumented Function (SIF): it validates safety-related inputs, applies deterministic voting/decision rules, and drives a defined safe output action while producing audit-ready evidence (reason codes, timestamps, and state snapshots). It does not measure the physical world and does not provide final power actuation.

What must be unambiguous (engineering acceptance)
  • Position in the SIF chain: Sensor → Logic Solver → Final Element. The Logic Solver is where “uncertain inputs” become a “certain decision.”
  • Boundary of responsibility: input credibility and consistency are handled here; sensor calibration/physics and actuator sizing are not.
  • Auditability requirement: identical validated inputs must yield identical outputs, and each decision must be explainable via a logged cause.
Evidence fields to anchor this page
Cause→Effect matrix (inputs → action) Deterministic decision rules Input validity / stale-data policy Safe-state definition (Trip behavior) Event log snapshot (timestamp + state)

Practical intent: each future chapter (voting, lockstep, isolation, diagnostics, logging) must map back to at least one of these evidence fields.

What this chapter deliberately does NOT cover
This page covers (Logic Solver scope) Out of scope (handled elsewhere)
Voting / comparison rules, decision determinism, safe-state output logic, diagnostic flags, event evidence SIL math and certification workflow, sensor physics/calibration, actuator power stage sizing, PLC/DCS network architecture
Safety Instrumented Function (Functional Chain) Sensor Safety signals Status / trips Logic Solver Voting / Decision Diagnostics Event Log Final Element Trip / Permit Safe state Key boundary Logic Solver validates signals, decides deterministically, and records evidence; it does not perform physical measurement or power actuation.
Figure H2-1 — Functional chain view (Sensor → Logic Solver → Final Element). The Logic Solver is highlighted as the decision + evidence hub.
Cite this figure ICNavigator • SIS Logic Solver

Transition to next chapter: because inputs can be wrong, late, stuck, or drifting, the Logic Solver must start from explicit failure assumptions and measurable targets.

H2-2. Failure Assumptions and Design Targets

Why this chapter exists

Voting and redundancy are not “features”—they are responses to explicit assumptions about how channels can fail. Without a written failure assumption set, it is impossible to justify a voting rule, set a detection time budget, or produce audit evidence that a safe state will be reached when needed.

Failure categories (engineering explanation only)
  • Random failures: device wear-out, intermittent connections, environmental stress. Mitigation often relies on redundancy, voting, and online detection.
  • Systematic failures: design, software, requirements, or configuration errors that can repeat. Mitigation relies on deterministic rules, lockstep/self-check coverage, and controlled change management.

Practical implication: voting primarily counters random faults; lockstep + diagnostics constrain systematic faults by detecting divergence and invalid internal states.

Design targets that must be measurable
  • Fault assumption list: what can fail, how it manifests, and which channel(s) it affects.
  • Detection latency: the maximum allowed time to detect each fault type before it can become dangerous.
  • Safe state definition: the required output action once a dangerous or unknown state is detected (including latch/reset policy).
Fault type → symptom Symptom → detection method Detection → safe action Action → evidence log
Why inputs must be assumed imperfect
Input problem Safety risk Design direction (links to later chapters) Evidence field to log
Wrong (incorrect state) False permit or false trip Voting consistency rules; mismatch tolerance Mismatch reason code + channel snapshot
Late (timing skew) Transient disagreement → wrong vote Validation window; alignment policy Timestamp delta + decision window ID
Stuck (no change) Danger masked by stale signal Stale-data detection; periodic self-test hooks Stale counter + last-change timestamp
Drift (slow bias) Long-term vote bias / nuisance trips Window/trend compare; tolerance bands Trend metric + tolerance threshold ID
Evidence Pack (what auditors and debuggers need)
  • Assumption registry: fault categories with symptoms and affected channels (versioned).
  • Latency budget: per fault type, the maximum detection time and the rationale.
  • Safe-state contract: output behavior on detection (de-energize/energize-to-trip, latch, reset conditions).
  • Fault codes: unique reason codes enabling reconstruction of “what happened” from logs.

Next chapter setup: once assumptions and targets are explicit, voting (1oo2/2oo3), lockstep, isolation, diagnostics, and logging become implementation choices to satisfy the latency and safe-state contract.

Failure Assumptions → Targets → Evidence Assumptions Wrong Late Stuck Drift Measurable targets Fault list Detection latency Safe state contract Reason codes Evidence outputs Event log snapshot Timestamp integrity Stale / drift counters Trip cause trace Assumptions define targets; targets define what must be logged as evidence.
Figure H2-2 — Engineering flow: define failure assumptions, convert them into measurable targets (fault list, latency, safe state, reason codes), then produce evidence outputs for audit and debugging.
Cite this figure ICNavigator • SIS Logic Solver

H2-3. Voting Logic Fundamentals: 1oo1, 1oo2, 2oo3

Input semantics (must be explicit)

Voting only becomes deterministic when each channel’s state has a fixed meaning. This page uses a Trip vote convention: 1 = Trip request (danger detected), 0 = No trip request. Any “invalid/stale” channel must be handled as an explicit state by policy (covered later under diagnostics and logging).

Trip=1 convention Deterministic decision rule Explicit invalid handling
Decision conditions (no probability derivations)
Mode Trip rule (boolean form) Engineering interpretation
1oo1 Trip = A Single channel decides. Simplest, but a single wrong channel can dominate.
1oo2 Trip = A OR B Any channel can force Trip. Reduces missed-trip risk when hazards can appear in only one channel, but increases nuisance-trip sensitivity.
2oo3 Trip = (A + B + C) ≥ 2 Majority required. More robust to a single wrong Trip signal, but depends more on channel coherency (timing and thresholds) to avoid split votes.
Availability vs safety (engineering outcomes)
  • Nuisance trip sensitivity: 1oo2 trips on any single asserted channel; 2oo3 rejects isolated single-channel assertions unless a second channel corroborates.
  • Hazard visibility assumption: if a real hazard can be observed by only one channel (coverage gaps), 1oo2 may trip earlier than 2oo3.
  • Timing sensitivity: majority voting can misbehave when channels are asynchronous; validation windows and coherency flags become mandatory controls.
Why 2oo3 is not “a safer 1oo2”

2oo3 and 1oo2 optimize different failure assumptions. 1oo2 prioritizes “trip if any credible channel requests trip,” which can be desirable when hazards may be partially observed. 2oo3 prioritizes “trip on corroborated majority,” which can reduce nuisance trips from single-channel noise or drift. Safety impact depends on the fault model and the coherency controls (thresholds, tolerances, windows).

Different fault assumptions Different coherency needs Different nuisance behavior
Evidence Pack (minimum fields)
  • vote_mode: 1oo1 / 1oo2 / 2oo3
  • input_snapshot: A,B,(C) states captured at decision time
  • decision: Trip / Safe
  • reason_code: ANY_TRIP / MAJORITY_TRIP / INVALID_INPUT / WINDOW_FAIL
  • window_id + timestamp: tie decision to a validation window and time base
Voting Truth Table (Trip vote: 1 = Trip request) Inputs A 0/1 B 0/1 C 0/1 Decision snapshot Capture A,B,C at the same window Vote Logic 1oo1: Trip=A 1oo2: A OR B 2oo3: ≥2 Representative cases 000 → Safe 100 → 1oo2 Trip 110 → 2oo3 Trip Outputs Safe Trip Log fields vote_mode A,B,C snapshot reason_code Voting determines Trip/Safe from discrete channel states; coherency controls are required for asynchronous inputs.
Figure H2-3 — Truth-table style view: A/B/C states enter a deterministic vote rule (1oo1 / 1oo2 / 2oo3) to produce Safe/Trip plus minimum log fields.
Cite this figure ICNavigator • SIS Logic Solver

Chapter linkage: once the vote rule is fixed, the implementation must guarantee that channel states are comparable (thresholds, tolerances, and validation windows).

H2-4. Voting Comparator Architectures

Core problem

Voting rules assume discrete channel states, but real inputs are noisy, delayed, and drifting. Comparator architectures convert ambiguous signals into vote-ready states (0/1/INVALID) under controlled error bounds. The controls are not optional: threshold definition, mismatch tolerance, and debounce/validation windows determine whether voting behaves predictably.

Architecture families (implementation shapes)
  • Analog comparator voting: hardware thresholds with hysteresis and debounce to prevent chatter near trip points; deterministic latency is a key advantage.
  • Digital comparator + MCU voting: sampled/filtered signals mapped to states using window rules; enables richer diagnostics and drift-aware policies.
  • Window / trend compare (drift-aware): band checks flag out-of-range behavior; trend metrics detect slow bias that can poison long-term voting.
Sync vs async inputs (common failure mode)

Asynchronous channels can disagree briefly even when all channels are correct. Without a validation window and coherency rules, majority voting can oscillate: early edges create split votes; late edges “correct” them after the decision is already made. This is controlled by windowing (time alignment), debounce, and explicit INVALID handling.

window_id + window_ms timestamp_delta coherency_flag INVALID policy
Hysteresis and chatter (why nuisance trips happen)
  • Noise-induced chatter: signals near threshold flip rapidly; hysteresis and debounce convert chatter into a stable state transition.
  • Edge timing bounce: short-lived edges create disagreement across channels; validation windows prevent single-edge artifacts from becoming a vote input.

Requirement: each stability mechanism must be traceable (which threshold/band/window was active during the decision).

Evidence fields (configuration-grade)
Control Minimum fields to record Purpose
threshold definition threshold_id, threshold_value, hysteresis_band Proves what “Trip” means and prevents chatter near the trip point.
mismatch tolerance tolerance_id, allowed_delta, channel_pair Controls how much disagreement is permitted before marking INVALID or forcing a safe action.
debounce / validation window debounce_ms, window_ms, window_id, timestamp_delta Aligns asynchronous inputs and prevents transient edges from corrupting a vote decision.
Comparator Architectures → Vote-Ready States Raw channel inputs A (noisy) B (delayed) C (drifting) Coherency needs threshold + tolerance debounce + window Comparator layer Analog compare hysteresis + debounce Digital compare sample window + rules Window / Trend band + drift metric Output states 0 / 1 / INVALID Vote stage Voting rule 1oo2 / 2oo3 Safe / Trip Log controls threshold_id tolerance_id window_id debounce_ms Validation windows align asynchronous edges; hysteresis/debounce stabilize threshold crossings.
Figure H2-4 — Comparator architectures convert noisy/delayed/drifting inputs into vote-ready states (0/1/INVALID) using thresholds, tolerances, debounce, and validation windows.
Cite this figure ICNavigator • SIS Logic Solver

Next chapter linkage: lockstep safety MCUs can internalize comparison and diagnosability, but the same evidence controls (windowing, reason codes, and state snapshots) remain mandatory.

H2-5. Lockstep Safety MCUs as Logic Solvers

What “lockstep” contributes to a Logic Solver

A lockstep safety MCU acts like an internal voter: two execution paths run the same instruction stream in tightly aligned cycles, and a cycle-by-cycle compare checks whether computed states remain identical. When a mismatch is detected, the device asserts a fault flag and transitions into a defined safety response (e.g., force safe outputs, latch fault state, and record evidence).

cycle compare state compare fault flag safe response
Core vs peripheral coverage (critical boundary)
  • Core-domain coverage: lockstep is strongest at catching internal random faults that cause divergent computation (register/ALU/control-flow divergence).
  • Peripheral-domain boundary: many peripherals are shared. A peripheral fault can feed identical wrong data to both cores, producing consistent but wrong execution that may not trigger a mismatch.
  • Practical control: peripheral health flags (timeouts, CRCs, overruns) and input plausibility rules complement lockstep by turning “shared wrong” into an explicit invalid state.
Internal compare vs external voting (layering)

Internal lockstep comparison and external multi-channel voting solve different problems. Lockstep focuses on execution integrity inside the MCU. External 1oo2/2oo3 voting focuses on input-channel consistency and hazard visibility across channels. A robust Logic Solver typically uses both: lockstep to detect internal divergence and external voting/comparison to validate safety inputs.

Layer Primary protection Minimum evidence
Internal (lockstep) Detects divergent execution and invalid internal states (random faults causing mismatch) compare_mismatch_flag, fault_reason_code
External (voting) Validates channel consistency and resolves disagreement across inputs vote_mode, input_snapshot, window_id
Lockstep ≠ system-level redundancy

Lockstep improves fault detection but does not automatically create independence. Shared resources (clock, power, memory paths) can introduce common-cause failures that affect both cores similarly. In addition, a systematic software defect can produce identical wrong behavior in both cores. Therefore, lockstep must be paired with explicit diagnostics, coherency controls, and evidence logging to maintain auditability.

Software error visibility (what is caught vs not caught)
Category Typical outcome Control / evidence
Caught Random faults that cause divergent results (bit flips, transient execution anomalies) → mismatch triggers fault compare_mismatch_flag, fault_latch_status
Not caught Systematic defects that produce identical wrong logic in both cores (wrong threshold, wrong rule) config_id, software_build_id, change control
Conditionally caught Input sampling or timing differences can create divergence even without true faults (asynchronous edges) window_id, timestamp_delta, coherency_flag
Evidence Pack (minimum fields)
  • lockstep_state: enabled / degraded / disabled
  • compare_mismatch_flag + fault_reason_code: why divergence was declared
  • fault_latch_status: whether the fault is latched and requires a controlled reset
  • decision_origin: internal fault vs external vote decision
  • software_build_id + config_id: ties behavior to a versioned configuration baseline
Lockstep Execution Compare (internal integrity check) Shared resources clock power memory Core A Fetch / Decode Execute State snapshot regs / flags Core B Fetch / Decode Execute State snapshot regs / flags Compare Cycle check State check Fault flag Safe response Mismatch indicates divergent execution; shared resources can still create common-cause risk.
Figure H2-5 — Lockstep compares Core A/Core B execution and state each cycle. A mismatch asserts a fault flag and triggers a defined safe response while producing audit evidence.
Cite this figure ICNavigator • SIS Logic Solver

H2-6. Discrete vs MCU-Based Logic Solver Trade-offs

Purpose of this selection chapter

Discrete and MCU-based logic solvers can both implement voting, but they differ in diagnosability, timing determinism, audit evidence depth, and lifecycle cost. Selection should be driven by measurable requirements: diagnostic expectations, maximum decision latency, evidence obligations, and change-management maturity.

Comparison matrix (mobile-safe table)
Dimension Discrete logic solver Lockstep safety MCU Evidence focus
Diagnostic coverage Strong for clear threshold/line faults; depends on added monitors for deeper visibility Richer self-tests and state diagnostics; depends on software/config discipline health flags, reason codes, diagnostic inventory
Latency determinism Short, predictable paths; minimal scheduling uncertainty Depends on sampling windows, task timing, and rule execution budget latency budget, window_id, timestamp_delta
Auditability Often needs external logging to explain “why Trip happened” Can record snapshots, reason codes, and state traces internally input_snapshot, state_snapshot, log_sequence
Lifecycle management Low change frequency; simpler maintenance but limited feature evolution Updatable and extensible; requires versioning, regression, and controlled rollout software_build_id, config_id, change_log_ref

Reading tip: each dimension should be tied to an explicit requirement. For example, “maximum detection latency” and “minimum evidence fields” determine whether windowing and logging are mandatory.

Decision anchors (practical)
  • Choose discrete-first when deterministic latency and minimal complexity dominate, and evidence needs can be satisfied by external logging.
  • Choose MCU-first when richer diagnostics, configurable rules, and audit-grade traces are required, and lifecycle controls (versioning, regression) are feasible.
  • Hybrid reality is common: discrete front-end comparators for clean thresholds plus MCU logic for windowing, voting, and evidence logging.
Discrete vs MCU Logic Solver (trade-off map) Dimensions: coverage latency audit lifecycle Discrete path Comparator(s) Vote logic Output action External logging often needed Lockstep MCU path Sampling window Rules + voting Lockstep checks Built-in evidence logs Evidence outputs: reason_code • input/state snapshots • latency/window IDs • build/config IDs
Figure H2-6 — Engineering trade-off map: discrete logic emphasizes deterministic paths; lockstep MCUs enable richer diagnostics and audit evidence but require lifecycle discipline.
Cite this figure ICNavigator • SIS Logic Solver

H2-7. Isolated I/O for Safety Signal Integrity

Why isolation matters for a Logic Solver

A Logic Solver does not measure the physical world, but it must guarantee that safety inputs remain clean, comparable, and independent. Isolation is not only an electrical barrier: it protects the assumptions behind voting by limiting common-mode coupling, ground shifts, and transient injection that can move multiple channels together and invalidate independence.

channel independence threshold stability CM event containment auditable health
Input isolation vs output isolation
Isolation location Primary purpose (logic-solver view) Evidence focus
Input isolation Preserves vote credibility by limiting shared reference errors and transient injection into comparator/threshold decisions across channels. channel_health, CM transient logs, coherency flags
Output isolation Preserves the ability to execute a safe action without back-injection from high-energy domains; prevents output disturbances from polluting inputs. output_health, isolation fault flags, trip path status
How common-mode events break voting
  • Threshold reference shift: a CM event moves the effective threshold in multiple channels at once, producing “consistent” but wrong vote inputs.
  • Transient injection inside the decision window: a spike coincides with validation windows, causing multiple channels to sample the same wrong state.
  • Recovery mismatch: isolation elements may saturate or recover at different rates, creating short split votes and oscillating decisions if windowing is weak.

Practical implication: voting assumes independence; CM coupling can collapse independence into a single shared failure mode.

Isolation failure ≠ open failure

An open failure typically removes a channel and is often detected by timeouts or out-of-range checks. An isolation failure can be more dangerous: signals may still appear plausible while channel independence is degraded. This means “value looks normal” is not sufficient evidence—explicit isolation health and CM event awareness are required.

Open: missing channel Isolation: plausible but coupled Needs explicit health flags
Evidence Pack (minimum fields)
  • channel_health: OK / Degraded / Invalid (per channel)
  • isolation_fault_flags: isolation element health and fault latches
  • CM_transient_logs: timestamp + severity + affected channels
  • coherency_flag: whether inputs were comparable during the decision window
Isolated Input Channel Map (vote integrity) Field domain Channel A signal Channel B signal Channel C signal CM event source surge / ground shift Input isolation Iso A Iso B Iso C Iso health fault flags Logic Solver Compare / window coherency Vote-ready states 0 / 1 / INVALID Logs channel_health isolation flags CM transient CM transient path Isolation protects channel independence; CM events must be logged to prove vote integrity.
Figure H2-7 — Isolated input channels limit common-mode coupling and support auditable channel health, isolation fault flags, and CM transient event logging.
Cite this figure ICNavigator • SIS Logic Solver

H2-8. Diagnostic Coverage and Fault Detection

What diagnostics are (and are not)

Diagnostics are not “alarms” for operators. The primary safety purpose is to reduce dangerous undetected failures by converting them into detected faults that trigger defined safe actions, degraded modes, or maintenance interventions. Evidence must distinguish detected versus undetected fault classes and show how each is handled.

detect dangerous faults drive safe state prove detection
Three diagnostic layers (structure)
Layer Role in fault detection Evidence focus
Startup diagnostics Blocks “faulty-at-boot” conditions by validating critical integrity before enabling safety functions. startup_diag_result, config_id, integrity flags
Online diagnostics Detects faults during operation (execution divergence, I/O integrity loss, window/coherency failures) and drives safe/degraded behavior. fault_reason_code, window_id, channel_health
Periodic proof-test support Complements online coverage by enabling verification of fault classes that are otherwise difficult to detect continuously. proof_test_records, counters, test hooks
DC% categories (conceptual, evidence-driven)

Diagnostic coverage categories (often described as Low/Medium/High) reflect how much of the relevant dangerous fault space is converted from undetected to detected. The key is not a label: it is a maintained mapping between fault classes and the diagnostics that detect them, plus proof that detection triggers the intended safe response.

fault class list detection mapping response mapping
Detected vs undetected fault list (audit-critical)
Fault class Detected by Evidence fields (minimum)
Execution divergence Online (lockstep compare) compare_mismatch_flag, fault_reason_code, fault_latch_status
I/O integrity loss Online (channel health + coherency) channel_health, coherency_flag, window_id, timestamp_delta
Isolation degradation Online (isolation flags) + event logs isolation_fault_flags, CM_transient_logs, affected_channels
Systematic rule/config error Not reliably detected online (requires lifecycle controls) config_id, software_build_id, change_log_ref
Blind-spot fault class Periodic (proof-test support) proof_test_records, counters, last_test_timestamp

Requirement: “undetected” categories must be explicitly listed and mapped to compensating controls (proof-test support, maintenance actions, or design constraints).

Evidence Pack (minimum fields)
  • DC% categories: mapped to fault classes and diagnostics (inventory-style)
  • detected_fault_list: detected class → diagnostic source → response
  • undetected_fault_list: blind-spot class → compensating control
  • diag_event_logs: reason_code + timestamp + affected channels
Diagnostic Coverage Map (detected vs undetected) Startup diagnostics Integrity checks Enable safety only if pass Online diagnostics Lockstep mismatch I/O integrity health + coherency Response safe / degraded Periodic proof-test support Covers blind spots Records + counters + timestamps Maintain two lists: detected_fault_list and undetected_fault_list, each mapped to controls and evidence fields.
Figure H2-8 — Coverage is built from startup + online diagnostics and supplemented by periodic proof-test support. The audit core is the detected vs undetected fault mapping.
Cite this figure ICNavigator • SIS Logic Solver

H2-9. Safe State Handling and Output Actions

Why post-trip behavior matters

A trip decision is only half of safety. The Logic Solver must enforce a deterministic safe-state policy that defines what outputs do after a detected hazard, how long they persist, and what evidence is required before any recovery. The goal is predictable, repeatable, and auditable behavior under real fault and transient conditions.

deterministic outputs explicit latching controlled recovery audit evidence
De-energize vs energize-to-trip (policy semantics)
Policy What “Trip” means (logic view) Evidence focus
De-energize-to-trip Trip commands outputs into a de-energized safe state; loss-of-energy tends to align with the safe direction. safe_state_policy, output_sequence_id
Energize-to-trip Trip commands an energized state to enforce protection; correctness relies on output consistency and health under disturbances. output_path_health, trip_output_command

Key engineering point: the difference is not “high vs low” but the defined safe direction under loss-of-energy and disturbances.

Latched vs auto-reset (recovery rules)

Safe-state persistence must be explicit. Latched handling keeps the system in safe outputs until reset preconditions are proven. Auto-reset allows recovery when preconditions are satisfied, but requires validation windows to prevent oscillation during noisy boundaries.

  • Reset preconditions: channel health is stable, coherency holds, and recent CM transient severity is acceptable for recovery.
  • Reset traceability: every reset must be logged with timestamp, actor/mode, and reason.
  • Anti-oscillation: recovery windows and throttling prevent rapid trip-reset cycles under marginal conditions.
Output consistency vs recoverability (state-machine view)

Safe-state behavior should be modeled as a sequence, not a single output assignment. A robust policy freezes decision context, commits evidence, drives safe outputs, then manages latching and recovery. Determinism ensures the same event leads to the same sequence, while recoverability ensures the system returns only when evidence-based conditions are met.

freeze vote context commit evidence execute safe outputs manage recovery
Evidence Pack (minimum fields)
  • safe_state_policy: DE_ENERGIZE / ENERGIZE_TO_TRIP
  • latch_state + reset_mode: latched behavior and reset control mode
  • reset_preconditions_met: boolean + reason code (why recovery was allowed)
  • output_sequence_id + sequence_step: ties behavior to a deterministic output sequence
  • reset_event_log: timestamp + mode/actor + reason
Safe State Policy (deterministic outputs + controlled recovery) Policy: De-energize-to-trip Energize-to-trip Normal vote active outputs commanded Trip detected freeze context commit evidence Safe outputs sequence step(s) deterministic Latched hold safe state manual/supervised reset Auto-reset recover only if proven window + throttling Reset preconditions health stable • coherency ok • CM severity ok Outputs follow a defined sequence; recovery is evidence-based and traceable.
Figure H2-9 — Safe-state handling is a deterministic policy: trip freezes context, commits evidence, drives safe outputs, then enforces latching and controlled recovery.
Cite this figure ICNavigator • SIS Logic Solver

H2-10. Diagnostic Logging and Event Traceability

Why logging is the EEAT / GEO core

Logging is not a “nice to have.” It is the mechanism that turns safety decisions into verifiable evidence. A traceable Logic Solver records a time-ordered sequence of events, vote snapshots, and health context so that post-trip analysis can reconstruct what happened, why it happened, and whether the decision was consistent with the configured rules.

timeline snapshots cause codes forensic replay
Minimum event record (one log entry)
Field Meaning Example
timestamp Orders events and enables reconstruction of causality and decision windows. t=123.456s
event_type Declares the semantic class of the event (trip, reset, CM transient, channel invalid, mismatch). TRIP
cause_code Explains why the event occurred (rule result, mismatch, coherency failure, isolation fault). VOTE_2OO3
affected_channels Lists channels implicated in the event and the decision context. A,B
decision_origin Distinguishes external voting decisions from internal integrity faults. VOTE
log_sequence Detects missing entries and supports integrity checks across the event stream. #004218
Vote snapshot (frozen decision context)

Every trip should bind to a snapshot that captures inputs, voting state, and health context at the decision boundary. Without snapshots, logs show that a trip occurred but cannot prove why it was inevitable under the configured rules.

Snapshot group Contents Key fields
Inputs snapshot Per-channel state and quality flags (0/1/INVALID + plausibility/coherency). input_snapshot, channel_health
Voting snapshot Vote mode, decision window ID, window state, and vote result at the boundary. vote_mode, window_id, vote_result
Health snapshot Lockstep and isolation context plus recent CM event references. lockstep_state, isolation_flags, CM_log_ref
Post-trip forensic reconstruction (repeatable method)
  1. Build the timeline: sort by timestamp and mark key events (CM transient, channel invalid, vote snapshot, trip, safe output sequence, reset attempts).
  2. Bind decisions to snapshots: verify that each trip maps to exactly one snapshot and that snapshot fields match the configured vote rules and windows.
  3. Explain causality: determine whether the trip was driven by input disagreement, integrity faults, or coherency/CM events; confirm output sequence and latching followed policy.

Result: a defensible evidence chain that supports audits and root-cause analysis without relying on subjective interpretation.

Evidence Pack (minimum fields)
  • timestamp + log_sequence: ordered events and missing-entry detection
  • event_type + cause_code + affected_channels
  • input_snapshot + vote_mode + vote_result + window_id
  • channel_health + isolation_fault_flags + lockstep_state
  • output_sequence_id + sequence_step + latch_state + reset_event_log
Event Timeline & Evidence Chain (traceability) Timeline t0 t1 t2 t3 t4 t5 CM event Channel invalid Snapshot Trip decision Safe outputs Reset attempt Evidence chain (bind facts to decisions) Event record timestamp • cause Vote snapshot inputs • mode • window Health context channel • iso • lockstep Reconstruction timeline → bind → explain Integrity log_sequence • missing gaps A defensible trip requires ordered events + snapshots + integrity markers for repeatable forensic replay.
Figure H2-10 — Traceability binds a time-ordered event stream to vote snapshots and health context so trip decisions can be reconstructed and audited.
Cite this figure ICNavigator • SIS Logic Solver

H2-11. Integration Considerations in SIS Architectures

Scope boundary (interface-only)

Integration is defined by contracts: what inputs mean, what outputs guarantee, and how diagnostics are acknowledged. This chapter covers interface semantics and evidence fields only; it does not describe PLC/DCS network topologies or plant-wide control architectures.

input contracts output contracts diagnostic handshakes versioned evidence
Input contracts (value + timing + quality)

A voting logic solver depends on inputs that are comparable. A usable input contract defines three layers: value semantics (0/1/INVALID), timing semantics (freshness and maximum age), and quality semantics (OK/DEGRADED/INVALID) that governs how each channel participates in voting.

Contract item Definition (logic view) Evidence fields
Value semantics Define allowed states and explicit INVALID behavior (e.g., stale/illegal/out-of-window becomes INVALID). input_state
Timing semantics Define update expectation and max-age; specify how stale values affect vote participation. input_timestamp, age_ms
Quality semantics Quality flag drives whether a channel is eligible for voting and how it is weighted/ignored. input_quality, channel_health
Contract versioning Inputs must be traceable to a specific contract revision to prevent silent semantic drift. input_contract_id

Practical rule: without a defined INVALID state and freshness rule, voting can mistake “data” for “truth.”

ISO1211 (TI) — industrial digital input ISO7741 (TI) — 4-ch digital isolator ADuM141E (ADI) — 4-ch digital isolator Si8642 (Skyworks/Silicon Labs) — digital isolator TLV1704 (TI) — quad comparator (window/vote) LM339 (TI/others) — quad comparator (vote)
Output contracts (action semantics + guarantees + verifiability)

Outputs must be specified as a policy, not a pin toggle: what “Trip” commands (de-energize vs energize-to-trip), whether the state is latched, and what evidence allows recovery. A robust output contract also defines determinism (same cause → same sequence) and verifiability (readback or acknowledgement of output state).

Contract item Definition (logic view) Evidence fields
Safe-state policy Define the safe direction and how it is commanded (de-energize vs energize-to-trip). safe_state_policy
Sequence determinism Trip drives a defined ordered sequence (freeze → log → outputs) with stable steps. output_sequence_id, sequence_step
Latching & reset Specify latch behavior and the required evidence for reset (manual/auto/supervised). latch_state, reset_mode, reset_preconditions_met
Verifiability Define how output state is confirmed (readback/ack); log if confirmation is missing. output_state_readback, output_ack, ack_status
Contract versioning Outputs are traceable to a specific contract revision and policy set. output_contract_id
ISO1042 (TI) — isolated CAN transceiver ADM3053 (ADI) — isolated CAN transceiver ISO3082 (TI) — isolated RS-485 ADM2587E (ADI) — iso RS-485 (with iso power) ISO6721 (TI) — single-ch reinforced isolator
Diagnostic handshakes (fault → ACK → record)

Diagnostics must form a closed loop. A “fault event” becomes actionable only when it is acknowledged and linked to a record ID that supports audits and proof tests. Handshakes should define: fault announcement fields, acknowledgement timing, escalation rules for missing ACK, and proof-test hooks (start/end markers and record IDs).

Handshake step Requirement Evidence fields
Fault announcement Emit event with cause code, severity, channels, and decision origin (vote vs integrity). event_type, cause_code, severity, affected_channels, decision_origin
Acknowledgement Define ACK semantics and timeouts; missing ACK must be visible and policy-driven. diag_request_id, ack_id, ack_timeout_ms, ack_status
Escalation Specify what changes when ACK is absent (latch hold, degraded mode, or recovery blocked). escalation_state, latch_state, recovery_blocked
Proof-test support Provide interface markers: enter test mode, record start/end, produce test record ID. proof_test_record_id, test_start_ts, test_end_ts
Versioning Handshake semantics must be versioned to prevent “same bits, different meaning.” diag_handshake_version
MB85RS64V (Fujitsu) — FRAM for event logs CY15B104Q (Infineon/Cypress) — nvSRAM TPS3435 (TI) — watchdog supervisor MAX6369 (Maxim/ADI) — watchdog supervisor
Contract Boundary Map (interface-only integration) Upstream inputs Channel A Channel B Channel C value • timing • quality Input contract input_contract_id INVALID rules Logic Solver Voting + integrity snapshots Evidence fields timestamp • cause • channels Output contract output_contract_id sequence + latch Downstream interface readback / ACK Diag handshake fault → ACK record ID Integration is proven by contracts + version IDs + acknowledgements, not by network topology diagrams.
Figure H2-11 — Integration is contract-driven: input semantics (value/timing/quality), output guarantees (sequence/latch/readback), and diagnostic handshakes (fault→ACK→record).
Cite this figure ICNavigator • SIS Logic Solver
MPN list (quick reference)

Example parts frequently used around Logic Solver interfaces (not system networking):

Function Example MPN Why it fits this chapter
Industrial digital input ISO1211 (TI) Encodes input state/threshold behavior into a predictable logic-level contract.
Digital isolation (multi-ch) ISO7741 (TI), ADuM141E (ADI), Si8642 (Skyworks/Silabs) Supports channel independence and clean handshakes across isolation boundaries.
Comparator building blocks TLV1704 (TI), LM339 (multi-vendor) Implements window/threshold checks that feed explicit value semantics (0/1/INVALID).
Isolated CAN ISO1042 (TI), ADM3053 (ADI) Useful for isolated diagnostic/status interfaces without discussing network topology.
Isolated RS-485 ISO3082 (TI), ADM2587E (ADI) Provides isolated diagnostic/ACK signaling across domain boundaries.
Nonvolatile event log MB85RS64V (Fujitsu FRAM), CY15B104Q (Infineon nvSRAM) Supports traceability records (sequence IDs, snapshots) with robust retention.
Watchdog supervisor TPS3435 (TI), MAX6369 (Maxim/ADI) Supports “handshake timeout / escalation” supervision at the interface-policy level.

Note: MPNs are examples to anchor integration discussions; final selection depends on isolation rating, channels, data rate, and system constraints.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12. FAQs

Each answer follows a fixed format: 1 conclusion sentence + 2 evidence checks + 1 first fix. (Maps back to H2-3…H2-10.)

FAQ 01 2oo3 still trips unexpectedly—logic threshold or channel drift?

Conclusion: Unexpected 2oo3 trips usually come from a boundary condition (threshold/window) or a slowly diverging channel that crosses the vote rule.

  • Evidence to check: vote snapshot at decision time (A/B/C states, vote_mode, vote_result, window_id/window_state).
  • Evidence to check: drift-aware comparison signals (mismatch tolerance, trend/window comparator flags) versus the configured threshold definition.
  • First fix: tighten the validation window with drift-aware gating (flag channel as DEGRADED/INVALID before voting) and re-baseline thresholds with an explicit mismatch tolerance.
Maps: H2-3, H2-4
FAQ 02 1oo2 never trips when one channel is stuck—INVALID handling or debounce window?

Conclusion: A stuck channel that never triggers 1oo2 often indicates the stuck state is being treated as “valid” or the debounce/validation window never reaches a decisive state.

  • Evidence to check: channel health flags and whether “stale/constant” behavior transitions to INVALID (age_ms, input_quality, channel_health).
  • Evidence to check: debounce/validation window telemetry (window_state transitions, window duration, and any suppression due to asynchronous arrivals).
  • First fix: enforce a freshness/age rule that forces a stuck channel to INVALID and make the vote rule explicitly count INVALID as ineligible for 1oo2 participation.
Maps: H2-3, H2-4
FAQ 03 Vote result flips near the boundary—hysteresis missing or async inputs?

Conclusion: Boundary flip-flopping is typically caused by insufficient hysteresis or unsynchronized inputs arriving in different windows.

  • Evidence to check: comparator/window thresholds and hysteresis configuration (threshold definition, hysteresis value, chatter counters).
  • Evidence to check: input alignment vs validation window (per-channel timestamp skew, window_id consistency across channels).
  • First fix: add or increase hysteresis and require window-aligned sampling so inputs are compared within one coherent validation window.
Maps: H2-4, H2-3
FAQ 04 Lockstep MCU flags mismatch but inputs look normal—core fault or I/O timing skew?

Conclusion: A lockstep mismatch with “normal-looking” inputs usually points to internal execution divergence or a timing/latency skew between sampling and comparison boundaries.

  • Evidence to check: lockstep compare records (compare_mismatch_flag, lockstep_state, fault timestamp and persistence).
  • Evidence to check: event timeline alignment between input capture and decision (timestamp ordering, snapshot_id binding to the mismatch event).
  • First fix: freeze and log a synchronized snapshot on mismatch, then tighten the sampling-to-compare schedule so both cores and I/O capture share the same decision boundary.
Maps: H2-5, H2-10
FAQ 05 Lockstep is “healthy,” but voting disagrees—peripheral fault coverage gap or input contract violation?

Conclusion: Voting disagreement with a healthy lockstep core often indicates a peripheral/I/O fault not covered by lockstep or an input contract (value/timing/quality) mismatch.

  • Evidence to check: core vs peripheral coverage indicators (lockstep_state OK while I/O timing/quality flags degrade).
  • Evidence to check: input contract compliance (input_state validity, age_ms freshness, input_quality transitions into DEGRADED/INVALID).
  • First fix: enforce the input contract at the boundary (reject stale/low-quality inputs before voting) and add explicit peripheral self-check/handshake flags into channel_health.
Maps: H2-5, H2-11
FAQ 06 Discrete voting works in the lab but fails in the plant—CM transient or isolation fault flags?

Conclusion: Field-only failures usually trace to common-mode (CM) events or isolation-domain anomalies that corrupt apparent input agreement.

  • Evidence to check: CM transient logs (severity/occurrence near trip, correlation with channel invalidations).
  • Evidence to check: isolation health/fault flags (isolation_fault_flags, channel_health transitions, “fault ≠ open” semantics).
  • First fix: gate voting with isolation/CM-aware health checks so a CM event forces channels to DEGRADED/INVALID before the vote decision is accepted.
Maps: H2-7, H2-10
FAQ 07 Repeated spurious trips after maintenance—voting window or reset policy changed?

Conclusion: Post-maintenance spurious trips commonly come from altered validation windows/debounce or a reset policy that allows recovery before stability is proven.

  • Evidence to check: configured debounce/validation window parameters and window_state history around each trip.
  • Evidence to check: latch/reset evidence (reset_mode, reset_preconditions_met, recovery_window_ms, reset_event_log).
  • First fix: require supervised or stricter reset preconditions plus a recovery window, and re-validate the voting window configuration against expected input timing.
Maps: H2-9, H2-4
FAQ 08 Trip happens, but outputs don’t match the expected sequence—output contract or latch state mismatch?

Conclusion: Output misbehavior after a valid trip is usually a contract mismatch (semantics changed) or an unverified output path (no readback/ACK) that breaks determinism.

  • Evidence to check: output sequence trace (output_sequence_id, sequence_step progression, trip_output_command snapshot).
  • Evidence to check: output confirmation (output_state_readback/output_ack and whether missing ACK triggered escalation/latch hold).
  • First fix: enforce the output contract version and require readback/ACK for each critical step, blocking auto-reset when confirmation is missing.
Maps: H2-9, H2-11
FAQ 09 Diagnostic coverage looks high, but audit still fails—what evidence is missing?

Conclusion: Audits fail when coverage claims are not backed by traceable evidence—especially “detected vs undetected” lists and decision snapshots tied to timestamps.

  • Evidence to check: DC categories and the detected/undetected fault inventory (what is covered online vs only by proof test).
  • Evidence to check: evidence chain completeness (log_sequence gaps, snapshot presence for each trip, cause_code consistency).
  • First fix: publish a bounded fault coverage matrix and ensure every trip binds to a vote snapshot and ordered event records with missing-entry detection.
Maps: H2-8, H2-10
FAQ 10 Proof test passes, yet undetected faults remain—startup vs online diagnostics gap?

Conclusion: Passing proof tests does not eliminate undetected faults when the diagnostic plan leaves gaps between startup checks and online monitoring.

  • Evidence to check: which fault classes are only covered at startup versus continuously online (DC category mapping and intervals).
  • Evidence to check: proof-test records tied to events (proof_test_record_id, test_start/end timestamps, and linked cause_code results).
  • First fix: add targeted online diagnostics for the remaining gap classes and tie proof-test execution to immutable log records and evidence snapshots.
Maps: H2-8, H2-10
FAQ 11 A channel shows “OK,” but behaves stale—timestamp/age_ms contract or logging granularity?

Conclusion: “OK but stale” indicates a contract violation (freshness not enforced) or insufficient logging granularity to expose age and window alignment.

  • Evidence to check: input freshness fields (input_timestamp, age_ms) and the rule that transitions stale inputs into DEGRADED/INVALID.
  • Evidence to check: timeline resolution and snapshot binding (timestamp order, window_id consistency, snapshot_id presence per decision).
  • First fix: enforce a hard age_ms threshold in the input contract and log age/window_id at every decision boundary so stale behavior becomes visible and vote-ineligible.
Maps: H2-11, H2-10
FAQ 12 After a CM event, the system recovers too quickly—recovery window or reset preconditions too weak?

Conclusion: Fast recovery after a CM event is a policy weakness: reset is being allowed before channel health and coherency are stable for long enough.

  • Evidence to check: CM transient severity and proximity to reset (CM log entries correlated to reset_event_log and timeline).
  • Evidence to check: reset gate conditions (reset_preconditions_met reasons, recovery_window_ms, latch_state transitions).
  • First fix: require a CM-aware recovery window and strengthen reset preconditions so CM events force a minimum stability period before outputs can exit safe state.
Maps: H2-9, H2-7