123 Main Street, New York, NY 10001

Boiler / Furnace Control: Flame Detection, Drives & Comms

← Back to: Smart Home & Appliances

Core idea: A boiler/furnace control board is a safety-first controller that proves flame, verifies airflow, and authorizes the gas valve through a hard lockout path—while using rails, sensor chains, and event logs as the evidence to diagnose lockouts, false flame, and intermittent resets.

In practice: Most “mystery faults” become deterministic once flame/air/valve timing is correlated with rail droop and timestamped logs, enabling fast isolation and the smallest safe first fix.

H2-1Featured Answer

Featured Answer (Engineering Boundary)

A boiler/furnace control board safely manages ignition and steady combustion by proving flame, enforcing interlocks, and driving the inducer/blower, gas valve, and igniter while monitoring temperature/pressure. This page focuses on board-level hardware evidence (signals, power rails, lockout paths, and event logs), not thermostat UI, cloud apps, or protocol/register-map business logic.

The content is organized around five evidence anchors that recur in every chapter: prove what happened, isolate why, then apply the smallest safe fix.

Fault code / event log Flame signal (raw + decision) Air-proving (DP/switch) Valve/igniter drive Rail droop / reset reason
Depth rule for this page: each claim must land on measurable board evidence (waveform, logic state, counter/log field), especially for lockout vs false-trip separation.
Evidence Spine Five anchors used across the entire page Symptom → Evidence → Isolate → Fix → Prevent Event Log Fault code State/time Reset reason Flame Raw sense Threshold Decision Air-Proving DP / switch Tach link Interlock Drives Valve enable Igniter Fans/pumps Power Integrity 3V3/5V rail droop • brownout • watchdog/reset Noise injection paths from igniter/relays/fan start
Figure F0 — The five evidence anchors used throughout this page (minimal text; measurement-driven).
Cite this figure: Figure F0 Suggested caption: “Evidence spine for boiler/furnace control board debugging.”
H2-2System Domains

System Block & Safety Domains

A boiler/furnace controller is best understood as five coupled domains: a safety interlock chain that must fail-safe, high-energy actuator drives (fans/valves/igniter), low-level sensing (flame/temperature/pressure), logic & event recording (MCU/WD/log), and rugged communications (RS-485/Ethernet). Clear domain boundaries prevent false lockouts and speed field isolation.

Safety chain = prove + gate + lockout.
The flame decision must directly gate the gas valve enable path, and the lockout cause must be recoverable from an event record (fault code + timestamp + reset reason). This is the backbone for separating true hazards from false trips caused by EMI or rail droop.

What this chapter establishes (so later chapters stay vertical)

  • Safety interlock chain: flame detect → valve enable gating → lockout action (fail-safe).
  • Actuators: inducer/blower/pump/valves/igniter are treated as noise sources and load-step drivers.
  • Sensing: flame, temperature, and pressure/DP are handled as low-level evidence sources that require shielding from switching noise.
  • Low-voltage logic: MCU + AFE + watchdog/brownout + event log (diagnosis starts here).
  • Mains/high-voltage: relay/triac/SSR drive + isolation boundary (not a certification walkthrough).
  • Comms/diagnostics: RS-485/Ethernet hardware robustness (no protocol-stack deep dive).

First measurement entry points (minimal-tool, high-discrimination)

Domain What it proves First 2 measurements
Safety chain Whether lockout is caused by true interlock failure vs false decision Flame raw + flame_ok decision timing; valve_enable gating vs lockout flag/event
Actuators Whether load steps inject droop/noise that cascades into sensing or resets Fan/igniter/valve drive waveform; 3V3/5V rail droop during load events
Sensing Whether sensor chain is stable and consistent with physics Sensor supply/reference ripple; ADC/comp output before filtering/decision
Logic & record Whether resets/WD/brownout are the root cause of “random” failures Reset pin + reset reason; watchdog behavior + event log fields
RS-485/Ethernet HW Whether comm activity correlates with flame false trips or dropouts RS-485 A/B differential + common-mode; PHY reset/link vs rail noise
Boiler / Furnace Control Board Domains + safety chain + minimal test points (TP) Mains / High Energy AC Entry + Protection Relay / Triac / SSR valve / igniter / loads Isolation Boundary creep/clear concept Low-Voltage Logic + Sensing Power Rails 3V3/5V + supervisor TP1 MCU + Watchdog event log / fault code TP2 reset / WD Flame Detect AFE raw sense → window → flame_ok TP3 TP4 flame raw / flame_ok Temp / Pressure NTC/RTD • DP • transducer Actuator Drives fan/inducer • valve • igniter TP5 TP6 valve/igniter • air-proving Comms RS-485 • Ethernet SAFETY CHAIN switching / igniter noise coupling path
Figure F1 — Domain view of a boiler/furnace controller. The highlighted safety chain shows how flame evidence gates valve enable and lockout.
Cite this figure: Figure F1 Suggested caption: “Boiler/furnace control board domains and safety-chain test points.”
H2-3Core Differentiator

Flame Detection AFE Deep Dive

Flame detection becomes reliable when it is treated as a measurable signal chain rather than a binary “has flame / no flame” assumption. The controller must reject ignition noise, ground bounce, and leakage paths while still proving weak flames within a defined time window.

Two sensing routes (choose the correct evidence model)

  • Ionization current / flame rectification: most common; a high-impedance electrode signal is conditioned into a stable “flame present” decision.
  • UV flame sensing: used in specific burners; a phototube/photodiode current is converted by a TIA and filtered to reject ambient-light triggers.
Depth rule for this chapter: every claim must map to two measurable points—the raw flame sense node and the final decision (comparator output or MCU flame status).

Ionization / rectification AFE: pull a weak signal out of ignition noise

1) Limit + protect — survive spikes; preserve baseline recovery 2) Rectify — convert polarity bias into evidence 3) Integrate — average HF noise; create a slow variable 4) Window compare — thresholds + hysteresis for stability 5) Debounce — prove/hold/loss timing (no chatter)

Engineering intent: ignition generates large dv/dt and di/dt events; the AFE must settle fast enough to meet flame-proving time limits while rejecting synchronous spikes and common-mode jumps.

UV AFE: TIA + filtering to avoid ambient-light false triggers

  • TIA stability: maintain a clean baseline (dark current + leakage) and controlled bandwidth.
  • Filter selection: reject slow ambient changes and supply ripple without delaying flame prove.
  • False triggers: reflections, sunlight, hot surfaces, and sensor contamination must not create a stable “flame_ok”.

High-value failure mode: false flame from leakage or contamination

  • Leakage path raises the integrated node even with no flame (humidity, residues, polluted surfaces).
  • Rectifier/integrator can turn leakage into DC bias that sits inside the comparator window.
  • Verification: compare “no flame” baseline drift across humidity/temperature changes and observe whether the decision flips without any ignition event.
Two-signal proof (mandatory):
Measure TP3 (flame raw) and TP4 (flame_ok decision) together. If TP4 flips while TP3 shows only baseline drift (no ignition/no flame evidence), suspect leakage/thresholding. If TP3 shows large synchronous spikes and TP4 chatters during ignition, suspect coupling/ground bounce/filtering.
Flame Detection AFE Ionization (rectification) and UV routes with noise/leakage paths Ionization / Rectification Route Electrode Limit + protect Rectify Integrate Window Compare threshold + hysteresis Timing / Debounce prove window • min-hold • loss Flame Decision Comparator out / MCU flame_ok TP4 TP3 UV Route (Specific Burners) UV Sensor TIA current → voltage Filter reject ambient Compare stable decision False-Trip Paths Igniter Coupling synchronous spikes Ground Bounce common-mode jump Leakage / Humidity DC bias without flame TP3: raw sense TP4: decision
Figure F2 — Flame AFE blocks for ionization and UV sensing, plus common false-trip injection paths (igniter coupling, ground bounce, leakage).
Cite this figure: Figure F2 Suggested caption: “Flame detection AFE and false-trip injection paths for boilers/furnaces.”
H2-4Evidence Timing

Ignition & Flame-Proving Sequence (Evidence-Timed)

The ignition flow is valuable only when it produces discriminating evidence. Each phase is defined by what must be true (interlocks), what is driven (igniter/valve/fan), and what the controller must record to make field failures recoverable.

Phases and what they must prove

Phase Purpose Required interlock Failure evidence
Purge Clear chamber; establish airflow Air-proving true (DP/switch + tach consistency) DP not reached; fan starts then rail droops; interlock toggles
Ignition Create ignitable condition Air-proving stays true; valve still gated Igniter drive present but TP3 baseline never settles; TP4 chatters
Flame Prove Prove flame within time window Valve enabled only in allowed window; flame_ok must become stable TP3 shows evidence but TP4 never asserts; or asserts then drops
Run Maintain stable combustion Flame_ok maintained; protections valid Flame loss vs false trip separated by TP3+TP4 and event log
Retry / Lockout Bounded recovery; fail-safe stop Retry counter; lockout gate must cut valve enable Log must preserve phase, cause code, and last-sample snapshot
Three-signal consistency check (mandatory):
Igniter drive proves the noise event occurred; valve enable proves the safety gate behavior; flame_ok proves decision stability. If flame_ok appears before valve enable, suspect false flame (leakage/threshold). If valve enable is present but flame_ok never asserts, suspect true no-flame or AFE suppression. If flame_ok chatters synchronized with ignition, suspect coupling/ground bounce/filtering.
Ignition Timeline Evidence Strip Align three signals with phase windows and log snapshots Purge Ignition Prove Run Retry / Lockout prove window Signals Igniter_drive Valve_enable Flame_ok pulses gated by safety min-hold loss debounce log snapshot lockout code phase boundary phase boundary phase boundary
Figure F4 — A compact evidence timeline: align igniter drive, valve enable, and flame_ok with purge/ignition/prove/run/retry/lockout phases and log snapshots.
Cite this figure: Figure F4 Suggested caption: “Ignition evidence timeline for boilers/furnaces (igniter/valve/flame + log snapshots).”
H2-5Most Common Pain

Fan / Blower Drive + Air-Proving (DP Interlock)

Blower and draft-inducer issues are the fastest way to trigger no-ignition, unstable flame, or “high-speed only” lockouts. The reliable approach is a closed evidence loop: command → fan rail → tach → DP response → interlock decision, with coupling checks into the flame AFE.

Board-level interfaces (what to measure first)

TP-FAN_V fan supply rail TP-PWM command duty / frequency TP-EN enable / gate TP-TACH FG / Hall tach TP-DP DP sensor or DP switch state TP3 flame raw (coupling check)

Control boards often drive an external fan module or a power stage through PWM/EN and then rely on tach + DP as the “air-proving” proof. The goal is not perfect speed control here; the goal is stable interlock evidence.

Two mandatory evidence chains (non-negotiable)

  • Chain A — Fan rail droop + tach build: capture TP-FAN_V during start and speed step; verify TP-TACH reaches a stable frequency without missing pulses.
  • Chain B — DP response curve: DP switch/state must change at expected airflow; DP analog must be monotonic with speed steps (not necessarily linear, but consistent).

Symptoms → discriminators (minimal measurements)

Symptom First 2 measurements Discriminator (what it proves)
Fan does not spin TP-PWM/TP-EN + TP-FAN_V If command exists but TP-FAN_V collapses, suspect inrush or rail limitation; if rail is stable but tach absent, suspect harness/driver/fan module.
Speed unstable / tach jitter TP-TACH + low-voltage rail (3V3/5V) Tach dropouts aligned with low-voltage ripple indicate common return/power integrity; tach noise with stable rails points to input conditioning or harness coupling.
DP never proves (air-proving fail) TP-DP + TP-TACH Normal tach but DP flat indicates DP chain issue; low tach and DP fail indicates fan rail/drive or excessive load (use evidence to separate).
Only fails at high speed TP-FAN_V droop at speed step + TP3 flame raw If high-speed step causes rail droop and TP3 noise rises together, suspect shared return or injection into flame AFE; if droop dominates without TP3 noise, suspect rail capacity/inrush.
Coupling check (high value):
When fan speed changes, observe whether TP3 (flame raw) noise increases in sync with TP-PWM edges or with TP-FAN_V droop. PWM-synchronous noise suggests harness/edge coupling; droop-synchronous noise suggests common return or supply injection that shifts the AFE reference.
Fan / Blower + Air-Proving Evidence Loop command → fan rail → tach → DP response → interlock (plus flame AFE coupling check) Control Board Domain MCU PWM / EN Driver / Switch MOSFET / IC Fan Supply Rail inrush / droop TP-FAN_V Fan / Airflow Domain Blower / Inducer load + inertia Tach / FG frequency TP-TACH DP Sensor/Switch air-proving TP-DP Interlock allow ignition Coupling Risk to Flame AFE high-speed edges / rail droop / return path can inject noise into TP3 Harness / Return Path edge + current loop Noise Injection PWM-sync or droop-sync Flame AFE raw sense TP3
Figure F5 — Fan drive evidence loop (command → rail → tach → DP → interlock) and the coupling paths that can inject fan noise into the flame AFE (TP3).
Cite this figure: Figure F5 Suggested caption: “Fan/DP interlock evidence loop and flame AFE coupling paths.”
H2-6Safety Actuation

Gas Valve / Damper / Pump Drivers (Protection + Hard Cut)

Actuation faults must be separated into logic (valve_enable), energy delivery (coil/motor voltage or current), and safe turn-off (clamp and hard-cut behavior). This chapter focuses on board-level drive topologies, protection, and measurable proof that lockout removes valve energy even if firmware fails.

Load map (classify by measurable evidence)

  • AC gas valve: relay / triac / SSR. Evidence is coil voltage waveform and guaranteed de-energize under lockout.
  • DC solenoid valve: low-side MOSFET or high-side switch. Evidence is coil current ramp + flyback clamp at turn-off.
  • Damper / 3-way valve: DC motor (H-bridge) or stepper. Evidence is drive phase waveforms + enable gating.
  • Pump: relay/triac/driver. Evidence is supply step response and return-path stress during start/stop.

Two mandatory evidence chains

  • Chain A — valve_enable + coil energy: capture valve_enable together with coil voltage or coil current; both must collapse when lockout asserts.
  • Chain B — turn-off transient: observe the turn-off spike and verify the clamp path (diode/TVS/snubber) keeps stress bounded.

Common failures → discriminators (minimal measurements)

Failure First 2 measurements Discriminator (what it proves)
Valve does not actuate valve_enable + coil V/I If enable is present but coil energy is absent, suspect drive path (relay/triac/MOSFET/harness). If enable is absent, interlock/sequence is blocking (refer to H2-4 timing).
Actuates then drops (bounce/back) coil current vs supply droop Current ramps then collapses with rail droop or gating indicates supply/drive limitation; stable current with drop suggests non-electrical cause (use evidence to exclude board path).
Coil overheats steady-state current + duty/hold control Overcurrent or excessive hold energy points to gating/drive strategy; verify the commanded hold state matches measured current.
Driver device fails (blows) turn-off spike + clamp node Unclamped spikes or long loops indicate insufficient clamp or poor return path; verify clamp node behavior (diode/TVS/snubber actually conducts).
Hard-cut requirement (safety):
Lockout must remove valve energy through a hardware gating path that does not rely on firmware timing. Verification requires capturing lockout/valve_enable together with coil V/I and confirming that coil energy collapses immediately when lockout is asserted.
Actuator Drivers + Protection + Hard Cut valve_enable → drive stage → load energy, with clamp and lockout gating proof Control + Safety Gating MCU commands Hard Gate lockout cuts TP-VE valve_enable Drive Stages and Loads AC Gas Valve Relay Triac SSR (option) DC Solenoid MOSFET HS Flyback clamp TP-I Damper / Valve H-Bridge STEP Enable gating Protection: TVS / snubber / clamp / current sense • Proof: TP-VE + coil V/I + clamp node TP-V coil voltage
Figure F6 — Driver map for AC valves, DC solenoids, and dampers with protection blocks and a highlighted hard-cut gate. Verification uses TP-VE with coil V/I and clamp behavior.
Cite this figure: Figure F6 Suggested caption: “Actuator driver map with protection and lockout hard-cut proof points.”
H2-7Robustness > Accuracy

Temperature & Pressure Sensing Chain (Robust Diagnostics)

The sensing chain should be engineered for robust field behavior rather than peak accuracy. Most “random over-temp” and “pressure false alarm” cases are traceable to raw ADC artifacts, sensor-supply/Vref ripple, and missing consistency gates against flame/airflow state.

Board-level measurement points (start here)

TP-ADC_T temp raw codes (pre-filter) TP-ADC_P pressure raw codes (pre-filter) TP-VREF ADC reference ripple TP-SENS_V sensor supply ripple TP3 flame raw (consistency) TP-DP DP state/value (air-proving) TP-TACH tach frequency

The minimum evidence pair for any jumpy sensor complaint is: pre-filter samples plus Vref / sensor-supply ripple. Without both, “filter tuning” becomes guesswork.

Temperature chain: multi-point sensors, noise, and open/short resilience

  • Multi-point role separation: supply/return water, flue, and heat-exchanger sensors should enable cross-checks (a single channel jumping alone is a strong wiring/chain signal).
  • Filter strategy (field-first): combine a light low-pass for ripple with outlier suppression for contact bounce; diagnostics should observe pre-filter behavior in parallel.
  • Open/short detection: detect rail-saturated codes early (before the protection logic interprets them as real over-temp).
  • Harness and return-path coupling: correlate sensor jumps with fan speed steps or ignition events to separate real thermal change from injected noise.

Pressure chain: ratiometric pitfalls and reference contamination

  • Sensor supply ripple: pressure sensors often track supply; a noisy TP-SENS_V can appear as “pressure oscillation”.
  • Vref ripple: a moving TP-VREF turns every ADC channel into a power-noise probe; pressure false alarms frequently align with load steps.
  • MUX/settling artifacts: fast channel scanning without adequate settle time can create step-like jumps in raw codes (distinct from true pressure dynamics).

Three-gate diagnostics: range + rate + consistency

Gate What it blocks Evidence to verify
Range check Open/short or impossible sensor states being treated as real conditions TP-ADC_T / TP-ADC_P saturations + explicit fault code (do not rely on filtered value)
Rate check Unphysical jumps driving over-temp / pressure trips Pre-filter sample-to-sample delta (raw slope), correlated with load-step timestamps
Consistency check False trips that contradict combustion/airflow state Temperature rise should align with flame_ok; DP changes should align with tach and commanded fan state
Minimal discriminator rules:
If raw ADC steps track TP-VREF or TP-SENS_V ripple, the root cause is reference/supply injection. If only a single channel jumps while others remain stable, suspect wiring/contact/chain. If temperature/pressure alarms contradict flame_ok or DP/tach state, tighten the consistency gate before adjusting thresholds.
Temp & Pressure Robust Sensing Chain pre-filter ADC evidence + Vref/sensor-supply ripple + range/rate/consistency gates Sensors (Multi-point) Temp: Supply Temp: Return Temp: Flue Temp: HEX Press: Water Press: Gas / DP Analog Front-End Bias / Excite divider / I-source Input RC + Protect open/short ready ADC + Sampling Pre-filter Data MUX / Settling scan artifacts TP-ADC Diagnostics & Evidence Range Rate Consistency Cross-check inputs flame_ok (TP3) • DP (TP-DP) • tach (TP-TACH) TP3 Reference / Supplies Vref TP-VREF Sensor V TP-SENS_V Noise load-step
Figure F7 — Robust sensing chain: pre-filter ADC evidence plus Vref/sensor-supply ripple. Range/rate/consistency gates reduce false over-temp/pressure trips.
Cite this figure: Figure F7 Suggested caption: “Temp/pressure robust sensing chain with Vref and sensor-supply injection paths.”
H2-8Board Consequences

Power Tree & Brownout / Surge Robustness (Board-Level Consequences)

This chapter avoids power-topology derivations and focuses on how rail events corrupt safety evidence. Load steps from ignition, fans, and valves can produce rail droop, threshold shift, and partial resets, leading to false trips or confusing lockouts unless reset/windowing and logs are designed for proof.

Power domains (what each rail “hosts”)

  • 12V (actuation domain): fans, relays/triacs/valves, ignition-related loads; primary source of load-step stress.
  • 5V (intermediate/sensor domain): sensor supply, comms, auxiliary logic; vulnerable to injection from 12V events.
  • 3V3 (logic/AFE domain): MCU, comparators/AFE, state tracking; most sensitive to brownout and reference drift.

Brownout classes (why “no reset” can still fail)

Class What happens Typical board symptom Evidence to capture
Threshold shift (no reset) Rails sag but do not cross POR/BOD; analog thresholds and references move false flame/over-temp/DP trip without a reset record TP-3V3/TP-5V ripple + TP-VREF + trip timestamp
Hard reset POR/BOD triggers; state machine restarts unexpected lockout or retry loops TP-RESET low pulse + event log “reset cause”
Partial init / timing window rails recover during boot; sampling begins before AFE/refs settle early false codes right after power-on boot timeline + first-sample snapshot vs rail settling

Power-on & reset windowing (safety default + valid sampling window)

  • Safety default: during boot/reset, valve energy must remain gated off by hardware (lockout/hard gate), not by firmware timing.
  • Valid sampling window: flame/temperature/pressure evidence should be accepted only after rails and references settle; pre-window samples should be logged as “invalid,” not interpreted as faults.
Surge protection (mention only):
MOV/TVS/fuse/inrush limiting should be treated as “rail event shapers.” Detailed implementation belongs to the EMC/Safety subsystem page; here the focus remains on rail consequences, reset records, and proof.

Two mandatory evidence chains

  • Chain A — rail waveform + reset/WD: capture TP-3V3 and TP-5V during fan/ignition/valve steps, together with TP-RESET (or reset cause) and watchdog status.
  • Chain B — event log timeline: timestamps for load-step events and fault codes must align with the rail behavior (otherwise the logging chain is not robust under stress).
Power Domains & Brownout Evidence Map load step → rail droop → threshold shift / reset → false trip unless windowed and logged 12V Domain fan / valve / relay / ignition loads Load Step Events Fan Start Ignition 5V Domain sensors / comms / aux logic Vref / Sensor V 3V3 Domain MCU / AFE / comparators State + Trip Logic TP-12V TP-5V TP-3V3 TP-VREF Distribution / Conversion rail coupling paths (do not derive topology here) Evidence + Records Reset / WD Event Log False Trip Risk TP-RESET TP-WD rail droop
Figure F8 — Power domains and how load steps can cause rail droop, threshold shift, resets, and false trips unless sampling is windowed and reset/WD/log evidence is robust.
Cite this figure: Figure F8 Suggested caption: “Power domains, brownout classes, and proof points (TP-3V3/5V + reset/WD + logs).”
H2-9HW Robustness

RS-485 / Ethernet Hardware Integration (Evidence-Driven)

Intermittent comms, random dropouts, and post-surge port failures are usually rooted in common-mode stress, ESD/surge paths, and PHY/transceiver supply or reset integrity. This section stays at the hardware layer: measure the physical evidence first, then decide whether termination, biasing, isolation, shielding reference, or rail windowing is the true limiter.

Board-level test points (minimum set)

TP-485_A bus line A TP-485_B bus line B TP-485_CM common-mode (A+B)/2 TP-485_VIO transceiver I/O rail TP-485_EN DE/RE enable TP-ETH_RAIL PHY rail ripple TP-ETH_RST PHY reset TP-LINK link status/LED drive TP-CHASSIS shield/chassis reference

For “CRC errors” or “sometimes works”: capture A/B differential and common-mode together. For “Ethernet drops”: capture PHY rail and PHY reset together.

RS-485 transceiver: what matters for field faults

  • ESD and surge tolerance: prevents latent port damage that manifests as rare errors after a discharge event.
  • Fault protection: survives A/B shorts to GND/V and bus contention without burning the interface.
  • Common-mode range (CMR): directly determines whether ground potential differences become CRC errors and random framing loss.
  • Failsafe behavior: avoids floating-bus noise being interpreted as data during idle or wiring faults.

Termination & biasing (principles, proven by measurements)

  • Biasing goal: idle A–B should settle to a stable polarity; a floating idle that wanders typically increases error bursts.
  • Termination goal: reduce reflections at the ends of the line; incorrect placement often shows up as edge ringing and timing margin loss.
  • Evidence pair: measure A/B differential plus TP-485_CM. Large common-mode jumps frequently track the error bursts more than the differential amplitude does.

Ethernet: PHY + magnetics + common-mode management

  • PHY rail and reset integrity: transient rail droop or reset glitches can create “link flaps” without any protocol involvement.
  • Magnetics isolation: transformer isolation helps, but common-mode transients can still couple through parasitics and shield paths.
  • CMC/ESD at the connector: treat these as the port’s “stress steering” elements; a damaged protection part can cause leakage that drags rails or distorts common-mode.

Isolation decision (when it is justified)

Interface Trigger condition (principle) Hardware evidence What isolation changes
RS-485 Uncontrolled ground potential difference, long cables, repeated surge/ESD exposure TP-485_CM frequently approaches/exceeds transceiver CMR; errors correlate with CM jumps Breaks the ground loop; improves CM headroom; shifts stress to isolated barrier + port protection
Ethernet Shield/common-mode is the dominant coupling path; repeated port damage events Link flaps correlate with TP-ETH_RAIL / TP-ETH_RST; shield reference behavior correlates with failures Magnetics already isolate data; emphasis becomes shield/chassis strategy and rail/reset robustness
Minimal discriminator rules:
If RS-485 errors rise while TP-485_CM swings widely, prioritize common-mode/ground reference and isolation decisions. If Ethernet drops align with TP-ETH_RAIL ripple or TP-ETH_RST glitches, prioritize PHY supply windowing and reset conditioning. If a surge event precedes persistent failure, check whether the port protection now leaks or clamps abnormally (evidence first; no protocol assumptions).
RS-485 / Ethernet Hardware Integration measure differential + common-mode, plus PHY rail/reset and link status Control Board Domain MCU UART / MAC RS-485 Transceiver ESD / fault-protected • wide CMR TP-485 Optional Isolation digital isolator + iso DC-DC RS-485 Port Entry TVS • series-R • bias/term (principles) A / B CM (A+B)/2 TP-485_CM Ethernet Hardware Path PHY Rail / Reset TP-ETH_RAIL TP-ETH_RST Magnetics CMC + ESD RJ45 / Port Shield → Chassis TP-CHASSIS Link Status TP-LINK CM stress
Figure F9 — Hardware integration map: RS-485 differential/common-mode evidence and Ethernet PHY rail/reset + link status, with protection and isolation options.
Cite this figure: Figure F9 Suggested caption: “RS-485/Ethernet port protection, isolation option, and proof points (TP-485_CM, TP-ETH_RAIL/RST, TP-LINK).”
H2-10Verifiable Safety

Functional Safety Hooks: Lockout, Self-Test, Event Recording

Safety must be expressed as verifiable engineering hooks: a lockout matrix tied to evidence, self-tests that detect stuck inputs and actuator faults, and an event record that reconstructs failures with minimal ambiguity. The goal is a proof-friendly chain: event log + reset reason + flame/air/valve triad.

Lockout trigger matrix (evidence-bound)

Trigger Safety action Required evidence snapshot Notes (anti-false-trip)
Flame fail / flame loss Immediate valve energy off + lockout/retry policy flame_ok + valve_enable + timestamp use debounce window for momentary drop only if valve energy is still safe
Air-proving fail (DP/pressure) Block ignition/valve enable; escalate to lockout if persistent DP state/value + fan command/tach + timestamp consistency: DP should track tach and commanded fan state
Over-temp Reduce/stop heat; lockout if hard limit is crossed temp raw/filtered + flame_ok + stage + timestamp reject single-sample spikes that contradict flame/air state
Pressure fault Block ignition/valve; lockout if unsafe region reached pressure raw + Vref/sensor supply status + timestamp differentiate sensor-chain noise vs real pressure change (pre-filter evidence)
Stuck relay/triac / valve drive fault Hard gate off + record fault + lockout command vs feedback/energy signature + timestamp treat as “must-prove-off” fault class

Hard cut-off path (survives MCU failure)

  • Lockout latch / gate: a hardware gate must collapse valve energy even if firmware is stalled.
  • Proof requirement: when lockout asserts, valve_enable and the valve energy signature must drop consistently.
  • Capture: log the lockout reason together with the last known flame_ok, DP, and valve_enable states.

Self-tests (signal-verifiable)

Flame input stuck-high / stuck-low Temp/Pressure open/short saturation Valve drive open/short / no-response Stuck relay/triac command-off yet energy persists DP vs tach inconsistency

Each self-test should produce a recordable contradiction (command vs evidence), not a “silent fix.” The key is to capture the raw state at the moment the test fails.

Event recording: minimum fields that reconstruct failure

  • Timestamp (at least monotonic ordering)
  • Stage (purge / ignite / prove / run / retry / lockout)
  • Lockout reason (single enumerated code)
  • Evidence snapshot: flame_ok, DP state/value, valve_enable, fan command/tach, temp/pressure status flags
  • Retry counters (attempt index and remaining budget)
  • Reset reason (POR/BOD/WD/other)

Watchdog gating (prevents “fake running”)

  • Feed condition: watchdog feed should be allowed only when critical safety checks have executed and evidence is internally consistent.
  • WD record: on WD reset, store reset cause and the last stage + last lockout reason so the next boot can report a coherent story.
Failure reproduction template (mandatory triad):
Collect event log (timestamp + stage + reason), reset reason, and the triad waveform/states: flame_ok + air-proving (DP) + valve_enable/energy signature. If lockout occurs without reset, suspect threshold shift/noise injection; if WD reset aligns with load steps, suspect rail windowing and boot sampling timing.
Functional Safety Hooks (Verifiable) lockout matrix + hard gate + self-test + event recorder + reset/WD linkage Evidence Inputs Flame (TP3) Air / DP Temp / Press Drive FB Reset Reason / WD Status TP3 TP-DP TP-WD TP-RESET Safety Logic Lockout Matrix reason → action Self-Test Checks stuck-high/low • open/short Consistency Gates flame ↔ air ↔ valve evidence triad Hard Cut-Off + Recording Hard Gate / Latch survives MCU stall Valve Energy Off TP-VE Event Recorder timestamp • stage • reason • retry • snapshot Reset Reason + WD Cause proof fields
Figure F10 — Verifiable safety hooks: evidence inputs → lockout matrix/self-tests → hard gate + event recorder with reset/WD linkage.
Cite this figure: Figure F10 Suggested caption: “Lockout matrix, self-test checks, hard cut-off gate, and minimum event-record fields.”
H2-11SOP / Field Debug

Validation & Field Debug Playbook (Symptom → Evidence → Isolate → Fix)

This playbook is designed for fast isolation with minimal tools. Every symptom is forced to land on one (or more) evidence anchors: flame, air (DP), valve, rails, log. Use the same 5-line template repeatedly to avoid “maybe causes” and converge on a discriminator.

First 2 measurements highest information per minute Discriminator one rule that splits causes First fix smallest change first Prevent next-rev hardening

Required triad for safety-related incidents: event log + reset reason + flame/air/valve. If any of these are missing, add the missing hook before deep analysis.

1) Ignition fails (no flame proving) flame • valve • log

  • Symptom Purge/ignite runs, but no flame_prove within the proving window.
  • First 2 measurements (1) Flame sense raw (ionization waveform) (2) Valve_enable + igniter_drive timing.
  • Discriminator If valve_enable/igniter_drive are present yet flame waveform never crosses the expected window → flame AFE threshold/injection is dominant. If valve_enable is blocked → interlock/lockout path dominates (air/limit chain).
  • First fix Shift flame sampling/proving window away from the highest ignition noise interval; tighten input current limiting and reduce common-mode injection into the flame front end.
  • Prevent Record: proving-window start/stop timestamps + flame_ok decision + last flame_raw snapshot into event log.

2) Flame drops after 2–10 s flame • valve • log

  • Symptom Flame is proven, then flame loss triggers retry/lockout shortly after.
  • First 2 measurements (1) Flame_ok and/or flame_raw (2) Valve_enable (or valve energy signature if available).
  • Discriminator If flame_ok drops before valve_enable drops → either true flame loss or flame signal corruption. If valve_enable drops first → hard gate, drive fault, or rail window dominates.
  • First fix Add a short flame-loss debounce only when valve energy is confirmed stable; otherwise cut immediately and diagnose injection paths.
  • Prevent Log a “3-signal snapshot” at flame loss: flame_ok + DP state + valve_enable.

3) Fan start causes reset or fault rails • log • air

  • Symptom Draft inducer/blower command triggers MCU reset, brownout, or immediate fault.
  • First 2 measurements (1) 3.3 V/5 V rail waveform during fan start (2) reset reason (POR/BOD/WD) + event timestamp.
  • Discriminator If rail crosses BOR/BOD threshold and reset asserts → brownout is confirmed. If rails stay inside window but fault occurs → sensor/AFE injection or timing windowing dominates.
  • First fix Separate high di/dt fan domain return from logic/AFE return; adjust reset/boot sampling windows to occur after rail stabilization.
  • Prevent Add “load step validation”: fan start/stop at cold/hot line while logging rails + reset reason.

4) Lockout only at high power stage air • rails • flame

  • Symptom Low stage runs, but high stage triggers lockout (air fail, flame fail, or mixed).
  • First 2 measurements (1) DP/air-proving vs fan command/tach (2) rail ripple when power stage changes.
  • Discriminator If DP fails to track fan state → airflow/DP chain dominates. If DP is consistent but flame misreports during high noise → flame AFE common-mode injection dominates.
  • First fix Add DP↔tach consistency gate; harden flame windowing and reduce common-mode coupling from high-power switching/commutation events.
  • Prevent Validate at boundary conditions: high stage, maximum fan, cold line, and wet/humid stress.

5) Enabling comms causes false flame flame • rails • log

  • Symptom RS-485/Ethernet activity correlates with flame false-positive or false-negative.
  • First 2 measurements (1) TP-485_CM (common-mode) or PHY rail ripple (2) flame_raw + comparator/flame_ok output.
  • Discriminator If false flame aligns with common-mode jumps → comms ground reference/CM injection dominates. If it aligns with rail ripple/reset behavior → rail windowing dominates.
  • First fix Control common-mode at the port entry (biasing/ESD path/shield reference) and harden flame input against CM injection; isolate only when evidence supports it.
  • Prevent Include “comms on/off transient” in validation while logging TP-485_CM (or PHY rail) + flame decision.

6) Temperature spikes cause over-temp trip rails • log

  • Symptom Heating appears normal, yet a sudden temp jump triggers over-temp.
  • First 2 measurements (1) pre-filter ADC codes (raw) (2) Vref or sensor-supply ripple at the same timestamp.
  • Discriminator If raw codes track Vref/supply ripple → reference/supply injection dominates. If single-sample spikes correlate with switching events → wiring/return coupling dominates.
  • First fix Add range + rate gates, plus a “consistency gate” with flame/air state; reject impossible temp transitions that contradict operating stage.
  • Prevent Record a short pre-trigger ring buffer (N samples) of raw codes around over-temp events.

7) Valve actuates but combustion is unstable valve • flame • log

  • Symptom Valve command is present, but flame signal is unstable or feedback is inconsistent.
  • First 2 measurements (1) valve_enable + energy signature (current/voltage proxy) (2) flame_raw (or flame_ok).
  • Discriminator Valve energy stable but flame unstable → flame chain robustness/threshold/windowing dominates. Flame stable but valve energy fluctuates → drive protection, kickback clamp, or rail coupling dominates.
  • First fix Verify kickback clamp behavior and drive timing; ensure event log captures both command and energy proxy at transitions.
  • Prevent Add drive self-test: command-off must prove energy collapse within a fixed timeout (stuck relay/triac class).

8) Humid/rainy conditions cause false flame or missed flame flame • log

  • Symptom After humidity exposure, false flame is detected (no flame) or flame is missed (real flame).
  • First 2 measurements (1) no-flame baseline of flame input (idle) (2) flame input bias / leakage indicators (baseline shift, CM drift, or comparator bias drift).
  • Discriminator “Flame-like” rectification signature present with valve disabled → leakage/contamination path dominates. Flame present but amplitude is reduced with higher noise → window/threshold no longer robust under leakage/CM shift.
  • First fix Increase leakage tolerance (input impedance strategy, bias control, cleaning/guarding) and retune proving windows based on measured wet baselines.
  • Prevent Add wet/leakage stress validation and store baseline + threshold margin into the event record.
Field Debug Decision Map symptom → evidence anchor → discriminator → first fix Top 8 Symptoms Ignition fail (no prove) Flame loss (2–10 s) Fan start → reset/fault High stage → lockout Comms → false flame Temp spike → over-temp Valve on, unstable burn Humidity → false/miss Evidence Anchors Flame Air (DP) Valve Rails Log Required triad: event + reset + flame/air/valve Discriminator Gates Timing consistency flame ↔ valve ↔ DP Common-mode injection port CM → flame AFE Rail windowing BOR/WD + sampling First Fix Actions Proving windows CM control Hard gate proof Log fields
Figure F11 — Debug map that forces every symptom to land on flame/air/valve/rails/log evidence, then apply a discriminator gate and the smallest first fix.
Cite this figure: Figure F11 Suggested caption: “Field debug decision map for ignition/flame/air/rail/log evidence.”
H2-12BOM / MPN Classes

IC / BOM Selection (MPN Classes) + RFQ-Ready Checklist

This section lists IC classes and concrete MPN examples that match the control-board evidence chain: flame sensing robustness, air-proving consistency, valve/fan drive survivability, rail/reset determinism, and comms common-mode tolerance. Select by “field symptom consequence” first, then map to specs.

A) Flame detect front-end (ionization/rectification) — MPN examples

Class What it solves Key specs to watch Example MPNs (common families)
Low-bias op-amp / integrator Stable rectified-signal integration under leakage/humidity; reduces false prove Input bias current, offset drift, EMI robustness, supply range TI OPA333 / OPA376 • ADI ADA4528-1 • Microchip MCP6V01
Low-power comparator (window building block) Hard flame_ok thresholding after integration; supports proving windows Input protection, hysteresis control, propagation vs noise immunity TI TLV1701/TLV1702 • TI TLV3701 • ST TS881 • onsemi LMV331
Precision reference (for stable thresholds) Prevents “works in lab, trips in field” due to Vref wander Tempco, noise, load regulation, startup behavior TI REF3330/REF5030 • ADI ADR4525/ADR3450 • onsemi LM4040

Practical selection hook: if “humidity → false flame/miss” is frequent, prioritize ultra-low input bias and leakage tolerance (guarding + bias strategy), then use a window comparator stage with controlled hysteresis.

B) ADC / Comparator / Reference for temperature & pressure chains — MPN examples

Class What it solves Key specs to watch Example MPNs
ΔΣ ADC (sensor chains) Stable readings with filtering; supports “raw code” evidence for spike triage Noise, programmable data rate, input mux behavior, ref pin options TI ADS1120 / ADS1220 • ADI AD7793 • Microchip MCP3561
I²C ADC (quick integration) Fast bring-up for NTC/pressure monitoring; easy to log raw codes Gain options, input range, conversion time, I²C robustness TI ADS1115 • ADI AD7997 • Microchip MCP3421
Supervisor reference / shunt ref Enables stable thresholding and open/short diagnostics under ripple Dynamic impedance, temp drift, startup behavior TI TL431 • onsemi NCP431 • ST TL431A

C) Drivers (relay/triac/SSR, solenoid, fan PWM) — MPN examples

Load Selection focus Key specs to watch Example MPNs
Relay coil drive Robust coil drive + clamp; avoids rail droop back-injecting into logic Clamp strategy, current capability, thermal headroom TI ULN2003A/ULN2803A • ST ULN2003A • onsemi ULN2803A
AC valve via triac/SSR Predictable triggering; dv/dt tolerance; safe “prove-off” behavior Isolation class (opto), zero-cross vs random-phase, dv/dt immunity onsemi MOC3063 (zero-cross) • Vishay VO3063 • onsemi MOC3023 (random-phase)
DC solenoid / valve driver IC Kickback control + current shaping; enables “energy signature” evidence Peak/hold control, diagnostics, clamp behavior TI DRV103 • TI DRV110 • onsemi NCV7751 (H-bridge/actuator class)
Fan PWM gate driver Clean switching edges without logic rail upset Gate drive current, UVLO, dV/dt immunity, supply range TI TPS28225 • Microchip MCP1402 • onsemi NCP81071

Practical selection hook: if “fan start → reset” appears, treat driver choice + clamp/return path as a rail-integrity problem, not just a drive-current problem.

D) Supervisors (BOR / watchdog / sequencing) — MPN examples

Function What it solves Key specs to watch Example MPNs
Voltage supervisor (BOR/reset) Eliminates “half-boot” and undefined sampling during droop Threshold accuracy, hysteresis, reset delay, reset output type TI TPS3808 • TI TPS3823 • Analog Devices MAX809/810 family
Window watchdog Prevents “fake running” and enforces safety-check cadence Window settings, reset behavior, supply range TI TPS3430 • Analog Devices MAX6369 • Microchip MCP1316/1416 family
Sequencing / power-good Ensures AFE/ADC sampling only after rails are valid PG thresholding, timing, open-drain behavior TI TPS3890 • TI TPS229xx (load switch class) • ADI ADM6315 family

E) RS-485 transceiver (fault-protected, wide CMR) + isolation — MPN examples

Need Selection focus Example transceiver MPNs Example isolated RS-485 MPNs
General robust 485 ESD, fault protection, wide common-mode range, failsafe TI THVD1500 • TI SN65HVD1781 • ADI ADM3485E • Maxim MAX13487E ADI ADM2682E/ADM2687E • TI ISO1452 + transceiver (2-chip)
Port ESD clamp (board-level) Steer ESD/surge away from transceiver; avoid leakage after hits Semtech SM712 (RS-485 TVS)

Isolation decision must be evidence-based: if TP-485_CM repeatedly approaches the transceiver’s CMR and correlates with errors, isolation is justified.

F) Ethernet PHY + port protection hooks — MPN examples

Class What it solves Key hooks to verify Example MPNs
10/100 PHY Stable link under rail ripple and reset windowing Rail quality, reset timing, strap configuration stability Microchip LAN8720A • TI DP83825/DP83848 • Microchip KSZ8081/KSZ8051
Ethernet ESD array Improves port survival; reduces latent leakage faults Low capacitance, placement at connector entry TI TPD4E1U06 • Nexperia PESD1ETH series • Littelfuse SP305x series

RFQ-ready selection checklist (copy/paste)

  • Appliance type: Boiler / Furnace (gas / electric / hybrid)
  • Flame method: ionization/rectification (default) or UV sensor (if applicable)
  • Actuators: valve type (AC triac/SSR or DC solenoid), igniter type, fan type (PWM/ECM interface), pump (if present)
  • Sensors: temp points (HX/flue/supply/return), pressure types (water/gas/DP)
  • Comms: RS-485 and/or Ethernet; isolation requirement (evidence: TP-485_CM behavior)
  • Rails: logic rails (3.3V/5V), actuator rails (12V/24V), brownout symptoms (yes/no)
  • Top symptom (from H2-11): pick 1–2 and provide available evidence: flame/air/valve/rails/log
BOM Class Map (MPN Examples) IC class → key hook → evidence anchor IC / BOM Classes Flame AFE bias • window compare ADC / Comparator / Vref raw codes Drivers triac/SSR • solenoid • PWM Supervisors BOR • WD • sequencing RS-485 CMR • fault • ESD • isolation Ethernet PHY rail/reset • ESD Evidence Flame Air Valve Rails Log RFQ Inputs Flame method Valve / igniter Fan / DP sensors Rails + brownout Comms + CM evidence
Figure F12 — BOM class map: IC classes and key selection hooks tied to the five evidence anchors and RFQ inputs.
Cite this figure: Figure F12 Suggested caption: “IC class → evidence anchor map for boiler/furnace control boards.”

Notes on MPN usage: verify voltage ratings, isolation requirements, safety approvals, and thermal margins against the target market and appliance class. MPNs above are example families; final selection should follow the evidence anchors and the RFQ checklist.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.
H2-13FAQs ×12

FAQs (Evidence-Locked, Accordion-Ready)

Each question is answerable using only in-page evidence: flame, air (DP), valve, rails, log, or temp/pressure. Every answer follows the same rule: pick two measurements, apply a single discriminator, then do the smallest first fix.

Flame raw waveform + flame_ok Air DP state/curve + tach Valve enable + energy signature Rails 3V3/5V + reset/WD Log timestamp + state + retry Temp/Pressure raw ADC + Vref ripple
FAQ Evidence Router Q → evidence anchor → first 2 measurements Evidence Anchors Flame Air (DP) Valve Rails Log Temp / Pressure 12 FAQs (Primary Anchor) Q1 Lockout w/ ignition Q2 3–5s flame drop Q3 Wet false flame Q6 RS-485 → false flame Q11 Weak flame signal Q5 Air-proving intermittent Q9 High stage lockout Q8 Valve has V, won’t open Q4 Fan start → reset Q10 Fault disappears after reboot Q12 Can’t reproduce in lab Q7 Temp spike → over-temp Rule: Pick 2 proofs → waveform + state/time, or rail + reset reason, or DP + tach.
Figure F13 — FAQ router that forces each question to map to one evidence anchor and a two-measurement starting point.
Cite this figure: Figure F13 Suggested caption: “FAQ evidence router for boiler/furnace control boards.”
Q1

“Ignition is active” but lockout always happens—check flame first or valve_enable first?

ValveFlameLog

Answer Start with valve_enable because it proves whether the safety gate actually authorizes fuel; then confirm flame evidence.

First 2 measurements Valve_enable timing + flame_raw/flame_ok during the proving window valve + flame.

Discriminator If valve_enable is missing or drops first, the lockout path is upstream (interlock/limit). If valve_enable is stable but flame never crosses the window, the flame AFE threshold/noise window dominates.

First fix Align proving windows away from peak ignition noise and log timestamps for valve_enable, flame decision, and retries.

Q2

Flame drops 3–5 seconds after lighting—true flame loss or flame AFE chatter? What two proofs decide?

FlameValveLog

Answer Prove the order: flame decision vs valve authorization. The first signal to fail defines the root path.

First 2 measurements flame_ok (or comparator output) + valve_enable edge timing at the drop moment flame + valve.

Discriminator If flame_ok falls while valve_enable stays high, suspect AFE injection/windowing or leakage drift. If valve_enable falls first, suspect hard gate/drive fault or rail windowing that forces shutdown.

First fix Add a short, evidence-gated flame-loss debounce only when valve energy is confirmed stable; otherwise cut and log a 3-signal snapshot.

Q3

After rain/humidity, “false flame” appears—where is the most common leakage path and how to verify?

FlameLog

Answer The highest-yield suspect is leakage/contamination around the high-impedance flame input, creating a rectification-like bias even with no flame.

First 2 measurements Capture no-flame baseline (valve disabled) of flame_raw + the flame comparator output flame.

Discriminator If “flame-like” signatures exist with valve disabled, it is leakage/PCB surface contamination. If real flame exists but amplitude collapses under noise, thresholds/windows lack margin under wet baselines.

First fix Improve leakage tolerance (guarding/cleaning/bias strategy) and store wet baseline + threshold margin in the event record.

Q4

Fan start causes reset—rail droop from inrush or ground-bounce injection? What two points to measure first?

RailsLogAir

Answer Treat it as a rail determinism problem first; “no reset” and “reset” produce different fault narratives.

First 2 measurements 3V3/5V waveform during fan command + reset_reason/WD flag with timestamp rails + log.

Discriminator If rails cross BOR/BOD and reset asserts, brownout is confirmed. If rails stay inside the window but symptoms correlate with switching edges, ground bounce/common-mode injection is more likely.

First fix Separate high di/dt returns from logic/AFE returns and delay sensitive sampling until rails are stable; validate with repeated fan start/stop logs.

Q5

Air-proving intermittently fails—DP switch/sensor issue or weak fan? Which curve distinguishes them?

Air (DP)Rails

Answer Compare airflow evidence against fan evidence. A DP chain that does not track fan state is the fastest split.

First 2 measurements DP state/DP analog value versus tach/fan command over the same time window air + tach.

Discriminator Tach rises but DP does not move → DP sensor/switch chain or tubing path dominates. Tach is low or unstable with rail ripple during start → fan supply/inrush or rail windowing dominates.

First fix Add DP↔tach consistency gating and log DP transitions with timestamps at each proving attempt.

Q6

RS-485 enabled → flame misreports—common-mode injection or power noise? How to pick evidence?

FlameRails

Answer Decide whether the disturbance enters through port common-mode or through rail ripple; the correlation tells the story.

First 2 measurements Measure RS-485 common-mode (port reference jump) and capture flame_raw + comparator output during comms bursts CM + flame.

Discriminator If flame errors align with common-mode steps, port reference/ESD return paths inject into flame AFE. If they align with 3V3/5V ripple or resets, rail windowing and sequencing dominate.

First fix Control common-mode at the port entry and harden flame input against CM injection; isolate only when TP evidence supports it.

Q7

Temperature jumps trigger over-temp—sensor open/short first, or ADC reference drift first?

Temp/PressureRailsLog

Answer Check raw evidence before filters: true sensor faults and reference drift look different in pre-filter codes.

First 2 measurements Capture pre-filter ADC codes and record Vref (or sensor-supply) ripple at the same timestamp raw ADC + Vref.

Discriminator Codes track Vref ripple → reference/supply injection dominates. Saturation to rail-like extremes → open/short is likely. Single-sample spikes aligned with switching events → wiring/return coupling dominates.

First fix Add range + rate gates plus a consistency gate with flame/air stage; store a short raw-code ring buffer around trips.

Q8

Valve “has voltage” but won’t open—driver protection, coil open, or mechanical jam? How to tell?

ValveRailsLog

Answer Voltage alone is not proof of actuation; the decision needs an energy signature (current/decay behavior) plus the enable timeline.

First 2 measurements Measure coil current (or equivalent energy proxy) and capture valve_enable edge timing valve + log.

Discriminator Voltage present but near-zero current → open circuit/connector. Current present but no mechanical response → jam/return spring issue. Driver enters protection or supply droops → short/overtemp/undervoltage dominates.

First fix Verify kickback clamp behavior and log command + energy proxy at transitions for future root-cause retention.

Q9

Lockout only at high power stage—ignition noise or unstable valve/air/combustion? Which two proofs first?

Air (DP)FlameRails

Answer Start with the two state proofs that change with staging: airflow proof and flame proof. High-stage faults usually reveal which chain loses margin first.

First 2 measurements Log DP/tach consistency and capture flame_raw (or flame_ok) during the stage transition air + flame.

Discriminator DP stops tracking tach → airflow/DP chain dominates. Flame becomes noisy or flips near commutation events → AFE injection/windowing dominates. If both are stable but lockout happens, check rails + reset/WD timing around the transition.

First fix Add stage-transition validation: timestamped DP, flame decision, and rail minima logged for every high-stage entry.

Q10

After reboot the fault code “disappears”—what minimum event fields prevent “no evidence” investigations?

LogRails

Answer Store the last actionable story, not just a code. Without reset cause and timestamps, root cause becomes unrecoverable.

First 2 measurements Ensure reset_reason (POR/BOD/WD) and event timestamp + last_state are captured non-volatilely log.

Discriminator If reset_reason is missing, brownout and watchdog faults look identical after reboot. If last_state/retry_count is missing, ignition failures and interlock failures are indistinguishable.

First fix Minimum fields: timestamp, state, decision flags (flame/air/valve), retry count, and reset_reason; add a short snapshot of key analog minima if possible.

Q11

Flame signal is weak but can be maintained—adjust thresholds or fix return paths first?

FlameRails

Answer Do not tune thresholds blindly. Decide whether weakness is a stable low-amplitude condition or a noise-dominated injection problem.

First 2 measurements Capture flame_raw amplitude/noise and correlate it with a known aggressor event (ignition pulses, fan start, comms bursts) flame.

Discriminator Weak but stable and not correlated to aggressors → threshold/window margin can be adjusted. Weak and strongly correlated to common-mode jumps or rail ripple → return paths/CM injection must be fixed first.

First fix Improve CM/return control, then re-evaluate flame window thresholds with logged baselines to avoid trading false positives for false negatives.

Q12

Field issue is intermittent and cannot be reproduced in the lab—how to design a “minimal reproduction jig” and log fields?

LogRailsFlame/Air/Valve

Answer The jig should reproduce disturbances, not the whole appliance. Make the injection source controllable and the evidence capture deterministic.

First 2 measurements Use a repeatable aggressor (fan start load step or comms on/off burst) and log event timestamps + reset_reason + flame/air/valve snapshot log + rails.

Discriminator If faults align with load steps, rail windowing/return coupling dominates. If they align with port bursts, common-mode injection dominates. If neither aligns, expand capture to include raw ADC minima and comparator edges.

First fix Add a short ring buffer (N samples) for rails/Vref/flame decision around triggers so “one-off” becomes evidence-backed and debuggable.