123 Main Street, New York, NY 10001

Smart Floor Heating: Temperature Sensing & SSR Power Control

← Back to: Smart Home & Appliances

Center Idea: Smart floor heating is a slow, high-inertia control system—stable comfort comes from trusted temperature sensing and measured power switching, not fast “on/off.” Use an evidence chain (T_floor/T_air curves + SSR_GATE/Vcc signals) to prevent overshoot, keep zones consistent, and diagnose reboots, alarms, and wireless dropouts quickly.

H2-1 — Set the Page Boundary & Center Idea (Make the Promise Clear)

Center idea: Smart floor heating is a slow thermal system. Stable comfort comes from trusted temperature sensing and predictable power delivery, validated by two evidence streams: temperature curves and power switching / load behavior.

This page stays device-side and focuses on the complete engineering loop: floor/air temperature sensingSSR/relay power controlanti-overshoot control strategyzoning & scheduleswireless coexistence under heating noisevalidation & field debug evidence.

Search intents this page is built to answer (without scope creep)
Comfort: overshoot / swings / “hot feet” Power stage: SSR heating / false trigger / buzz Sensing: drift / probe failure / placement errors Zones: one room cold / inconsistency Robustness: reboot / wireless drops when heating
What you’ll get
  • A device boundary map with numbered test points that can be reused in validation and field SOP.
  • A consistent way to separate sensor vs power-stage vs safety failures using minimal measurements.
  • Control strategy guidance for slow thermal inertia: window, deadband, floor cap, and how to validate with curves.
  • Field-debug mindset: symptom → evidence → isolate → first fix (no “app tutorial” content).
Not covered here

Hub-level architecture, server-side services, whole-home energy panels, deep utility programs, and building-code walkthroughs. (This page remains device-side and evidence-based.)

Evidence rule: Any comfort claim must tie back to at least one measurable artifact: T-floor / T-air curve and/or power switching / load behavior. If no evidence is available, the section belongs in troubleshooting—not in “design conclusions.”

F1. Device Boundary (Evidence First) Temperature curve + power behavior → isolate root cause Sensors Floor Probe Air Sensor MCU + ADC filter • setpoint • schedule • zones Control Output (SSR_GATE) Power Stage Zero-Cross SSR/Relay Load Heating mat / cable Safety & Protection Over-temp cutoff • Thermal fuse • Fault latch Connectivity (Device-Side) Wi-Fi / Thread / Zigbee / BLE Aux PSU 3V3 / 5V rails 1 TP1: T_floor 2 TP2: T_air 3 4 5 6 TP3: SSR_GATE TP4: AC_ZC TP5: I_load (opt.) TP6: Vcc 3V3/5V
Cite this figure Figure F1 — Smart Floor Heating device boundary with numbered test points (ICNavigator).

H2-2 — Architecture: Separate Sense Chain vs Power Chain (Plus Safety)

Most field failures become solvable once the system is split into three chains: Sense chain (temperature credibility), Power chain (actual heating delivery), and Safety chain (hardware cutoffs that remain effective under MCU faults). The goal is not “more features,” but minimum observability: a small set of signals that prove what is happening.

Engineering rule: Comfort issues should never be diagnosed from UI text alone. Prove “what the floor is doing” with TP1/TP2 curves, and prove “what power is delivered” with TP3/TP4 timing (and TP5 if available). If stability breaks during heating, correlate with TP6 rail behavior.

Typical controller building blocks (device-side)
  • Floor probe + ambient sensor → ADC front-end → filtering / plausibility checks.
  • MCU control → schedule / zone state machine → SSR/relay drive decision.
  • Power stage (zero-cross / triac / relay / SSR) → heating mat/cable load.
  • Aux PSU rails (3V3/5V) → brownout / watchdog reset handling.
  • Safety chain (over-temp policy + independent cutoff elements) → fault latch behavior.
  • Device-side wireless → coexistence under switching noise (evidence: retries/RSSI vs heating).
Module cards (What / Failure mode / Evidence / First fix)
Sense chain

What: Floor/air temperature → ADC → filter → validated temperature.

Failure mode: probe placement error, noise-injected readings, drift, open/short.

Evidence: TP1/TP2 jitter or implausible slope vs heating state; stable UI but unstable raw codes.

First fix: placement + wiring separation; RC/median filter; sample away from switching edges.

Power chain

What: SSR/relay timing → AC switching → load heating power.

Failure mode: false triggering (dv/dt), overheating, partial conduction, wiring faults.

Evidence: TP3/TP4 mismatch; TP5 shows missing power under “ON” state; abnormal temperature rise rate.

First fix: zero-cross timing; snubber strategy; thermal path; verify load current presence.

Safety chain

What: Hardware cutoffs + fault latch keep the system safe during MCU/PSU faults.

Failure mode: stuck-on power stage, missing cutoff, unsafe restart state.

Evidence: TP1 rises while control indicates “OFF”; FAULT state repeats after restart; cutoff not effective.

First fix: independent cutoff element; fail-safe default OFF; fault latch until manual clear.

Measurement points checklist (minimum observability set)
TP# Signal Typical tool What it proves Common pitfall
TP1 T_floor (floor probe) ADC log / DMM Floor thermal state & slope Placement error looks like “control bug”
TP2 T_air (ambient) ADC log Room response vs floor cap behavior Slow air dynamics misread as overshoot
TP3 SSR_GATE / relay drive Oscilloscope Commanded switching intent Looks correct even if power stage misbehaves
TP4 AC_ZC (zero-cross) Oscilloscope Switching timing reference Wrong reference causes extra EMI / heating jitter
TP5 (opt.) I_load / power presence Clamp / current sense Actual delivered heating power Partial conduction can hide as “some current”
TP6 Vcc rails (3V3/5V) Oscilloscope Brownout / reboot correlation to heating Probe ground loop can lie—keep leads short
F2. Three Chains + Minimum Observability Sense proves temperature • Power proves heating • Safety proves fail-safe Sense chain Control Power chain Floor / Air Sensors TP1 • TP2 ADC Front-End RC • sampling • plausibility MCU State Machine schedule • zones • setpoint Switch Command TP3: SSR_GATE Zero-Cross Timing TP4: AC_ZC SSR/Relay + Load TP5: I_load (opt.) Safety chain (independent effectiveness) Over-temp cutoff • Thermal fuse • Fault latch • Fail-safe OFF Aux PSU Rails TP6: 3V3/5V (brownout evidence) 1 2 3 4 5 6
Cite this figure Figure F2 — Sense/Control/Power/Safety chain split with evidence hooks (ICNavigator).

H2-3 — Temperature Sensing: Probe Type, Placement, ADC Front-End & Noise Immunity

Core principle: Temperature “accuracy” in floor heating is a system property. Credible control depends on thermal coupling (where the probe actually measures), electrical integrity (how the signal survives switching noise), and sampling discipline (not turning transient spikes into false temperature).

This section covers only floor/air temperature inputs used by a floor-heating controller. It intentionally avoids IAQ sensors (PM/VOC/CO₂) and any cloud-side analytics.

Design checklist (ordered by impact on comfort stability)
1) Probe & placement (thermal truth)

Use a floor probe position that represents the controlled mass. Prefer a protective sleeve/tube that enables replacement and stabilizes coupling. Avoid direct proximity to heating wire paths.

2) Front-end & wiring (electrical truth)

Keep probe wiring away from mains/SSR switching loops. Use a simple RC input network and stable divider resistors; treat long cable runs as noise antennas.

3) Sampling & filtering (turn data into control)

Align sampling away from switching edges when possible. Prefer median/clamp for impulsive spikes, then use light low-pass smoothing to avoid adding control lag.

4) Fault diagnostics (safe failure behavior)

Implement open/short detection thresholds and plausibility checks (slope limits, impossible jumps). Fault states should drive fail-safe heating behavior and clear operator messaging.

Probe selection in floor-heating constraints (NTC vs RTD)
Decision axis NTC (e.g., 10k / 50k) RTD (PT100 / PT1000) What to validate (evidence)
Long cable runs Often practical with divider + ADC; noise immunity must be engineered. Lead resistance can become a dominant error source (especially PT100). Raw code stability vs SSR switching; compare reading drift vs cable length.
Linearity & calibration Nonlinear; needs curve/segment mapping. More linear; measurement can be more complex. Two-point or segmented fit vs reference; verify across operating range.
Noise & sampling Susceptible to injected spikes; median/clamp often effective. Susceptible to pickup in lead wires; requires robust measurement method. Correlate spikes with SSR edges (TP3/TP4); quantify jitter in steady state.
Fault detect Open/short produces saturation codes; easy to threshold. Open/short behavior depends on excitation/meas scheme. Inject open/short and confirm safe state + clear diagnosis.
Error source map (static vs dynamic)
Static errors (reading offset)

Sensor tolerance • divider tolerance • Vref drift • ADC INL/DNL • lead resistance (RTD).
Typical symptom: smooth curve, consistent bias.

Dynamic errors (comfort instability)

Switching noise injection (SSR dv/dt) • sampling at edges • cable pickup • ground return coupling.
Typical symptom: spikes/jitter that trigger control oscillation.

Common installation pitfalls: probe not in a sleeve/tube • probe placed too close to heating wires • incorrect depth • probe cable routed with mains/SSR wiring. These often present as “control bugs” but are proven by temperature spikes that correlate with switching edges.

F2. Temperature AFE + Noise Coupling Path Keep sensing truthful under SSR dv/dt and long cable pickup Sensor NTC / RTD probe Divider Rref • Vref stable RC Input low-pass • anti-spike ADC sampling window Filter median • clamp Plausibility open/short • slope SSR Switching Node dv/dt edges • mains wiring Coupling Mechanisms capacitive pickup • ground return • cable antenna Noise injection Sampling avoid edges Goal: stable raw codes → stable control
Cite this figure Figure F2 — Temperature AFE and dv/dt noise coupling path (ICNavigator).

H2-4 — Control Strategy: Slow Thermal Inertia Without Hot-Foot, Oscillation, or Overshoot

Control framing: Floor heating is dominated by thermal inertia and time delay. Aggressive switching rarely creates faster comfort; it more often creates overshoot, switching stress, and measurement-noise-driven chatter. The objective is stable comfort with predictable evidence.

Setpoint modes (device-side)
Mode A: Floor-limited

Primary regulation uses T_floor. This protects foot comfort and floor materials. Verification: T_floor reaches target with controlled slope; T_air may lag in high heat-loss rooms.

Mode B: Air-controlled with floor cap

Primary regulation uses T_air with a floor temperature cap. Verification: T_air converges without driving T_floor beyond cap; cap prevents hot-foot during long calls.

Control methods (ordered by practical deployability)
1) Hysteresis (deadband)

Simple and robust. Risk: larger temperature ripple. Evidence: sawtooth curve amplitude tracks deadband width.

2) Time-proportional control

Uses a fixed window (e.g., minutes) and modulates duty to smooth delivered heat. Evidence: reduced ripple without frequent relay chatter.

3) PI-lite (slow integral)

Eliminates steady-state error; must prevent integral windup. Evidence: no persistent heating after crossing setpoint; overshoot stays bounded.

Evidence-driven parameter tuning (no heavy math)

Tuning flow: choose a window that avoids chatter → set deadband/cap to prevent hot-foot → add min on/off and ramp limits to reduce stress → apply conservative cold-start behavior to avoid first-rise overshoot.

Parameter What it controls Too small / too short Too large / too long Evidence to watch
Window length How often duty can change Chatter, EMI, visible ripple Sluggish response T curve ripple frequency vs duty steps
Deadband On/off sensitivity Switching too frequent Large temperature ripple Sawtooth amplitude
Floor cap Foot comfort & protection May limit air comfort Hot-foot risk T_floor plateau vs T_air convergence
Min on/off Switching stress reduction Wear, audible relay noise Delayed correction Switch count per hour
Ramp / slope limit How fast duty rises Overshoot on cold start Slow warmup Cold-start overshoot and settling time
F3. Time-Proportional Control: Curve vs Duty Window length drives ripple, switching stress, and settling behavior Temperature (T_floor) Target Overshoot Settling time Duty within fixed windows Short window → chatter Balanced → stable Long window → sluggish
Cite this figure Figure F3 — Temperature response vs duty windows in time-proportional control (ICNavigator).

H2-5 — Power Actuation: Choosing SSR / Triac / Relay Without Field Failures

Objective: A power stage is “good” only when it stays predictable in real wiring: it must switch when commanded, stay off when commanded, manage heat, and minimize dv/dt-triggered false turn-on and EMI. The most common field failures cluster into three buckets: false triggering, overheating, and EMI coupling into low-voltage domains.

Selection matrix (floor-heating relevant dimensions only)
Actuator Switching frequency Heat & thermal design Noise / comfort False trigger sensitivity EMI profile Best fit (typical)
Relay Low (avoid frequent time-proportional toggling) Low conduction loss; watch contact heating under stress Audible clicks Low dv/dt sensitivity; contacts can bounce or weld under abuse Moderate; contact arcing can be noisy during abuse Simple on/off with long min on/off; cost-focused designs
Triac + driver Medium; depends on trigger strategy Moderate; conduction drop causes heating Silent dv/dt + holding-current behaviors can cause unexpected conduction patterns Can be high for random turn-on/phase methods When controlled turn-on is engineered and verified by waveforms
Zero-cross SSR Medium; still avoid unnecessary fast toggling Often the dominant concern; requires heat path planning Silent Lower EMI than random turn-on; still has leakage and dv/dt limits More friendly under proper layout and wiring Time-proportional control with comfort stability focus
The three “failure buckets” (each one must be proven by evidence)
1) False triggering (dv/dt / leakage / timing)

Symptom: heating appears when command is OFF, or intermittent warm patches without schedule.
Evidence: correlate unexpected conduction with switching edges (AC_ZC / drive timing) and wiring proximity.
First fixes: reduce dv/dt injection loops, verify gate drive network, add snubber only when waveforms justify it.

2) Overheating (conduction loss → temperature rise)

Symptom: SSR/triac case runs hot, drift and intermittent faults after warm-up.
Evidence: estimate P ≈ V_on × I_rms, then measure case temperature rise by thermocouple/IR.
First fixes: improve heat path (copper area, thermal interface, airflow), reduce unnecessary toggling.

3) EMI coupling (power edges disturb sensing & radio)

Symptom: sensor spikes, resets, wireless dropouts only during heating transitions.
Evidence: supply dip/ground bounce aligned to switching edges; temperature ADC spikes aligned to SSR gate events.
First fixes: enforce wiring separation, shorten hot loops, strengthen low-voltage decoupling and reset immunity.

Driver & protection (device-side only)

Driver chain: MCU output → isolation/optocoupler (as required) → gate/input resistor → actuator input. Device-side surge control: use MOV/TVS only where they protect the product entry and are backed by layout discipline. Snubber: apply only when needed, proven by waveform ringing or dv/dt misbehavior—avoid adding components blindly.

Thermal evidence workflow (estimate → measure → decide)
Step 1: Estimate loss

Use actuator conduction drop at operating current. Treat it as continuous loss during ON windows.

Step 2: Measure rise

Measure case temperature after steady operation. Use consistent placement and time-to-steady-state.

Step 3: Decide action

If rise is high, upgrade heat path or derate; if only under edge events, prioritize EMI/trigger debugging.

F4. Zero-Cross vs Random Turn-On Timing Switching instant changes edge severity and EMI coupling risk AC waveform + zero-cross marks ZC ZC ZC Zero-cross SSR (gentler edges) SSR_GATE aligns near ZC → lower high-frequency content SSR_GATE EMI LOW Random turn-on / phase-like edges (harsher) SSR_GATE away from ZC → larger step → stronger coupling risk SSR_GATE Harsher edge EMI HIGH Validate by scope: AC_ZC + SSR_GATE + I_load (optional) + low-voltage rail stability.
Cite this figure Figure F4 — AC zero-cross and drive timing comparison (ICNavigator).

H2-6 — Safety Chain: Over-Temperature, Leakage Symptoms, and Hardware Fail-Safes

Safety principle: Safety is not a feature; it is a stack. The chain must remain effective even when the MCU is wedged, the supply browns out, or the power device fails. Build protection so that the default fault outcome is heating OFF, and confirm each layer by evidence.

Protection stack (Trigger → Action → Evidence)
Layer 1 — Software limiting (floor cap)

Trigger: T_floor approaches cap or rises faster than allowed.
Action: reduce duty / switch OFF.
Evidence: duty decreases while temperature curve forms a controlled plateau near the cap.

Layer 2 — Independent thermal cutoff (thermal fuse / thermostat)

Trigger: local temperature exceeds independent threshold.
Action: physically opens the power path in series.
Evidence: load power stops even if MCU drive remains asserted.

Layer 3 — Runaway detection (stuck-on / slope anomaly)

Trigger: command OFF but temperature continues to climb abnormally, or load current persists with duty = 0.
Action: latch fault state; require manual intervention to resume.
Evidence: “state vs physics” contradiction in logs and traces.

Leakage / RCD interaction (device-side symptoms only)

What to cover here: external RCD/GFCI behavior is observed as power loss events. Typical symptoms include trips on heater enable, intermittent trips during switching, and resets aligned with heating transitions. Device evidence should focus on brownout counters, reset reasons, and fault logs—without turning this page into a regulatory tutorial.

Fail-safe design anchors (without standards deep dive)
Isolation & partitioning

Keep mains and low-voltage domains physically and electrically separated. Maintain clear routing zones and controlled return paths.

Fail-safe OFF as default

On reset, watchdog, or sensor fault, the power output must move to OFF. Avoid ambiguous transitional states.

Fault latch & reset policy

Over-temperature or runaway detection should latch to prevent oscillating failures. Recovery should require explicit operator confirmation.

F5. Safety Chain: Independent Cutoff Even if the MCU freezes, a series cutoff can still remove power Sensors T_floor • T_air MCU control + limits SSR / Triac power switching Load heating mat Independent Cutoff Thermal fuse / thermostat series in mains path OPEN Fault Scenario MCU wedged Output may stay ON Evidence to confirm the stack Command OFF but temperature rises → runaway detection Independent cutoff opens → load power stops even if drive is stuck
Cite this figure Figure F5 — Safety chain with independent cutoff in series (ICNavigator).

H2-7 — Power Integrity & Immunity: Why Heating Causes Reboots or Dropouts

Core idea: Many “software-like” failures during heating are actually power-tree evidence. Switching edges and mains disturbances can cause rail sag, ground bounce, or brownout resets. The fix starts by proving causality: rail waveform aligned to SSR switching timing, plus reset reason and radio reconnect counters.

First 2 measurements (prove the cause in minutes)
  • M1 — Vcc rail at the load: probe 3V3/5V close to MCU or radio power pins (not only PSU output).
  • M2 — Switching reference: capture SSR_GATE or AC_ZC on the same timebase to correlate edge events.

Strong evidence: a repeatable Vcc dip or spike that occurs at each switching edge, with resets or dropouts clustered at the same timestamps.

Power-tree weak points (what typically collapses first)
Radio rail sensitivity

Wireless modules often have tighter transient limits. A small sag can cause silent link loss and reconnect storms before a full reset.

MCU brownout behavior

Brownout can look like random firmware instability. Confirm with reset reason (BOR) and watchdog counters, then fix rail margin.

UI & sensing integrity

Ground bounce and switching injection can corrupt ADC readings and touch/UI states, triggering control oscillations and user-visible glitches.

Immunity levers (device-side, evidence-driven)
UVLO margin + brownout logging

Ensure rails stay above UVLO during switching edges. Log brownout/reset reasons and reconnect counters to close the evidence loop.

Bulk capacitance + placement

Bulk helps only if placed where transient current is demanded. Validate by reduced dip depth and shorter recovery time on Vcc.

Rail partitioning + return paths

Separate noisy power loops from sensitive domains. Ground bounce is proven when ADC noise and radio drops align with switching edges.

Entry suppression (surge/EFT device-side)

Use device-side protection and layout discipline to limit injected disturbances. Confirm with fewer rail excursions under edge events.

Pass condition: Heating transitions do not cause measurable Vcc excursions at MCU/radio pins, reset reasons remain clean, and reconnect counters do not spike during switching.

F6. Power Tree & Transient Path Prove reboots/dropouts by correlating rail dips with switching edges AC Entry Surge / EFT SSR switching node Aux PSU Low-voltage supply 5V 3V3 Radio rail (fragile) Sensitive Domains MCU Radio Sensors / UI / ADC Measure Vcc + Reset Transient path Switch edge / mains disturbance → rail dip Correlation to prove cause 1) Vcc dip aligned to SSR_GATE / AC_ZC 2) Reset reason shows brownout (BOR) or watchdog 3) Radio reconnect counter spikes during heating transitions SSR_GATE AC_ZC Vcc Reset reason Reconnect count
Cite this figure Figure F6 — Power tree and transient path for heating-related reboots/dropouts (ICNavigator).

H2-8 — Zoning & Scheduling: Consistency Across Rooms (Beyond “App Features”)

Core idea: Zoning is a control and evidence problem. The same setpoint can feel different across rooms because thermal resistance, thermal mass, and probe coupling differ. Use a single evidence metric—dT/dt at the same duty—to explain inconsistency and to justify per-zone parameters.

Zone sensing models (device-side feasible)
Model A — Per-zone floor probe

Best controllability. Each zone has direct floor feedback; consistency tuning is straightforward. Evidence: compare dT/dt distribution across zones at equal duty windows.

Model B — Shared ambient + per-zone actuation

Lower sensor cost but weaker floor insight. Use conservative caps and longer windows to avoid overshoot. Evidence: stability vs comfort tradeoff appears in settling time and ripple amplitude.

Model C — No probe (estimate-only)

Requires strict safety limits and slow control. Consistency is maintained by conservative windows and min on/off rules. Evidence: use rise-time and switching-count constraints to avoid chatter and hotspots.

Scheduling: preheat and hold (no pricing / DR expansion)
  • Preheat lead time: choose based on measured rise time (time-to-comfort), not guesswork.
  • Hold strategy: long window + min on/off reduces switching stress and improves comfort stability.
  • Night comfort: cap floor temperature and avoid rapid toggling to reduce noise and interference.
Consistency evidence (one metric that explains most complaints)

Metric: compare dT/dt across zones under the same duty window.
Interpretation: different slopes imply different coupling/installation/thermal paths.
Action: tune per-zone parameters (window, deadband, cap, min on/off) instead of forcing one global setting.

User scenario cards (goal → strategy → parameters → validation)
Morning

Goal: comfort at wake-up.
Strategy: preheat lead time + stable hold window.
Parameters: longer window, conservative cap, clear min on/off.
Validation: time-to-comfort, overshoot size, switching count per hour.

Arrive Home

Goal: quick recovery without hotspots.
Strategy: controlled boost then settle into hold.
Parameters: ramp limits, avoid short windows that chatter.
Validation: settle time, ripple amplitude, zone-to-zone slope alignment.

Night

Goal: steady comfort, low disturbance.
Strategy: stable hold + strict cap.
Parameters: longer window, reduced transitions, strong min on/off.
Validation: no chatter, no radio drops, consistent floor curve.

F7. Multi-Zone Control (A / B / C) Consistency comes from per-zone parameters + shared evidence metrics Shared Layer UI / Connectivity device-side only Zone A T_floor Controller Duty window Actuator Load Zone B T_floor Controller Duty window Actuator Load Zone C T_floor / est Controller Duty window Actuator Load Per-zone params window • deadband cap • min on/off Evidence metric: dT/dt at the same duty window → explains zone inconsistency
Cite this figure Figure F7 — Multi-zone closed-loop with per-zone parameters (ICNavigator).

H2-9 — Connectivity Coexistence: Why Wireless Gets Worse Only During Heating

Core idea: “Heater on → wireless worse” is usually device-side coupling, not a cloud or gateway problem. The correct approach is to prove correlation: RSSI/PER/retry counters versus SSR duty and switching edges, then trace the coupling path (high di/dt looprails/groundRF front-end).

Evidence first (what to correlate)
  • Sync reference: capture SSR_GATE or AC_ZC timestamps.
  • Wireless metrics: RSSI, PER / retry rate, disconnect reason, reconnect counter.
  • Power noise hints: radio rail ripple or short rail dips during switching transitions.

High confidence diagnosis: retries or dropouts cluster at specific switching edges or at certain duty bands (low/medium/high).

Likely coupling paths (device-side)
Path A — Rail ripple into RF

Switching noise raises ripple on the radio rail or 3V3. The RF front-end loses margin and retry rate climbs. Evidence: rail ripple rises with duty, retries rise with duty.

Path B — Ground bounce / return path pollution

High-current returns inject noise into the RF/baseband reference. Evidence: ADC noise and wireless errors increase together at edges.

Path C — Near-field coupling into antenna area

Noisy nodes or long harnesses near the antenna cause receiver desense. Evidence: RSSI looks stable but PER/retries spike.

Symptom → Evidence → Likely cause → First fix

Symptom

Only heating Link drops occur only when duty window toggles.

Evidence

Reconnect spikes align with SSR_GATE edges; radio rail shows short dips.

Likely cause

Rail margin + switching injection into radio supply.

First fix

Stiffen radio rail locally; reduce edge injection; validate ripple reduction and fewer retries.

Symptom

Specific power band Worst at medium or high duty, fine at low duty.

Evidence

PER/retry rate rises monotonically with duty; Vcc ripple increases with load.

Likely cause

Load-dependent coupling (rails or returns) rather than random RF interference.

First fix

Improve partitioning/returns and rail decoupling; confirm retry-vs-duty curve flattens.

Symptom

RSSI ok RSSI looks normal but latency and retries spike.

Evidence

PER rises at switching edges; near-field probe shows noise near antenna keepout.

Likely cause

Receiver desense from near-field coupling; RF front-end loses effective SNR.

First fix

Enforce antenna keepout and move noisy nodes/loops away; re-test PER under heating.

Symptom

Edge-specific Drops happen only at certain switching transitions.

Evidence

Errors cluster at edge timestamps; changing window phase shifts the cluster.

Likely cause

Switching event collides with RF critical timing window.

First fix

Apply switching synchronization (avoid RF critical slots); validate error clusters disappear.

Scope note: If the issue is proven to correlate with switching duty/edges, prioritize device-side coupling paths. Avoid jumping to gateway/cloud assumptions until the correlation evidence fails.

F8. Device-Side Coupling Paths SSR loop → rails/ground → RF front-end (heating-only degradation) Power Loop High di/dt region AC entry SSR Load High di/dt loop (keep small) Aux PSU Rails & ripple 3V3 Radio rail Ground / returns RF Front-End Antenna RFIC Retries / PER ↑ Coupling rail ripple ground bounce into RF Near-field coupling antenna keepout How to prove (correlate) 1) SSR_GATE / AC_ZC timestamps 2) RSSI + PER / retries + reconnect counter 3) Radio rail ripple (or short dips) during heating edges Result: heating-only degradation becomes a measurable coupling problem
Cite this figure Figure F8 — Noise coupling paths from SSR power loop into RF domain (ICNavigator).

H2-10 — Validation Test Plan: Maximum Coverage with Minimal Instruments

Core idea: A compact validation plan should cover the highest risks—thermal behavior, power stage stress, immunity to disturbances, and safety under sensor faults—while producing consistent logs that enable fast field triage.

Minimum toolset (practical)
Thermal

Thermocouple or contact probe; optional thermal camera for SSR and enclosure hotspots.

Power

Oscilloscope/recorder for Vcc rails + switching reference (SSR_GATE or AC_ZC).

Logs

Serial/event logs for reset reason, reconnect counter, fault flags, and control state.

Unified logging fields (use the same fields across all tests)

Control T_floor, T_air, duty, window length, min on/off, cap, zone ID
Power Vcc_3V3/5V, (optional) radio rail, switching edge marker
Events reset reason (BOR/WDT), reconnect counter, fault flags, safety state (fail-safe off / latched)

Compact checklist (test item / method / pass / log fields)
Thermal

Test item

Step response: cold start → target comfort (floor-limited and/or air-controlled with cap)

Method

Apply a setpoint step; record T_floor/T_air curve and duty over time; repeat for at least two zones or two simulated thermal paths.

Pass

Overshoot bounded; stable settling without chatter; consistent rise-time behavior per zone after tuning.

Log fields

T_floor T_air duty window zone

Test item

Steady-state hold: ripple and switching count

Method

Hold near setpoint for extended period; count switching transitions per hour; track ripple amplitude.

Pass

No oscillation; acceptable ripple amplitude; switching not excessive for the chosen actuator.

Log fields

duty min_on_off switch_count

Power stage

Test item

SSR temperature rise across duty bands

Method

Run low/medium/high duty profiles; measure SSR case temperature trend (thermocouple or IR).

Pass

No runaway rise; temperature stabilizes under steady conditions; margins remain under worst-case ambient.

Log fields

duty switch_count T_case

Robustness

Test item

Brownout / rail sag immunity during heating transitions

Method

Induce controlled rail stress (edge-heavy transitions); capture Vcc + SSR reference; observe reset reasons and reconnect counters.

Pass

No resets; no reconnect storms; if reset occurs, recovery enters fail-safe off and does not re-energize unexpectedly.

Log fields

Vcc SSR_GATE/AC_ZC reset_reason reconnect safety_state

Test item

ESD / EFT / surge (device-side injection points)

Method

Exercise entry points: AC entry, enclosure, probe cable; track resets and safe recovery behavior.

Pass

No unsafe output; faults are captured; system returns to known safe state after disturbance.

Log fields

fault_flags reset_reason safety_state

Sensor faults

Test item

Sensor open / short / drift injection

Method

Inject open/short; simulate drift with offset; verify detection thresholds and safe-state behavior.

Pass

Fault is detected quickly; output goes fail-safe off or enters strict limit mode; user-visible alarm is triggered.

Log fields

sensor_status fault_flags safety_state

Critical pass condition: After any reset or disturbance, the system must recover into a safe output state (fail-safe off or strict limited mode), never energizing heating unexpectedly.

F9. Validation Coverage Map Thermal • Power • Robustness • Faults → unified logs for field triage Thermal step response • overshoot • settling T(t) Power Stage SSR temp rise • stress across duty SSR T_case Robustness brownout • ESD/EFT/surge • safe recovery Vcc dip BOR/WDT Fault Injection open/short/drift • fail-safe off open / short Fail-safe off Unified Logs T_floor / duty / Vcc reset_reason / reconnect
Cite this figure Figure F9 — Validation coverage map and unified logging fields (ICNavigator).

H2-11 — Field Debug SOP: Symptom → Evidence → Isolate → Fix

Field failures often look like “software bugs” but resolve faster by proving correlation between temperature curves, switching timing, and rail integrity. This SOP uses a repeatable 4-step template with two measurements first, then a single discriminator to isolate root cause.

Template (use for every symptom): First 2 measurementsDiscriminatorLikely root cause (ranked)First fix (fast)

Evidence signals used throughout this page
Thermal

T_floor, T_air, dT/dt (rise rate), overshoot, settling time

Switching

SSR_GATE timing, AC_ZC reference, duty/window length, min on/off, switch_count

Power & events

Vcc_3V3/5V (and optional radio rail), reset_reason (BOR/WDT), fault flags, reconnect/retry counters

Symptom A — Temperature Overshoot / “Too Hot” Floor

Applies when: electric floor heating with time-proportional control (SSR/relay/triac). Not for water loops or boiler systems.

First 2 measurements

  • T_floor curve (and T_air if available): log at 1–5 s interval.
  • Duty + window length (or SSR_GATE timing) aligned to the same timeline.

Discriminator (one-shot)

  • If overshoot grows when window is short or deadband is small, the control loop is “too eager” for a high-inertia system.
  • If overshoot happens even with long windows, suspect probe thermal coupling (probe does not represent actual floor surface temperature).

Likely root cause (ranked)

  1. Probe placement / thermal coupling error: probe too close to heating wire, wrong embed depth, no sleeve/tube, or local hotspot.
  2. Control parameters too aggressive: short windows, tiny deadband, missing min on/off, no slope limiting.
  3. Wrong control mode: air-controlled without a strict floor cap, or cap set too high for flooring type.

First fix (fast)

  • Containment: enforce a floor cap, increase deadband, add min on/off, and lengthen the time window.
  • Permanent: rework probe placement (sleeve/tube, avoid heating wire adjacency), then re-tune using step response.

MPN examples (verify ratings & approvals)

NTC: TDK EPCOS B57560G104F NTC: Semitec 104GT-2 RTD: TE PT1000 (platinum probe series) ADC (ext): TI ADS1115

Escalate/RFQ trigger: repeated overshoot after correct probe placement and tuning indicates thermal model mismatch or hardware limitations (actuator constraints, safety cap strategy, or required re-layout of sensing lines).

Symptom B — Temperature Display Jumps / Drifts

Goal: distinguish real thermal changes from sampling noise, coupling, or probe faults.

First 2 measurements

  • Raw ADC code (pre-filter) and filtered temperature output in parallel.
  • SSR edge reference (SSR_GATE or AC_ZC) to check edge-aligned noise bursts.

Discriminator (one-shot)

  • If spikes cluster at switching edges, the issue is coupled noise (layout/grounding/timing).
  • If drift persists without edge correlation, suspect probe aging, divider resistor drift, or reference drift.

Likely root cause (ranked)

  1. Probe cable picks up switching noise: probe routed with mains/heater lines; large loop area; no RC/guarding.
  2. Sampling timing issue: sampling during high dv/dt transitions; insufficient settling after switching.
  3. Probe connection intermittency: loose terminals; micro-cracks; moisture ingress causing leakage.
  4. Resistor/reference drift: divider resistor tempco or reference instability changes conversion gain.

First fix (fast)

  • Containment: move sampling away from edges, add median filtering and debounced fault thresholds.
  • Hardware quick-fix: add RC low-pass near ADC pin, shorten probe routing, separate from mains bundle.
  • Permanent: improve routing (twist/route away from SSR loop), tighten grounding, adjust front-end impedance.

MPN examples (front-end & protection)

Opto ZC detector: Vishay H11AA1 ESD (probe): Nexperia PESD5V0S1BA Divider R: Vishay TNPW (0.1%) Ref (if needed): TI REF3330

Symptom C — SSR Overheats / Fails

Focus: prove whether the dominant loss is conduction loss, triggering/EMI loss, or thermal path failure.

First 2 measurements

  • SSR case temperature trend (T_case) during low/med/high duty.
  • Load current presence (clamp meter) or power level estimate + switching count/hour.

Discriminator (one-shot)

  • If T_case rises roughly with current and on-time, conduction loss + thermal path dominates.
  • If T_case rises sharply with frequent switching, reduce switching (window/min on/off) and re-check.

Likely root cause (ranked)

  1. Thermal design insufficiency: no heatsink margin, poor mounting, enclosure hot spots.
  2. Underrated SSR/triac: current/ambient derating ignored; repetitive surge events.
  3. dv/dt false triggering: inadequate snubber/MOV leading to unintended conduction heating.

First fix (fast)

  • Containment: increase time window, enforce min on/off, and limit maximum duty under hot ambient.
  • Hardware quick-fix: add snubber and MOV where appropriate; improve heatsinking contact.
  • Permanent: re-select SSR/triac with correct derating; redesign thermal path.

MPN examples (actuator & drive)

Triac: ST BTA16-600B Optotriac (ZC): onsemi MOC3063 Optotriac (random): onsemi MOC3023 Snubber cap (X2): KEMET R46 series MOV: TDK EPCOS B722 series SSR module: Crydom D2425 (example class)

Safety note: Any suspected “stuck-on” behavior must default to fail-safe off via independent hardware cutoff (thermal fuse / safety thermostat chain).

Symptom D — Reboots When Heating Turns On/Off

This symptom is often a power integrity problem triggered by switching transients.

First 2 measurements

  • Vcc rail at the MCU/radio pins (3V3/5V and optional radio rail), captured with edge timing.
  • SSR_GATE or AC_ZC aligned to Vcc to prove cause/effect.

Discriminator (one-shot)

  • If Vcc dips align with switching edges and reset_reason = BOR, the root cause is rail margin/return path.
  • If no Vcc dip is visible but reset_reason = WDT, suspect firmware lockup triggered by EMI or brownout side-effects.

Likely root cause (ranked)

  1. Insufficient bulk + high di/dt return: switching transient injects ground bounce or rail droop.
  2. UVLO margin too tight: PSU collapses briefly under mains disturbance or load steps.
  3. Entry disturbance: surge/EFT coupling into PSU, causing short rail interruptions.

First fix (fast)

  • Containment: reduce edge aggressiveness (switching rate), avoid rapid toggling, and log brownout events.
  • Hardware quick-fix: increase local bulk near MCU/radio; improve return routing; add input suppression.
  • Permanent: separate noisy power/returns from logic/radio domains; validate across worst-case mains events.

MPN examples (power integrity)

Supervisor: TI TPS3823 Supervisor: Microchip MCP1316 Buck: MPS MP1584 Buck: TI TPS62130 LDO: TI TLV75533 TVS (mains DC side): Littelfuse SMBJ series

Symptom E — Wireless Drops Only During Heating

Objective: prove correlation between retries/PER and switching duty/edges; then isolate rail vs near-field coupling.

First 2 measurements

  • Retries/PER/reconnect counter vs duty (log per minute or per window).
  • SSR edge timestamps (SSR_GATE or AC_ZC) to detect edge-clustered failures.

Discriminator (one-shot)

  • If failures cluster at edges, apply switching synchronization (phase shift). If clusters move or disappear, coupling is confirmed.
  • If RSSI is stable but PER spikes, suspect receiver desense from near-field coupling (antenna keepout violation).

Likely root cause (ranked)

  1. Rail ripple into radio domain (radio rail decoupling/partitioning insufficient).
  2. High di/dt loop near antenna (near-field coupling, harness antenna effect).
  3. Timing collision (switching edge overlaps RF critical window).

First fix (fast)

  • Containment: shift switching windows away from RF critical activity; reduce switching edge density.
  • Hardware quick-fix: strengthen radio rail decoupling; enforce antenna keepout; route noisy loops away.
  • Permanent: redesign return paths and partitioning; validate PER vs duty becomes flat.

MPN examples (EMI/ESD and radio rail)

CMT choke: TDK ACM2012 ESD (RF): Nexperia PESD5V0X1B LDO (radio): Microchip MCP1700-3302 TVS (IO): Littelfuse SP0502BAHT

Symptom F — One Zone Never Heats / Intermittent Heating

Goal: separate “control not commanding” vs “command present but no power delivery”.

First 2 measurements

  • Zone SSR_GATE presence (or relay coil drive) for the affected zone.
  • T_floor response (rise slope) within a controlled on-window.

Discriminator (one-shot)

  • If drive is present but T_floor does not respond, suspect power path / wiring / load.
  • If drive is absent, check sensor fault flags or cap/lockout conditions for that zone.

Likely root cause (ranked)

  1. Wiring/terminal issue: loose terminal, swapped zone wiring, neutral/line error.
  2. Actuator channel failure: SSR open, relay contact damage.
  3. Sensor fault lockout: probe open/short triggers fail-safe off for that zone.

First fix (fast)

  • Containment: force a short diagnostic on-window and confirm command + response.
  • Hardware quick-fix: verify terminals and actuator channel; replace suspect actuator; clear latched faults only after evidence.
  • Permanent: add per-zone current presence sensing or actuator health check hooks.

MPN examples (zone actuation)

Relay (mains): Omron G2RL series Triac: Littelfuse Q6008 series Optocoupler: Vishay VO615A

Symptom G — Intermittent False Alarms (Overtemp / Probe Fault)

Target: confirm whether alarms are triggered by real thermal conditions or by edge-coupled sensing noise.

First 2 measurements

  • Fault flags with timestamp (and latch state if implemented).
  • Raw sensor evidence around the alarm: ADC code, open/short thresholds, and edge reference.

Discriminator (one-shot)

  • If alarms align with switching edges, apply timing/RC changes. If alarms disappear, the root cause is sensing coupling.
  • If alarms occur without edge correlation, suspect real hotspots, probe intermittency, or moisture leakage.

Likely root cause (ranked)

  1. Thresholds too tight + insufficient debounce for a noisy environment.
  2. Probe intermittency: connector oxidation, micro-movement, moisture ingress.
  3. True thermal event: localized hotspot due to installation or insulation changes.

First fix (fast)

  • Containment: widen fault debounce and ensure safe output behavior (fail-safe off or strict limit mode).
  • Hardware quick-fix: improve probe ESD/EMI protection; add RC and routing separation.
  • Permanent: add probe integrity checks (open/short + plausibility) and alarm latching rules.

MPN examples (fault robustness)

TVS (probe/IO): Littelfuse SMF5.0A ESD array: TI TPD2E001 Supervisor/WDT: Maxim MAX809 (example class)

Quick MPN Shortlist (Common Fix Parts)

Examples only. Always verify voltage/current/creepage approvals, thermal derating, and safety requirements.

Zero-cross / timing

Vishay H11AA1 onsemi MOC3063 onsemi MOC3023

Actuation

ST BTA16-600B Littelfuse Q6008 Omron G2RL Crydom D2425

Protection & EMC

TDK EPCOS B722 MOV KEMET R46 X2 cap Nexperia PESD5V0S1BA Littelfuse SMBJ TVS

Power integrity

TI TPS3823 Microchip MCP1316 MPS MP1584 TI TPS62130 TI TLV75533

F10. Field Debug Decision Tree (Evidence First) Start from symptom → take 2 measurements → use one discriminator → isolate → first fix Overshoot T_floor too hot No Heat zone cold Reboot on heating edges Wireless drops w/ heat First 2 measurements T_floor curve duty + window Discriminator overshoot ↑ when window short? First fix cap + deadband min on/off First 2 measurements SSR_GATE present? T_floor response Discriminator drive yes, temp no rise? First fix wiring/actuator sensor lockout First 2 measurements Vcc at MCU SSR edge timing Discriminator Vcc dip + BOR? First fix bulk + returns UVLO margin First 2 measurements retries/PER vs SSR edges Discriminator edge clusters move w/ phase? First fix sync + keepout radio rail
Cite this figure Figure F10 — Evidence-first decision tree for common smart floor heating field symptoms (ICNavigator).

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12 — FAQs (Evidence-First, Device-Side Only)

Each answer stays inside the device boundary and closes the loop with two measurements, a single discriminator, and the fastest fix. No HEMS/cloud/gateway deep-dive.

1“It’s scheduled ON but still not warm” — probe placement or no real power?

First prove command vs response: capture SSR_GATE (duty/window) and T_floor rise rate (dT/dt). If SSR_GATE is active but dT/dt stays near zero, isolate wiring/load/actuator (open relay, failed triac/SSR, wrong terminal). If SSR_GATE is absent, check sensor plausibility or safety lockout. Fast fix: force a short diagnostic ON window and verify current presence.

Evidence: SSR_GATE + T_floor dT/dt Maps: H2-3 / H2-5 / H2-11
2Temperature looks stable, but comfort feels hot/cold — window too short or filtering too heavy?

Log duty + window length and compare raw ADC vs filtered T_floor. If comfort oscillation repeats with a period close to the control window, the window/deadband is too aggressive—lengthen the window and enforce min on/off. If the display lags real changes (raw moves but filtered barely moves), filtering/settling is too heavy—reduce smoothing and avoid sampling near switching edges.

Evidence: window + raw/filtered T Maps: H2-4 / H2-3
3Over-temp alarms happen often — real overheating or noisy probe wiring/contacts?

Correlate fault timestamps with SSR edges (SSR_GATE/AC_ZC) and inspect the raw ADC around the event. If alarms cluster at switching edges, it is usually coupling or intermittency—add RC near the ADC, separate probe routing from mains, and improve contact reliability. If alarms occur without edge correlation and dT/dt remains high, treat as a real thermal event and force fail-safe behavior.

Evidence: fault flag + edge alignment Maps: H2-6 / H2-3 / H2-11 MPN ex: PESD5V0S1BA, TPD2E001
4Is a very hot SSR “normal”? How to judge risk quickly?

Quantify, then decide: measure I_load (or power level) and SSR case temperature trend during low/med/high duty. If temperature rises roughly with on-time and current, conduction loss plus thermal path dominates; reduce switching density and improve heatsinking or derate maximum duty. If it heats abnormally during frequent toggling, increase the window and enforce min on/off. Re-select actuator only after evidence.

Evidence: I_load + T_case Maps: H2-5 MPN ex: BTA16-600B, Crydom D2425
5Only at high power the link drops — EMI coupling or power droop? Which two waveforms?

Capture Vcc at MCU/radio and retries/PER while aligning with SSR edges. If Vcc dips or resets show BOR, it is rail margin/return-path—add local bulk, tighten UVLO margin, and separate noisy returns. If Vcc is stable but PER spikes at edges, it is EMI/near-field coupling—reduce edge density, improve snubber/MOV strategy, and enforce antenna keepout plus a clean radio rail.

Evidence: Vcc + PER (edge-aligned) Maps: H2-9 / H2-7 MPN ex: TPS3823, ACM2012, MOC3063
6Set 26°C but it keeps hitting 29°C — tune parameters first or add a floor cap?

Apply safety-first layering: verify whether a floor cap exists and whether T_floor is capped correctly. If there is no cap (or it is too high), add/close the cap before tuning. If a cap exists and overshoot still happens, tune the slow system: lengthen the time window, increase deadband, and enforce min on/off. Validate by step response and overshoot reduction, not by “feel” alone.

Evidence: T_floor overshoot + mode/cap Maps: H2-4 / H2-6
7One room is always colder — thermal resistance difference or zone parameters not independent?

Compare zones under the same commanded duty: log duty and each zone’s dT/dt. If dT/dt differs strongly, the dominant factor is installation/thermal resistance or probe coupling—treat it as a physical delta and calibrate expectations. If zones share one probe or share parameters, the controller cannot correct per-room behavior—enable per-zone caps, offsets, and window settings, or add a per-zone probe where required.

Evidence: per-zone dT/dt vs duty Maps: H2-8
8After power loss, the state feels wrong — restore last duty or default OFF?

Default to fail-safe: use reset_reason (PowerOn/BOR/WDT) and latched faults to decide behavior. For BOR/WDT or any uncertain state, force outputs OFF and re-validate sensors and caps before re-enabling heat. Only restore a gentle ramp (not the last duty) when the system is clean and stable. Fast fix: add a supervisor and log brownout events; avoid “instant resume” on unstable mains.

Evidence: reset_reason + fault latch Maps: H2-6 / H2-7 MPN ex: TPS3823, MCP1316
9Probe open/short — what is the safest expected behavior?

A safe design detects open/short by ADC thresholds with debounce, then forces a predictable output state. The minimum safe behavior is fail-safe OFF plus a visible error and a logged fault flag. Avoid “keep heating at last duty” under sensor uncertainty. Add plausibility checks (rate-of-change and range) and require a stable sensor window before clearing the fault. This prevents runaway heating from a broken probe or connector.

Evidence: ADC threshold + fault flag Maps: H2-6 / H2-3
10Relay clicking complaints — is switching to SSR enough? What are trade-offs?

Start with control strategy: if clicking comes from frequent toggling, increase the window and add min on/off first. Switching to SSR reduces audible noise but introduces heat dissipation, possible leakage current, and dv/dt sensitivity. A zero-cross drive can reduce EMI, but thermal design still matters. Decide with evidence: switch_count/hour, T_case, and EMI symptoms. Then choose relay/triac/SSR accordingly.

Evidence: switch_count + T_case Maps: H2-5 MPN ex: Omron G2RL, MOC3063, BTA16-600B
11Floor temperature looks fine, but room air won’t warm — strategy issue or heat loss?

Use evidence to separate “control” from “capacity”: observe whether T_floor plateaus near the cap while T_air stays below target. If yes, the controller is doing what it is allowed to do; the limiting factor is heat transfer/heat loss (or too strict a cap). If T_floor is not reaching target and duty is constrained, tune window/deadband and verify sensor coupling. Confirm with step tests and steady-state error logs.

Evidence: T_floor plateau + T_air error Maps: H2-4 / H2-10
12Which two test categories are most often missed and cause painful field failures?

Two gaps dominate: (1) disturbance + recovery and (2) fault injection. Disturbance tests include brownout during switching, EFT/ESD at probe lines and enclosure, and verifying safe restart states. Fault injection includes probe open/short, stuck-on actuator detection, and alarm latching/clearing rules. The “comfort gap” is skipping long-duration steady-state tests across different thermal resistances and multi-zone simultaneous heating.

Evidence: reset_reason + injected faults Maps: H2-10
F11. FAQ Evidence Map (Device-Side) Five evidence streams → FAQ buckets → fastest fixes Thermal T_floor / T_air / dT/dt Switching SSR_GATE / AC_ZC / duty Power Vcc rails / BOR / WDT Wireless RSSI / retries / PER Safety fault flags / latch / cap FAQs two signals one discriminator fastest fix Comfort & Overshoot Q2, Q6, Q11 Actuator & Heat Q4, Q10 Power & Recovery Q5, Q8, Q12 Zones & “No Heat” Q1, Q7
Cite this figure Figure F11 — FAQ Evidence Map for Smart Floor Heating (ICNavigator).