Heat Pump & Radiant Heating Control: ΔT, Metering & Diagnostics
← Back to: Smart Home & Appliances
Heat pump radiant heating performance becomes predictable when the controller can prove delivery (Flow + ΔT), prove capacity (Compressor/Vdc/Iphase), and prove boundaries (metering + state gating) with complete logs. If any of these evidence chains is weak, COP, comfort, and fault conclusions will be misleading—so the fastest fix is to restore trustworthy sensing, stable control, and actionable snapshots.
H2-1 — Definition & boundary: Heat Pump + Radiant Hydronics
This chapter locks the scope so every later section stays evidence-based and non-overlapping: drives → ΔT/flow proof → metering/COP confidence → remote diagnostics logs.
One-sentence boundary
This page focuses on control and diagnostics for a heat pump feeding a hydronic radiant loop—compressor/valve/pump drives, supply/return ΔT and flow evidence, energy metering and COP estimation, plus remote fault logs—excluding thermostat UX, room-by-room zoning/TRVs, and cloud platform architecture.
- Drives: what is commanded, what feedback confirms execution, and what “trip signatures” prove a root cause.
- ΔT & flow evidence: where to measure, what ranges make sense, and what failure patterns look like.
- Metering: define the energy boundary (AC line vs DC link) and grade COP confidence based on sensor integrity.
- Remote diagnostics: minimum snapshot bundle + counters so “remote fault” becomes actionable.
- Thermostat UX / automation rules (schedules, UI flows, app integration).
- Room-by-room zoning, TRVs, manifold actuator design (end devices and zone logic).
- Boiler/furnace flame control (combustion safety chain).
- HEMS / whole-home energy orchestration (panel-level switching, solar/storage system control).
- Cloud backend architecture (pipelines, dashboards, multi-tenant services).
- Installation tutorials (plumbing practice, building codes, hydronic balancing procedures).
What this scope enables (reader outcomes)
- Fast discrimination: separate “not enough heat” into capacity vs circulation/mixing using ΔT + flow evidence.
- Trustworthy performance: compute thermal output and grade COP confidence instead of relying on misleading single sensors.
- Actionable remote support: define exactly what to log so the first remote report already contains proof-quality snapshots.
H2-2 — Reference architecture: Outdoor unit + hydraulic module + radiant loop
The goal here is not theory—it is a stable “evidence coordinate system” that every later control, metering, and diagnostics section can reference without drifting into thermostat or cloud topics.
How to read the system: four evidence paths
- Power → Drive path: AC input to PFC/DC link and inverter current/voltage prove whether trips, brownouts, or ramp limits are electrical/drive-driven.
- Refrigerant actuation path: compressor speed, EEV state, reversing valve state explain defrost transitions and capacity behavior without turning into a refrigeration textbook.
- Hydronic delivery path: pump + mixing valve + supply/return headers, validated by ΔT/flow/pressure, prove whether heat is actually delivered to the radiant loop.
- Diagnostics/log path: fault codes + snapshot bundle + counters convert “remote fault” into proof-quality evidence.
Blocks (and why each matters to control & diagnosis)
- AC in → PFC → DC link → inverter: defines the electrical-in boundary and explains start/defrost transient failures (DC link sag, current spikes).
- Compressor inverter drive: provides capacity; phase current + speed command are the quickest discriminators for overload vs control-limit behavior.
- EEV + reversing valve: must be logged as state during capacity drops/defrost events; “water-side symptoms” can be caused upstream.
- Hydraulic module (HX/buffer optional): acts as the coupling point between refrigerant-side capacity and water-side delivery stability.
- Circulation pump + mixing valve: determines radiant supply temperature and ΔT stability; hunting often comes from sensor lag + actuator deadband.
- Supply/return temps + flow + pressure: are the minimum set to prove circulation vs blockage/air-lock vs sensor error.
- Remote interface (RS-485/Modbus or Ethernet): used to deliver evidence—fault class, snapshots, and counters—not a protocol-stack deep dive.
Interface evidence table (what to measure, what it proves)
| Signal / Node | Sensor / Method | Typical range (field) | What it proves | Failure signature (common) | Logging rule |
|---|---|---|---|---|---|
| Supply temp Radiant supply pipe |
NTC / PT1000 clamp or immersion well |
Low-to-mid temp water for radiant floors (system-dependent) |
Confirms delivered water temperature at the loop entry; used with return temp for ΔT and thermal output estimation. | Loose clamp: noisy steps; poor insulation: drift with ambient; slow response causes mixing valve hunting. | 1–2 s |
| Return temp Radiant return |
Same as supply | Usually below supply; gap depends on load and flow | Enables ΔT; rising return temp with low ΔT often indicates high flow or insufficient load pickup. | Thermal coupling errors make ΔT appear “too small/negative,” breaking COP estimation. | 1–2 s |
| Flow rate | Hall/turbine flow sensor or inferred from ΔP |
Varies by loop size and pump curve | Proves circulation actually occurs; required to distinguish “capacity issue” vs “delivery issue.” | Intermittent pulses: sticking rotor; flow reads OK but ΔP abnormal: sensor misread or bypass path. | 1–2 s |
| ΔP / pressure Across filter/loop |
Pressure sensor(s) or ΔP transducer |
System-dependent; used as trend evidence | Separates clog/valve restriction/air-lock from normal flow; validates whether “flow” reading is believable. | High ΔP + low flow: blockage; normal ΔP + low flow: pump issue or closed valve; noisy pressure: cavitation/air. | 2–5 s |
| Pump RPM / tach | EC motor tach output or driver estimate |
Varies by pump type | Confirms actuator execution; essential when remote diagnosis claims “pump running.” | RPM present but flow missing → air-lock / closed valve; RPM unstable → supply/driver issue. | 1 s |
| Mixing valve position | Stepper position or analog feedback |
0–100% (normalized) | Explains supply temp shaping and hunting; validates whether control action matches measured ΔT trends. | Deadband/stiction: repeated small commands with no temp change; overshoot: sensor lag or wrong loop gain. | 1–2 s |
| DC link voltage | Divider to ADC | Tracks input line and PFC state | Proves brownout/UVLO behavior; ties transient faults (start/defrost) to power integrity evidence. | Sag during defrost/start → resets / derating; spikes → OVP trips; ripple rise → capacitor aging. | Event snapshot (200–500 ms) |
| Inverter phase current | Shunt + amplifier or Hall sensor |
Load-dependent | Separates “control-limited” from “hard overload”; supports OCP/root-cause classification. | Current spikes aligned with valve transitions; high RMS for same capacity → mechanical or refrigerant-side issue. | Event snapshot (200–500 ms) |
| Ambient / frost state | NTC + logic state | Climate-dependent | Explains defrost entry/exit and capacity derating triggers; prevents misdiagnosing “heat drop” as water-side only. | Frost sensor drift → excessive defrost; missing state in logs → remote diagnosis becomes guesswork. | 5–10 s + state changes |
Note: “Typical range” is intentionally described as field-dependent to avoid false precision across different loop sizes, pump curves, and regional design temperatures. The diagnostic value comes from trends and cross-checks (ΔT vs flow vs ΔP).
H2-3 — Control loops that matter (and how they couple)
This chapter defines a practical control coordinate system: what each loop controls, what evidence proves it is behaving, and how defrost/freeze-protect states override setpoints so faults are not misclassified.
Loop selection rule (keep it evidence-based)
Every loop below is defined by a controlled variable, an actuator, and 2–3 proof signals. If a claimed “loop issue” cannot be proven by these signals, treat it as a measurement integrity or override-state problem first.
Key loops (what controls what)
Compressor speed loop (capacity control)
- Controlled variable: delivered capacity proxy (speed/power trend), not abstract thermodynamic quantities.
- Actuator: inverter → compressor (BLDC/PMSM).
- Proof signals: speed command vs speed estimate; inverter phase current; DC link voltage/power trend.
- Common failure signature: command rises but current/power does not → derating, protection limit, or measurement error.
EEV control (state-driven stability)
- Controlled variable: stable EEV state consistent with operating mode (heating/defrost transitions).
- Actuator: EEV step/coil driver.
- Proof signals: EEV position/step count; driver status; mode/defrost state bits.
- Common failure signature: frequent state changes with weak system response → mechanical stiction, drive undervoltage, or missing logs.
Pump control (flow/ΔT stability)
- Controlled variable: flow (or flow-derived stability) and its impact on ΔT trend.
- Actuator: EC pump speed/PWM.
- Proof signals: pump RPM/tach; flow reading; ΔP/pressure trend (if available).
- Common failure signature: RPM present but flow missing → air-lock, closed valve, or bypass path.
Mixing valve control (supply temperature shaping)
- Controlled variable: radiant supply temperature (and/or ΔT target stability).
- Actuator: mixing valve position (stepper or analog feedback).
- Proof signals: valve position; supply temp response; sensor lag/filtered sampling window.
- Common failure signature: valve hunting usually indicates sensor lag + deadband + loop gain mismatch, not only a “bad valve.”
Defrost / freeze-protect interactions (what gets overridden)
- Controlled variable: system safety and survivability states (freeze avoidance, stable transitions).
- Actuator overrides: compressor/pump/valves may be forced to preset actions during defrost or freeze-protect.
- Proof signals: defrost/freeze-protect state bits; override reason; start/end timestamps; affected setpoints.
- Common failure signature: “heat drop” during defrost misread as water-side fault when override states are not logged.
Loop coupling map (if X changes, check Y)
- If mixing valve hunts (rapid position oscillation): check supply sensor response time (contact + insulation) and filter window; confirm valve deadband/stiction before changing gains.
- If ΔT collapses while supply temperature looks “fine”: check flow (pump RPM vs flow) and return sensor integrity; verify ΔP trend to rule out bypass/air-lock.
- If compressor command rises but heating performance does not: check inverter current/DC link evidence; then cross-check hydronic delivery (flow + ΔT) to separate capacity from delivery.
- If frequent defrost coincides with heat delivery drops: check defrost state transitions and overridden setpoints; ensure snapshots include valve states and DC link during entry/exit.
- If pump RPM is stable but ΔT is noisy or negative: suspect ΔT measurement integrity (placement/lag) before concluding “loop instability.”
H2-4 — ΔT capture: sensing chain, placement, accuracy, and sampling strategy
ΔT is the evidence currency for this topic. If ΔT is wrong, diagnostics and COP estimates become untrustworthy. This chapter treats ΔT as an integrity problem: sensor choice, placement, lag, and field validation steps.
Sensor options (NTC vs PT1000/RTD) — choose by failure signatures
- NTC clamp sensors: cost-effective but sensitive to contact quality, self-heating, and ADC reference stability; poor contact often appears as step noise and drift.
- PT1000/RTD: better linearity but more sensitive to lead resistance and installation method; incorrect wiring or contact can bias ΔT and break COP estimation.
- Selection rule: prioritize the option that yields repeatable trends under insulation and contact constraints, because trend integrity matters more than a single absolute value.
Placement pitfalls (most field failures originate here)
- Stratification / non-mixed points: measuring too close to mixing junctions can read blended water rather than true supply/return.
- Pipe contact errors: loose clamps create noise and “teleporting” temperature steps.
- Insulation missing: ambient air affects the pipe sensor; return readings often drift more, creating a fake ΔT shrink.
- Response time mismatch: slow supply sensor response introduces phase lag, causing mixing valve hunting and overshoot.
Sampling & filtering (avoid lag that destabilizes control)
- Over-filtering risk: smoothing reduces noise but adds delay, which destabilizes mixing valve and pump coordination.
- Recommended pattern: a slow trend channel (1–2 s) for steady-state analysis plus a fast event snapshot (200–500 ms window) on defrost start/stop and trips.
- Cross-check rule: if supply temp changes but return temp shows no coherent delayed response, treat ΔT as untrusted until placement/contact is verified.
Two “first measurements” to validate ΔT quality in the field
- Co-located cross-check: place a second probe or contact thermometer at the same pipe point and compare trend alignment. Large divergence indicates contact/insulation issues.
- Small step response test: apply a small controlled change (pump speed or mixing valve position) and observe whether supply and return temperatures respond with a reasonable delay and direction. Lack of coherent response indicates lag/placement problems.
Deliverable checklist: ΔT sanity checks + symptoms of bad ΔT
- ΔT should not cross zero repeatedly in stable heating; frequent sign flips suggest contact/placement or filtering artifacts.
- Supply and return should move coherently with return lagging; return moving “first” often indicates ambient coupling or sensor error.
- ΔT trend should correlate with flow changes; if flow increases and ΔT stays random, at least one sensor chain is untrusted.
- Cross-check ΔT with ΔP/pressure trend when available to rule out bypass/air-lock misreads.
- False “capacity loss” diagnosis when return sensor drifts or is ambient-coupled.
- False “pump insufficient” conclusion when supply sensor lag compresses ΔT.
- COP estimate jumping wildly due to ΔT noise dominating thermal-out calculation.
- Mixing valve “unstable control” blamed when the real cause is sensor phase lag and deadband interaction.
H2-5 — Flow & pressure sensing: proving circulation vs guessing
“Not enough heat” must be separated into a delivery problem (circulation/valves/air-lock) versus a capacity problem (compressor/derating). This chapter treats flow and pressure as proof signals that cross-check ΔT.
Circulation proof rule (ΔT + flow + pressure must agree)
A single sensor can lie. Circulation is proven only when ΔT trend, flow evidence, and pressure/ΔP evidence form a coherent story across mode transitions and pump commands.
Flow sensing options (choose by diagnostic strength)
- Best for: low-cost trending and “flow exists / not exists” proofs.
- Typical artifacts: intermittent pulse dropouts from bubbles, stiction, or debris.
- Proof pairing: always compare with pump RPM/tach and ΔP trend to avoid false confidence.
- Best for: blockage evidence and resistance change detection.
- Typical artifacts: pump cavitation shows as noisy ΔP and unstable ΔT.
- Proof pairing: combine with valve position and pump command to separate “restriction” vs “forced bypass.”
- Best for: stable long-term flow trending and low-maintenance sensing.
- Typical artifacts: air entrainment can reduce accuracy; treat “unstable readings” as evidence of bubbles.
- Proof pairing: validate against ΔT response to small pump steps (coherent response matters).
- Pump head evidence: outlet pressure and ΔP across loop correlate with commanded RPM.
- Blockage evidence: ΔP rises while flow drops (or inferred flow drops).
- Air-lock evidence: unstable ΔP + flow pulse dropouts + ΔT coherence loss.
Failure signatures (what the evidence looks like)
- Bubbles / air-lock: flow pulses intermittently missing, ΔP noisy, ΔT becomes erratic or flips sign during transients.
- Clog / restriction: ΔP rises, flow falls, ΔT can inflate (heat stays “in the water” but delivery weak) or collapses (if circulation becomes intermittent).
- Stuck valve / wrong valve state: normal pump RPM with abnormal ΔP/flow correlation; supply looks “OK” but return response is delayed or absent.
- Wrong pump curve / insufficient head: pump command increases but flow barely changes; ΔP stays low; ΔT can grow (low flow) while comfort stays poor.
Discriminator table (symptom → ΔT trend → flow/pressure evidence → likely cause)
| Symptom | ΔT trend | Flow evidence | Pressure / ΔP evidence | Likely cause | First confirmation |
|---|---|---|---|---|---|
| Supply temp looks OK, rooms still cold | ΔT small or unstable | Flow low or intermittent; RPM may be normal | ΔP noisy or inconsistent with RPM | Air-lock / bubbles; intermittent circulation | Check air purge behavior and pulse continuity during small pump step |
| Heating slow; short bursts of warmth then fades | ΔT flips or drifts | Flow pulses drop out; flow reading sporadic | Pressure signal noisy; ΔP spikes then collapses | Bubble entrainment / cavitation events | Correlate events with pump RPM and valve position changes |
| System trips into protection when demand rises | ΔT may collapse abruptly | Flow does not increase as commanded | ΔP rises sharply at higher RPM | Restriction/clog or partially closed valve | Inspect filter/strainer state and valve commanded state |
| Comfort poor; pump command higher does not help | ΔT grows (low flow) | Flow barely changes with RPM | ΔP remains low even at higher RPM | Wrong pump curve / insufficient head | Compare head evidence versus expected RPM response; check bypass paths |
| ΔT looks “great” but feels wrong | ΔT large but inconsistent across time | Flow reading stable but mismatched with return response | ΔP contradicts flow or shows abnormal noise | Flow sensor bias or placement artifact | Cross-check with ΔP inference or secondary flow method |
Note: Diagnostic strength improves when flow and ΔP are evaluated as trends against pump commands and mode states (e.g., defrost transitions).
H2-6 — Drives & power chain (only what supports control + evidence)
Drives are included only as they impact stability, trips, and remote diagnostics. Each drive block is paired with the minimum signals required to confirm trip class and isolate whether the fault is electrical, thermal, or state-driven.
Drive evidence rule (trip class must be provable)
A trip label is actionable only when it can be confirmed by a time-aligned snapshot: DC link + current sense + driver status + temperature + state bits (mode/defrost).
Compressor inverter (evidence points that matter)
- Gate driver status: driver fault/UVLO flags distinguish power-stage events from control derating.
- Current sense: phase current peaks and trend confirm true OCP and identify pre-trip spikes.
- DC link monitoring: sag/spike evidence supports UVLO/OVP classification and correlates with resets.
- Mode coupling: defrost entry/exit often changes operating constraints; snapshots must include the state bits.
Valve & actuator drives (state + execution confirmation)
- Log: position/step count, enable, driver status (and drive current if available).
- Evidence goal: distinguish mechanical stiction from undervoltage or disabled drive.
- Log: coil ON/OFF with timestamp, mode state, transition counters.
- Evidence goal: correlate transitions with capacity drops and defrost cycles.
- Log: target vs actual position (if available), supply temperature response, sensor lag marker.
- Evidence goal: separate “stuck valve” from “ΔT / placement artifact” and control lag.
- Log: PWM/command, tach/RPM, optional power/current trend.
- Evidence goal: prove execution and correlate RPM-to-flow consistency (ties back to H2-5).
Design constraints (minimal rugged note)
- Surge/EFT risk: DC link spikes/sags and MCU resets can mimic “random trips” if snapshots are missing.
- EMI risk: tach/flow pulses and ADC readings may glitch; use counters and plausibility checks rather than trusting a single sample.
Deliverable: trip taxonomy (OCP/OVP/UVLO/OTP) + confirmation signals
| Trip class | What it looks like | Must-log signals | Fast discriminator | Typical root causes (scope-safe) |
|---|---|---|---|---|
| OCP | Immediate shutdown or rapid derating during load increase | Phase current peak; gate driver fault/status; timestamp; mode/defrost state | Current spike aligned to trip edge | Mechanical load anomaly, short transient, incorrect current threshold, power-stage fault |
| OVP | Trip coincident with voltage spike; may happen at transitions | DC link spike snapshot; driver status; line/rectified trend (if available) | DC link exceeds limit before trip | Surge event, regen transient, clamp insufficiency, measurement bias |
| UVLO | Reset, brownout, or forced low-power mode under demand | DC link sag; MCU reset reason; driver UVLO flags; load/current context | Voltage sag with reset reason | Weak supply, inrush, PFC limit, wiring drop, excessive load step |
| OTP | Gradual derating then shutdown; repeatable after warm-up | Thermal sensor(s) trend; derating state bit; fan/pump states; timestamp | Temperature crossing threshold with derating flag | Insufficient cooling, blocked airflow (unit-level), sustained overload, sensor placement error |
H2-7 — Energy metering & COP estimation (electrical-in vs thermal-out)
Practical performance measurement requires two coherent stories: electrical input and thermal output estimate. COP is meaningful only when ΔT and flow are trustworthy and operating states (e.g., defrost) are properly labeled.
Boundary (device-side, not HEMS)
Metering is limited to equipment-side signals (AC line or DC link, ΔT, flow). Results are for engineering diagnostics and trend verification, not utility billing or whole-home energy management.
Electrical input: where to meter (AC line vs DC link)
- What it proves: true equipment input including auxiliary loads (controls, pumps, valve drives).
- Accuracy tradeoffs: PF, harmonics, phase error; low-power segments require careful sampling windows.
- Best use: whole-unit trend, validation of “energy-in” across modes and transitions.
- What it proves: inverter-side energy that correlates tightly with capacity commands and trip evidence.
- Accuracy tradeoffs: excludes front-end losses and some auxiliary loads; boundary must be labeled.
- Best use: capacity diagnostics, derating correlation, and trip analysis (ties to H2-6).
Thermal output estimate: ΔT × flow × fluid heat capacity
- Thermal-out estimate: heat output can be approximated from ΔT and flow with an assumed Cp for the working fluid.
- Calibration note: fluid mixture and sensor scaling create systematic offsets; trend coherence matters more than a single absolute number.
- Sampling note: ΔT and flow must be evaluated in the same time window; asynchronous sampling creates artificial COP spikes.
- State note: defrost/freeze-protect segments must be labeled or excluded to avoid mixing incompatible regimes.
COP estimate: when it’s meaningful, when it lies
- ΔT passes sanity checks (stable sign, coherent supply/return dynamics).
- Flow is consistent with pump RPM and/or ΔP evidence (no pulse loss patterns).
- Metering boundary is explicit (AC-input COP vs DC-link proxy).
- Bad ΔT: poor contact/insulation or heavy filtering causes lag and oscillation artifacts.
- Bad flow: bubbles, stiction, or bypass paths break coherence between RPM, ΔP and flow.
- State mixing: defrost segments included without labeling corrupt averages and trends.
Deliverable: COP confidence checklist (High / Medium / Low)
| Confidence level | Required conditions | Automatic downgrade triggers | How to use the COP number |
|---|---|---|---|
| High | ΔT sanity checks pass; flow correlates with pump RPM and/or ΔP; defrost state filtered or labeled; metering boundary explicit (AC or DC link). | Flow pulse loss, ΔT sign flipping, sensor error flags, or unlabeled defrost segments. | Use for performance trend, comparative commissioning checks, and remote verification of stability changes. |
| Medium | ΔT and flow mostly stable but partial evidence missing (e.g., no ΔP); DC link metering used with boundary label; occasional missing samples tolerated. | Increasing noise metrics, rising retry counters, or inconsistent RPM-to-flow response. | Use for trend direction only; avoid absolute claims; prioritize additional proof signals in snapshots. |
| Low | ΔT or flow not trustworthy (noise, drift, dropouts); defrost state not available; sampling windows not aligned; metering boundary unclear. | Frequent invalid flags; counters indicate missing/CRC errors; state mixing suspected. | Do not interpret COP; first repair measurement integrity (ΔT/flow) and logging schema (H2-8). |
H2-8 — Remote diagnostics: what to log so you can actually debug
Remote diagnostics works only when logs contain time-aligned proof signals. This chapter defines an event-first logging model with snapshot bundles, counters, and communication reliability evidence—without drifting into platform architecture.
Remote diagnostics rule (event + snapshot + counters)
Every actionable fault must be supported by timestamped events, a snapshot bundle captured at the fault edge, and counters that turn intermittent issues into measurable evidence.
Event log structure (make faults self-explanatory)
- Required fields: timestamp, sequence ID, fault code, trip class, operating state (heating/defrost/freeze-protect), snapshot ID.
- Why sequence ID matters: reveals drops, reordering, and overwrites that otherwise look like “random missing data.”
- Why state bits matter: defrost and protection overrides change setpoints; events without state are not diagnosable.
Snapshot bundle (minimum useful set)
- Tsupply, Treturn, ΔT
- Flow
- Pump RPM (tach)
- Mixing valve position
- Compressor speed
- Inverter current (Iphase)
- DC link voltage (Vdc)
- Pressures / ΔP (if available)
- Ambient / frost state
- Defrost state and transition markers
- ΔT_valid, Flow_valid
- Sensor CRC / error flags
- Pulse-loss / missing-sample indicators
Counters (turn intermittent faults into statistics)
- Defrost counters: defrost count, defrost duration histogram (if feasible), entry/exit timestamps.
- Trip counters: count by class (OCP/OVP/UVLO/OTP), plus “last trip timestamp” per class.
- Retry counters: restart attempts, compressor retry, comm reconnect.
- Sensor integrity: ADC out-of-range count, CRC error count, flow pulse-loss count.
Communications (RS-485/Modbus or Ethernet) — reliability evidence only
- CRC error count, frame timeout count, retry count
- bus idle time anomalies, re-init count
- sequence gaps (if events are polled)
- link up/down count, reconnect count
- socket reset / keepalive timeout
- packet drop indicator (if available)
Deliverable: minimum viable log schema (MVLS)
- Event record: timestamp · sequence_id · fault_code · trip_class · mode_state · snapshot_id · comm_status_summary (optional)
- Snapshot record: Tsupply · Treturn · ΔT · Flow · Pump_RPM · MixingValve_Pos · Compressor_Speed · Vdc · Iphase · Pressure/ΔP (opt) · Ambient/Frost · Defrost_State · validity_flags
- Counters: defrost_count · trip_count_by_class · reboot_count · retry_count · comm_crc_err · comm_timeout · comm_retry · link_flap · flow_pulse_loss · sensor_crc_err
- Trigger strategy: on_trip · on_defrost_enter/exit · on_flow_drop/pulse_loss · periodic baseline snapshot
H2-9 — Protection & safety logic inside THIS controller (keep it local)
Protection is actionable only when it is expressed as controller-local triggers, deterministic actions, and mandatory proof logs. This section keeps safety tied to heat pump + hydronics behavior, not certification theory.
Local safety scope (what “local” means)
Local protection reacts using available sensors (ΔT, flow, pressures/ΔP, Vdc, Iphase, ambient/frost, state bits) and local actuators (compressor inhibit, pump force, mixing valve clamp, alarm/event logging).
Freeze protection (conditions → actions)
- Trigger candidates: low ambient/frost state, low Tsupply/Treturn, or persistent near-zero ΔT while flow exists.
- Primary actions: force pump to circulate, clamp mixing valve to a safe position, and mark state as freeze-protect.
- Escalation: if flow is absent or sensors invalid, upgrade to alarm + lockout with evidence snapshot.
Over-temp / floor safety (mixing valve + pump response)
- Trigger candidates: Tsupply above floor safety limit or rapid temperature rise with stable pump command.
- Primary actions: clamp mixing valve to reduce supply temperature; keep pump running to remove residual heat.
- Stability note: apply hysteresis/hold time to avoid oscillation between comfort control and protection.
Dry-run / no-flow protection (prove circulation)
- Trigger candidates: pump RPM present but flow ≈ 0 (or pulse-loss pattern), or ΔP signature indicates blockage/air lock.
- Primary actions: stop or limit compressor demand, force pump purge window, and capture “no-flow” proof bundle.
- Discriminator: differentiate “true no-flow” vs “flow sensor failure” using RPM↔ΔP↔ΔT coherence.
High-pressure / low-pressure trip evidence mapping
- Pressure rising trend into limit + compressor speed/current correlation.
- State bits (heating/defrost) captured to avoid mislabeling transitions.
- Action: compressor inhibit + fault event + retry policy with lockout timer.
- Pressure falling/unstable + capacity command present + mode state captured.
- Check sensor validity flags and sampling window alignment.
- Action: compressor inhibit + fault event + conditional restart after stabilization.
Leak / abnormal ΔP detection (hydronic)
- Abnormal ΔP: ΔP increases while flow does not increase → likely restriction/clog; ΔP noise spikes → bubbles/cavitation patterns.
- Leak tendency (local evidence): pressure decay after pump off, unexpected refill behavior, or persistent mismatch between RPM and delivery evidence.
- Action goal: downgrade capacity demand and preserve evidence snapshots; avoid “silent failure” without logs.
Deliverable: Protection action matrix (trigger → action → what you must log)
| Trigger (local condition) | Immediate action (actuators) | Lockout / recovery rule | Must-log bundle (snapshot + counters) | Why it matters here |
|---|---|---|---|---|
| Freeze risk: ambient/frost low + Tsupply/Treturn below threshold | Force pump · clamp mixing valve · set Freeze-Protect state | Exit only after temp margin holds for N minutes; prevent rapid toggling | Tsupply/Treturn/ΔT · Flow · Pump_RPM · Mixing_Pos · Ambient/Frost · Mode_State · defrost_count | Prevents freezing while preserving proof signals for remote verification |
| Over-temp: Tsupply above safety limit or rapid rise | Clamp mixing valve down · keep pump running · reduce compressor demand | Hysteresis + minimum hold time; log exit condition | Tsupply/Treturn/ΔT · Flow · Pump_RPM · Mixing_Pos · Compressor_Speed · Vdc/Iphase | Floor safety depends on valve+flow coupling, not a single sensor |
| No-flow / dry-run: Pump_RPM present but Flow≈0 or pulse-loss | Inhibit compressor · purge/force pump window · raise alarm event | Retry limited; lockout escalates if repeated within time window | Flow · pulse_loss_count · Pump_RPM · ΔP/pressure (opt) · ΔT trend · trip_count_by_class | Separates delivery failure from capacity failure and prevents damage |
| HP trip: pressure rising into limit with correlated Iphase | Compressor inhibit · fault event · controlled restart policy | Restart only after stabilization; backoff timers; class-based counters | Pressure · Compressor_Speed · Iphase · Vdc · Mode_State · fault_code · sequence_id | Evidence mapping avoids false blame on ΔT/flow when root is capacity-side |
| LP trip: pressure falling/unstable with demand present | Compressor inhibit · fault event · conditional restart | Require stable pressure window; invalidate COP during LP fault | Pressure · Compressor_Speed · Vdc/Iphase · validity_flags · retry_count · fault_code | Prevents misleading COP and supports remote root-cause narrowing |
| Abnormal ΔP: ΔP↑ while flow not rising; or ΔP noise spikes | Reduce capacity demand · keep circulation as needed · log anomaly | Clear only when coherence returns (RPM↔flow↔ΔP) for N samples | ΔP/pressure · Flow · Pump_RPM · Tsupply/Treturn · comm_crc_err (if relevant) | Hydronic restrictions and air patterns are diagnosable only with ΔP evidence |
H2-10 — IC/function selection map (with example part categories)
The selection map is organized by function blocks and the evidence/control needs of this controller. It provides procurement-ready requirements without turning into a brand catalog.
Selection principles (tied to this page’s evidence chain)
- Diagnosability first: time-aligned snapshots (ΔT/flow/pressure + Vdc/Iphase + state bits) must be feasible.
- Stability first: sensing latency/noise influences mixing valve control and COP confidence.
- Recoverability first: log memory write strategy determines whether remote debugging is possible after trips/resets.
Deliverable: compact selection table (category-level MPN types)
| Function block | Key specs (what to require) | Why it matters here | Example MPN types (category-level) |
|---|---|---|---|
| MCU / Control SoC | motor-control PWM resources; multi-channel ADC; fast capture window for events; safety IO (fault pins, watchdog); comm interfaces (RS-485 UART, Ethernet MAC/PHY support). | Enables MVLS snapshots and protection actions without timing drift; supports state bits and counters at fault edges. | Motor-control MCU class · Industrial MCU class · MCU + external ADC architecture |
| Gate driver (inverter) | robust fault reporting (OC/UVLO/desat class); compatible with power stage; fast fault response; clear fault pin/status readability. | Trip taxonomy requires proof (driver fault + current + Vdc) to distinguish real power-stage faults from control derates. | 3-phase gate driver family · Half-bridge driver family · Intelligent power module driver class |
| Current sense path | shunt/Hall options; bandwidth for trip capture; dynamic range; noise immunity; optional isolation where needed. | OCP classification and “spike vs sustained” evidence depends on capture fidelity around trip edges. | Shunt amplifier family · Hall current sensor IC class · Isolated current sense amplifier class |
| DC link monitoring | stable divider/reference; ADC range and sampling strategy; ripple/undervoltage event capture capability. | UVLO/OVP evidence and correlation with compressor demand is only credible with Vdc snapshots. | High-voltage ADC front-end class · Precision reference family · Supervisor/monitor IC class |
| Energy metering (optional AC input) | real power + PF handling; harmonic tolerance; line-voltage/current sensing interfaces; calibration support. | Enables AC-boundary COP and system-level energy-in trend; avoids ambiguous “DC-only” energy claims. | Single/3-phase energy metering IC family · Isolated metering AFE class |
| Temp sensor interface (NTC/RTD) | low-noise excitation/reference; wiring error detect; linearization support; sampling/filter knobs for loop stability. | ΔT quality controls COP validity and mixing valve stability; bad ΔT creates false diagnostics and oscillations. | RTD interface IC family · Precision ADC + front-end class · NTC AFE class |
| Pressure / ΔP conditioning | ratiometric conditioning; protection against transients; noise filtering without excessive lag; diagnostics flags. | Abnormal ΔP/leak evidence depends on stable trends; laggy filters hide bubbles/clog signatures. | Instrumentation amplifier class · Sensor AFE class · Pressure interface ADC family |
| EEV stepper / coil drive | step control + stall/oc detect (if available); position tracking; fault reporting; thermal robustness. | Remote debugging needs “command vs position vs effect” evidence to separate actuator failure from sensing errors. | Stepper motor driver family · Coil driver / smart switch class |
| Mixing valve actuator drive | H-bridge/stepper/servo interface (by actuator type); position feedback capture; fault detect. | Over-temp protection and supply shaping rely on deterministic valve response and measurable position evidence. | H-bridge driver family · Stepper/actuator driver class · Position feedback interface class |
| Pump drive interface | PWM/analog command support; tach capture; optional current/power trend; EMC-hardened input filtering. | No-flow discrimination requires RPM evidence and coherence with flow/ΔP; without tach the diagnosis collapses. | BLDC/EC motor controller class · Tach capture timer + driver interface class |
| Nonvolatile memory for logs | endurance; power-fail robustness; ring-buffer strategy; write amplification control; fast commit for events. | MVLS needs reliable preservation of “trip edge” snapshots; without robust NVM, remote debugging fails after resets. | FRAM family · SPI NOR flash class · EEPROM class (limited endurance) · Wear-leveling strategy |
| RS-485 / Ethernet physical layer (optional) | robust transceiver; ESD protection capability; error counters visibility; link status reporting. | Remote diagnostics requires reliability evidence (CRC/timeouts/reconnect) to separate comm faults from system faults. | RS-485 transceiver family · Ethernet PHY class · Isolated transceiver class |
Note: Example MPN types are intentionally category-level. If concrete part numbers are needed, request “List concrete MPNs by function block”.
H2-11 — Validation & commissioning test plan (bench → loop → field)
A repeatable commissioning plan must prove control integrity, hydronic delivery, metering credibility, and remote diagnosability using pass/fail signals.
Staging model
Each stage uses a checklist and acceptance metrics tied to controller-visible signals (Tsupply/Treturn/ΔT, Flow, Pump_RPM, Mixing_Pos, Compressor_Speed, Vdc, Iphase, State bits, Fault codes, Counters).
Bench bring-up (prove rails, sensing, and actuation)
- Measure: Vdc + low-voltage rails (3.3V/5V as applicable), reset reason, reboot counter.
- Pass: no reset loops; rail dips do not coincide with false trips; reboot_count stays stable.
- Fail signals: Vdc sag events without load change; repeated reset_reason patterns; log gaps.
- Measure: Iphase offset/gain at known points; trip-edge capture window; driver fault pin status.
- Pass: stable offset; gain error within target; trip-edge snapshots include Vdc + Iphase + fault pin.
- Fail signals: drifting offset; clipped peaks; inconsistent Iphase vs commanded compressor speed.
- Measure: command vs position/step count (if available), coil drive status, response time trend.
- Pass: position changes are observable; faults are latched with event snapshots.
- Fail signals: command issued but no movement evidence; frequent coil faults without current proof.
- Measure: Pump_RPM tach capture; Flow response; ΔP/pressure trend (if available).
- Pass: RPM ↑ leads to Flow ↑ and ΔP trend coherence; pulse-loss counter remains low.
- Fail signals: RPM present but Flow≈0; ΔP noise spikes; tach dropout under EMI events.
Loop tests (prove ΔT/flow/mixing stability and transitions)
- ΔT step response: apply mixing valve step or capacity step → validate Tsupply/Treturn alignment, ΔT lag, and noise floor.
- Flow sweep: sweep Pump_RPM across range → verify Flow and ΔP coherence; identify air lock/clog patterns.
- Mixing stability: hold supply target → evaluate overshoot, oscillation, valve activity rate, and settle time.
- Freeze/defrost transitions: enter/exit transitions → ensure actions match protection matrix and logs include mode/state bits.
Metering tests (protect COP credibility)
- Known operating points: compare electrical-in trend vs thermal-out trend at several stable points (avoid chasing absolute COP only).
- Offset checks: verify ΔT offset and Flow scaling; confirm “COP confidence” downgrades when sensors become unreliable.
- Sanity cases: introduce pulse loss or ΔT lag → confirm COP is flagged low-confidence rather than reported as a clean number.
Remote diagnostics tests (fault injection → log completeness verification)
- Inject no-flow: force Flow≈0 with RPM present → event must include Flow, Pump_RPM, ΔP (if), ΔT, counters.
- Inject over-temp: exceed Tsupply threshold → event must include Tsupply/Treturn, Mixing_Pos, mode/state, action taken.
- Inject DC link sag: reduce Vdc margin → event must include Vdc, Iphase, compressor speed, trip class UVLO/OVP.
- Inject comm errors: raise CRC/timeouts → logs must prove comm failure vs missing data generation (sequence gaps).
Deliverable: staged checklist + acceptance metrics (pass/fail signals)
| Stage | Step (what to do) | Stimulus | Measure (proof signals) | Pass/Fail metric | If fail, first suspect |
|---|---|---|---|---|---|
| Bench | Rail & reset integrity | Power cycle + load step | Vdc + LV rails + reset_reason + reboot_count | Pass: no reset loops; rail dips do not align with false trips | Supervisor thresholds / wiring / ground reference / sampling window |
| Bench | Current sense calibration | Known current points + trip-edge capture | Iphase offset/gain + Vdc + driver fault pin | Pass: stable offset; peaks captured without clipping; consistent correlation | Shunt amp bandwidth / ADC range / anti-alias filter / layout coupling |
| Loop | ΔT step response | Mixing valve step or capacity step | Tsupply/Treturn time series + ΔT lag/noise | Pass: bounded lag; no sign flips; noise within target | Sensor placement/insulation / filtering too heavy / sampling misalignment |
| Loop | Flow sweep coherence | Pump_RPM sweep | Flow + Pump_RPM + ΔP trend + pulse_loss_count | Pass: RPM↔Flow↔ΔP coherence; low pulse loss | Air lock / clog / wrong pump curve / flow sensor dropout |
| Field | Freeze/defrost transitions | Transition entry/exit scenarios | Mode/Defrost/Frost bits + actions + snapshots | Pass: correct action sequence; logs complete and ordered (sequence_id) | State machine gaps / missing log triggers / hysteresis too small |
| Any | Fault injection log completeness | No-flow / over-temp / Vdc sag / comm CRC | Fault code + trip class + snapshot bundle + counters | Pass: evidence fields present; counters increment; sequence continuity | Schema missing fields / snapshot trigger missing / buffer overwrite policy |
Concrete MPN examples (controller-relevant blocks used by this test plan)
The following part numbers are reference examples for selection and validation; equivalents are acceptable when key specs match. Choices depend on voltage class, isolation needs, and the targeted compressor/pump power level.
| Block | Why it is needed in H2-11/H2-12 | Concrete MPN examples (non-exhaustive) |
|---|---|---|
| Motor-control MCU | PWM/timers + ADC capture windows + fault pins + logging orchestration | ST STM32G4 series · TI C2000 F2837x series · Microchip SAM E70 series |
| 3-phase gate driver | Inverter trip proof (fault pins) and deterministic shut-down behavior | TI DRV8323 · Infineon 6EDL04I06NT · onsemi NCV7725 (driver class reference) |
| Shunt current sense amplifier | Accurate Iphase evidence near trip edges; avoids false OCP claims | TI INA240 · ADI AD8418A · TI INA181 |
| Hall current sensor IC/module | Galvanic isolation option and robust current trending for diagnostics | Allegro ACS758 · Allegro ACS70331 · LEM HO series (module family) |
| Energy metering (AC input) | Electrical-in credibility for COP confidence and commissioning reports | ADI ADE9153A · TI AMC3302 (isolated modulator class) · Renesas RAA211x (metering class reference) |
| RTD/temperature interface | ΔT trust: stable excitation + linearization + diagnostics flags | ADI MAX31865 · TI ADS124S08 (precision ADC) · ADI ADT7310 (temp sensor option) |
| Pressure / ΔP analog front-end | Hydronic restriction/air patterns: ΔP trend evidence without excessive lag | TI INA826 (in-amp class) · ADI AD8421 · TI OPA333 (low-drift op-amp) |
| EEV / stepper driver | Command→movement evidence; fault reporting for actuator isolation | TI DRV8846 · ST L6470 · Allegro A4988 (stepper class reference) |
| Solenoid / coil driver | Reversing valve / solenoid actuation with open/short detection options | TI DRV103 · Infineon BTS500xx (smart high-side family) · ST VNQ/VN5E families |
| RS-485 transceiver | Remote diagnostics reliability evidence (CRC/timeouts) depends on robust PHY | TI THVD1450 · Analog Devices LTC2862 · Maxim/ADI MAX13487E |
| Ethernet PHY | Link up/down evidence and stable field connectivity | Microchip LAN8720A · TI DP83825I · Microchip KSZ8081 |
| Supervisor / watchdog | Proves resets and prevents silent lock-ups that erase evidence | TI TPS3823 · Maxim/ADI MAX706 · Microchip MCP131 |
| Log memory (NVM) | Preserves trip-edge snapshots for field root-cause | Fujitsu MB85RS64V (FRAM) · Cypress/Infineon S25FL (SPI NOR family) · Microchip 25AA1024 (EEPROM class) |
H2-12 — Field debug playbook (symptom → evidence → isolate → first fix)
Fast field value comes from forcing each symptom into an evidence chain: first 2 measurements → discriminator → first fix. Each item also lists the minimal must-log fields required for remote reproduction.
Rules for evidence-driven debugging
- Start with delivery evidence: Flow + ΔT outrank comfort impressions.
- Separate delivery vs capacity: Pump_RPM/ΔP coherence distinguishes circulation faults from compressor-side faults.
- Demand logs before conclusions: if snapshots are missing, treat it as an observability fault and fix logging first.
S1 “Supply temp looks OK, rooms still cold”
First 2 measurements: Flow + ΔT (Tsupply − Treturn).
Discriminator: Flow present but ΔT very small → mixing over-dilution / too much flow / ΔT lag; Flow weak or pulse-loss → delivery failure (air lock, clog, closed valve).
First fix: verify RPM↔Flow↔ΔP coherence; purge air / confirm valve open state; reduce filter lag on temp chain if ΔT shows unrealistic phase shift.
Must-log: Tsupply/Treturn/ΔT, Flow, Pump_RPM, ΔP/pressure(if), Mixing_Pos, validity_flags.
S2 “Short-cycling (frequent on/off)”
First 2 measurements: ΔT trend + compressor speed (or run-time window).
Discriminator: ΔT near zero with high flow → capacity not transferring to slab (mixing/flow mismatch); ΔT oscillates with large lag → temp sensing/filtering causing control instability.
First fix: run ΔT sanity checks (sign stability, lag window); tune mixing control hysteresis/hold time; adjust pump command to keep ΔT in a stable band.
Must-log: Tsupply/Treturn/ΔT, Flow, Pump_RPM, Mixing_Pos, Compressor_Speed, mode_state.
S3 “Compressor trips on cold days”
First 2 measurements: Vdc sag + Iphase peak (trip-edge window).
Discriminator: trips coincide with defrost transitions/state changes → transient management; Vdc sag without state transition → power margin/UVLO thresholds; Iphase spikes with stable Vdc → current-sense/driver behavior.
First fix: enforce trip-edge snapshot capture (Vdc, Iphase, driver fault); adjust ramp limits during transitions; validate UVLO/OCP thresholds against real waveforms.
Must-log: Vdc, Iphase, Compressor_Speed, fault_code, trip_class, mode/defrost state, sequence_id, trip counters.
S4 “Frequent defrost / heat drops”
First 2 measurements: frost/defrost state evidence + valve position (EEV/Mixing where available).
Discriminator: defrost triggered without matching frost evidence → sensor/threshold issue; defrost triggered with evidence but delivery collapses → hydronic side not buffered (flow/mixing response).
First fix: log defrost entry/exit snapshots; verify transitions do not override pump/mixing into no-delivery; adjust state gating so COP and comfort are not computed across mixed modes.
Must-log: frost_state, defrost_state, Tsupply/Treturn/ΔT, Flow, Pump_RPM, valve positions, counters.
S5 “COP looks great/bad but feels wrong”
First 2 measurements: COP confidence inputs: (ΔT validity + Flow validity) plus state bits (defrost/freeze-protect).
Discriminator: ΔT or Flow invalid / pulse-loss present → COP is not trustworthy; state mixing (defrost windows included) → COP polluted.
First fix: enforce COP confidence gating (High/Medium/Low); fix ΔT placement/insulation and flow measurement coherence before interpreting COP.
Must-log: ΔT, Flow, validity_flags, mode/defrost state, electrical-in boundary marker (AC vs DC link if used).
S6 “Remote faults but no actionable data”
First 2 measurements: snapshot bundle completeness + sequence_id continuity.
Discriminator: missing fields with continuous sequence → schema gap; sequence gaps + CRC/timeouts rising → communication loss; continuous comm but no new events → event trigger missing.
First fix: implement minimum viable log schema: fault_code + trip_class + snapshot_id + counters; add triggers at trip edges and state transitions; protect ring buffer from overwrite of critical events.
Must-log: fault_code, trip_class, snapshot fields, sequence_id, comm_crc_err/timeout/retry counters.
S7 “Supply temperature oscillates; valve ‘hunts’”
First 2 measurements: Tsupply trend + Mixing_Pos activity rate.
Discriminator: Tsupply oscillation with slow ΔT sensors → measurement lag; oscillation with stable sensors → loop gain/hysteresis/hold-time mismatch.
First fix: reduce excessive filtering delay; add minimum hold time on valve moves; validate ΔT sampling alignment to prevent control chasing stale data.
Must-log: Tsupply/Treturn/ΔT, Mixing_Pos, Flow, Pump_RPM, validity flags.
S8 “Pump runs, but heat delivery collapses intermittently”
First 2 measurements: Pump_RPM + Flow (plus ΔP if available).
Discriminator: RPM steady but Flow drops with ΔP spikes → air/cavitation; RPM steady but Flow drops with ΔP rising trend → clog/filter restriction; Flow signal drops while ΔP/RPM coherent → flow sensor dropout.
First fix: purge air and verify filter/strainer; use coherence checks to isolate sensor failure; add pulse-loss counter thresholds to trigger “no-flow” protection with evidence snapshots.
Must-log: Flow, Pump_RPM, ΔP/pressure(if), pulse_loss_count, ΔT trend, event timestamps.
H2-13 — FAQs (evidence-anchored, controller-local)
Each answer lands back on measurable signals: ΔT, Flow/ΔP, drives (Vdc/Iphase), metering boundary, and log completeness. No thermostat UX, no zoning/TRVs, no cloud platform design.
FAQ 01 Why is ΔT sometimes “negative” for a few minutes—sensor placement or mixing valve behavior?
FAQ 02 Radiant loop warms slowly: check flow first or compressor capacity first—what proves it?
FAQ 03 COP estimate jumps wildly: which two signals are usually lying?
FAQ 04 Short-cycling: is it ΔT control instability or a minimum compressor speed limit?
FAQ 05 Pump running but no heat delivered: air lock vs clogged filter—what evidence separates them?
FAQ 06 Mixing valve keeps hunting: is it sensor lag, wrong PID, or actuator deadband?
FAQ 07 Compressor trips only during defrost transitions: what snapshot must be logged?
FAQ 08 Supply temperature is capped for floor safety: how to prove it’s protection logic, not capacity loss?
FAQ 09 Flow sensor reads OK but ΔP suggests otherwise: which one to trust, how to validate fast?
FAQ 10 Electrical input metering disagrees with the utility meter: where is the measurement boundary wrong?
FAQ 11 Remote diagnostics says “EEV fault”: how to distinguish driver fault vs mechanical sticking remotely?
FAQ 12 After a power outage, the system behaves unstable for 10 minutes: which states must be restored/logged?
Notes: Answers intentionally reference controller-visible evidence only. If a signal is not available (e.g., ΔP), substitute the nearest coherence proof (RPM↔Flow↔ΔT) and log the confidence downgrade.