Telemetry & Ward Gateway (BLE/Wi-Fi/Cellular, ULP Power)

← Back to: Medical Imaging & Patient Monitoring

Core takeaway

A ward telemetry gateway saves power by running on a strict state machine (sleep → batch → transmit) and stays reliable by buffering data and enforcing bounded retries under weak RF. Power-loss hold-up is sized from a clear “must-finish” task list so critical data commits and graceful shutdown still complete during brownouts and outages.

H2-1 · What is a Ward Telemetry Gateway (and what it is not)

Practical definition (useful in design reviews)

A ward telemetry gateway is a ward-level aggregator that collects short-range wireless data (often BLE) from many bedside/wearable nodes, batches and buffers it, and then performs reliable uplink over Wi-Fi and/or cellular—while meeting 24/7 uptime, low-power operation, and power-loss data protection.

System boundary (prevents topic overlap)

This page covers

Ward topology: many nodes → gateway → Wi-Fi/cellular uplink, including aggregation/buffering and retry logic.
Gateway constraints: coverage & roaming symptoms, uptime, energy budget, data integrity, serviceability.
Low-power architecture: Always-On (AON) vs Radio domains, wake sources, duty-cycle scheduling.
Power-loss strategy: brownout detect → flush buffer → safe shutdown (hold-up concept).

This page does not cover (link out only)

Bedside wired comms / time sync (PTP/TSN) architectures (see “Bedside / ICU Monitor Comms”).
Hospital core network design and IT policy details (only interface expectations are mentioned here).
Imaging data paths (frame grabbers, PCIe/DMA, recorder pipelines).
Security deep dive (secure boot/HSM/TRNG) and EMC/isolation handbooks (only tests & boundaries are referenced).

Design targets (what must be true to call it a “gateway”)

Aggregation: can manage many leaf nodes without scan/connect storms; supports grouping and scheduled collection windows.
Buffering: absorbs uplink outages using RAM queue + persistent spool (with watermarks and backpressure rules).
Reliability: retries are bounded; acknowledgements are explicit; duplicate detection is deterministic.
Low power by state: average current is controlled by a state machine (sleep/sense/batch/transmit/confirm).
Serviceability: logs and counters exist for field triage (reconnect counts, RSSI stats, outage time, buffer watermarks, brownout events).
Power-loss protection: detects impending brownout early enough to flush critical records and mark last-known state.

Common failure patterns (symptoms → likely cause → quick check)

Symptom	Most likely cause	Fast check
Frequent “offline/online” flips across many nodes	Scan/connect window too aggressive; RF congestion; retry storm	Plot connection attempts/min vs RSSI distribution; cap retries and add randomized backoff
Data gaps after uplink outages	No persistent spool; incorrect queue watermarks; overwrite without accounting	Force uplink down for N minutes; verify monotonic sequence IDs and spool watermark behavior
Random reboots during peaks (TX bursts)	Supply droop; insufficient peak current; brownout threshold too high/late	Capture rail droop with scope during uplink bursts; log brownout reasons and peak current
Data corruption after power loss	No “flush & mark” sequence; hold-up energy too small; non-atomic metadata updates	Perform randomized power-cut tests; verify journal/commit markers; check hold-up time margin

H2-2 · Link Options: BLE vs Wi-Fi vs Cellular (selection matrix)

Key idea (how to pick without reading protocol textbooks)

Link selection is primarily driven by deployment control (hospital Wi-Fi access vs independent uplink), payload pattern (small periodic vs bursty), and reconnect tail energy (how long the radio stays expensive after each transmit). BLE is typically the leaf access layer; Wi-Fi/cellular are the uplink layers.

Selection matrix (engineering factors that change outcomes)

Factor	BLE (leaf)	Wi-Fi (uplink)	Cellular (uplink)
Deployment dependency	Low (gateway-controlled)	Medium–High (hospital IT access)	Low (independent uplink)
Payload pattern fit	Small periodic / event bursts	Bursty uploads; local backhaul	Low-frequency periodic is ideal
Reconnect behavior	Scan/connect storms if mis-tuned	Roaming + retries can dominate energy	Weak coverage causes long retry tail
Tail energy (after each TX)	Usually short, tunable by intervals	Can be significant with keep-alives	Often dominant; mitigated by PSM/eDRX
Cost & operations	Low BOM; gateway complexity	Low recurring cost; IT coordination	SIM/data ops; coverage validation

Practical combination patterns (avoid false either/or)

Pattern A (common): BLE leaf access → gateway batching → Wi-Fi primary uplink when hospital Wi-Fi access is stable.
Pattern B (independent deployment): BLE leaf access → gateway batching → cellular primary uplink for sites with limited IT integration.
Pattern C (highest availability): Wi-Fi primary uplink + cellular fallback triggered by outage counters and queue watermarks.

Rule of thumb: prioritize deployment control first, then optimize tail energy via batching and bounded retries.

Pitfalls to preempt (what usually breaks the plan)

Choosing Wi-Fi without control: access credentials and captive portal policies can turn into months of deployment delay.
Underestimating tail energy: frequent tiny uploads can consume more energy than rare batched uploads due to post-TX “radio expensive time.”
Roaming surprises: intermittent weak coverage creates retries and reconnections that look like “software bugs” but are RF realities.
Cellular coverage edge cases: indoor penetration and weak signal can create long retry tails and brownout-like resets.
Overloading BLE: too many simultaneous connections causes scan/connect storms—aggregation must be scheduled and bounded.

Output of this chapter (what the reader should take away)

BLE is a strong leaf access choice for many endpoints; uplink is selected by deployment control and tail energy.
Wi-Fi works best when hospital access is stable; cellular is strongest when independent deployment is required.
Battery life and stability improve when uploads are batched and retries are bounded.

H2-3 · Power-State Architecture (Always-On domain, wake sources, duty cycle)

Engineering takeaway

Ultra-low power is achieved by an auditable state machine, not by “adding a larger battery.” Each state must have clear entry/exit rules, a maximum dwell time, and a measurable current bucket. Wake sources must be gated and rate-limited to avoid wake storms that silently dominate average power.

Always-On (AON) domain: what must remain alive

RTC + time base: defines sensing/upload/maintenance windows and guarantees periodic housekeeping.
Wake arbitration: resolves multiple wake sources with priorities (e.g., brownout warning beats OTA).
Voltage monitor + early warning: detects impending brownout early enough to flush critical records.
Minimal bookkeeping: wake reason, reset cause, outage counters, buffer watermarks (for field triage).
Optional ULP co-processor: performs tiny “pre-check” tasks (threshold, scheduling) to reduce main-domain wakeups.

Boundary: this section describes responsibilities and interfaces (not MCU/RTOS tutorials).

Wake sources (gated): source → gate → rate limit → target state

Wake source	Gate (must be true)	Rate limit	Target state
Timer tick	within scheduled window; not in cooldown	fixed cadence; drift monitored	SENSE
Event flag (alarm/threshold)	event confirmed; debounce passed	burst allowed; then cooldown	AGGREGATE → TRANSMIT
Connection request (leaf join)	only during join/scan window; whitelist/allowlist hit	cap attempts/min; randomized backoff	SENSE (short) or AGGREGATE
User button	debounce; long-press for costly actions	lockout against chatter	MAINTENANCE
Charger insert	stable input detected; thermal OK	one-shot until removed	MAINTENANCE (safe window)
Brownout warning	pre-warning threshold hit; hold-up present	no rate limit (highest priority)	CONFIRM (flush) → SLEEP

Duty-cycle windows (why scheduling beats “more battery”)

Sensing window: short and predictable; powers only what is needed to collect and pre-check.
Upload window: expensive radio time; batch records and bound retries to minimize tail energy.
Maintenance window: infrequent; only allowed when energy/thermal/network gates are satisfied (logs/updates).

Average current model (for validation): I_avg ≈ Σ(I_state × t_state) / T. The goal is to keep TRANSMIT short and infrequent by batching, and keep unexpected wakeups near zero by gating.

State machine checklist (entry/exit/timeout + observability)

State	Entry	Exit	Timeout + current bucket	Must log
SLEEP	no pending work; gates satisfied	wake source triggers	indefinite; ~µA	wake reason + timestamp
SENSE	timer tick; join window	records collected or budget reached	bounded; mA	scan/connect attempts
AGGREGATE	new data queued	batch formed; watermark reached	bounded; mA	queue depth + watermark
TRANSMIT	upload window; energy OK	sent or retry budget exhausted	strict; burst	outage time + retries
CONFIRM	ACK or brownout warning	commit markers written	bounded; mA	last-ack + commit ID
MAINTENANCE	manual/charger; gates pass	task done; time budget hit	strict; burst	gates + outcome

Verification checklist (quick, practical)

Log wake reasons and state transitions; confirm unexpected wakeups are near zero during idle.
Measure state currents and dwell times; reconcile with the average-current model (I_avg).
Force uplink outage; verify buffering continues without raising average power uncontrollably.
Run brownout tests; confirm pre-warning triggers flush/commit before reset.

H2-4 · ULP PMIC & Power Tree (rails, retention, sequencing)

Engineering takeaway

The power tree is not just “power delivery.” It is a domain control system that ensures only the required rails are on at the right time. The critical retention path (AON + monitoring + minimal state) must be independent and verifiable. Sequencing with PG/EN must protect data consistency during both normal shutdown and brownout events.

Multi-rail domains (domain → typical loads → power-off consequence)

AON: RTC, wake arbiter, voltage monitor. Off = cannot wake or record last state.
MCU: main compute. Off = cold restart; retention optional depending on boot time budget.
RADIO: BLE/Wi-Fi/cellular. Off = no uplink; must be hard-gated to eliminate idle tail.
SENSOR: sensor and front-end rails. Off = no sampling; best controlled by sensing windows.
STORAGE: flash/spool and metadata. Off during write = corruption risk; must follow sequencing rules.

Boundary: isolation/leakage standards are handled on the dedicated PSU & Isolation page.

Power components (role → why it matters)

Buck vs LDO: select by light-load efficiency and noise needs; light-load behavior dominates average power in 24/7 systems.
Load switch: enforces domain off, reduces leakage, limits inrush, and prevents “half-on” failure modes.
Ideal diode / OR-ing: enables seamless switchover between main input and hold-up source with low drop and no backfeed.
Fuel gauge (if battery present): enables gating (allow maintenance windows only when energy margin is safe).
PG/EN signals: turn power sequencing into a hardware-enforced dependency graph (no guessing in firmware).

Sequencing & data consistency (why PG/EN is part of reliability)

Power-up: AON → MCU → STORAGE → RADIO. Reason: record state first, then safely write, then connect.
Power-down: stop RADIO → flush STORAGE → enter safe shutdown. Reason: avoid high radio peaks during writes.
Brownout event: pre-warning triggers a short “flush & mark” routine; commit markers ensure deterministic recovery.
Reset gating: MCU reset release should depend on PG of critical rails (especially STORAGE and AON).

The retention path must survive long enough to: log reset reason → flush essential records → mark last-ack/commit.

Verification checklist (quick, practical)

Measure peak current during radio bursts; confirm no brownout resets under worst-case uplink retries.
Validate sequencing: MCU reset release depends on PG of critical rails (AON + storage readiness).
Power-cut test during storage writes; confirm commit markers prevent corruption and recovery is deterministic.
Confirm retention path remains alive during hold-up long enough to log reset reason and flush essentials.

H2-5 · BLE Low-Power Playbook (advertising, connection params, scanning)

Engineering takeaway

BLE average power is dominated by radio on-time (scan duty + connection-event rate) and by retry/reconnect frequency. Savings come from windowing (short, scheduled scan/join windows), batching (fewer, denser connection events), and gating (bounded retries + cooldown) to prevent reconnection storms in crowded wards.

Where BLE power really goes: advertising vs scanning vs connection events

Phase	Main drivers	Hidden drain	Practical control
Advertising	adv interval, PHY, TX power	too-fast adv forces more gateway scanning	separate “join adv” from “presence adv”
Scanning	scan window/interval (scan duty)	continuous scan creates “always-on” radio	scheduled scan bursts + allowlist filters
Connection	conn interval, slave latency, event length	retries + reconnect storms in RF congestion	batch payloads + bounded retries + cooldown

Connection parameters (engineering meaning, not textbook definitions)

Connection interval: sets the “heartbeat” of connection events. Shorter intervals increase responsiveness but multiply radio wakeups and tail energy.
Slave latency: allows skipping events without dropping the connection. It is a power lever for stable signals, but it increases worst-case report latency.
Supervision timeout: defines when the link is declared dead. Too short creates false death → reconnect storms; too long delays failure detection and grows buffers.

Practical target: keep event frequency low enough for average power, while bounding worst-case latency and preventing false disconnect.

Multi-device aggregation (how to avoid collisions and reconnect storms)

Stagger connection-event start times: distribute devices across time so the gateway is not hit by synchronized bursts.
Group-and-window uploads: use short “batch windows” per group (bed/zone) and keep joining separate from reporting.
Bounded retries: cap retries per record and per device, then enter a cooldown to avoid tail-dominated power.
Admission control: in congestion, prioritize stability for already-connected devices; postpone new joins to a later join window.

Rule of thumb: prefer “fewer wakeups with larger batches” over “many tiny packets,” because the radio tail dominates.

Verification metrics (prove the playbook works)

Power: scan duty on-time, connection-event rate, retry tail duration, average current across a 24/7 trace.
RF health: packet error rate, retransmissions, disconnect frequency, time-to-reconnect distribution.
System health: queue watermarks, batch sizes, join success rate under high device density.
Fault injection: force weak RSSI and interference; confirm bounded retries + cooldown prevents storms.

H2-6 · Wi-Fi Low-Power & Reliability (DTIM, keep-alives, roaming traps)

Engineering takeaway

Wi-Fi power is often stolen by “staying online”: DTIM-driven wakeups, keep-alives, and network-stack retries. Real savings come from windowed uplink (batch transfers inside an upload window), bounded retries (avoid tail storms), and roaming control that prioritizes stable connectivity over frequent AP switching.

DTIM and power save (why cadence dominates average current)

DTIM cadence: defines how often the client must wake to receive buffered traffic. More wakeups create a visible “comb” in current traces.
Windowed behavior: place expensive uplinks inside a scheduled upload window, then allow the radio to return to deep sleep outside that window.
Downlink tolerance: if the gateway is primarily uplink-driven, it can tolerate delayed downlink and keep wake cadence low.

Boundary: this is an engineering view of symptoms and controls, not an enterprise Wi-Fi design guide.

Keep-alives: the most common “power thief”

Why tiny packets can be expensive: waking up, contending for airtime, transmitting, waiting for ACK, and settling back creates tail energy.
Batch heartbeats: merge multiple status items into one report aligned to the upload window.
Gate costly actions: maintain “always-on” connectivity only when queue watermark or alarm class requires it; otherwise allow disconnect/sleep.
Weak-signal behavior: decrease keep-alive frequency and prefer local buffering to avoid repeated handshake tails.

Power tail traps (handshake, DHCP/DNS retries, weak-signal retransmissions)

Typical field symptoms → likely cause → practical strategy

Frequent current spikes + delayed uploads → repeated association/handshake or DHCP/DNS loops → cap retries and enter cooldown; buffer locally.
High power with low throughput → weak RSSI causing retransmissions → measure link quality first; upload only when above a minimum margin.
Random long reconnect times → congestion or unstable AP → prefer stability; avoid aggressive roaming and avoid rapid reconnect loops.

Reliability rule: bounded retries + deterministic buffering is better than “try forever” because tail energy will dominate.

Roaming traps (symptoms and control strategy)

Symptom: periodic dropouts, latency spikes, or packet bursts after AP switching.
Control: roam only when metrics degrade beyond thresholds; avoid “ping-pong” switching under marginal RSSI.
Fallback: if roaming fails, apply backoff and rely on buffering rather than repeated fast re-association loops.
Operational view: stable uplink with bounded delay often beats peak throughput for ward telemetry.

Verification metrics (power + network + system)

Power: DTIM comb amplitude/frequency, burst TX tail duration, reconnect/handshake energy cost.
Network: association time, DHCP/DNS failures, retry counts, roaming attempts and failures.
System: upload-window completion ratio, queue watermarks, backlog drain speed after outage recovery.

H2-7 · Cellular Power Strategy (PSM/eDRX, modem states, coverage pain)

Engineering takeaway

Cellular power is rarely dominated by “one payload.” It is dominated by connection and signaling tails and by repeated failures under weak coverage. The strategy is to keep the modem in low-cost states as long as possible (PSM/eDRX), transmit in scheduled bursts, and enforce bounded retries + cooldown to prevent runaway attach/TAU loops.

Modem state ladder (why average current looks like “steps”)

State class	What triggers it	Power signature	Common pitfall
PSM / deep sleep	no immediate downlink need	near-zero baseline	waking too often defeats PSM
Idle with eDRX	periodic paging listen	comb-like periodic spikes	too-frequent cadence steals power
Connected	uplink burst / session	high steps + long tail	tiny frequent sends keep it alive
Attach / TAU loops	weak coverage / loss of registration	repeating spikes (storm pattern)	“try forever” destroys battery

PSM vs eDRX (configuration logic for low-rate telemetry)

Prefer PSM when uplink is periodic and downlink can be delayed until the next uplink window (lowest baseline).
Use eDRX when occasional downlink reachability is needed, but seconds-to-minutes latency is acceptable.
Keep “connected” short by batching: send multiple records in one burst, then return to idle/PSM.

Design goal: maximize time in low-cost states and make uplink energy predictable with scheduled bursts.

Weak coverage: symptom → detect → mitigate (to prevent power runaway)

Symptoms commonly seen in wards with dead zones

Average current climbs with frequent spikes; uploads become jittery or stall.
Repeated registration/attach attempts; reconnect time distribution widens dramatically.
Backlog grows even though the modem appears “busy.”

Detect (log what matters)

Signal quality trend: RSRP/RSRQ/SINR (trend + thresholds), not single snapshots.
Failure counters: attach/registration failures, retry counts, time-to-connect percentiles.
Radio on-time: total connected time per hour; tail duration per burst.

Mitigate (actions that save both power and data integrity)

Gate uplink by coverage: if quality is below a minimum margin, switch to store-and-forward instead of forcing a burst.
Bound retries: cap retries per burst and per time window; then enter a cooldown before the next attempt.
De-rate “keep-alive”: reduce non-critical heartbeats under poor coverage; prioritize alarms only.
Batch larger, less often: fewer sessions reduces repeated tails and signaling overhead.

Verification (what to measure to prove savings)

State occupancy: percent of day in PSM/eDRX idle vs connected vs attach/TAU loops.
Burst energy cost: energy per upload window (and its tail) under normal vs weak coverage.
Storm prevention: after injecting weak coverage, confirm bounded retries and cooldown stop repeated spikes.

H2-8 · Data Pipeline: batching, buffering, and “store-and-forward”

Engineering takeaway

A ward gateway must assume link dropouts. Data integrity comes from priority classes, batch windows, and a store-and-forward loop with sequence/ack and controlled flash wear. The goal is to avoid both “lost events” and “flash death by tiny writes.”

Data classes (QoS): alarms vs trends vs debug

Alarm (highest): small, urgent, may break the upload window; must be deduplicated and rate-limited during storms.
Trend (medium): periodic samples; designed for batching; tolerant to short delays; ideal for store-and-forward.
Debug (lowest): maintenance-only; strictly gated; uploaded in a service window with bandwidth and power limits.

Principle: separate paths by priority so an alarm cannot be blocked by trend backlogs or debug logs.

Batching (reduce session count to reduce tail energy)

Upload window: aggregate trend points and non-urgent events, then transmit in one burst session.
Alarm override (gated): alarms can transmit immediately, but enforce a cap and a cooldown to prevent power storms.
Bundle framing: send one header for many records; avoid per-record handshake behavior.

Buffering: RAM ring buffer + Flash spool (roles and boundaries)

RAM ring buffer: absorbs short outages and reduces flash writes by collecting records into batches.
Flash spool: protects against long outages and power loss; stores append-only segments for replay.
Spool trigger: move from RAM to flash when backlog exceeds a watermark, or when link quality gates uplink.

Boundary: flash is a durability tool, not a substitute for good batching. Tiny writes are the enemy.

Flash wear control (avoid “writing the flash to death”)

Append-only segments: write sequentially; avoid random overwrites that amplify wear.
Batch-to-flash: persist only after reaching a minimum batch size or after a timeout boundary.
Minimal metadata churn: keep pointers/watermarks compact and update at controlled intervals.
GC gating: reclaim only after confirmed ACK watermark; never delete “maybe delivered” data.

Delivery integrity: sequence → ACK watermark → de-dup → replay

Sequence IDs: every record or bundle carries an increasing ID to support replay and ordering.
ACK watermark: server acknowledges up to an ID; the gateway advances the durable watermark.
De-dup: replays are allowed; server must ignore duplicates to avoid double-counting.
Replay loop: on reconnection, send from flash spool starting at the last unacked watermark.

Power-fail behavior (fast, bounded, predictable)

On power-fail warning: stop low-priority ingestion, flush a bounded critical batch to flash, and persist the current watermark.
No “big work”: avoid compaction, long hashing, or re-indexing inside the hold-up window.
On next boot: resume replay from durable watermarks; log the event for service visibility.

H2-9 · Power-Loss Hold-Up Sizing (supercap/battery/bulk caps) and budget math

Engineering goal (hold-up contract)

When a power-loss warning occurs, the gateway must complete a bounded “critical sequence”: freeze ingress → persist minimal state → shed high loads → enter safe state. Hold-up sizing is therefore an energy window problem (Vstart to Vend), not a “bigger capacitor is always better” problem.

Critical energy budget: define what must finish

Must finish (critical)

Persist minimal metadata: ACK watermark, spool pointer, monotonic sequence stamp, and a power-fail reason code.
Bounded flash commit: write the smallest durable record that makes replay deterministic after reboot.
Shed high loads: stop RF transmit and disable non-critical rails to reduce Pcritical immediately.
Enter safe state: keep RTC / always-on logic and store the last shutdown stage for diagnostics.

Nice-to-have (only if budget allows)

Send a single power-fail notice only when link quality gates pass and the transmit tail is predictable.
Persist a short diagnostic summary (not full logs, not compaction).

Forbidden during hold-up

Any long network handshake, reconnect, or waiting for server response.
GC/compaction/re-index work that can turn into unbounded flash writes.

Budget math: energy window + critical power

Step 1 — define the usable voltage window (Vstart to Vend)

Vstart: the rail voltage at the moment the early warning triggers (before the system becomes unstable).
Vend: the lowest voltage where flash commit and RTC/AON still behave deterministically (including regulator headroom).

Step 2 — compute energy available from the storage element

Capacitor energy window:
E_cap = 1/2 · C · (Vstart² − Vend²)

Hold-up time estimate (bounded critical sequence):
t ≈ E_usable / P_critical

Step 3 — size C from the time budget (useful design form)

C ≈ 2 · P_critical · t / ( η · (Vstart² − Vend²) )

Where:
- P_critical = only the rails that stay on during hold-up
- η accounts for conversion losses and real-world inefficiencies
- t is the required completion time (typically 50–200 ms for graceful shutdown)

Practical tip: the fastest way to shrink C is to reduce P_critical early (load-shedding) and make flash writes bounded.

Option trade-offs: supercap vs small battery vs bulk caps (ward gateway scale)

Supercap

Best for: short, deterministic hold-up to finish writes and shut down cleanly.
Strength: high pulse current capability; long cycle life.
Watch-outs: leakage/self-discharge, ESR at cold temperature, inrush limiting on recharge.

Small battery

Best for: longer survival time and extended logging when mains can be absent for minutes.
Strength: higher energy density; supports more extensive safe-state functions.
Watch-outs: charger/BMS complexity, aging, and maintenance expectations.

Bulk capacitors

Best for: very short hold-up and smoothing; often enough for fast metadata commits only.
Strength: low cost; simple integration.
Watch-outs: limited usable window and higher risk of brownout timing variability.

OR-ing and ideal diode devices are part of the hold-up system: they enforce one-way energy flow and prevent reverse discharge paths.

Real-world corrections (why margin is mandatory)

Temperature: effective capacitance and ESR change with temperature; derate to the worst expected condition.
Aging: capacitance fade and leakage drift over life; reserve extra energy headroom.
Leakage: supercap self-discharge can dominate if “hold-up” must be available after long idle times.
Recharge inrush: uncontrolled recharge can cause dips and resets; limit current and sequence rails.

H2-10 · Brownout Detection & Graceful Shutdown (what must happen in 50–200 ms)

Engineering goal (bounded response)

Brownout handling is a time-budgeted state machine. The response must be deterministic within a bounded window: detect early → shed loads → persist minimal state → enter safe state. Unbounded actions (reconnect, long writes, compaction) must be gated or skipped.

Power-loss detection chain (two-level triggers)

Level-1 (early warning): PG de-assertion, ADC threshold crossing, or bus droop detector that interrupts early enough for flash commit.
Level-2 (imminent brownout): hard supervisor/comparator threshold that forces minimal actions only (protect correctness, skip extras).

Design intent: Level-1 enables graceful shutdown; Level-2 protects against corruption when time is nearly gone.

Graceful shutdown sequence (strict order)

Freeze ingress: stop adding new records; snapshot current queue watermarks.
Load-shed: disable RF transmit and non-critical rails first to collapse Pcritical quickly.
Bounded commit: persist minimal metadata and a power-fail stage marker (small, deterministic write).
Reason code: store brownout cause and counters for service visibility.
Safe state: enter a low-power mode that preserves RTC/AON and blocks heavy peripherals.

Skip policy: if voltage drops below the safe margin, skip network activity and any non-essential flash work.

Data consistency with a tiny two-phase commit (action-level, not file-system theory)

Pre-commit marker: write a short “intent” record that a shutdown commit is starting.
Payload + pointers: write the minimal durable watermarks (ACK level, spool pointer, sequence stamp).
Commit marker: write a short “done” record. On next boot, missing “done” triggers replay/rollback safely.

Avoiding reset storms (brownout → reboot → brownout loops)

Minimum voltage gate: do not enable high-load rails (RF/flash heavy writes) until voltage exceeds a safe threshold with margin.
Cooldown timer: after a brownout, wait a minimum bounded time before retrying network-heavy actions.
Retry counter: if brownouts repeat N times, enter a protective mode (RTC + minimal logging only) until power stabilizes.
WDT policy: ensure watchdog behavior does not create extra resets during the brownout window; keep the shutdown path deterministic.

H2-11 · Validation Checklist: power profiling, RF stress, outage drills, field telemetry

Definition of “done”

Validation is complete only when the gateway shows bounded energy per state, bounded retries under weak RF, deterministic data consistency under power loss, and field counters that close the loop in production.

A) Power profiling by state machine (average, peaks, and “tails”)

Measure current as a segmented profile (SLEEP → SENSE → AGGREGATE → TRANSMIT → CONFIRM/RETRY), not as a single average number. The goal is to verify both energy per event and upper bounds under worst-case retries.

What to record

SLEEP/AON: Iavg, periodic wake spikes, RTC/AON stability across hours.
Wake + compute: peak current and duration for parsing, batching, encryption (if enabled), queue ops.
Transmit: peak current, burst duration, and the power tail energy (retries, DHCP/DNS, attach/TAU, scanning).
Confirm/Retry: energy per retry, maximum retries allowed by policy gates.

Pass criteria (engineering-grade)

Each state meets its budget: E(state) ≤ E_budget × (1 + margin) across normal and stress runs.
Transmit “tail” is explainable and bounded (no unbounded reconnect loops).
Energy per report remains bounded when RF is degraded (bounded retry policy is enforced).

B) RF stress: weak signal, congestion, roaming traps, cellular edge coverage

RF validation must connect reliability metrics with energy cost. The same “bad RF” condition should produce consistent signatures in reconnect counters, retry rates, and energy per event.

Stress stimuli (examples)

Weak signal: controlled attenuation / obstructed path; verify retry gates and fallback behavior.
Congestion: busy channel / high AP load; verify latency P95 and packet loss behavior.
AP switch / roam: forced reassociation; verify bounded reconnection logic (no energy runaway).
Cellular edge: poor RSRP/RSRQ; verify attach/TAU and retry pacing remain bounded.

Metrics to log (minimum set)

Reconnect count, retry count, failure reasons (DNS/DHCP/auth/timeout), and RSSI/RSRP distributions.
Packet loss and retransmissions; end-to-end latency (P50/P95).
Energy per report under each RF stress profile.

C) Outage drills: random cuts, cold derating, supercap aging assumptions

A power-loss drill is successful only when data remains consistent and the device avoids reset storms. Drills should be run across different RF states (idle / transmitting / retrying) to validate the worst-case “tail” behavior.

Drill set (recommended)

Random cut: remove input power at random phases; repeat across thousands of cycles.
Cold derating: reduced usable window (simulate higher ESR / lower C); verify hold-up still meets the minimal contract.
Aging assumption: shrink Vstart→Vend window / increase leakage assumption; verify bounded commit still succeeds.

Pass criteria

After reboot, ACK watermark / spool pointer / sequence stamp are valid and monotonic (no duplicate or missing critical records beyond defined policy).
“Brownout → reboot → brownout” loops do not occur (reset-storm guard works: voltage gate + cooldown + retry counter).
Critical shutdown stage markers show the device reached safe state when budget allowed, and degraded cleanly when not.

D) Field telemetry: counters that close the loop in production

Field observability should separate failures by domain (RF, power, storage, policy) without requiring invasive debugging. The same metrics used in lab stress tests should exist in field telemetry with stable definitions.

Minimum counter dictionary

RF: reconnect_count, retry_count, last_fail_reason, avg_RSSI/RSRP, roaming_events, time_to_attach.
Power: brownout_count, early_warn_count, hold_up_entries, last_shutdown_stage.
Storage: spool_high_watermark, commit_fail_count, replay_events, wear_estimate (at least erase/write counters).
Performance: report_latency_P95, queue_delay, drops_by_policy (intentional drops vs corruption).

EMC note: list only what to test (ESD/EFT/surge/radiated immunity) and record symptoms + counters; mitigation details belong to the Compliance & EMC page.

Reference parts (example material numbers used in validation fixtures)

These part numbers are commonly used to make validation repeatable (accurate current/energy logging, precise power-fail triggers, and measurable hold-up behavior). Actual selection depends on the chosen rails and current ranges.

Power/energy profiling monitor: TI INA228 (digital power monitor; useful for per-state energy profiling).
Rail supervisor / reset: TI TPS3839 (ultra-low-power supervisor for deterministic brownout triggers).
Window supervisor (early warning + hard threshold): TI TPS3703 (dual-threshold monitoring for two-level triggers).
Supercap backup controller (hold-up system reference): Analog Devices LTC3350 (supercap backup supply controller).
Supercap state/health monitor (aging/derating evidence): TI BQ33100 (supercap monitor / health estimation).
External flash for spool validation (example): Winbond W25Q64 (used widely for log/spool endurance exercises).
BLE SoC platform (example): Nordic nRF52840 (for BLE stress + low-power parameter verification).
Wi-Fi platform (example): Espressif ESP32-C3 (for DTIM/tail profiling and congestion stress).
Cellular module platform (example): Quectel BG95 (Cat-M/NB family commonly used for edge-coverage stress).

Tip: keep the validation fixture BOM stable so “before/after” firmware changes can be compared with high confidence.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12 · FAQs

These FAQs focus on low-power telemetry backhaul, store-and-forward reliability, and power-loss hold-up behavior for ward gateways.

1) What is a ward telemetry gateway, and what is it not?

A ward telemetry gateway aggregates nearby device data and forwards it to hospital infrastructure or cloud using Wi-Fi or cellular. It is not a bedside-only comms hub or a hospital core-network controller. The defining features are low-power duty cycling, store-and-forward buffering, and deterministic behavior during outages and brownouts.

2) When should BLE be used only for local aggregation (not for backhaul)?

Use BLE primarily for short-range, multi-device collection where payloads are small and devices are nearby. Avoid using BLE as backhaul when long-range coverage, seamless roaming, or higher throughput is required. BLE scanning and reconnection “tails” can dominate energy if the environment is noisy, so backhaul is usually better handled by Wi-Fi or cellular.

3) Why does “power tail” often dominate energy per report more than TX peak current?

Peak current is brief, but tail energy can last seconds due to handshakes, retries, address resolution, or attach/re-association work. Under weak RF, retransmissions and repeated setup steps create long high-current plateaus that exceed the burst itself. Validation should track energy per event (not just Ipeak) and enforce bounded retry gates to prevent runaway tails.

4) What is a safe default power-state architecture for 24/7 gateways?

A safe baseline is an Always-On domain (RTC, wake controller, voltage monitor) plus short active windows for sensing, batching, and transmit. Wake sources should be explicit: timer, event threshold, scheduled maintenance, power-loss early warning, and user/service triggers. The firmware should be a state machine with bounded transitions and budgets, so energy and reliability remain predictable across days.

5) How should rails be partitioned in an ultra-low-power power tree?

Partition rails by responsibility: AON (RTC/monitor), MCU/retention, RADIO, SENSORS, and STORAGE. RADIO and high-load rails should be hard-switchable for fast load-shedding, while AON must remain stable across outages. Sequencing and PG/EN gating should ensure storage writes are completed before rails collapse, preventing partial commits and replay ambiguity.

6) Which BLE parameters most strongly affect power in multi-device wards?

Advertising interval and scan window directly set how often radios wake and how long they listen. Connection interval, slave latency, and supervision timeout determine how frequently connection events occur and how tolerant the link is to missed packets. For many devices, schedule group reporting windows to reduce collisions and scanning time, and avoid continuous scanning outside defined collection windows.

7) What are the top Wi-Fi low-power and reliability traps in hospitals?

DTIM settings shape wake cadence; mismatches can force frequent wakeups even when payloads are small. Keep-alives, DHCP/DNS retries, and weak-signal retransmissions create large “tails” that overwhelm average power targets. Roaming events can add repeated reassociation cost; validation should measure reconnect counts, time-to-service recovery, and energy per report under forced AP changes.

8) For cellular backhaul, when do PSM/eDRX help and when do they hurt?

PSM/eDRX help when uplinks are infrequent and payloads are small, because the modem can sleep deeply between scheduled network checks. They can hurt when near-real-time responsiveness is required, because wake latency increases and attach/keepalive timing becomes more constrained. In weak coverage, repeated retries and network re-selection can dominate power; bounded retry pacing and batching are essential.

9) How can store-and-forward avoid data loss without wearing out flash?

Separate data by QoS: alarms, trends, and debug logs should have different retention and retry policies. Use a RAM ring buffer for short outages and a flash spool for longer gaps, with bounded write sizes and explicit watermarks. Avoid unbounded garbage-collection during low-voltage windows; instead, keep a minimal durable pointer and replay deterministically after reboot.

10) How should hold-up be sized to finish critical tasks during power loss?

Start from the hold-up contract: detect power loss early, freeze ingress, write minimal durable metadata, shed high loads, and enter safe state. Size storage from the usable voltage window (Vstart to Vend) and the critical power budget after load-shedding. Supercaps excel for short deterministic windows; small batteries fit longer outages; bulk caps often cover only the shortest “minimal commit” path.

11) What must happen in the first 50–200 ms after a brownout warning?

Use a deterministic sequence: trigger on early warning, stop new writes, shed radio transmit and noncritical rails, and commit only bounded minimal state. A two-level scheme (early warning plus hard supervisor) prevents corruption when voltage collapses faster than expected. Reset-storm guards should block repeated heavy startup until voltage is stable, using a minimum-voltage gate, cooldown timing, and retry counters.

12) What evidence is required to sign off low power and reliability?

Sign-off should include segmented power traces for each state, RF stress results under weak signal and roaming, and outage drills across temperature and aging assumptions. The device must demonstrate deterministic replay and bounded retries, with no reset storms under repeated brownouts. Field telemetry must include a minimal counter dictionary (reconnects, retries, RSSI/RSRP, brownouts, spool waterlines) to close the loop in production.

Telemetry & Ward Gateway (BLE/Wi-Fi/Cellular, ULP Power)

Telemetry & Ward Gateway (BLE/Wi-Fi/Cellular, ULP Power)

H2-1 · What is a Ward Telemetry Gateway (and what it is not)

H2-2 · Link Options: BLE vs Wi-Fi vs Cellular (selection matrix)

H2-3 · Power-State Architecture (Always-On domain, wake sources, duty cycle)

H2-4 · ULP PMIC & Power Tree (rails, retention, sequencing)

H2-5 · BLE Low-Power Playbook (advertising, connection params, scanning)

H2-6 · Wi-Fi Low-Power & Reliability (DTIM, keep-alives, roaming traps)

H2-7 · Cellular Power Strategy (PSM/eDRX, modem states, coverage pain)

H2-8 · Data Pipeline: batching, buffering, and “store-and-forward”

H2-9 · Power-Loss Hold-Up Sizing (supercap/battery/bulk caps) and budget math

H2-10 · Brownout Detection & Graceful Shutdown (what must happen in 50–200 ms)

H2-11 · Validation Checklist: power profiling, RF stress, outage drills, field telemetry

Request a Quote

Accepted Formats

Attachment

H2-12 · FAQs

Explore

Categories

Get in Touch

Telemetry & Ward Gateway (BLE/Wi-Fi/Cellular, ULP Power)

Telemetry & Ward Gateway (BLE/Wi-Fi/Cellular, ULP Power)

H2-1 · What is a Ward Telemetry Gateway (and what it is not)

H2-2 · Link Options: BLE vs Wi-Fi vs Cellular (selection matrix)

H2-3 · Power-State Architecture (Always-On domain, wake sources, duty cycle)

H2-4 · ULP PMIC & Power Tree (rails, retention, sequencing)

H2-5 · BLE Low-Power Playbook (advertising, connection params, scanning)

H2-6 · Wi-Fi Low-Power & Reliability (DTIM, keep-alives, roaming traps)

H2-7 · Cellular Power Strategy (PSM/eDRX, modem states, coverage pain)

H2-8 · Data Pipeline: batching, buffering, and “store-and-forward”

H2-9 · Power-Loss Hold-Up Sizing (supercap/battery/bulk caps) and budget math

H2-10 · Brownout Detection & Graceful Shutdown (what must happen in 50–200 ms)

H2-11 · Validation Checklist: power profiling, RF stress, outage drills, field telemetry

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

H2-12 · FAQs

Explore

Categories

Get in Touch