Wearable Safety Beacon: Success Metrics (TTFA) & Trigger Evidence
Core idea: A wearable safety beacon “works” only when every emergency trigger reliably produces a fast, provable end-to-end alert—from wake → location → radio uplink → (optional) voice—under worst-case power, RF coexistence, and harsh field conditions.
Design and debug it with an evidence-first chain: always verify one rail + one subsystem metric, isolate the bottleneck (power / cellular / GNSS / UWB / audio / trigger logic), apply the first fix, then re-test using measurable KPIs (TTFA/TTFF, reset count, dropout rate, confidence).
H2-1. System Boundary & “Success Definition”
A wearable safety beacon is “successful” only when an emergency event produces a provable device-side evidence chain: Trigger → Wake → Location → Link → Payload/Voice → Delivery Proof. This chapter defines what to measure, what must be true, and what is explicitly out-of-scope (no cloud/app walkthroughs).
| Metric | Definition (device-side) | What to log / measure | Design levers (examples) |
|---|---|---|---|
| TTFA Time-to-first-alert |
Time from event trigger (button or accel interrupt) to uplink delivered/acked (or local receipt confirmation). Break into T0–T5 checkpoints. | Timestamps: T0 trigger, T1 MCU awake, T2 location valid, T3 link ready, T4 payload sent, T5 ack/receipt. | Wake latency, GNSS retention/aiding, cellular attach strategy, payload sizing, retry/backoff policy, radio scheduling priority. |
| Energy per event E_event |
Total energy consumed per emergency event including location acquisition + radio bursts + voice session start. Must survive peak current without brownout. | Current waveform: sleep (µA), confirm (mA), TX bursts (A peaks), audio (mA–100s mA). Compute ∫V·I·dt for event. | Power-domain gating, supercap/peak buffer, buck/boost selection, TX burst shaping, codec duty-cycle, “voice vs telemetry” arbitration. |
| Standby life T_standby |
Achievable standby duration under specified assumptions: sleep current + periodic maintenance wakes + expected events/day. | Long-run sleep current distribution; wake frequency; event count/day; battery voltage droop vs temperature and aging. | Always-on island design, sensor hub configuration, maintenance interval, leak paths, fuel-gauge model tuning. |
| Coverage assumptions | Explicit boundaries: indoor/outdoor, GNSS availability, cellular registration state, and allowed fallbacks (UWB/BLE relay/last-known + confidence). | GNSS C/N0 + fix flags; RSRP/SINR/attach state; UWB range residuals; “confidence score” for last-known position. | Fallback ordering, “confidence-based” messaging, antenna/coexistence scheduling, retry route selection (device-side). |
Treat TTFA as a chain of observable checkpoints. Each checkpoint must have at least one “hard” signal (GPIO edge, rail step, modem status, counter) to avoid ambiguous blame.
- T0 Trigger: interrupt reason code captured (BTN_LONG / FALL_CONF / TAMPER_STRAP).
- T1 Wake: main MCU domain active (RTC → main), measured by rail step + firmware timestamp.
- T2 Location valid: GNSS fix flag OR UWB positioning valid OR last-known position with confidence & age.
- T3 Link ready: cellular attached / bearer ready OR BLE relay ready; include signal snapshot.
- T4 Payload sent: queue drained + sequence counter incremented; TX burst observed on current trace.
- T5 Delivery proof: ack/receipt flag or timeout fallback taken (device-side decision logged).
H2-2. Use-Case Matrix & Trigger Taxonomy
Triggers must be classified by evidence requirement and false-alarm cost. Ultra-low-power design depends on a two-stage wake approach: an always-on sensor/button interrupt only “opens a short confirmation window,” then the main system escalates to alert.
The table below is device-side only: sensors, timers, and local counters. No app/cloud workflows.
| Trigger type | Required evidence (minimum) | Confirm window | False alarm cost | Engineering levers |
|---|---|---|---|---|
| Manual panic / long-press |
Button interrupt + stable press duration + release pattern (optional). Log press duration and debounce result. | 0–1 s typical (debounce + long-press timer). | High (unwanted emergency dispatch + battery hit). | Long-press thresholds, pocket/pressure guards, double-press patterns, haptic confirm prompt, cancel window. |
| Automatic fall / impact |
Accel peak + orientation change + post-event inactivity. Include saturation/clipping flags. | 1–5 s typical (capture before/after window). | Very high (most prone to false positives). | Two-threshold gating, orientation filters, “quiet-time” after strong motion, adaptive thresholds by activity class. |
| Automatic no-motion / man-down |
Low-motion timer + posture/orientation constraint (optional) + strap present. | 10–60 s typical (depends on duty-cycle and risk tolerance). | Medium–high (common during sleep/rest). | Multi-window confirmation, user-cancel prompt, strap contact validation, timer hysteresis. |
| Tamper strap remove / open |
Strap loop open + duration > threshold + motion context (optional). | 0.5–3 s typical (debounce + re-check). | Medium (alarm credibility vs nuisance). | Contact debounce, dual-sensor corroboration, enclosure switch placement, EMI hardening of sense line. |
| Battery emergency | UVLO near-threshold + predicted remaining event energy (model-based). | Immediate (protect last-chance message). | High if mis-triggered (premature panic). | Fuel-gauge calibration under bursts, ESR estimation, “reserve capacity” policy, peak buffer enablement. |
Debounce is not only “button bounce.” In field failures, “false alarms” often come from mechanical shocks, sensor saturation, or power/EMI glitches. Confirmation must separate these with minimal extra energy.
- Mechanical: use time windows + multi-evidence corroboration (peak + posture + inactivity).
- Sensor limits: track saturation/clipping flags; if saturated, extend confirm window and require post-event stillness.
- Power/EMI: log rail droop / reset counters; if droop occurs near trigger, tag event as “suspect” and avoid immediate escalation.
- User intent: for manual panic, provide a short cancel window with a deterministic pattern (press+hold to cancel).
A strict state machine reduces ambiguity. Pair every transition with a reason code so that validation, FAQs, and debug playbooks can trace real devices with Ctrl+F.
| Reason code | Meaning | Required log fields | Typical confirm gate |
|---|---|---|---|
| BTN_LONG | Manual long-press panic trigger. | press_ms, debounce_ok, cancel_window_used, T0–T1. | press_ms ≥ threshold; no bounce; optional second confirmation. |
| FALL_PRE / FALL_CONF | Fall/impact candidate then confirmed. | peak_g, orient_delta, post_still_ms, sat_flag, T0–T2. | multi-signal consistency + post-event stillness. |
| MAN_DOWN | No-motion timer expiration with posture constraints (optional). | still_ms, posture_state, strap_ok, recheck_count. | two-window recheck + user-cancel prompt if available. |
| TAMPER_STRAP | Strap loop open beyond debounce threshold. | open_ms, reclose_seen, EMI_suspect, rail_mv. | open_ms ≥ threshold and stable sense line. |
| BAT_EMERG | Battery near end-of-life for an event burst. | vbat_mv, ESR_est, reserve_policy, last_event_energy. | model-based “can still send SOS” decision. |
H2-3. Multi-Radio Architecture: UWB + BLE + Cellular + GNSS (Who does what?)
Multi-radio reliability comes from clear responsibility and measurable handoff gates. Each radio is assigned a primary job and a fallback role, while the firmware maintains two independent selectors: Location Provider (GNSS/UWB/Last-known) and Uplink Bearer (Cellular/BLE relay/Store-and-forward).
| Radio | Primary role | Secondary role / fallback | Do-not-duplicate (anti-chaos) |
|---|---|---|---|
| BLE | Provisioning, short-range status, and optional phone/relay uplink when cellular is unavailable. | Low-duty advertising / scanning for presence and short bursts of metadata transfer. | Do not treat BLE as a “cloud channel.” Keep it as a local bearer with explicit timeouts and budgets. |
| UWB | Indoor precise ranging / anchor interaction to produce a timestamped location estimate (with residual/error). | Local integrity signals (session success rate, residual spikes) used as confidence indicators. | Do not expand into RTLS deployment or anchor network design. Only device session evidence and outputs. |
| Cellular LTE-M / NB-IoT / Cat-1 |
Wide-area alert delivery (TTFA critical path): attach/bearer ready → payload delivery → receipt/ack. | Voice path if required (gated by power/coverage policies). | Do not deep-dive operator core network. Focus on attach time, retry counters, and radio metrics. |
| GNSS | Outdoor absolute location fix with measurable quality (C/N0, fix type, HDOP). | Aiding/retention to reduce cold-start time; provide “fix age” and confidence. | Do not drift into anti-jam system design. Only device observable quality and fallback decisions. |
Handoffs must be driven by hard gates (timeouts, retry counts, confidence) rather than “best effort” heuristics. The goal is to avoid resource fights (RF time, current peaks) and to keep emergency paths deterministic.
- Location Provider selector: prefer GNSS when fix-quality passes threshold; else use UWB when session residual is stable; else last-known with age + confidence.
- Uplink Bearer selector: prefer cellular when bearer is ready; else BLE relay if available; else store-and-forward with bounded retry schedule.
- Priority rule: during emergency, scheduling favors the path that reduces TTFA; background scans must yield immediately.
- Exit rule: if the chosen provider fails N times or exceeds timeout, switch once and log the reason code (no infinite thrash).
When TTFA is long, the cause is usually one of: registration delay, poor RF margin, or coexistence blocking. The evidence set below is the minimum to distinguish them.
- Link budget snapshots: BLE RSSI; cellular RSRP/SINR; GNSS C/N0; UWB session RSSI/residual.
- Retry counters: attach retries, TX retries, UWB session retries, BLE reconnect attempts.
- Registration histogram: attach time distribution (best/typical/worst) + success rate in weak coverage.
- State timestamps: map radio states to T0–T5 so delays can be attributed to “fix” vs “bearer ready.”
H2-4. RF Front-End, Antenna Placement, and Coexistence (The silent killer)
In multi-radio wearables, failures often appear only when transmitters, antennas, and power domains operate together. The goal is to keep RF performance predictable under simultaneous activity using three levers: placement (antennas/keepouts), front-end partition (filters/LNA/switch/ESD), and coexistence scheduling (time-slicing + priority).
- LTE vs GNSS: maximize isolation; keep GNSS antenna away from LTE PA region and high di/dt converters; reserve a GNSS keepout area.
- UWB: avoid proximity to large metal (battery cans, shields); keep a stable ground reference; prefer edge placement for radiation.
- BLE: avoid detuning by battery/metal straps; leave matching network access for tuning during EVT/DVT.
- Human-body detune: validate in “worn” conditions; track RSSI/RSRP shifts and retune margin.
Front-end parts should be chosen and placed to protect weak receivers (GNSS/UWB) from strong transmitters (LTE). The aim is not perfection, but stable margins that survive worst-case bursts, temperature, and aging.
- LNA placement: shortest RF path to antenna; clean bias rail; tight return path to reduce injected noise.
- Filtering: SAW/BAW or band-pass where it materially improves desense; verify insertion loss vs isolation trade.
- RF switch / duplexer: when sharing antennas, quantify insertion loss and ensure isolation is sufficient for coexistence.
- RF ESD: use low-capacitance devices; place at the entry with controlled return path to avoid re-radiation.
- Time-slicing: schedule GNSS measurement windows away from cellular TX bursts; pause UWB sessions during high-duty BLE scans if needed.
- Priority: emergency uplink and “proof-of-delivery” outrank background scans; prevent thrash loops by logging fallback reasons.
- Notch/filter strategy: apply only when tests show measurable improvement; otherwise keep the RF chain simpler.
- Return path discipline: route high di/dt loops away from RF; correlate near-field hotspots with receiver quality loss.
- Desense test: record GNSS C/N0 and fix rate while forcing cellular TX bursts; overlay time markers to prove causality.
- UWB degradation test: range residual / session fail rate vs BLE scan/connect duty; identify thresholds where errors jump.
- Near-field scan: find hotspots near PA/DC-DC/feeds; correlate with conducted/radiated emissions and receiver loss.
- “Worn” detune: repeat RF metrics with body proximity; compare to bench and adjust matching/placement margins.
H2-5. Voice I/O Chain for Emergency Use (Mic → Codec → Speaker)
Emergency voice must remain intelligible and continuous while the device is power-limited and radios may be struggling. This chapter defines the voice chain as a set of gated stages (wake → capture → enhance → encode → deliver/play) and ties each stage to measurable evidence: mic SNR, clipping/AGC counters, (optional) AEC convergence indicators, end-to-end latency, and dropout rate.
- Wake strategy trade: wake-on-voice can drain battery via false wakes; button-only is deterministic but requires user action.
- Tiny-speaker feedback risk: small transducers + enclosure leakage can cause howl/echo; the chain needs bounded gain and observable convergence.
- Poor-link scheduling: when cellular retries spike, RF time and peak current compete with audio; emergency policy must preserve voice continuity.
- Wind + friction noise: outdoor turbulence and body contact can saturate the mic/AFE; clipping control is as important as noise reduction.
| Stage | What to record | What it proves |
|---|---|---|
| Mic + AFE/ADC | mic_SNR_est, clip_count, AGC_state/steps, wind_flag (if available) | Distinguishes wind/friction saturation from link issues; explains harsh distortion vs mere noise. |
| DSP gates VAD/NR/AGC/AEC |
VAD_on, NR_level, AEC_converged/residual (if used), limiter_hits | Shows whether enhancement is stable and bounded; flags “AEC never converges” or over-aggressive NR. |
| Codec + buffers | codec_mode, packetization_ms, jitter_buf_level, underrun_count | Explains choppiness from buffering/scheduling rather than from mic chain quality. |
| Radio scheduling | voice_priority_on, telemetry_suppressed_count, tx_retry_count | Proves voice was protected when the link was poor; detects starvation by telemetry or retries. |
| End-to-end | voice_latency_ms, dropout_rate, session_restart_count | Validates system-level targets; allows field comparison across coverage and temperature. |
Emergency mode should enforce a simple rule set that is provable by logs: voice frames are scheduled first, telemetry yields, and the system avoids thrashing when the link is unstable.
- Priority ladder: voice audio frames → delivery proof/ack metadata → location updates → non-critical telemetry.
- Bounded retries: if modem retries exceed a threshold, reduce telemetry frequency and protect audio buffers.
- Gain bounding: cap speaker/PA gain steps in emergency to prevent runaway feedback; log limiter hits.
- Fail-soft: if voice delivery degrades, fall back to short siren/beep patterns + minimal metadata payload (device-side behavior only).
H2-6. Ultra-Low-Power Power Tree & Peak-Current Survival
Safety beacons must survive a power paradox: µA sleep for months, then high peak current during cellular TX bursts, GNSS acquisition, and speaker/siren drive. The power tree must be partitioned into sleep islands and peak domains, with buffers and bounded inrush so that emergency events do not cause brownouts.
- Sleep islands: RTC domain, wake controller, sensor hub, secure element keep-alive; ensure no back-powering through I/O.
- Peak handling: modem/PA TX bursts, GNSS acquisition, speaker/siren; keep peak paths isolated from always-on rails.
- Energy storage: Li-ion vs primary cell behavior under burst load; supercap buffers; bounded inrush charging.
- Fuel gauge correctness: separate ESR droop from true SoC; avoid false “empty” or sudden shutdown under bursts.
- Current profile: sleep (µA) / scan (mA) / TX burst (A peaks) / audio (hundreds mA) — capture with time alignment.
- Rail droop: modem rail min voltage during worst burst; MCU rail min voltage; log reset reason codes.
- Brownout margin: worst-case ESR (cold + aged) model vs measured droop; verify UVLO thresholds and hysteresis.
- Inrush evidence: supercap charge current limit effectiveness; prove it does not trigger brownout at plug-in or wake.
- Fuel gauge sanity: SoC estimate stability during bursts; track “voltage sag vs SoC” correlation to prevent misreporting.
A robust pattern is to keep the always-on domain clean and minimal, then feed the modem/PA domain from a buffered branch. A supercap (or high-burst reservoir) can be OR-ed into the modem rail through an ideal-diode element, while inrush limiting prevents “buffer charging” from collapsing the system.
- Domain switches: load switches per domain to prevent leakage and back-power.
- Buffered modem rail: reservoir energy supports TX bursts without dragging down MCU/RTC rails.
- Bounded inrush: current limit + soft-start for supercap to avoid startup brownouts.
- Reason codes: log UVLO/brownout, watchdog, and modem fault states to separate power vs firmware issues.
H2-7. Wake Sources, Sensor Hub, and Trigger Robustness
Trigger robustness is a system problem: an always-on sensor hub must wake the main MCU quickly, then a short confirmation window must discriminate real events from daily motion. This chapter fixes the chain as two-stage wake (threshold wake → confirm pipeline) and forces every decision to produce evidence: raw accel window + rail droop snapshot + reason/confidence codes.
- Accelerometer ODR & range: choose a low ODR for always-on thresholding, then raise ODR only inside the confirmation window.
- Thresholds + persistence: use a duration gate to avoid micro-spikes; track peak-g and duration as evidence.
- Orientation compensation: measure posture change (delta angle) to avoid “running vibration” masquerading as a fall.
- Motion classification (minimal set): a small feature set (impact peak, decay, posture flip, band energy) beats opaque heuristics.
- Strap / tamper debounce: treat strap removal and enclosure open as state machines with bounded debounce time.
| Evidence | What to capture | What it proves |
|---|---|---|
| Raw confirmation window | accel_raw_window (timestamped), peak_g, impact_duration_ms, orientation_delta_deg | Separates true impact/fall from cyclic vibration; supports post-event replay and threshold refinement. |
| Feature summary | band_energy_low/high, decay_slope, stillness_after_ms | Explains why an event was accepted/rejected without needing an app tutorial or ML training. |
| Power integrity at wake | VBAT_min, Vsys_min, droop_duration_ms, reset_reason_code | Detects “missed event” caused by rail droop, brownout, or sensor hub/main MCU transition issues. |
| Decision outputs | wake_reason, decision (alert/reject), confidence, reject_reason | Makes tuning convergent: every false alarm must map to a specific reject reason. |
Robust triggers require per-class accounting. A minimal confusion matrix should include: true fall, running, vehicle vibration, stairs/jumps, and drop-on-table. The goal is not perfect classification; the goal is bounded false alarms while keeping missed critical events near zero.
- True fall: single impact peak + posture flip + short stillness window (often).
- Running: periodic energy with stable orientation; reject via cadence-like band energy signature.
- Vehicle vibration: low-frequency sustained energy, minimal posture change; reject via decay and orientation gates.
- Desk drop: large impact peak but inconsistent posture + immediate subsequent motion; reject via post-impact pattern.
H2-8. Location Stack: GNSS + UWB Ranging + Dead-Reckoning (Evidence-first)
Location is only useful when its quality is explicit. This chapter defines a device-side location contract: every update carries a timestamp, source tag, confidence score, and fix age. GNSS, UWB ranging, and IMU-assisted dead-reckoning contribute as evidence-driven inputs, and fusion must degrade gracefully when any source weakens.
- GNSS start reality: cold vs warm start distributions; aiding retention reduces TTFF only when state is preserved and antenna noise is controlled.
- Antenna matching impact: degraded matching and desense show up as lower C/N0, fewer satellites, higher HDOP, and longer TTFF.
- UWB error sources: clock offset and multipath dominate; use range residuals and session success ratio to bound trust.
- Fusion rules: weight inputs by their quality metrics; confidence must drop when fix age grows or residuals rise.
- Dead-reckoning role: bridge short gaps and smooth transitions; never hide drift—expose it via confidence and fix age.
| Source | What to record | What it proves |
|---|---|---|
| GNSS | TTFF_ms, fix_type, sat_count, HDOP, C/N0_avg (or per-sat) | Explains outdoor accuracy and availability; exposes desense and poor antenna conditions. |
| UWB | range_residual_rms, session_ok_ratio, retry_count, NLOS_flag (if available) | Bounds indoor ranging trust; catches multipath and clock issues without deep system deployment talk. |
| IMU / DR | dr_age_ms, imu_quality_flag, smoothing_gain (optional) | Shows whether motion smoothing is bridging responsibly or drifting beyond safe use. |
| Fusion output | position, timestamp, source_tag, confidence_score, fix_age_ms | Guarantees responders and logs see not only where, but how trustworthy and how fresh. |
- Indoor transition: GNSS quality drops (C/N0↓, HDOP↑) → down-weight GNSS; rely more on UWB residual-bounded updates.
- UWB multipath: residuals spike or sessions fail → reduce UWB weight; hold last-known with explicit fix_age growth.
- All weak: both GNSS and UWB unreliable → DR bridges briefly; confidence decays with dr_age_ms and fix_age_ms.
- Never hide staleness: last-known must carry fix_age_ms and low confidence to prevent silent misinterpretation.
H2-9. Security & Privacy Hardware Chain (Without cloud talk)
The goal is simple and measurable: only signed firmware runs, keys never leave protected hardware, and every alert can carry a device-side trust state. This chapter stays strictly on the device: secure boot, rollback prevention, key vault / secure element boundaries, and integrity signals that reduce spoofing or tampering without cloud dependency.
- ROM root: immutable boot ROM (or equivalent) must be the first verifier of the next stage.
- Signed firmware: verify signature before execution; log the verify result and fail reason.
- Rollback protection: prevent “valid but older” images using monotonic counters or version fuses/secure storage.
- Runtime trust state: expose a compact attestation flag to be attached to alert payloads (device-side only).
- Key vault / secure element: keys live in protected silicon; crypto operations occur inside the vault/SE.
- Access control: separate key usage domains (identity, message auth, encryption) and track per-domain access counters.
- eSIM/iSIM: treat cellular identity credentials as hardware-isolated assets; keep interfaces narrow and auditable.
- Least disclosure: store only what the device must send (e.g., location + timestamp); avoid local caches that increase leakage risk.
Spoof resistance is bounded by power and form factor. The goal here is not perfect prevention, but detect-and-downgrade trust. When integrity signals degrade, the device should reduce location confidence, tag alerts accordingly, and avoid silently presenting stale or suspicious fixes.
- GNSS anomaly hints: sudden C/N0 pattern changes, unrealistic speed/accel, frequent fix-type toggling → spoof_suspect flag.
- UWB integrity hints: residual spikes, session failures/retries, abrupt range jumps → down-weight UWB contribution.
- Trust downgrade: confidence score and fix_age are the public truth; suspicious inputs must not look “normal.”
| Evidence | What to capture | Why it matters |
|---|---|---|
| Boot attestation flags | boot_state (ROM/BL/APP), sig_verify_result, rollback_check_result | Proves only verified images run; explains any boot anomalies in the field. |
| Signature verify logs | verify_fail_reason, image_version, secure_time (if available) | Separates corruption, tampering, and versioning mistakes. |
| Key audit counters | key_access_counter (per domain), key_export_attempts (should be 0) | Detects suspicious key use and enforces “keys never leave hardware.” |
| Anti-spoof indicators | gnss_spoof_suspect_flag, cn0_stats, uwb_residual_rms, session_ok_ratio | Supports detect-and-downgrade trust for location and ranging inputs. |
H2-10. Ruggedization: EMC/ESD, Surge, Audio/RF Immunity, Environmental Limits
Ruggedization is not “add a TVS and hope.” It is an entry-point and return-path discipline: identify where energy enters (buttons, mic/speaker ports, RF feed, pads/cables), constrain where it returns, and protect the true victims (reset stability, RF sensitivity, GNSS acquisition, and emergency audio clarity). This chapter maps the hardening targets to measurable before/after evidence.
- ESD at buttons: clamp at the perimeter and keep discharge current out of sensitive reference grounds.
- Mic / speaker ports: treat apertures as ESD/EMI entry paths; protect nearby traces and bias networks.
- RF feed & antenna region: front-end ESD and filtering must avoid degrading sensitivity; keep return paths short and controlled.
- Pads / cables (if any): cable events inject surge/EFT-like energy; protect at connector and isolate fast edges from core rails.
- Environmental limits: low temperature raises battery ESR (droop during bursts); moisture/sweat increases leakage and bias drift.
- Buzz during TX: correlate audio noise bursts with modem TX scheduling; confirm whether coupling is via rails, ground, or near-field.
- Pop/click: check gain/AGC steps and rail droop at the same moment; “pop” is often a power integrity symptom.
- Isolation tactics (device-side): separate decoupling islands for audio; control return paths; avoid sharing high di/dt loops with mic bias.
| Evidence | What to record | Pass criteria style |
|---|---|---|
| Reset stability | reset_count, watchdog_count, brownout_count (pre/post stress) | Counters do not jump; reset reason codes remain clean under exposure. |
| RF sensitivity | RSSI/RSRP/SNR histograms, session_ok_ratio, retry_count (pre/post) | No significant sensitivity collapse; retries do not spike after events. |
| GNSS acquisition | TTFF_ms distribution, C/N0_avg, HDOP, fix_success_rate | TTFF and quality remain bounded; no persistent degradation after stress. |
| Audio immunity | audio_noise_delta during TX, pop_count, AGC anomaly markers | No audible buzz/pop during TX bursts; noise delta remains low and repeatable. |
| Trigger drift check | false_alarm_rate change, wake_reason distribution shift (pre/post) | Wake robustness remains stable; no drift toward increased false alarms. |
H2-11. Validation & Field Debug Playbook (Symptom → Evidence → Isolate → Fix)
This playbook forces a repeatable evidence chain for field failures. Each symptom is reduced with two measurements (one power-rail + one subsystem metric), then separated by a discriminator into a root-cause bucket, followed by the first fix and a re-test KPI. No cloud or app walkthrough is required.
| Field group | Minimum fields | Why it exists |
|---|---|---|
| Timeline (T0…T5) | T0 trigger, T1 MCU wake, T2 modem on, T3 location ready, T4 uplink sent/ACK, T5 voice start | Turns “slow” into a single blocking segment. |
| Power integrity | VBAT_min, VSYS_min, droop_ms, peak_current_hint (optional), brownout_count, reset_reason | Separates peak-current collapse from RF or logic issues. |
| Cellular | RSRP/RSRQ/SNR, attach_time_ms, retry_count, TX_burst_marker | Explains late SOS, choppy voice, and “fails only at edge coverage.” |
| GNSS / UWB | TTFF_ms, C/N0_avg, HDOP, fix_age_s, UWB_session_ok_ratio, UWB_residual_rms | Explains “no fix indoors” and integrity downgrades. |
| Audio | dropout_count, buffer_underflow, AGC_step_count, clip_count, audio_noise_delta_during_TX | Separates link loss from power/EMI coupling into audio path. |
| Trigger | wake_reason, confidence, reject_reason, false_alarm_counter, accel_feature_summary | Turns “false man-down” into measurable discriminators. |
- Rail: capture VBAT/VSYS min and droop_ms from T0→T4 (trigger to ACK).
- Cell metric: snapshot attach_time_ms + retry_count with RSRP/SNR.
- Power-dominant: VBAT_min dips near UVLO and attach restarts; brownout/reset counters increase around TX bursts.
- RF/coverage-dominant: rails stable but RSRP/SNR poor; attach_time histogram shifts long; retries spike at edge coverage.
- Scheduling-dominant: rails + RSRP are OK but delay sits between T1→T2 or T2→T4 (module not powered or priority wrong).
- Peak-current survival: add/moderate a buffer path for modem rail; reduce droop with lower ESR and controlled inrush.
- Priority: preempt non-critical scans; push SOS path to the highest priority window until ACK arrives.
- RF basics: quick antenna/keepout sanity check; avoid concurrent radios during initial attach window.
- TTFA distribution: worst-case shrinks; long tail collapses after fixes.
- Attach: attach_time_ms and retry_count reduce; no restart loops.
- Fuel gauge (burst-aware logging): TI BQ27441-G1, Maxim MAX17048
- Buck-boost for stable VSYS: TI TPS63070, TI TPS63031
- Load switch / rail gating: TI TPS22918, TI TPS22965
- Ideal diode / OR-ing: TI LM66100, Analog Devices LTC4412
- Supercap backup controller (modem burst buffer path): Analog Devices LTC3350
- LTE-M/NB-IoT modules (attach behavior varies by network): u-blox SARA-R5, Quectel BG95, Quectel BG77
- GNSS: capture TTFF_ms + C/N0_avg + HDOP plus fix_age_s.
- UWB: capture session_ok_ratio + residual_rms over a short window.
- GNSS input failure: persistently low C/N0 and long TTFF; confidence must drop instead of pretending “good fix.”
- Coexistence / desense: C/N0 collapses specifically during cellular TX bursts; correlates with TX markers.
- UWB failure: ok_ratio low or residual high (NLOS/multipath/slot collisions); indoor ranging unreliable.
- Fusion misuse: GNSS is bad but output confidence stays high; fix_age grows without downgrade.
- Trust downgrade first: enforce confidence + fix_age; tag “indoor/low-trust” rather than outputting stale location.
- Desense reduction: isolate GNSS supply/ground; schedule GNSS sampling away from TX bursts.
- UWB stabilization: improve session reliability (less concurrency, cleaner timing window); lower residual before “fusion trust.”
- Indoors: confidence drops quickly when GNSS is unusable; no “fake stable fix.”
- UWB: ok_ratio increases and residual_rms decreases; indoor location becomes explainable and repeatable.
- GNSS modules: u-blox MAX-M10S, u-blox NEO-M9N
- UWB transceiver / module: Qorvo DW3000, Qorvo DWM3000
- Low-noise LDO for RF islands: TI TPS7A20, Analog Devices ADP150
- RF switch (antenna path control / isolation use-cases): Skyworks SKY13351-378LF
- Rail: scope VSYS during TX; record VSYS_min + droop_ms.
- Reset evidence: read reset_reason + brownout_count aligned to TX markers.
- Power collapse: droop hits UVLO region; resets cluster at TX peaks; worse at cold temperature (higher ESR).
- Hardening/ESD path: no significant droop, but resets occur near touch/button/ports; resets correlate with disturbance events.
- Scheduling overload: droop occurs during CPU spikes (codec + modem + ranging); still proven by the rail waveform.
- Peak buffer: dedicated burst buffer path for modem rail (lower ESR, optional supercap branch, controlled inrush).
- Rail segmentation: gate non-essential islands off during attach/TX; avoid sharing di/dt loops with MCU core rails.
- Return-path discipline: keep ESD return current on the perimeter, away from sensitive grounds and reset pins.
- Resets: reset_count remains flat across repeated TX bursts and cold-start scenarios.
- Rails: VSYS_min margin increases; droop_ms shortens; no UVLO hits.
- High-current buck (main rail): TI TPS62840, TI TPS62130A
- Li-ion charger (thermal + robustness): TI BQ24074, TI BQ25895
- Battery protector (single-cell): TI BQ29700
- TVS arrays (ports/buttons proximity): Littelfuse SP0503BAHT, Semtech RClamp0524P
- Common-mode choke (cable/pads scenarios): TDK ACM2012-900-2P
- Link: capture RSRP/SNR + packet retry (or voice bearer error) during choppy moments.
- Audio evidence: capture dropout_count + buffer_underflow and audio_noise_delta_during_TX.
- Coverage-dominant: rails are stable; audio chain has no TX-correlated buzz, but retries/loss spike under weak RSRP/SNR.
- Power/EMI coupling: buzz/pop aligns with TX bursts; audio_noise_delta jumps; often tied to shared rails/returns.
- Scheduling-dominant: underflow spikes when concurrent tasks run (location + upload + voice); proven by buffer logs.
- Audio rail isolation: separate codec/amp supply decoupling island; avoid high di/dt return path overlap with mic bias.
- Priority: voice stream must preempt background ranging/log flush during active session.
- Link tolerance: use a more robust voice mode under weak coverage (describe only as “fallback mode,” no protocol tutorial).
- Audio: dropout_count and underflow drop to near-zero during repeated TX bursts.
- Immunity: audio_noise_delta_during_TX stays bounded; buzz/pop disappears.
- Audio codec: TI TLV320AIC3104, NXP SGTL5000
- Class-D speaker amp: Maxim MAX98357A, TI TPA2016D2
- Mic preamp / AGC (simple analog front-end option): Maxim MAX9814
- Low-noise LDO (audio island): TI TPS7A02, Analog Devices ADP7118
- ESD on audio ports: Nexperia PESD5V0S1UL, Semtech RClamp0504S
- Trigger evidence: dump a short accel raw window + feature summary from stage-2 confirmation.
- Rail sanity: record VSYS_min during the same window to exclude power-induced sensor artifacts.
- Threshold/ODR mismatch: periodic vibration energy with small orientation change should be rejected (reject_reason must show it).
- Wear/attachment artifact: loose strap creates impulse peaks that look like impacts; raw window shows sharp spikes.
- State machine gap: no-motion timer and recovery logic conflict; repeated wake_reason patterns reveal it.
- Two-stage confirm: stage-1 sensitive wake is fine; stage-2 adds discriminators (orientation delta + post-event stillness).
- Debounce tamper: strap removal and enclosure sensors need stable debounce and reject_reason categories.
- Counter-driven tuning: log false_alarm_counter by reject_reason to converge quickly without guesswork.
- Confusion matrix: false positives drop while true events remain detectable.
- Explainability: reject_reason distribution becomes stable and scenario-consistent.
- Ultra-low-power accelerometer: ST LIS2DW12, Analog Devices ADXL362
- Higher-feature IMU (if fusion needed): Bosch BMI270
- Hall sensor for strap/tamper: TI DRV5032
- Secure element (tamper + key protection): Microchip ATECC608B, NXP SE050
- BLE SoC (sensor hub class): Nordic nRF52840, Nordic nRF5340
H2-12. FAQs ×12 (Evidence-first; no scope creep)
Each FAQ is answered using the same rule: First 2 measurements = one rail + one subsystem metric, then a discriminator to isolate the bucket, followed by the first fix and a re-test KPI. Device-side only (no cloud/app tutorials).
Panic button pressed, but alert arrives 30–60s late—first blame GNSS or cellular attach?
Break TTFA into segments: if attach_time and retry_count dominate while rails stay stable, the delay is cellular registration/coverage. If attach is fast but TTFF is long or fix_age is waiting, GNSS is the blocker. First fix is evidence-driven: send an early payload with last-known + confidence, then refine location; re-test with TTFA tail collapse.
Works outdoors, fails indoors—prove it’s GNSS limitation vs UWB/anchor issue?
Compare input quality: low C/N0 with long TTFF indicates GNSS limitation indoors, while UWB health is shown by session_ok_ratio and residual_rms. If GNSS is poor but UWB residual stays low, rely on UWB with explicit confidence. If UWB ok_ratio drops or residual rises, it is an indoor ranging integrity failure. Re-test by confidence correctness and UWB residual stability.
Random reboot only during cellular TX—what 2 rails prove peak-current collapse?
Prove peak-collapse with a TX-correlated rail capture: log VBAT_min/VSYS_min and droop_ms aligned to TX markers, then confirm with reset_reason and brownout_count. If droop hits UVLO region and resets cluster at TX bursts, it is peak-current survival. First fix is a modem-rail buffer path (lower ESR, controlled inrush, optional supercap branch); re-test is zero resets across repeated TX at cold.
Voice is choppy only when location updates run—scheduling or shared clock?
If dropout_count or buffer_underflow spikes exactly at the location update tick while rails and RF are stable, scheduling/resource contention is proven. If choppiness correlates with RF activity markers and audio noise rises during TX, shared supply/return coupling is likely. First fix is hard time-slicing: voice path preempts location work; location updates are rate-limited and shifted away from voice windows. Re-test is near-zero underflow and stable voice latency.
UWB ranging error jumps when BLE is active—desense or time-slice bug?
Track UWB_residual_rms against a BLE activity marker. If residual rises proportionally with BLE TX windows and improves when BLE is idle, the issue is coexistence/desense (front-end, antenna keepout, or coupling). If residual spikes at fixed schedule boundaries even with low BLE activity, it is a time-slice/priority bug. First fix is enforced non-overlap scheduling plus RF isolation (filters/keepouts); re-test is residual flattening and higher UWB ok_ratio during BLE traffic.
False ‘man-down’ during running—thresholding vs orientation logic?
Dump the stage-2 confirmation window: accel raw + feature summary plus reject_reason. Periodic vibration with small orientation delta indicates threshold/features are misclassifying running; large orientation delta followed by rapid recovery suggests state-machine logic (confirm window or recovery gate). First fix is two-stage confirm: sensitive wake is kept, but stage-2 adds orientation delta + post-event stillness and stable debounce for strap/tamper. Re-test is a confusion-matrix drop in false positives without losing true events.
Missed fall events—sensor saturation or wake latency?
Separate “not captured” from “captured but rejected.” First measure saturation flags or max amplitude in the accel window, then measure T0→T1 wake latency and the confirmation window length. If saturation clips the event, range/ODR is wrong and features become unreliable. If the event is not in the window, wake/confirm latency is too long and pre-trigger buffering is insufficient. First fix is widening the confirm window and optimizing two-stage wake; re-test is improved detection rate across controlled motion scripts.
Battery shows 40% then dies during SOS—fuel gauge model or burst ESR?
Prove whether the device is dying from droop or estimation. Measure VBAT droop and temperature during TX bursts, then compare with fuel-gauge flags/model behavior. Large droop, especially worse at cold, indicates high ESR and insufficient burst buffering; fix peak survival before tuning the model. If droop is modest but SOC estimate is inconsistent, adjust fuel-gauge parameters and learning. Re-test is no brownout while SOC remains above a protected threshold during repeated SOS.
GNSS fix time worsens over months—antenna aging or firmware retention?
Trend the RF input first: compare long-term C/N0_avg and satellite visibility across the same test route. If C/N0 steadily drops, suspect antenna matching drift, contamination, moisture ingress, or keepout changes. If C/N0 stays healthy but TTFF grows and warm-start rate falls, retention/backup-domain handling is failing (aiding not retained, cold starts increase). First fix is protecting antenna integrity and verifying backup/retention rails and state; re-test is stabilized TTFF distribution over time.
Audio has RF ‘buzz’ during TX—ground return or codec front-end?
Align audio artifacts to TX markers: measure audio_noise_delta_during_TX and check ripple on codec/speaker-amp rails or mic-bias. If buzz tracks rail ripple, it is supply/return coupling and needs stronger island decoupling and cleaner return paths. If buzz is channel-specific and changes with input routing or impedance, the codec front-end/mic path is coupling RF and needs filtering and layout fixes. First fix is perimeter ESD return + audio island isolation; re-test is zero buzz during repeated TX bursts.
Device is spoofed/jammed—what device-side signals detect it?
Use device-side integrity signals, not network assumptions. Flag GNSS anomalies when position jumps while IMU motion and UWB residual do not support the change, or when fix confidence is inconsistent with C/N0 patterns and fix_age. If GNSS degrades but UWB remains stable, treat GNSS as untrusted and downgrade. If both GNSS and UWB residuals worsen with RF metrics, treat as interference and enter conservative alert mode. Re-test is correct integrity flags and safe downgrade behavior in repeatable disturbance scripts.
ESD passes in lab but field resets happen—what entry points are most common?
Field resets usually enter through user-contact or aperture points: button, strap sensor, mic/speaker ports, charging pads, enclosure seams, and any exposed metal. Capture reset_reason and correlate to handling events, then check whether rail minima dip (brownout path) or remain stable (ESD/return-path path). If brownout is seen, reinforce power buffering and return routing; if not, add targeted TVS/RC at entry points and keep ESD current on the perimeter ground. Re-test is zero resets during controlled handling and repeated contact disturbances.
A navigation map that forces every FAQ back to the same evidence buckets (Power, Cellular, GNSS, UWB, Audio, Trigger/State).