Motion Coach Chest Strap: IMU + HR AFE, BLE, Power & Logging
Key takeaway: A motion-coach chest strap is only as reliable as its evidence chain—electrode contact quality, IMU integrity, time alignment, BLE framing, power states, and atomic logging must be measurable and gated. When any link degrades, the system should downgrade output with clear flags instead of producing confident but wrong coaching events.
H2-1. What a Motion Coach Chest Strap Is (and isn’t)
Definition (engineering-only): A motion-coach chest strap is a chest-mounted electrode interface that measures R-R interval via an ECG HR AFE, captures motion events via an IMU, extracts on-device features on an edge MCU, and delivers time-stamped packets over BLE—while meeting strict low-power and traceability requirements.
Why chest straps are unique (constraints that dominate design):
- Electrode contact is variable: strap tension, skin dryness vs sweat, and micro-slip cause impedance shifts, baseline drift, and transient saturation.
- High acceleration is a coupled disturbance: impact/vibration can create IMU clipping and also inject motion artifact into the electrode interface.
- Coaching depends on alignment: IMU events (cadence/impact) must be time-aligned to R-peak/R-R with controllable drift and buffering latency.
What “motion coach” means here: local, testable computations such as quality gating, event detection, and feature summarization (e.g., cadence/impact markers), not cloud training plans or app UX workflows.
Typical data path (testable signal chain):
- Electrodes → HR AFE → R-peak / R-R + quality flags (lead-off/saturation counters).
- IMU → motion features/events (clip count, RMS, orientation stability).
- Edge MCU → aligned feature packets (timestamp + sequence + quality).
- BLE → realtime notifications and/or batch sync; local log remains the source of truth for gaps.
Not covered (to prevent scope creep):
- Cloud dashboards, mobile app walkthroughs, or training-plan content.
- Other form factors (smart rings, sleep headbands, EMG patches, pendants).
- Deep BLE protocol-stack internals (only device-level parameters + evidence).
- Medical diagnosis/claims (engineering evidence only).
Evidence to collect (first, before guessing)
- Contact / AFE: lead-off state, saturation count, and an AFE output sample window during motion.
- Transport: sequence-counter gaps (or retry counters) across realtime notifications.
- Alignment: timestamp drift between IMU event markers and R-peak/R-R markers (distribution over time).
H2-2. Target Metrics & Real Failure Modes (spec the problem correctly)
Chest-strap performance should be specified as an engineering contract: what “good” looks like, what evidence proves failure, and which root-cause bucket it belongs to. Three metric families are sufficient to drive the entire design and debug loop:
- Accuracy: R-R stability during motion without being dominated by contact-induced artifact.
- Continuity: data arrives complete and time-consistent (no silent gaps, no drifted alignment, controlled latency).
- Power: average current meets battery-life targets and peak events do not trigger brownout / reboot / reconnection storms.
Common failure buckets (use these to avoid guessing):
- Contact: electrode impedance swings, lead-off toggling, baseline shifts, AFE saturation.
- Sync/alignment: timestamp drift, buffering jitter, IMU-event ↔ R-peak misalignment.
- Link/queue: BLE notification pacing, sequence gaps, retry storms, reconnection loops.
- Power/wake storms: unexpected wake sources, peak current bursts, UVLO resets.
- Logging/storage: non-atomic commits, CRC gaps, duplicates/holes after resets.
Metric matrix (what to measure, and what it proves):
| Metric family | What “bad” looks like | Evidence to collect (minimal set) | Likely bucket |
|---|---|---|---|
| Accuracy R-R under motion |
HR/R-R stable at rest but noisy or dropped during runs/intervals. |
Lead-off state + saturation count + short AFE output window during motion; quality flags vs IMU intensity (impact windows). |
Contact → (then) Sync |
| Continuity Complete & time-consistent |
Reordered packets, silent gaps, “laggy” coaching cues. |
Sequence counter gaps; retry/reconnect counters; timestamp drift distribution (IMU events vs R-peak markers). |
Link/queue or Sync |
| Power Battery life + peak robustness |
Battery drains fast only during workouts; random resets when transmitting or logging. |
Current profile: sleep / active / radio burst / storage commit; wake-reason counters; UVLO/reset logs. |
Power/wake storms |
| Logging integrity Traceability |
Data holes or duplicates after charging, syncing, or unexpected resets. |
Timestamp + sequence + CRC per record; commit markers around resets; gap statistics. |
Logging/storage |
Engineering rule of thumb: many “accuracy” complaints are actually continuity or alignment failures. Always prove the bucket using counters and timestamps before changing algorithms.
Evidence to collect (first, before tuning)
- E1 Contact: lead-off toggles + saturation counter + short AFE sample during motion windows.
- E2 Motion: IMU clip count + RMS level (impact intensity) across the same window.
- E3 Transport: seq gap rate + retry/reconnect counters during realtime vs batch modes.
- E4 Alignment: timestamp drift histogram between IMU events and R-peak markers.
- E5 Power: average current + peak current during radio bursts and storage commits.
Motion Coach Chest Strap (Chapters 3–4)
System architecture + HR front-end deep dive. Evidence-first: every block maps to test points (TP) and log fields (time/seq/CRC/reset).
H2-3. End-to-End Architecture (signal chain you can test)
Goal: define a complete signal chain that can be measured, time-aligned, and root-caused. A chest-strap “motion coach” is not validated by a single heart-rate number; it is validated by a chain of events + timestamps + quality flags that remain consistent from sensor front-ends to BLE packets and local logs.
Three parallel paths that must agree:
- HR event path: electrodes → HR AFE → MCU (R-peak/R-R + contact quality).
- Motion event path: IMU → MCU (cadence/impact features + clip indicators).
- Traceability path: MCU → local log (timestamp + sequence + CRC + commit marker) → later sync/recovery.
Data model (what the system really outputs):
- R-R events: time-stamped intervals and R-peak markers, accompanied by quality flags (lead-off state, saturation count, contact score).
- Motion features: time-stamped cadence/impact markers, with integrity flags (IMU clip count, RMS intensity, orientation stability).
- Transport frames: BLE payloads that include sequence counter + timestamp + mode (realtime notify vs batch sync) so gaps and reordering are provable.
- Log records: durable entries with time + seq + CRC and an explicit commit marker to survive resets without duplication/holes.
Timing domains (keep it testable)
All sensors and packets must reference a common notion of time. If the IMU has its own clock, drift must be observable via timestamps and corrected via alignment logic.
Root-cause readiness
Every segment must expose at least one “proof signal”: a waveform, a counter, or a log field that can confirm or exclude a bucket.
Observability checklist (TP points + what they prove)
- TP1 AFE output window: proves saturation/baseline drift vs true R-peak visibility.
- TP2 Lead-off / contact score: proves electrode contact transitions (dry/sweat/slip).
- TP3 IMU clip/RMS: proves impact/clipping that can create false cadence/impact events.
- TP4 MCU timebase: proves timestamp stability and alignment feasibility.
- TP5 Buffer/queue depth: proves latency comes from scheduling/queues (not “wireless magic”).
- TP6 BLE seq gaps/retries: proves packet loss, pacing issues, or reconnection storms.
- TP7 Log CRC/commit markers: proves durable integrity (holes/duplicates) across resets.
- TP8 Reset/UVLO reason: proves power-related resets during bursts (radio or storage commits).
H2-4. HR Front-End on a Chest Strap (electrode contact is everything)
The chest-strap HR front-end is dominated by the electrode–skin interface. When motion begins, contact impedance and micro-slip change faster than most people expect, and those changes can push the AFE into baseline drift, transient saturation, or repeated lead-off toggling. If these states are not observable, the system will mislabel contact artifact as “heart-rate variability.”
Contact-driven failure chain (field-realistic):
- Dry skin: higher impedance increases susceptibility to common-mode pickup; R-peaks shrink relative to noise floor.
- Sweat transitions: impedance drops but becomes unstable; polarization-like drift and sudden step changes become common.
- Micro-slip under impact: low-frequency baseline shifts and fast transients can clip the input or trigger false peaks.
- Strap tension changes: amplifies all three behaviors, causing “good at rest, bad when running.”
Input protection and RFI/ESD hardening (engineering meaning): protection networks must prevent damage without consuming the usable input range or introducing non-linear rectification paths that turn RF events into low-frequency drift. A robust design separates “survivability” from “signal integrity” by ensuring that protection does not dominate the waveform during motion.
AFE parameter priorities (what they influence in practice):
- Input range / recovery: determines whether impact-induced transients cause extended saturation (and long false gaps).
- CMRR + reference strategy: determines how well common-mode disturbances are rejected when contact changes.
- Input noise: determines detection margin for small R-peaks under high-impedance contact conditions.
- Lead-off / contact sensing: provides the only reliable way to gate output when the interface is not trustworthy.
Engineering rule: contact quality must be treated as a first-class signal.
- Lead-off state proves electrode detachment or intermittent contact.
- Contact impedance trend / contact score proves gradual degradation before obvious dropout.
- Saturation count / baseline drift indicators prove that the front-end is being driven out of its linear region.
Minimal waveform set (2 traces that settle most debates)
- Trace A: a short AFE output capture during a motion/impact window (shows clipping, baseline shifts, false peaks).
- Trace B: lead-off/contact-quality metric over the same time window (proves whether the interface changed).
H2-5. IMU Subsystem for Coaching (sampling, bias, and shock)
Goal: treat the IMU as a coaching sensor, not a generic motion chip. The output is only useful when it consistently produces time-stamped events/features (cadence/impact/swing/stillness) with explicit quality flags (clip, RMS intensity, bias health, temperature drift). Without quality, high-energy motion turns into believable—but wrong—features.
Coaching-oriented outputs (what to log and validate):
- Events/features: cadence markers, foot-strike/impact markers, swing intensity, stillness state.
- Integrity flags: clip_count, RMS_intensity, axis energy ratio, orientation stability flag.
- Health flags: bias health, temperature drift flag, self-test pass/fail summary.
- Timing: per-sample or per-event timestamp suitable for cross-sensor alignment (see H2-6).
Sampling rate & bandwidth (spec from the feature need): coaching features are driven by two regimes: (1) lower-frequency cadence/swing patterns and (2) fast transients at impact. If the bandwidth is too low, impacts smear into slow ramps and event timing drifts; if bandwidth is high but quality flags are missing, high-frequency vibration becomes false triggers. Configuration must be validated via event timing stability and false-trigger rate, not by “nice-looking” waveforms.
Dynamic range & clipping (the most common root cause of fake events): impact can exceed IMU range during running, jumping, or abrupt torso motion. Once clipped, many features appear artificially “consistent,” and threshold crossings shift in time. A coaching pipeline must treat clip_count as a first-class input—otherwise “more motion” can look like “better cadence.”
Mounting & coordinate drift (chest strap specific): strap tension changes, sweat lubrication, and fabric movement can cause small rotations/offsets of the IMU module. This changes axis projection and can drift feature thresholds over a session. Drift must be observable using simple indicators: axis energy ratio, gravity direction consistency, and sudden orientation jumps that correlate with false cadence/impact events.
Self-check & consistency (bias / temperature / stillness): bias and temperature drift show up first in “quiet” segments. A robust system validates itself continuously: during stillness windows, RMS should drop near the motion floor and bias health should remain stable; during motion windows, RMS and clip_count should rise in predictable ranges, not explode unpredictably.
Evidence pack (minimal, field-friendly)
- Stillness window (30–60s): RMS_floor, bias_health, stillness_state, temperature trend.
- Run/impact window: RMS_intensity, clip_count, event_count (cadence/impact).
- Strap-shift window: axis energy ratio drift + orientation stability flag changes.
H2-6. IMU-HR Time Alignment (the silent killer)
Goal: make “coaching feel” measurable. HR events (R-peak / R-R) and IMU events (impact/cadence) can each be correct yet still fail coaching if they are not mapped onto the same time axis. Misalignment produces believable but wrong causality: impact appears to “cause” a heart-rate change too early/late, and cadence markers drift relative to R-R trends over a session.
Align events, not raw waveforms:
- HR side: align to R-peak event time (and R-R interval boundaries), not just a smoothed HR number.
- IMU side: align to impact/cadence event time, with clip/RMS flags to suppress corrupted windows.
- Record both: raw capture time and event decision time (so fixed pipeline latency stays visible).
Timestamps: single-clock vs dual-clock: single-clock systems stamp every event in the MCU timebase; dual-clock systems must map IMU time (FIFO sample counter or internal timestamp) into MCU time. Dual-clock mapping is never “set once”: drift (ppm-level), temperature changes, and resynchronization events shift the mapping, causing step errors in alignment unless drift and resync are explicitly tracked.
Buffers & queues (where fake delay is born): even with a perfect clock model, queueing can dominate. FIFO depth, task preemption, and packetization schedules introduce variable latency between “sample happened” and “event got stamped/sent.” If alignment error correlates with queue depth or task latency, the root cause is local scheduling—not the sensor.
- Every event record carries timestamp + sequence (and quality flags).
- Dual-clock systems maintain a simple drift estimator (slope) and offset; resync increments resync_count.
- Log queue_depth_max and task_latency_ms alongside event timestamps.
Alignment evidence outputs (what a build must report)
- alignment_error_ms distribution: p50 / p95 / max per session window.
- drift trend: drift_ppm (or slope) vs time (and temperature if available).
- resync list: timestamps of resync events + offset step size.
H2-7. Edge MCU & On-Device Features (what runs locally, why)
Goal: define a strict engineering boundary for “edge intelligence” on a motion-coach chest strap. The device runs feature/event extraction, quality control, and data integrity locally. It does not implement training plans, app tutorials, or cloud-driven coaching content. The priority is simple: never output believable-but-wrong events when contact, motion sensing, or timing integrity is compromised.
Local responsibilities (and why they must be local):
- Timing unified timestamps + sequence counters so downstream can verify causality and continuity.
- Quality gate events/features using contact/clip/alignment indicators to avoid false coaching cues.
- Integrity logging commit markers + CRC so resets and reconnects do not create duplicates/holes.
- Resilience watchdog and reset recovery that preserves session semantics, not just “boots again”.
Task layering (a testable pipeline): the firmware should be structured as a measurable pipeline with explicit boundaries. Each stage emits counters so field failures can be attributed to sampling, scheduling, detection, or transport—not guessed.
Stage 1 — Sampling
Acquire HR AFE / IMU with monotonic timebase. Track missed samples and FIFO stress.
Stage 2 — Preprocess
Basic conditioning + artifact tagging (clip/RMS). Keep fixed latency visible.
Stage 3 — Event detection
Produce R-peak / impact / cadence events with event_time and quality flags.
Stage 4 — Feature windows
Summarize over fixed windows; update rate adapts under quality gating.
Stage 5 — Output
Realtime notifications vs batch records; always include seq/timestamp/CRC.
Stage 6 — Logging
Atomic commit markers prevent duplicates/holes across resets and reconnects.
Quality gating (degrade, don’t hallucinate): edge logic must treat sensor health as an input. When contact is poor, IMU is clipped, or alignment error rises, the system should reduce or suppress misleading outputs and instead report explicit quality state and reason codes. This prevents “confident” but incorrect cadence/impact cues.
| Quality input | Observable evidence | Degrade output | Reason code |
|---|---|---|---|
| Contact unstable | lead-off toggles, contact_score drop, saturation count | hold HR events; publish quality-only + conservative features | Q1_CONTACT |
| IMU clipping | clip_count ↑, peaks flatten, false event bursts | suppress impact events; lower cadence confidence/update rate | Q2_CLIP |
| Alignment degraded | alignment_error_ms p95 ↑, resync_count ↑ | freeze cross-sensor features; keep separate streams + flags | Q3_ALIGN |
| Scheduler backlog | queue_depth_max ↑, task_latency_ms spikes | throttle realtime; prioritize logging continuity | Q4_QUEUE |
| Storage stress | write_retry ↑, commit fail, CRC fail | switch to reduced record rate; preserve seq continuity | Q5_LOG |
Firmware robustness (watchdog + recovery with data semantics): a reset is not just a reboot. A chest strap must resume without silently corrupting session history. On restart, firmware should record reset_reason, restore sequence counters safely, and ensure logs are atomic (commit marker + CRC) so sync does not create duplicates, gaps, or reorder confusion.
Evidence pack (what to log and verify)
- Pipeline health: missed_samples, fifo_overflow, task_latency_ms, queue_depth_max.
- Quality state: quality_state, degrade_level, reason_code, holdoff_timer.
- Integrity: seq, timestamp, record_crc, commit_marker, reset_reason.
- Recovery success: post-reset seq continuity and no duplicate commit blocks.
H2-8. Bluetooth LE Link for Sports Data (low power, not low rigor)
Goal: treat BLE as a sports-data transport with two distinct operating modes. Real-time coaching requires predictable latency and cadence; post-workout sync requires throughput and recoverability. In both modes, the link must be provably complete using timestamps, sequence counters, and recovery behavior—without deep protocol-stack exposition.
Two-mode link design (do not “hard-carry” everything on one behavior):
- Mode A Realtime cues: low latency, stable notification rhythm, small frames.
- Mode B Batch sync: high throughput, fragmentation, resume after disconnect, dedup by seq/commit.
Connection parameters (engineering meaning, not a stack lecture): connection interval controls the density of transmit opportunities and average current; slave latency saves power but risks jitter and delayed delivery; MTU and packetization determine per-transaction overhead and the balance between throughput and peak current. In motion devices, these choices couple directly into dropouts because RF retries and CPU wakeups amplify peak current, which can trigger brownouts or scheduling backlog if margins are thin.
Data integrity (detect → mark → recover): completeness must be verifiable. Every record frame includes timestamp + seq + flags + CRC. The receiver detects gaps (seq discontinuity), the device marks events (gap_count / retry_count / reconnect_count), and recovery resumes from a known seq/offset using the local log without creating duplicates (commit marker + dedup rules).
| Mode | Primary KPI | Typical evidence fields | Failure signature |
|---|---|---|---|
| Realtime | latency_ms, jitter_ms | notify_drop_rate, queue_depth_max, task_latency_ms | jitter spikes with queue depth |
| Batch Sync | throughput_kbps, resume_success | fragment_count, retry_count, dedup_count, CRC_fail | duplicates or holes after reconnect |
Evidence triple (the field-ready 3-column table)
- Drop rate: seq gap rate, notify loss rate, CRC fail rate.
- Reconnect: reconnect_count, resume_success_rate, dedup_count.
- Power: Iavg + peak current during retry bursts / sync bursts (correlate with resets).
H2-9. Power Architecture & Battery Life (budget-first design)
Goal: treat battery life as a measurable current budget. A chest strap succeeds when average current is controlled for long life and peak current events (TX bursts, storage writes, high-ODR windows) never push rails into UVLO/brownout behavior. This chapter focuses on chest-strap patterns only and avoids charger-topology derivations.
Supply form factor (impact on system strategy):
- Coin-cell high source impedance → peak events dominate; strict peak shaping and shorter bursts become mandatory.
- Li-ion better peak capability → focus shifts to rail-domain sequencing, UVLO evidence, and reset-proof transitions.
Peak events (the three pulses that break wearables): in chest straps, most unexpected resets or dropouts correlate with a few repeatable pulses. TX retries amplify RF peak demand; log writes create short but sharp write-current spikes; IMU high-ODR windows raise CPU and sensor duty cycle and can indirectly increase queue latency and radio contention.
Pulse A — Radio TX / Retries
TX burst current rises with retries; brownout risk increases when rail margin is thin.
Pulse B — Log Write / Commit
Write pulses can collide with TX bursts; commit failure becomes an integrity fault.
Pulse C — High-ODR Window
Higher ODR increases compute and wake density; can trigger wake storms if thresholds are noisy.
Coupled effect
Retries + backlog → longer awake time → higher Iavg; peaks + weak source → UVLO/reset signatures.
Power-domain partitioning (AFE/IMU/Radio): domain control is not only about saving energy; it protects data trust. If the radio domain browns out, link counters and reconnects must be visible. If sensor domains are unstable, quality gating must prevent plausible-but-wrong events. A practical approach is to make rail-good and UVLO evidence first-class telemetry and align domain transitions with the logging commit boundary.
| Domain | What can go wrong | Observable evidence | Mitigation intent |
|---|---|---|---|
| AFE domain | reference sag, contact sensing unstable | rail_good_fail, saturation count, lead-off toggles | gate HR events |
| IMU domain | clip bursts during shocks | clip_count, RMS spikes, ODR window markers | degrade impact features |
| Radio domain | retry storms, reconnect loops | retry_count, reconnect_count, TX burst markers | throttle mode / resume |
| Core rail | UVLO / brownout reset | uvlo_flag, brownout_reset_count, reset_reason | protect integrity + recover |
Low-power state machine (wake sources and wake storms): battery life collapses when wake sources are not budgeted. Typical wake sources are motion triggers (IMU), periodic timers, HR events, and BLE link activity. The critical failure mode is an erroneous wake storm: noisy thresholds or aggressive link activity causes excessive wakeups, raising Iavg and increasing peak-event collisions that produce UVLO signatures.
Evidence pack (budget + counters)
- Current profile: sleep / active / TX / write (Iavg + Ipk markers per state).
- Wake accounting: wake_count_by_source (IMU / HR / timer / BLE) + wake_storm_flag.
- Brownout proof: uvlo_count + brownout_reset_count + reset_reason correlation to TX/write windows.
H2-10. Logging & Data Integrity (you can’t coach what you can’t trust)
Goal: make local logging a closed engineering loop: data remains complete and interpretable even with disconnects, resets, and power loss. A motion-coach strap depends on sequence continuity, timestamps, and atomic commit so downstream analysis never has to guess whether a gap is real behavior or a transport artifact.
Storage choice (FRAM vs flash — tradeoffs that matter here): FRAM favors frequent small writes with low write energy and strong power-fail behavior; flash offers capacity but imposes program/erase constraints and can create higher write pulses and endurance pressure. The right choice depends on record rate, retention window, and how aggressively peak current must be controlled.
| Dimension | FRAM tendency | Flash tendency |
|---|---|---|
| Write energy | lower and more uniform; easier to budget | higher pulses; may collide with TX peaks |
| Endurance | high for small frequent records | limited; requires wear strategy |
| Power-fail behavior | favors atomic-style updates | must harden against partial-program states |
| Capacity | typically smaller | typically larger |
Structured record (minimum fields that make trust verifiable): every record must carry enough structure to detect duplication, reorder, and gaps. The minimum is timestamp + sequence + CRC + quality flags. If a quality gate was active, the record must carry that evidence so the stream remains interpretable across sessions and reconnects.
Minimum record schema (conceptual):
- time timestamp (monotonic domain)
- order seq (strictly increasing)
- trust quality_flags + reason_code (optional)
- proof record_crc + commit_marker
Power-loss and reset handling (atomic commit + checkpoint): logging must be resistant to partial writes. A robust approach is two-phase commit: write payload first, then write CRC/header, and only then write a small commit marker as the final step. On boot, recovery scans the tail region to find the last committed record, restores the write pointer, and sets seq_next without creating duplicates or holes.
Validation (statistics, not promises): integrity should be verified under randomized power cuts while writing and transmitting. Metrics must explicitly count duplicates, reorders, and gaps, and correlate these with reset_reason and commit_fail evidence. The acceptance goal is not “mostly okay” but “explainable and bounded.”
Evidence pack (what to measure in power-fail tests)
- Integrity stats: duplicate_rate, reorder_count, gap_count across N random cuts.
- Commit proof: commit_fail_count, CRC_fail_count, scan_recovery_time.
- Correlations: reset_reason vs (TX/write windows) vs gap/dup anomalies.
H2-11. Validation & Field Debug Playbook (symptom → evidence → isolate → fix)
Intent: convert Chapters H2-1~H2-10 into a field-ready SOP. Each symptom starts with two measurements that separate contact vs timing vs link vs power vs logging. Fix actions aim to stop wrong outputs first (quality gates), then restore stable operation (state recovery), then harden evidence (seq/CRC/commit).
Use this SOP in the field
- Rule 1 — Only two measurements first: one waveform/rail evidence + one counter/timestamp evidence.
- Rule 2 — Always log a reason: every “bad data” moment must leave a quality_flag or a reset_reason.
- Rule 3 — Fix order: gate → stabilize → resume → prove (stats).
Concrete MPN examples (typical choices for this class of chest strap):
| Block | MPN examples | Why it appears in this SOP |
|---|---|---|
| ECG/HR AFE | ADI AD8232 ADI AD8233 Maxim MAX30003 TI ADS1292R | Lead-off/contact quality, saturation/baseline drift evidence, R-peak stability |
| IMU (6-axis) | Bosch BMI270 Bosch BMI323 ST LSM6DSOX TDK ICM-42688-P | Clip evidence, ODR windows, bias/temperature drift, shock artifacts |
| BLE SoC / MCU+BLE | Nordic nRF52832 Nordic nRF52840 Nordic nRF5340 TI CC2642R Renesas DA14695 | Retry/reconnect counters, low-power modes, data framing (seq/resume) |
| FRAM (I²C/SPI) | Infineon/Cypress FM24V10 Infineon/Cypress FM25V10 Fujitsu MB85RC256V | Atomic commit-friendly writes, lower write energy, power-fail robustness |
| Flash (low-power) | Winbond W25Qxx Macronix MX25Rxx | Capacity option; commit/CRC failures often correlate with write pulses |
| Buck / LDO (low IQ) | TI TPS62740 TI TPS62743 TI TPS7A02 Microchip MCP1700 | Iavg budget and Ipk margin; rail stability during TX/write collisions |
| Fuel gauge (Li-ion use) | TI BQ27441 Maxim MAX17048 | Correlate fast drain vs wake storms vs real capacity |
| ESD / TVS (I/O) | Nexperia PESD5V Littelfuse SP050x | Protect external contacts/I/O; ESD events can mimic “random resets” |
| 32.768 kHz clock | Abracon ABS07 Epson MC-306 | Timestamp drift, dual-clock domain evidence (alignment stability) |
Notes: MPNs are examples to anchor debugging conversations and BOM targeting; final selection depends on packaging, supply, power budget, and interface choices.
Symptom A — HR drifts / drops out during motion
First 2 measurements
- AFE evidence: capture AFE output (or R-peak detect stream) and record sat_count/baseline_drift markers if available (typical AFEs: AD8232/AD8233/MAX30003/ADS1292R).
- Contact evidence: trend lead_off or a contact-quality metric vs time; look for toggles that correlate with strap movement.
Goal: separate “electrode interface instability” from “timing/link artifacts”.
Discriminator (evidence → bucket)
- If lead_off toggles or contact quality collapses while AFE baseline shifts → Contact / electrode interface (H2-4).
- If contact stays stable but HR events become late/early relative to IMU windows → Time alignment / scheduling (H2-6/H2-7).
- If dropouts align with reconnect bursts or retry storms → BLE/link pressure (H2-8) or power peak collision (H2-9).
First fix (stop wrong output first)
- Enable quality gating: when contact is poor, suppress or downgrade HR events and set quality_flags (H2-7/H2-10).
- Separate peaks: avoid TX + write overlap; schedule log commits away from TX bursts (H2-9/H2-10).
- Persist “why”: log lead_off, sat_count, and reset reason near events so field traces remain explainable (H2-10).
Typical suspects: ECG AFE (AD8232/AD8233/MAX30003/ADS1292R), ESD devices on electrode/I/O (PESD5V/SP050x).
Symptom B — Cadence / impact jumps or looks unstable
First 2 measurements
- IMU clip evidence: read clip_count (or saturate flags) during the “bad” window (typical IMUs: BMI270/BMI323/LSM6DSOX/ICM-42688-P).
- RMS / event density: compare IMU RMS (rest vs run) and check whether event density spikes in bursts.
Discriminator
- If clip_count spikes → dynamic range / shock clipping (H2-5).
- If clip is low but behavior changes with strap placement/orientation → mounting / coordinate drift (H2-5).
- If cadence instability appears together with timestamp misalignment → alignment drift (H2-6), often tied to clock domain stability (e.g., 32k crystal ABS07/MC-306).
First fix
- Raise usable headroom: adjust IMU range/filters and explicitly log clip_flag so “bad windows” are not treated as real motion (H2-5/H2-10).
- Add a “static self-check window” for bias sanity and placement detect; when violated, degrade motion features output (H2-5/H2-7).
- When alignment confidence is low, output HR-only or IMU-only events with a clear flag (H2-7/H2-10).
Typical suspects: IMU (BMI270/BMI323/LSM6DSOX/ICM-42688-P), 32k clock (ABS07/MC-306).
Symptom C — BLE disconnects often / latency becomes large
First 2 measurements
- Link counters: log retry_count, reconnect_count, and “disconnect timestamp” (BLE SoCs: nRF52832/nRF52840/nRF5340/CC2642R/DA14695).
- Power coupling: correlate Iavg with wake_count_by_source(BLE) to detect retry-driven wake storms (H2-9).
Discriminator
- If reconnects are high but wake(BLE) is normal → connection parameter mismatch / environment stress (H2-8).
- If reconnects are high and wake(BLE) dominates and Iavg climbs → retry storm / wake storm (H2-8/H2-9).
- If disconnects create seq gaps but commits are clean → resume/dedup missing (H2-8/H2-10). If gaps coincide with commit_fail/CRC_fail → logging integrity (H2-10) or peak collisions (H2-9).
First fix
- Split modes: real-time notify vs bulk sync; stop forcing bulk traffic into low-latency mode (H2-8).
- Add seq + resume window; after reconnect, request missing seq ranges instead of restarting blindly (H2-8/H2-10).
- Implement bounded retries + backoff; prevent BLE from monopolizing wake sources (H2-9).
Typical suspects: BLE SoC (nRF52/nRF5340/CC2642R/DA14695), buck/LDO stability (TPS6274x/TPS7A02/MCP1700).
Symptom D — Battery drains too fast
First 2 measurements
- 4-state current profile: measure sleep/active/tx/write contributions and annotate Iavg + Ipk (H2-9).
- Wake accounting: read wake_count_by_source (IMU/HR/timer/BLE) to locate the storm source.
Discriminator
- If wake is dominated by BLE → link scheduling/notify pressure (H2-8/H2-9).
- If wake is dominated by IMU → threshold noise / high-ODR window policy (H2-5/H2-9).
- If write pulses are frequent → logging strategy too chatty (H2-10) or commit too often.
First fix
- Debounce wake sources and enforce minimum wake intervals; log a storm flag when thresholds chatter (H2-9/H2-10).
- Batch log writes and commit less often (still atomically), reducing write-pulse density (H2-10).
- Reduce peak collisions by staggering TX and write windows; protect Ipk margin on weak sources (H2-9).
Typical suspects: buck/LDO (TPS62740/TPS62743/TPS7A02/MCP1700), fuel gauge (BQ27441/MAX17048), flash write pulses (W25Q/MX25R) vs FRAM (FM24/MB85).
Symptom E — Data gaps / reorders / duplicates
First 2 measurements
- Sequence continuity: compute gap_count, reorder_count, duplicate_rate over the failure window (H2-10).
- Commit proof: check commit_fail, CRC_fail, and reset_reason correlation (H2-10/H2-9).
Discriminator
- If gaps follow disconnects but commit/CRC are clean → resume/dedup not implemented (H2-8/H2-10).
- If duplicates/reorders follow resets → recovery pointer / checkpoint weakness (H2-10).
- If commit failures align with TX/write peaks or UVLO events → power peak collision causes partial writes (H2-9/H2-10).
First fix
- Enforce “commit marker defines validity”; ignore any in-flight record without commit on boot scan (H2-10).
- Add checkpoint_seq and restore write_ptr from last committed record; prevent duplication after reset (H2-10).
- Stagger TX and commit; increase Ipk margin (or move frequent small writes to FRAM) to avoid commit failures (H2-9/H2-10).
Typical suspects: FRAM (FM24V10/FM25V10/MB85RC256V), flash (W25Q/MX25R), rail stability (TPS6274x/TPS7A02), BLE SoC resume behavior (nRF52/CC2642R/DA14695).
Minimum evidence checklist (must exist for field-debug closure)
Power evidence
Iavg (sleep/active/tx/write), wake_count_by_source, uvlo_count, reset_reason
Sensor evidence
AFE: lead_off, sat_count • IMU: clip_count, RMS, ODR markers
Link evidence
retry_count, reconnect_count, mode flag (real-time vs bulk)
Logging evidence
timestamp, seq, CRC, commit, quality_flags, gap/dup/reorder stats
H2-12. FAQs ×12 (Evidence-anchored, no scope creep)
Each answer points back to this page’s evidence chain using two observables (waveform/rail + counter/timestamp), then gives a first fix to stop wrong outputs before deeper redesign.
Resting HR is stable, but running makes it messy—contact or AFE saturation first?
Start with the electrode interface before blaming algorithms. Check AFE output baseline and saturation markers (sat_count) on an ECG AFE such as ADI AD8233, then correlate with lead_off/contact_quality toggles during motion. If contact is unstable, gate HR events and log quality_flags so “bad windows” do not create false coaching cues.
Mapped: →H2-4 / H2-11
Cadence/impact looks “fast then slow”—IMU clipping or mounting rotation?
Separate sensor saturation from orientation drift. First read clip_count (or saturate flags) on an IMU such as Bosch BMI270; spikes during impacts indicate dynamic-range clipping. If clipping is low, compare run vs rest RMS and event density; strong changes after strap repositioning suggest coordinate/mount rotation. First fix is to log clip_flag and degrade event output when clipping or instability is detected.
Mapped: →H2-5 / H2-11
HR and cadence never line up—timestamp drift or queue latency?
Treat this as a timing-domain problem. Compare HR event timestamps (R-peak) against IMU event timestamps and plot the offset distribution; a slow trend indicates clock drift, often tied to the 32 kHz timebase (e.g., Epson MC-306). If offsets jump in bursts, inspect task/queue latency (buffer depth, service jitter). First fix is seq + timestamp with periodic resync and a bounded queue to avoid “phantom delay.”
Mapped: →H2-6
BLE stays connected, but real-time prompts feel delayed—what parameters matter first?
Connection stability does not guarantee low latency. Inspect end-to-end delay from event creation to notification, then relate it to conn_interval and slave_latency on the BLE SoC (e.g., Nordic nRF52832). If latency is “one-step behind” without packet loss, the link is likely over-buffered or using bulk-friendly intervals. First fix is to split modes: short-interval low-payload for real-time, separate bulk path for sync.
Mapped: →H2-8
Post-workout sync is very slow—packetization inefficiency or retransmission storm?
Use two counters to isolate the cause: retry_count / reconnect_count for retransmission storms, and seq_gap (or missing-range requests) for resume inefficiency. On a BLE SoC such as TI CC2642R, high retries with rising average current indicates a wake/retry storm; clean retries but persistent gaps indicates poor framing/resume. First fix is bounded retries + backoff, and a seq-based resume window instead of restart-from-zero behavior.
Mapped: →H2-8 / H2-10
Battery drains much faster when moving—wireless peaks or storage write peaks?
Break the drain into a 4-state current profile: sleep/active/tx/write. If peaks align with TX bursts, the radio schedule dominates; if peaks align with frequent commits, logging dominates. A low-IQ buck such as TI TPS62743 helps average budget, but peak collisions still cause waste. First fix is to stagger TX and commit windows, batch writes atomically, and log wake_count_by_source to prove the storm source.
Mapped: →H2-9 / H2-10
After sweating, HR gets worse—electrode polarization or protection network bias?
Sweat can improve contact yet worsen baseline stability via polarization and leakage paths. If contact_quality looks good while AFE baseline drifts or sat_count rises, suspect electrode polarization or protection/ESD bias currents rather than strap looseness. For an AFE such as Maxim MAX30003, correlate baseline drift with sweat onset and check lead-off behavior. First fix is to separate “contact OK” from “front-end overrange” in quality_flags and gate outputs accordingly.
Mapped: →H2-4
Cold weather causes more disconnects—battery impedance or RF power limits?
Cold exposes power margin first. During TX, check rail droop and uvlo_count; if disconnects cluster with droops, battery impedance is likely dominating, even when RF conditions are unchanged. Then check retry_count; a spike without droop suggests RF margin/PA limits. On a BLE SoC like Renesas DA14695, first fix is to reduce peak concurrency (avoid TX+write overlap), add backoff to retries, and keep a conservative real-time mode under low-voltage conditions.
Mapped: →H2-9 / H2-8
Occasional data gaps—MCU reboot or non-atomic storage commit?
Decide using two proofs: reset_reason correlation versus commit/CRC_fail evidence. If gaps align with resets, recovery pointers or state restoration are suspect. If gaps occur without resets but show missing commit markers or CRC failures, the write path is not atomic. Using FRAM such as Infineon/Cypress FM24V10 can reduce write-energy risk, but atomic commit still matters. First fix is “commit marker defines validity” plus checkpoint_seq restore on boot.
Mapped: →H2-10 / H2-11
Random reboot during motion—what two power evidences should be checked first?
First check a rail droop signature plus a wake/peak signature. Measure the main rail during the reboot window and log uvlo_count or brownout flags, then correlate with wake_count_by_source and TX/write timestamps to detect peak collisions. A low-IQ LDO like TI TPS7A02 helps quiescent loss, but it will not prevent instantaneous peak-induced brownouts. First fix is to cap concurrency (TX vs commit), add backoff, and persist reset context in the log header.
Mapped: →H2-9 / H2-11
IMU seems “fine,” but coaching events are misdetected—how should quality gating work?
Quality gating prevents “confident wrong” outputs. Define at least two gate inputs: IMU integrity (e.g., clip_flag or placement instability) and timing integrity (alignment offset or queue jitter). On an IMU such as ST LSM6DSOX, flag windows with clipping or unusual RMS bursts; when triggered, degrade event outputs (OK → Degraded → Suppressed) and write quality_flags so downstream analysis ignores invalid windows. First fix is strict gating before tuning detection thresholds.
Mapped: →H2-7
Field issues can’t be reproduced—what is the lowest-cost way to preserve evidence?
Preserve a minimal, explainable trace: timestamp + seq + CRC + commit + reset_reason + quality_flags, plus a light power ledger (4-state Iavg and wake_count_by_source). Store only these essentials in a ring buffer to keep write energy low; FRAM like Fujitsu MB85RC256V is a practical option. First fix is “make failures explainable,” then iterate on isolation using the H2-11 decision tree.
Mapped: →H2-10 / H2-11