123 Main Street, New York, NY 10001

E-Paper Tag / ESL Hardware Design & Validation Guide

← Back to: Consumer Electronics

E-Paper Tag / ESL design is a power-and-evidence problem: coin-cell pulse sag, temperature-binned e-paper waveforms, and shelf RF retries dominate lifetime and reliability more than “average” specs. Build around measurable hooks—Vsys_min/BOR, retry_cnt/RSSI/SNR, and CRC/SEQ + atomic commit—so every refresh and update can be proven robust in worst-case shelves and temperatures.

H2-1 · Definition & Boundary

What an E-Paper Tag / ESL Is (and what it is not)

An E-Paper Tag / Electronic Shelf Label (ESL) is an ultra-low-power, wireless display node that uses a bistable electrophoretic panel to hold an image with near-zero static power. Most of its lifetime is spent in deep sleep on a coin cell, waking occasionally for RF synchronization and a high-energy refresh event. The system succeeds only when rare pulses are engineered to avoid droop, corruption, and visible artifacts.

Boundary statements keep design intent clean and prevent hidden scope creep:

  • Not a tablet: it does not target continuous frame refresh, interactive UX, or high compute; the panel is meant to hold an image.
  • Not a smart hub: it is not a multi-protocol aggregator; the tag focuses on low-frequency sync + display integrity.
  • Not signage playback: it is not designed for video, high brightness backlight, or sustained power draw; refresh is occasional and expensive.

System constraints that dominate ESL outcomes (each constraint ties to evidence you can measure):

Coin-cell pulse limit

Causes refresh-time droop and brownout risk. Evidence: Vbat/Vsys sag, BOR reason flags, refresh current profile.

Refresh energy + waveform sensitivity

Partial/full refresh tradeoffs; ghosting accumulates when waveforms or rails are unstable. Evidence: artifact patterns vs temperature bins.

RF transaction energy

Retries can silently dominate lifetime. Evidence: retry counters, RSSI variance by position, packet loss heatmap per aisle/shelf.

Cost & manufacturing robustness

Low BOM headroom requires fewer always-on blocks and strong test access. Evidence: test pads, factory programming yield, field return signatures.

Key idea: multi-year lifetime is rarely limited by the “pretty” sleep current. It is usually destroyed by event energy × event frequency (refresh bursts, RF retries, and non-atomic writes during droop).
Spec knob Typical bands Why it matters (evidence-driven)
Update interval minutes · hours · days Sets refresh frequency and RF duty. Log: refresh count/day, transaction count/day, retry distribution.
Display size class ~1.5–2.9″ · 4.2″ · 7.5″+ Larger panels increase refresh energy and sensitivity to droop. Capture: refresh current vs temperature and aged cell.
Battery class CR2032 · CR2450 (coin cell families) Internal resistance + temperature define pulse sag. Measure: Vbat droop at worst-case pulse, cold start margin.
RF mode Sub-GHz · BLE · dual Range/coexistence vs infra trade. Record: RSSI, retries, airtime, energy per successful update.
Lifetime target 1–2y · 3–5y · 5–7y bands Must be supported by measured event profiles (not guessed averages). Store: current profile + counters as proof.
Figure F1 — ESL At-a-Glance: Constraints That Decide Lifetime Block-style diagram showing coin-cell pulse limitation, e-paper refresh energy, RF transaction retries, and atomic data integrity as the primary ESL constraints. ESL At-a-Glance Four constraints dominate multi-year battery life Coin-cell pulse limit Droop → BOR/reset during refresh or RF bursts Refresh energy Waveforms + temperature decide ghosting risk RF retries Retries quietly dominate energy per update Data integrity Non-atomic writes + droop → corruption/brick Evidence points to log (proof-ready) Vbat/Vsys droop · BOR reason · refresh count · retry counters · CRC/sequence · temperature bin Cite this figure · F1
Figure F1. ESL is won or lost by pulse droop, refresh energy, RF retries, and atomic data integrity. Link
Reading tip: if a design claims “5-year lifetime,” require the evidence set above (current profiles + counters + temperature bins). Without measured event energy, “average current” is not a proof.
H2-2 · System Mental Model

System Architecture Map: MCU ↔ Display Driver ↔ RF ↔ Storage ↔ Power

ESL architecture is best understood as two parallel flows: energy flow (coin cell → power path → pulse events) and evidence flow (test pads + counters + logs that prove why an update succeeded or failed). The core trick is not “making it work once,” but ensuring the system remains stable when rare, high-current pulses occur against a high-impedance coin cell—especially at cold temperature and end-of-life cell conditions.

What typically dominates power and BOM in ESL designs:

  • Refresh bursts: the e-paper driver + panel waveform sequence concentrates energy into short windows; rail stability decides ghosting and reset risk.
  • RF transactions: energy per successful update expands with retries (coexistence, detuning, shelf placement); counters matter more than intuition.
  • Memory writes: image assets and state need an atomic commit strategy; brownout during a write is a top “silent killer” of fleet reliability.
  • Always-on leakage: dividers, indicator LEDs, sensor standby, and weak pull-ups can destroy baseline current if not controlled by design rules.
Architecture requirement: every field symptom must map to a block in Figure F2 below (power droop, refresh artifact, RF desync, or state corruption). This prevents “debug by guesswork” and keeps the page vertically focused.
Figure F2 — ESL System Block Diagram (Energy Flow + Evidence Points) Block diagram showing coin cell, power path with UVLO, MCU, RF front-end (Sub-GHz/BLE), memory (FRAM/flash), e-paper driver, and panel, plus test pads and log counters as evidence points. ESL System Architecture Energy flow (left→right) + evidence probe points (●) Coin Cell CR2032 / CR2450 Power Path UVLO · bulk cap · rail gating System Rail Vsys stable window ULP MCU Deep sleep · RTC wake · retention SPI/I²C/GPIO to driver + memory RF Front-End Sub-GHz / BLE Transactions · retries · RSSI evidence Memory FRAM / Flash CRC · seq · atomic commit E-Paper Driver Waveforms/LUT · VCOM stability Full vs partial refresh E-Paper Panel Bistable image hold Artifacts: ghosting · slow refresh at cold Vbat droop UVLO/BOR reason Vsys window retry counter CRC/seq refresh current Cite this figure · F2
Figure F2. Architecture anchors every symptom to a block: power droop, refresh artifacts, RF retries, or non-atomic state writes. Link
Practical rule: treat refresh, RF, and writes as “events.” Store counters for each event type, and capture at least one real current profile per event. That evidence becomes the fastest path to root cause and lifetime proof.
H2-3 · Power Reality

Coin-Cell Physics, Pulse Load, and Brownout Survival

In ESL hardware, the coin cell behaves like a high-impedance source that powers a system dominated by rare, high-current events. The display may hold an image with near-zero static power, but refresh bursts, RF transactions, and non-atomic memory writes concentrate energy into short windows. When a pulse arrives, the cell voltage can sag sharply, especially in cold temperature or later-life conditions where effective resistance rises.

Why a tag “dies at ~40% state-of-charge”: open-circuit voltage can look normal, but the pulse droop under refresh/RF crosses UVLO/BOR thresholds. The result is reset, corrupted state, or an incomplete refresh—often followed by retries that drain the cell faster.

Design patterns that improve brownout survival (event-focused, not topology-focused):

  • Bulk capacitor placement: add local energy near the pulse consumer (driver/rail). Validate by measuring Vsys sag reduction during the same event.
  • Staged updates: split a large refresh into smaller steps to reduce peak current. Track total event time and verify no increase in retry rate.
  • Pulse shaping: use soft-start / current limiting / rail gating so the pulse is flatter and shorter. Confirm by current profile capture.
  • On-demand conversion: avoid unnecessary always-on conversion loss; enable higher-power rails only during events and shut them down cleanly.
  • Brownout-aware writes: treat flash writes as high-risk events; use atomic commit and record BOR reasons to avoid silent corruption.

Symptoms → Evidence → Fix (mini decision table)

Symptom First evidence to capture Discriminator (prove which block) First fixes (lowest risk)
Resets only during refresh Vbat + Vsys droop, BOR reason, refresh counter Vbat droops first → coin-cell pulse limit; Vsys collapses with stable Vbat → rail path/driver load Increase/move bulk cap; reduce refresh peak (staged update); adjust UVLO strategy
“Battery looks fine” but blank update Vsys minimum, driver rail stability, refresh current profile Ghosting/blank correlates with Vsys dips → droop; correlates with temperature bins → waveform mismatch Improve rail stability; add temperature bin selection; schedule periodic full refresh
Frequent desync after update Retry counter, RSSI variance, reset count around RF events Retry spikes without droop → RF/coexistence/antenna; retry spikes with droop → power path Improve antenna/layout; reduce RF power spikes; add cap on RF rail; tune retry/backoff limits
Corrupted image/state after power event CRC/sequence errors, BOR reason, write-in-progress flag Errors right after BOR → non-atomic writes; errors without BOR → firmware integrity/asset format Atomic commit; move frequent writes to FRAM; delay refresh when Vbat is weak
Figure F2 — Coin-Cell Pulse & Rail Sag Evidence Panel-style waveforms for load current, Vbat droop, and Vsys droop with UVLO/BOR threshold markers and evidence logging points. Coin-Cell Pulse Evidence Capture Vbat + Vsys during the same event Iload (event pulse) Vbat (coin cell) Vsys (system rail) Refresh / RF / Write UVLO BOR BOR event Log: Vbat min Log: Vsys min Log: BOR reason Counter: event # Cite this figure · F2
Figure F2. Measure Iload, Vbat, and Vsys during the same refresh/RF event. A short droop below UVLO/BOR explains resets and corrupted state. Link
Measurement minimum: one current profile for each event type (refresh, RF transaction, write) plus Vbat/Vsys droop capture at cold temperature.
H2-4 · E-Paper Update Mechanics

Waveforms, Partial Updates, Ghosting, and Temperature Effects

E-paper updates are driven by multi-step waveforms that move charged pigment particles through the capsule stack. Unlike LCDs, the display is bistable: it holds the image without continuous power, but each update is a controlled physical process. Update quality depends on correct waveform/LUT selection, temperature compensation, and stable driver rails (including VCOM and the panel drive window) throughout the refresh event.

Full vs partial refresh (engineering tradeoffs that affect both power and artifacts):

  • Full refresh: higher energy and longer time, but clears accumulated artifacts and re-centers the panel state.
  • Partial update: lower energy and faster, but can accumulate ghosting if waveforms are mismatched or rails droop.
  • Practical strategy: use partial updates as default, then schedule periodic full refresh based on update count, time, or temperature events.
Ghosting is not a single-cause symptom. It is typically one of three classes: incomplete erase, waveform/temperature mismatch, or voltage droop during the update window. Each class can be separated with evidence (temperature bins, droop capture, and artifact accumulation rate).

What can be controlled (checklist aligned to measurable evidence):

  • Waveform / LUT selection

    Bind waveform tables to temperature bins. Evidence: artifact rate vs temperature and update count.

  • Temperature binning

    Measure temperature representative of the panel/driver area. Evidence: sudden degradation after cold soak indicates bin mismatch.

  • Update cadence (partial vs full)

    Set a cleanup cadence to prevent long-term artifact accumulation. Evidence: ghosting growth with partial count.

  • Driver rail & VCOM stability during refresh

    Prevent Vsys/driver rail dips during waveform steps. Evidence: droop correlated with blank/gray/ghost patterns.

  • Event sequencing

    Avoid stacking RF transactions and refresh in one droop window. Evidence: droop depth increases when events overlap.

Figure F3 — Full vs Partial Refresh + Ghosting Risk Points Two-lane sequence diagram showing full refresh and partial update steps with overlay risk markers for temperature mismatch, rail droop, and VCOM instability. Refresh Sequences Partial saves energy; full refresh resets accumulated artifacts Full Refresh Partial Update Erase / Reset Drive Waveform Settle / Stabilize Commit State Targeted Drive Short Settle Commit State Temp mismatch Temp bin Rail droop Droop VCOM stability VCOM Key rule: use partial by default, schedule full refresh to clear artifacts. Cite this figure · F3
Figure F3. Full refresh resets accumulated artifacts; partial update saves energy but requires correct temperature/LUT selection and stable rails. Link
Evidence separation: ghosting that worsens with temperature bins points to waveform selection; ghosting that correlates with Vsys dips points to droop during refresh.
H2-5 · MCU Selection

ULP Modes, Wake Latency, IO Fit, and State Memory Strategy

ESL MCU selection should follow the duty-cycle reality: a tag lives in deep sleep most of the time, then wakes for RF windows, refresh bursts, and state commits. Multi-year lifetime is rarely decided by a single “sleep current” headline. It is usually decided by wake cost (latency and active energy), timekeeping drift (missed windows → retries), and brownout-safe state handling.

Rule of thumb: treat every wake as an “event.” Minimize event time, minimize event peak, and make every write survivable under droop. Evidence comes from time-to-ready, retry counters, and BOR reasons, not from a single datasheet number.

Selection checklist (each item maps to measurable proof):

  • Deep-sleep + retention leakage

    Baseline must be truly low across temperature. Proof: sleep current profile at cold/room/hot with all pulls and dividers included.

  • Wake latency and “time-to-ready”

    Slow boot expands RF listening and overlaps with refresh pulses. Proof: timestamped wake→IRQ ready and wake→refresh start distribution.

  • RTC accuracy and drift stability

    Drift causes missed rendezvous windows → retries. Proof: drift statistics over days and retry rate vs time since last sync.

  • Retention RAM / fast context restore

    Preserve minimal state to avoid re-initializing everything. Proof: reduced active time per event and stable refresh success rate.

  • Interrupt sources and priority model

    RF IRQ, timer, button, motion can collide. Proof: no event overlap spikes in current profile; counters remain monotonic.

  • IO fit for driver + test evidence

    SPI + busy/ready + reset lines plus test pads for Vbat/Vsys and counters. Proof: repeatable factory programming and field-debug access.

  • State memory: FRAM vs flash (write risk)

    Flash writes are droop-sensitive; FRAM is safer for frequent small commits. Proof: CRC/sequence integrity after forced droop tests.

Common traps (what fails in the field):

  • Great sleep number, expensive wake: long initialization inflates active energy and increases missed RF windows.
  • RTC drift ignored: missed rendezvous windows silently turn into retry storms and rapid battery collapse.
  • Flash write as a background habit: droop during write creates non-reproducible corruption unless atomic commit is enforced.
  • No evidence hooks: without BOR reason + monotonic counters, failures look like “random bugs” and become unfixable at fleet scale.
Figure F4 — ESL Duty-Cycle MCU Decision Map Diagram showing ESL event states from deep sleep to RF window, refresh, and commit, with evidence probes for wake time, retries, BOR reason, and CRC/sequence. MCU Decision Map Choose MCU by event cost, not by a single sleep headline Duty-cycle timeline Deep Sleep Wake RF Window Refresh Commit State MCU factors that decide ESL outcomes Wake latency · RTC drift · retention leakage · interrupt priority · write safety Probe: time-to-ready Probe: retry counter Probe: refresh current Probe: BOR reason Probe: CRC/seq Cite this figure · F4
Figure F4. MCU selection should optimize event cost (wake + RF window + refresh) and ensure droop-safe state commits with evidence hooks. Link
Verification shortcut: log wake time, retries, BOR reasons, and CRC/sequence counters for every update. These four signals explain most “random” field failures.
H2-6 · RF Link Choices

Sub-GHz vs BLE in ESL (Evidence-based)

RF choice in ESL is best decided by energy per successful update, not by a theoretical range claim. In dense shelves and reflective retail environments, the dominant variable is often retry behavior. A link that looks “low power” on paper can become expensive when coexistence or placement drives repeated transactions.

Evidence-first approach: record RSSI/SNR distributions, retry counters, and packet loss versus location. Then estimate energy per successful update using current profiles and time-to-success statistics.

When Sub-GHz tends to win (hardware-practical reasons):

  • Penetration and robustness: better tolerance to shelf density and obstacles, reducing retries for the same coverage.
  • Coverage per gateway: fewer infrastructure nodes for a given area, especially in aisle-heavy layouts.
  • Location variance control: more stable link margin across “good vs bad” positions reduces tail energy events.

When BLE tends to win (ESL-specific deployment advantages):

  • Phone commissioning: convenient local setup and service workflows without extra tooling.
  • Lower-cost infrastructure (in some deployments): if BLE gateways already exist or coverage is small and controlled.
  • Short-range reliability: when placement and coexistence are well-managed, transactions can be efficient.

Comparison table (decide with field evidence):

Dimension Sub-GHz BLE What to log (proof)
Coverage & penetration Typically more robust across shelves and obstacles Strong at short range; more sensitive to 2.4 GHz environment RSSI/SNR distribution by location (not single-point)
Energy per successful update Often stable if retries are controlled Can be excellent or terrible depending on retries Retry counters + time-to-success distribution + current profile
Infrastructure May require dedicated gateways/planning Can leverage phones/gateways in certain setups Gateway density vs success rate and tail retries
Coexistence risk Different interference sources; often less Wi-Fi coupling 2.4 GHz coexistence can drive tail retries Packet loss vs time-of-day; retry spikes near hotspots
Certification surface Region-dependent; planning required Common ecosystem; still region rules apply Deployment region list + pre-scan interference snapshots
Figure F5 — RF Transaction Energy Budget (Concept) Two horizontal segmented bars comparing energy budget components for a successful RF update, highlighting how retries dominate when link margin is poor. RF Energy Budget Retries decide energy per successful update Listen Tx/Rx Retry Commit Case A: low retries Listen Tx/Rx Retry Commit Energy stays predictable Case B: retry storm Listen Tx/Rx Retry (dominant) Commit Evidence: retry counter ↑ · time-to-success tail ↑ · packet loss vs location Decision metric: minimize energy per successful update, not “range.” Cite this figure · F5
Figure F5. RF energy is dominated by retries and tail latency. Record retries and time-to-success to compare Sub-GHz vs BLE in real placement. Link
Field-proof checklist: collect RSSI/SNR distribution, retry counters, and packet loss vs location. Averages hide tail energy events that destroy coin-cell lifetime.
H2-7 · Memory & Data Integrity

Image Buffers, Delta Updates, Wear, and Rollback-Safe Recovery

ESL updates are constrained by storage size, write time, and dropout risk during the update window. Treat each image update as a transaction: receive payload, verify integrity, write to a staging slot, then atomically switch the active image only after a final commit flag. This prevents “half-updated” states that trigger retries, rapid battery drain, and confusing field failures.

Core objective: keep the display consistent with the last known-good image even if a brownout happens mid-transfer or mid-write. Use CRC + sequence counters + atomic commit to make recovery deterministic.

Firmware storage layout (minimal, robust pattern):

  • Image Slot A / Slot B: store full bitmap or compressed representation (two-slot layout enables rollback).
  • Metadata (tiny, high-integrity): active_slot, pending_slot, pending_version, image_crc, commit_flag, write_in_progress.
  • Monotonic sequence: accept only newer versions; never “flip active” without commit_flag.
  • Two-phase commit: PREPARE (pending set) → WRITE (payload stored) → COMMIT (flag set last) → ACTIVATE.

Delta updates (use when they shorten the risky window):

  • Wins: small region changes reduce RF airtime and reduce refresh time, lowering pulse exposure.
  • Traps: large diffs can cost CPU time and extend the active window; treat “delta” as a measured decision.
  • Evidence to decide: time-to-success distribution and write window length under worst placement and cold temperature.

Write media split (avoid hidden corruption):

  • FRAM (or equivalent): frequent small writes (counters, flags, last-good markers) with low dropout risk.
  • Flash: large image payloads with batching and CRC verification; avoid frequent metadata churn in flash.
  • Wear control: batch writes, minimize commit metadata size, and bound retry loops to prevent write storms.

Failure recovery flow (deterministic outcomes):

  1. Dropout during RF receive

    Discard pending payload. Keep active_slot unchanged. Increment retry counter with backoff.

  2. Dropout during flash write

    On boot, detect write_in_progress and invalidate pending_slot by CRC/flag. Keep active_slot.

  3. Dropout after write, before commit

    CRC may be valid, but commit_flag is not set. Keep active_slot; reattempt activation only after explicit commit.

  4. Dropout after commit

    If commit_flag and CRC are valid, activate pending_slot and clear write_in_progress; record last-good version.

Figure F6 — Firmware Storage & Update Transaction Map Diagram of an ESL image update transaction with payload verification, staging write into A/B slots, atomic commit, activation, and rollback-safe recovery. Update Transaction CRC + SEQ + atomic COMMIT keeps last-good image safe RF Payload bitmap / compressed / delta Verify CRC check + SEQ compare Staging Write write_in_progress=1 Flash / FRAM Image Slot A last-good Image Slot B pending / staging Metadata (tiny, high-integrity) active_slot pending_slot image_crc pending_seq COMMIT flag in_progress activate rollback Log: CRC fail Log: SEQ Log: commit Cite this figure · F6
Figure F6. A/B image slots plus CRC/SEQ and an atomic commit flag make image updates rollback-safe under brownout and partial writes. Link
Dropout test: force a brownout at random offsets during receive/write/commit and confirm the display always returns to a valid last-good image.
H2-8 · Hardware Design Checklist

Layout, EMC/ESD, Antenna Detuning, and Minimum Test Points

ESL failures in production are often caused by ESD entry points, near-metal antenna detuning, and micro-leakage that silently destroys lifetime. A production-ready design should expose the minimum measurement hooks needed to separate power droop, RF retries, and driver rail instability.

Production mindset: design for evidence. Add test pads for Vbat/Vsys/driver rails, RF probing, and a debug interface. Without these, field failures look random and cannot be fixed at scale.

Copy/paste hardware checklist (build-ready):

  • ESD at touch/button/frame entry

    Check: series-R/TVS placement near entry · Verify: no lockups/resets during contact discharge; retry counter stays stable.

  • Antenna keepout and near-metal detuning

    Check: keepout region + matching network reserve · Verify: RSSI/SNR distribution remains acceptable in real mounting orientation.

  • Power path pulse handling

    Check: bulk cap placement on pulse rails · Verify: Vsys minimum during refresh stays above BOR/UVLO thresholds.

  • Driver rails and VCOM stability

    Check: rail decoupling and return paths · Verify: no droop during waveform steps; ghosting does not correlate with Vsys dips.

  • EMC noise hotspots (shelf environment)

    Check: sensitive traces, ground strategy, RF layout · Verify: packet loss vs location does not show hotspot spikes.

  • Low leakage rules

    Check: pullup/divider values, sensor/LED gating, floating IO · Verify: deep-sleep current across temperature stays within target.

  • Minimum test pads set

    Check: TP: Vbat, Vsys, driver rail, RF test, SWD/UART · Verify: quick diagnosis of droop vs RF retries vs driver instability.

Minimum test points (fast field and factory evidence):

  • TP: Vbat (coin cell) and TP: Vsys (system rail) to catch pulse droop.
  • TP: driver rails (panel/driver supply points) to correlate ghosting/blank updates with rail stability.
  • TP: RF (probe point or reserved matching pads) to measure detuning under real mounting.
  • TP: SWD/UART (or equivalent) for BOR reasons, retry counters, CRC/SEQ logging, and recovery diagnosis.
Figure F7 — Production Killers Map + Test Points Block diagram highlighting common production killers (ESD, detuning, leakage, EMC) and the minimum test pads required for evidence-based debugging. Production Checklist Map ESD · Detune · Leakage · EMC — add test pads for proof ESL Core Blocks Coin Cell Vbat Power Path Vsys MCU SWD/UART RF + Antenna RF test E-Paper Driver driver rails ESD entry points Leakage traps Antenna detune EMC TP: Vbat TP: Vsys TP: SWD/UART TP: RF TP: rails Cite this figure · F7
Figure F7. Production failures often trace to ESD entry points, antenna detuning near metal, EMC hotspots, and leakage traps. Add test pads for proof. Link
Sleep-current audit: measure deep-sleep current with real assembly (panel connected, sensors attached, pullups enabled) across temperature. Small leakage dominates lifetime.
H2-7 · Memory & Data Integrity

Image Buffers, Delta Updates, Wear, and Rollback-Safe Recovery

ESL updates are constrained by storage size, write time, and dropout risk during the update window. Treat each image update as a transaction: receive payload, verify integrity, write to a staging slot, then atomically switch the active image only after a final commit flag. This prevents “half-updated” states that trigger retries, rapid battery drain, and confusing field failures.

Core objective: keep the display consistent with the last known-good image even if a brownout happens mid-transfer or mid-write. Use CRC + sequence counters + atomic commit to make recovery deterministic.

Firmware storage layout (minimal, robust pattern):

  • Image Slot A / Slot B: store full bitmap or compressed representation (two-slot layout enables rollback).
  • Metadata (tiny, high-integrity): active_slot, pending_slot, pending_version, image_crc, commit_flag, write_in_progress.
  • Monotonic sequence: accept only newer versions; never “flip active” without commit_flag.
  • Two-phase commit: PREPARE (pending set) → WRITE (payload stored) → COMMIT (flag set last) → ACTIVATE.

Delta updates (use when they shorten the risky window):

  • Wins: small region changes reduce RF airtime and reduce refresh time, lowering pulse exposure.
  • Traps: large diffs can cost CPU time and extend the active window; treat “delta” as a measured decision.
  • Evidence to decide: time-to-success distribution and write window length under worst placement and cold temperature.

Write media split (avoid hidden corruption):

  • FRAM (or equivalent): frequent small writes (counters, flags, last-good markers) with low dropout risk.
  • Flash: large image payloads with batching and CRC verification; avoid frequent metadata churn in flash.
  • Wear control: batch writes, minimize commit metadata size, and bound retry loops to prevent write storms.

Failure recovery flow (deterministic outcomes):

  1. Dropout during RF receive

    Discard pending payload. Keep active_slot unchanged. Increment retry counter with backoff.

  2. Dropout during flash write

    On boot, detect write_in_progress and invalidate pending_slot by CRC/flag. Keep active_slot.

  3. Dropout after write, before commit

    CRC may be valid, but commit_flag is not set. Keep active_slot; reattempt activation only after explicit commit.

  4. Dropout after commit

    If commit_flag and CRC are valid, activate pending_slot and clear write_in_progress; record last-good version.

Figure F6 — Firmware Storage & Update Transaction Map Diagram of an ESL image update transaction with payload verification, staging write into A/B slots, atomic commit, activation, and rollback-safe recovery. Update Transaction CRC + SEQ + atomic COMMIT keeps last-good image safe RF Payload bitmap / compressed / delta Verify CRC check + SEQ compare Staging Write write_in_progress=1 Flash / FRAM Image Slot A last-good Image Slot B pending / staging Metadata (tiny, high-integrity) active_slot pending_slot image_crc pending_seq COMMIT flag in_progress activate rollback Log: CRC fail Log: SEQ Log: commit Cite this figure · F6
Figure F6. A/B image slots plus CRC/SEQ and an atomic commit flag make image updates rollback-safe under brownout and partial writes. Link
Dropout test: force a brownout at random offsets during receive/write/commit and confirm the display always returns to a valid last-good image.
H2-8 · Hardware Design Checklist

Layout, EMC/ESD, Antenna Detuning, and Minimum Test Points

ESL failures in production are often caused by ESD entry points, near-metal antenna detuning, and micro-leakage that silently destroys lifetime. A production-ready design should expose the minimum measurement hooks needed to separate power droop, RF retries, and driver rail instability.

Production mindset: design for evidence. Add test pads for Vbat/Vsys/driver rails, RF probing, and a debug interface. Without these, field failures look random and cannot be fixed at scale.

Copy/paste hardware checklist (build-ready):

  • ESD at touch/button/frame entry

    Check: series-R/TVS placement near entry · Verify: no lockups/resets during contact discharge; retry counter stays stable.

  • Antenna keepout and near-metal detuning

    Check: keepout region + matching network reserve · Verify: RSSI/SNR distribution remains acceptable in real mounting orientation.

  • Power path pulse handling

    Check: bulk cap placement on pulse rails · Verify: Vsys minimum during refresh stays above BOR/UVLO thresholds.

  • Driver rails and VCOM stability

    Check: rail decoupling and return paths · Verify: no droop during waveform steps; ghosting does not correlate with Vsys dips.

  • EMC noise hotspots (shelf environment)

    Check: sensitive traces, ground strategy, RF layout · Verify: packet loss vs location does not show hotspot spikes.

  • Low leakage rules

    Check: pullup/divider values, sensor/LED gating, floating IO · Verify: deep-sleep current across temperature stays within target.

  • Minimum test pads set

    Check: TP: Vbat, Vsys, driver rail, RF test, SWD/UART · Verify: quick diagnosis of droop vs RF retries vs driver instability.

Minimum test points (fast field and factory evidence):

  • TP: Vbat (coin cell) and TP: Vsys (system rail) to catch pulse droop.
  • TP: driver rails (panel/driver supply points) to correlate ghosting/blank updates with rail stability.
  • TP: RF (probe point or reserved matching pads) to measure detuning under real mounting.
  • TP: SWD/UART (or equivalent) for BOR reasons, retry counters, CRC/SEQ logging, and recovery diagnosis.
Figure F7 — Production Killers Map + Test Points Block diagram highlighting common production killers (ESD, detuning, leakage, EMC) and the minimum test pads required for evidence-based debugging. Production Checklist Map ESD · Detune · Leakage · EMC — add test pads for proof ESL Core Blocks Coin Cell Vbat Power Path Vsys MCU SWD/UART RF + Antenna RF test E-Paper Driver driver rails ESD entry points Leakage traps Antenna detune EMC TP: Vbat TP: Vsys TP: SWD/UART TP: RF TP: rails Cite this figure · F7
Figure F7. Production failures often trace to ESD entry points, antenna detuning near metal, EMC hotspots, and leakage traps. Add test pads for proof. Link
Sleep-current audit: measure deep-sleep current with real assembly (panel connected, sensors attached, pullups enabled) across temperature. Small leakage dominates lifetime.
H2-9 · Validation Plan

Measurements That Prove Lifetime and Robustness

A repeatable ESL validation plan should produce a single evidence set that supports three claims: lifetime estimation is credible, updates survive worst-case conditions, and RF reliability holds in real shelf geometry. The plan below standardizes tests, instruments, pass criteria, and log fields so results remain comparable across revisions.

Evidence-first rule: always capture (1) current profile segments, (2) Vbat/Vsys minima during refresh and RF windows, (3) BOR reasons, (4) retry counters + RSSI/SNR distributions, and (5) ghosting matrix outcomes by temperature.

Validation table (Test → Instrument → Pass criteria → Data to log):

Test Instrument / Setup Pass criteria Data to log (minimum fields)
Sleep baseline profile Power analyzer or shunt + DMM; real assembly state; temp bins Baseline current within target across temp; no unexpected leakage modes temp, I_sleep_avg, I_sleep_p95, mode flags (pullups/sensors), Vbat
RF burst energy per success Power profile capture during update; gateway/receiver logs Energy per successful update bounded; tail retries remain within limit rssi, snr, retry_cnt, t_success, E_success, seq, crc_fail
Refresh burst + droop margin Scope on TP:Vsys and TP:driver rail; trigger on refresh start Vsys_min stays above BOR/UVLO margin; driver rail stable in waveform steps Vsys_min, Vbat_min, rail_min, refresh_type (full/partial), temp
Brownout worst-case survival Cold condition + aged cell; forced update loop; BOR reason logging No corrupted state; if BOR occurs, rollback to last-good succeeds bor_reason, commit_flag, active_slot, pending_slot, crc_ok, reboot_count
Ghosting matrix Temp bins × partial count; controlled content patterns Ghosting boundary defined; mitigation triggers verified (full refresh policy) temp, lut_id, partial_count, Vsys_min, score (visual/metric), refresh_policy
RF reliability shelf matrix Distance/angle/metal shelf placements; log RSSI/SNR and success Success rate meets target at defined placements; hotspots documented placement_id, distance, angle, metal_state, rssi, snr, retry_cnt, success
ESD / touch-point robustness Controlled discharge on touch/button/frame points; monitor resets No lockup; reset count bounded; RF performance not degraded reset_count, bor_reason, retry_cnt, rssi baseline delta, error flags
Lifetime model input: compute baseline energy from sleep current and duty cycle, then add E_success × update frequency. Use tail statistics (p95/p99 of retries and t_success) to avoid optimistic estimates.
Figure F8 — Validation Evidence Chain Map Evidence chain diagram connecting validation tests to common log fields and to final conclusions: lifetime estimate, robustness, ghosting boundary, and deployment limits. Evidence Chain One log schema supports lifetime, robustness, ghosting, and RF limits Current Profile Brownout Margin Ghosting Matrix RF Shelf Matrix Unified Log Fields Vsys_min BOR reason retry_cnt RSSI SNR temp / LUT CRC / SEQ t_success Lifetime estimate Robust update Ghosting boundary Deploy limits Cite this figure · F8
Figure F8. Standardize a unified log schema (Vsys_min, BOR, retries, RSSI/SNR, temp/LUT, CRC/SEQ) to prove lifetime and robustness across revisions. Link
H2-10 · Field Debug Playbook

Symptom → Evidence → Isolate → Fix (Fast SOP Cards)

Field debugging should start with two measurements that split the problem space quickly: Vsys_min during refresh and RF retry behavior. Each symptom card below includes the first two measurements, a discriminator that isolates the root cause bucket, and a first fix that is safe to try early.

Default evidence hooks: TP:Vbat, TP:Vsys, TP:driver rails, TP:RF, and SWD/UART logs for BOR reason, retry counters, CRC/SEQ, and time-to-success.
SymptomBlank / missing update

Random blank screen or “update applied” but nothing changes

  1. First 2 measurements: TP:Vsys_min during refresh · TP:driver rail minimum / driver-busy timing.
  2. Discriminator: if Vsys_min dips near BOR/UVLO → power droop bucket; if rails stable but ghosting/blank persists → waveform/temp bucket.
  3. First fix: decouple refresh rail, shorten refresh overlap with RF, enforce temp-binned waveform selection, and log refresh start/end markers.
SymptomDesync / missing packets

Frequent desync: updates arrive late or not at all

  1. First 2 measurements: retry_cnt + t_success distribution · RSSI/SNR variance by placement.
  2. Discriminator: if retries spike only near metal or certain angles → antenna detune bucket; if retries spike by time-of-day → coexistence/EMC bucket.
  3. First fix: adjust antenna keepout/matching, document hotspot placements, add backoff and cap retries per update to protect coin-cell.
SymptomReset on update

Battery looks “healthy” but resets during update

  1. First 2 measurements: TP:Vbat droop under refresh pulse · BOR reason log (SWD/UART).
  2. Discriminator: if Vbat droops but baseline is fine → pulse load + internal resistance bucket; if droop aligns with RF+refresh overlap → scheduling bucket.
  3. First fix: increase local bulk near pulse rails, stage RF before refresh, and enforce rollback-safe update commit so resets cannot corrupt state.
SymptomGhosting

Ghosting grows over days as partial updates accumulate

  1. First 2 measurements: temp bin + LUT/waveform ID · partial_count + Vsys_min during refresh.
  2. Discriminator: if ghosting correlates with cold/hot bins → waveform/temp mismatch bucket; if ghosting correlates with droop events → rail stability bucket.
  3. First fix: define a full-refresh trigger (by partial_count or temp change), and verify rails during waveform steps with scoped minima.
SymptomLoop / mismatch

Update loops, repeated retries, or version mismatch across tags

  1. First 2 measurements: SEQ monotonicity (logs) · CRC fail count per update.
  2. Discriminator: if CRC fails are non-zero → transfer or storage corruption bucket; if CRC OK but versions flip → commit/atomicity bucket.
  3. First fix: enforce A/B slots + atomic COMMIT, bind activation to commit_flag, and cap retry loops to avoid write storms and battery collapse.
SymptomBench vs shelf

Works on bench but fails on shelf or in-store

  1. First 2 measurements: shelf placement matrix logs (RSSI/SNR + retries) · reset/BOR counters over time.
  2. Discriminator: if failures cluster by placement → detune/hotspot bucket; if failures cluster by interaction → ESD entry bucket.
  3. First fix: adjust antenna matching for mounted condition, harden ESD entry points, and add minimum test pads to make evidence collection fast.
Figure F9 — Symptom-to-Evidence Decision Tree (ESL) Decision tree mapping common ESL symptoms to the first two measurements and to isolation buckets for fast field debugging. Decision Tree Start with Vsys_min + retries to isolate root-cause fast Blank Ghosting Desync Reset / Loop First 2 measurements (evidence) Vsys_min + BOR reason retry_cnt + RSSI/SNR variance CRC/SEQ Power droop Waveform-temp RF detune Commit/CRC EMC Fix priority: protect coin-cell by limiting retries and keeping updates rollback-safe. Cite this figure · F9
Figure F9. Use Vsys_min, BOR reason, retries, RSSI/SNR variance, and CRC/SEQ to isolate ESL failures quickly and choose a safe first fix. Link
Fast triage: if Vsys_min is healthy but retries explode, isolate RF placement/detune first; if Vsys_min collapses, isolate pulse and scheduling first.
H2-11 · BOM / MPN Example Buckets

Selection-Oriented Parts Buckets (Non-exhaustive)

The list below is designed for fast ESL selection without drifting into protocol stacks or unrelated product teardowns. Each bucket provides (1) what to look for in an e-paper tag duty cycle, (2) concrete example parts, and (3) evidence hooks to validate choices.

How to use: treat MPNs as anchors for RFQ and cross-reference. Final selection should be proven with Vsys_min, BOR reasons, retry_cnt, RSSI/SNR variance, and CRC/SEQ logs under worst-case temperature and shelf placement.
Figure F10 — BOM Buckets Map for ESL System-centered diagram mapping BOM buckets (MCU, RF, driver, supervisor, load switch, memory) to the ESL core and to evidence hooks like Vsys_min, BOR, retries, and CRC/SEQ. BOM Buckets Selection knobs tied to coin-cell reality and field evidence ESL Core Coin cell + pulses Sleep + updates Shelf RF + metal ULP MCU sleep · wake · RTC Sub-GHz / BLE retry energy · RSSI E-paper Driver rails · LUT · busy Supervisor nA Iq · thresholds Load Switch leakage · inrush FRAM / Flash atomic commit · CRC Evidence Hooks Vsys_min + BOR retry_cnt + RSSI/SNR CRC/SEQ + t_success Cite this figure · F10
Figure F10. BOM buckets should be screened using evidence hooks (Vsys_min/BOR, retries/RSSI/SNR, CRC/SEQ) under worst-case shelf and temperature conditions. Link
BucketULP MCUsleep · wake · RTC

ULP MCU families (ESL duty-cycle oriented)

  • What to look for: deep sleep (incl. retention), wake latency/energy, RTC drift, fast peripheral bring-up, low-leakage GPIO defaults.
  • Why it matters: “nice sleep current” is not enough if wake + RF + refresh windows stretch and multiply energy per update.
  • Evidence hook: log t_wake, E_success, and retry tails (p95/p99).
Example parts (MPN)Why they fit ESLQuick screening knobs
TI MSP430FR series (e.g., MSP430FR5969 / MSP430FR2433) FRAM-centric design supports frequent small writes; strong ULP heritage for long sleep windows. sleep+retention, FRAM write strategy, RTC drift, wake path time
ST STM32L0 / STM32L4 (e.g., STM32L072 / STM32L476) Low-power modes + broad ecosystem; useful when more compute is needed for delta logic or logging. stop-mode current, wake latency, flash write time, Vmin range
Nordic nRF52 (e.g., nRF52810 / nRF52832) Good when BLE is primary and the SoC consolidates MCU+radio to reduce BOM and wake overhead. connection event energy, deep sleep, fast wake to radio
Silicon Labs EFR32 (e.g., EFR32BG22) BLE-focused SoC option; integrates radio and supports logging and fast event handling. radio event energy, sleep/retention, RF retry tail

Common trap: picking by headline sleep current while ignoring wake + radio start time and flash write overhead during updates.

BucketSub-GHzrange · penetration

Sub-GHz transceivers (shelf reliability first)

  • What to look for: energy per successful transaction (incl. retries), RX sensitivity, PA efficiency, fast turnaround, observable RSSI/SNR.
  • Why it matters: metal shelves and dense tags create hotspots where retries explode and drain coin-cells.
  • Evidence hook: track retry_cnt, t_success, and RSSI/SNR variance by placement.
Example parts (MPN)Why they fit ESLQuick screening knobs
TI CC1101 Common sub-GHz transceiver used in low-power links; suitable for controlled shelf networks. RX sensitivity, turnaround time, transaction energy with retries
TI CC1310 / CC1312 Sub-GHz MCU+radio integration can cut BOM and reduce wake overhead for sub-GHz-centric ESL. sleep+radio wake, PA efficiency, logging hooks
Silicon Labs Si4463 Sub-GHz transceiver option for robust links; useful when pairing with separate MCU. link margin, RX current, retry tail control
Semtech SX1262 / SX1261 Sub-GHz transceivers often used where long-range/penetration is favored; screen by transaction energy for ESL. TX/RX current profile, fast mode transitions, retry behavior

Common trap: using lab range numbers instead of placement matrix data; a few hotspot tags can dominate support cost.

BucketBLEcommissioning

BLE SoCs (phone-friendly commissioning)

  • What to look for: energy per connection/advertising event, fast reconnect, low sleep+retention, stable RSSI reporting.
  • Why it matters: BLE success is dominated by coexistence and connection event overhead in dense environments.
  • Evidence hook: log E_success per update and retry distributions vs time-of-day congestion.
Example parts (MPN)Why they fit ESLQuick screening knobs
Nordic nRF52810 / nRF52811 Cost-effective BLE SoCs; good for simple ESL tags where BLE is the primary link. advertising/connection energy, sleep current, fast wake
Nordic nRF52832 More headroom for logging/delta logic while maintaining BLE integration. CPU active time vs update window, RF retry tail
Silicon Labs EFR32BG22 BLE SoC family used for low-power BLE designs; screen by event energy and coexistence robustness. event energy, sleep+retention, RSSI variance
TI CC2642R BLE SoC option when TI ecosystem fits; evaluate by update transaction energy. radio start latency, event energy, retry behavior

Common trap: focusing on peak current while ignoring connection window duration and repeated reconnect cost.

BucketE-paper driverrails · LUT

E-paper driver ICs (update stability + ghosting control)

  • What to look for: supported panel size/resolution, internal HV generation needs, LUT/temperature handling, busy/status visibility, rail tolerance during waveform steps.
  • Why it matters: ghosting and blank updates often come from LUT-temp mismatch or rail droop mid-waveform.
  • Evidence hook: correlate ghosting score with temp, lut_id, and Vsys_min.
Example parts (MPN)Why they fit ESLQuick screening knobs
Solomon Systech SSD1681 Common e-paper driver/controller used in small-to-mid e-paper panels; verify rails and waveform behavior. busy timing, waveform tables, refresh stability
Solomon Systech SSD1675A Used in various e-paper modules; screen by partial/full refresh behavior and temp handling. LUT/temp bins, partial refresh boundary
UltraChip UC8151 / UC8176 Commonly seen in e-paper modules; evaluate by update current profile and ghosting matrix. refresh burst profile, Vsys_min during steps
Good Display / module controllers (module-integrated driver variants) When buying integrated panels/modules, focus on driver visibility (busy/status) and waveform options rather than hidden black boxes. status pins, waveform access, refresh time

Common trap: treating partial refresh as “free” without defining a full refresh trigger by temp change or partial_count.

BucketSupervisornA Iq

Nanoamp supervisors (brownout correctness over “sensitivity”)

  • What to look for: ultra-low Iq, threshold accuracy, delay/deglitch, manual reset, and behavior during pulse sag (avoid nuisance resets).
  • Why it matters: wrong thresholds/delays can reset the tag exactly during refresh pulses, corrupting update transactions.
  • Evidence hook: log BOR reason and reset counts during forced update loops at cold + aged cell.
Example parts (MPN)Why they fit ESLQuick screening knobs
TI TPS3839 Low-power supervisor family; select by threshold and delay to avoid pulse-triggered resets. Vth, delay, Iq, reset pulse width
TI TPS3840 Supervisor option for battery-powered designs; validate behavior under refresh pulse sag. Vth, deglitch, Iq
Analog Devices MAX16054 Ultra-low-power supervisor option; screen by threshold fit to coin-cell worst-case Vmin. Iq, threshold, hysteresis, delay
Microchip MCP112 / MCP121 Common supervisors; evaluate Iq and threshold/delay against ESL pulse behavior. Iq vs temp, Vth tolerance, reset delay

Common trap: selecting the highest threshold “for safety” and causing nuisance resets under cold pulse sag.

BucketLoad switchleakage

Load switches / power gating (stop leakage, control inrush)

  • What to look for: very low off leakage, Rds(on) fit for pulse rails, reverse current blocking, controlled rise time (soft-start), clean enable defaults.
  • Why it matters: microamp leak paths silently kill lifetime; uncontrolled inrush worsens Vsys_min during refresh.
  • Evidence hook: measure sleep current by rail-gating states; scope Vsys_min during gated rail enable + refresh.
Example parts (MPN)Why they fit ESLQuick screening knobs
TI TPS22916 / TPS22917 Load switch options often used for rail gating; verify leakage and inrush control for coin-cell. Ioff, Ron, slew rate, reverse blocking
TI TPS22910A Small load switch family; useful for gating sensors/aux rails to protect baseline current. Ioff, enable leakage, Ron
Analog Devices ADP199 Load switch option; screen for low leakage and stable turn-on under weak supply. Ioff, turn-on behavior, reverse leakage
onsemi NCP45521 Power path / load switch style part; validate leakage and droop impact in ESL update pulses. reverse current, Ron, Iq/Ioff

Common trap: leaving divider pullups, LEDs, or sensor rails ungated and losing years of lifetime to microamps.

BucketMemoryatomic commit

FRAM / Flash (image storage + rollback-safe metadata)

  • What to look for: write time/energy, endurance, brownout safety strategy (A/B slots + atomic COMMIT), CRC/SEQ integration.
  • Why it matters: ESL updates are transactions; half-written metadata causes loops and repeated RF + refresh drains.
  • Evidence hook: log commit_flag, active_slot, crc_ok, and reset causes across forced brownout tests.
Example parts (MPN)Why they fit ESLQuick screening knobs
TI FRAM (e.g., FM24C64B / FM24CL64B) FRAM supports frequent small writes for metadata (SEQ/flags) with reduced wear and faster commit behavior. write energy, interface (I²C/SPI), retention vs temp
Cypress/Infineon FRAM (e.g., CY15B104Q) FRAM option for robust metadata commits; use for commit_flag, counters, and last-good records. write model, interface, brownout behavior
Winbond SPI NOR (e.g., W25Q32JV / W25Q64JV) Common SPI flash for image A/B slots; must pair with atomic commit and CRC to survive resets. page program time, erase strategy, Vmin, endurance
Micron / Macronix SPI NOR (e.g., MT25QL / MX25R series) Flash families used for low-power storage; screen by program/erase energy and Vmin for coin-cell. program current, erase time, deep power-down

Common trap: storing frequently-updated metadata in flash without wear/atomicity controls, causing loops and repeated updates.

Pre-RFQ checklist: verify (1) Vmin margin at cold + aged cell, (2) peak pulse handling during refresh, (3) retry tail behavior at metal shelves, (4) atomic update commit (A/B + CRC/SEQ), and (5) measurable debug hooks (BOR, retries, Vsys_min).

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.
H2-12 · FAQs

ESL Hardware FAQs — Evidence First (12)

Each answer stays in-scope and lands on measurable ESL evidence: power droop, waveform/LUT + temperature, RF retries/RSSI/SNR, memory atomicity (CRC/SEQ), and a validation matrix.

Rule of thumb: start with Vsys_min + BOR reason for resets/blank screens, and retry_cnt + RSSI/SNR variance for RF complaints. Then confirm with CRC/SEQ and commit_flag.
Figure F11 — FAQ Evidence Router (ESL) Block diagram routing common ESL symptoms to first evidence captures and root-cause buckets: power droop, waveform temperature mismatch, RF detuning/hotspots, and memory atomicity. FAQ Evidence Router Symptom → first evidence → root-cause bucket Symptoms Reset during refresh Blank / black updates Ghosting accumulates RF desync / high retries First Evidence Vsys_min + BOR reason droop / nuisance reset proof lut_id + temp + partial_count waveform mismatch boundary retry_cnt + RSSI/SNR variance placement hotspots / detune CRC/SEQ + commit_flag atomic update & rollback health Root-cause Buckets Power droop coin-cell pulse limit Waveform-temp mismatch LUT / update cadence RF detune / hotspots metal shelf impact Memory atomicity A/B + commit integrity Q1 Q4/Q6 Q2/Q5 Q3/Q9 → Q1/Q2 → Q4/Q5 → Q3/Q10 → Q11/Q12 Cite this figure · F11
Figure F11. Use the router to pick the first two captures (power / waveform-temp / RF / memory) before deeper investigation. Link
1 Why does an ESL tag reset only during e-paper refresh even when the battery “looks fine”?
Root-cause bucket: Power droop during refresh pulses (often cold/aged-cell worst case).
  • First 2 evidence captures: scope Vsys_min during refresh + read BOR reason (or reset cause flags).
  • What it usually means: the coin-cell voltage under pulse sag crosses UVLO/BOR even if open-circuit voltage is healthy.
  • First fix (safe): shorten/segment refresh, avoid RF + refresh overlap, add controlled bulk near driver rails, tune supervisor threshold/delay to avoid nuisance resets.
  • Log fields: Vsys_min, temp, BOR reason, t_refresh, retry_cnt.
2 How can coin-cell impedance be proven as the cause rather than a firmware bug?
Root-cause bucket: Power droop vs. software timing (separated by rail evidence).
  • First 2 evidence captures: capture Vbat_min/Vsys_min under the same refresh + RF pattern, then compare against a lab supply with the same UVLO/BOR thresholds.
  • What proves impedance: failures track pulse current and temperature; with a stiff supply the symptom disappears while firmware stays unchanged.
  • First fix (safe): reduce peak load (staged refresh), enforce “no-RF-during-refresh,” and validate worst-case at cold + aged cell.
  • Log fields: Vsys_min, temp, BOR reason, t_success.
3 Same firmware, different shelf positions show very different RF retry rates — why?
Root-cause bucket: RF detuning and placement hotspots (near metal, orientation, human traffic).
  • First 2 evidence captures: record retry_cnt + map RSSI/SNR variance by position (distance/angle/near-metal matrix).
  • What it usually means: a few “bad geometry” points collapse link margin and multiply transaction energy, even with identical code.
  • First fix (safe): enforce antenna keepout, add matching adjust pads, change tag orientation, or add a gateway where hotspots dominate.
  • Log fields: retry_cnt, RSSI, SNR, t_success, pos_id.
4 Partial updates look OK initially — why does ghosting accumulate after a week?
Root-cause bucket: Waveform-temp mismatch and missing “full refresh trigger” boundary.
  • First 2 evidence captures: log partial_count + track temp/lut_id used for each update.
  • What it usually means: partial refresh does not fully erase; temperature drift or LUT mismatch leaves residual charge that accumulates.
  • First fix (safe): define full-refresh triggers (temp delta, partial_count threshold), and ensure refresh rails stay above Vsys_min limits during waveform steps.
  • Log fields: temp, lut_id, partial_count, Vsys_min.
5 Why does cold temperature suddenly cause “blank updates” or very slow refresh?
Root-cause bucket: Temperature-dependent e-paper waveform + higher coin-cell impedance at cold.
  • First 2 evidence captures: capture Vsys_min during refresh at cold + confirm lut_id matches the temperature bin.
  • What it usually means: cold requires longer waveform timing; at the same time the cell sags more under pulse, so rails collapse or steps fail, producing blank/weak updates.
  • First fix (safe): enforce temperature-binned LUTs, slow down only when required, and validate margin at cold + aged cell with worst-case content.
  • Log fields: temp, lut_id, Vsys_min, t_refresh, BOR reason.
6 What two waveforms should be captured first for a “random blank screen” complaint?
Root-cause bucket: Refresh rail collapse vs. driver/waveform sequencing error (distinguished by rail timing).
  • First 2 evidence captures: (1) Vsys / driver rail during refresh with trigger on droop, (2) driver BUSY (or status pin) aligned to refresh start/end.
  • How to interpret: droop aligned with BUSY steps suggests power margin; BUSY abnormal with stable rails suggests LUT/temp or sequencing issues.
  • First fix (safe): prevent RF overlap, reduce peak load, and lock LUT selection to temperature bins before deeper firmware changes.
  • Log fields: Vsys_min, busy_time, lut_id, temp.
7 FRAM vs flash: which reduces corruption risk during brownout?
Root-cause bucket: Memory atomicity and “update as a transaction” design.
  • First 2 evidence captures: force brownout during metadata writes and check CRC/SEQ + commit_flag consistency across reboot.
  • Practical rule: FRAM is safer for frequent small metadata commits; flash is fine for large A/B image slots when paired with atomic commit and CRC gating.
  • First fix (safe): keep counters/flags in FRAM (or wear-safe region) and never switch active image without a final commit step.
  • Log fields: active_slot, pending_slot, CRC_ok, SEQ, BOR reason.
8 Why does adding a big capacitor sometimes make RF worse?
Root-cause bucket: Inrush and impedance interactions that disturb RF rails or timing windows.
  • First 2 evidence captures: scope Vsys during cap charge + log retry_cnt/t_success before vs after the change.
  • What it usually means: uncontrolled inrush pulls the weak coin-cell down; the radio sees rail dip, longer startup, or more retries, increasing transaction energy.
  • First fix (safe): use controlled ramp (load switch soft-start), place bulk near the correct rail, and separate refresh energy storage from RF-sensitive rails.
  • Log fields: Vsys_min, t_radio_start, retry_cnt, BOR reason.
9 How to choose Sub-GHz vs BLE if infrastructure cost is fixed but lifetime must be 5 years?
Root-cause bucket: Energy per successful update (retries dominate) under real placement.
  • First 2 evidence captures: compare E_success distribution (or t_success + current profile) and retry_cnt tails for both links in the same shelf matrix.
  • Decision rule: pick the link with the lowest “retry tail” in worst hotspots; average numbers are misleading for multi-year claims.
  • First fix (safe): reduce retries (antenna/placement/gateway), then re-evaluate; link choice should follow evidence, not theory.
  • Log fields: retry_cnt, RSSI/SNR, t_success, Vsys_min.
10 What layout mistake most often ruins antenna performance on metal shelves?
Root-cause bucket: Antenna detune from metal proximity and broken keepout/ground strategy.
  • First 2 evidence captures: compare RSSI/SNR variance with/without shelf metal proximity + inspect matching network adjustment range (pad option effectiveness).
  • Typical mistakes: no keepout, ground pour under antenna, nearby high-current loops, and no tuning options after enclosure/label mounting.
  • First fix (safe): enforce keepout + controlled ground edge, add matching pads, and validate in the final mounted state (not free space).
  • Log fields: RSSI, SNR, retry_cnt, pos_id.
11 How can a safe update/rollback be designed so the tag never bricks in the field?
Root-cause bucket: Atomic commit and rollback-safe A/B slots with CRC/SEQ gating.
  • First 2 evidence captures: validate commit_flag transitions + verify CRC/SEQ checks block activation of corrupted images.
  • Safe pattern: write to a staging slot, verify CRC, then flip active_slot only as the last step; on reset, boot selects last-good.
  • First fix (safe): add a monotonic SEQ and “pending vs active” metadata; never overwrite last-good until new is proven.
  • Log fields: active_slot, pending_slot, CRC_ok, SEQ, BOR reason.
12 What’s the minimum validation matrix to claim “multi-year” battery life credibly?
Root-cause bucket: Validation evidence chain (tails at worst temperature + worst shelf positions).
  • First 2 evidence captures: current profile for sleep + E_success distribution (including retry tails) under a shelf matrix.
  • Minimum matrix: temp bins (cold/room/hot) × aged-cell worst case × update types (partial/full) × placement hotspots.
  • First fix (safe): lock full-refresh triggers, cap retries, and prove Vsys_min margin so tails do not destroy lifetime models.
  • Log fields: I_sleep, E_success, retry_cnt, t_success, Vsys_min, temp.