123 Main Street, New York, NY 10001

Balise / Transponder: RF Demodulation, Diagnostics, Event Timing

← Back to: Rail Transit & Locomotive

A Balise/Transponder link is only “reliable” when it can prove every pass with a consistent RF→decode→timestamp→commit evidence chain—so unreadable, wrong reads, and post-ESD failures are debugged by checking a few key fields (RSSI/AGC/CRC/timebase/commit) and applying the first fix pattern before changing architecture.

H2-1. Scope & Interfaces

This page locks the scope to the balise/transponder over-the-air read chain: the trackside balise and the onboard BTM (Balise Transmission Module) path that performs coupling, RF front-end processing, demod/decoding, and diagnostic evidence recording. It intentionally avoids broader ETCS/CBTC onboard architecture and any unrelated trackside subsystems.

Balise / Transponder BTM Read Chain RF Front-End Demod / Decode Event Evidence & Timestamps

1) System roles (responsibility boundary)

  • Trackside balise (passive / semi-active) is responsible for producing a valid telegram under allowed coupling conditions and environmental tolerances. The key requirement is not “responds once,” but responds repeatably across installation variation, temperature drift, and nearby metal effects.
  • Onboard BTM is responsible for the full read chain: coupling → RF conditioning → demod → sync → decode → integrity check → output, plus diagnostic evidence generation when reads fail (failure without evidence is not diagnosable).

2) Interface map (interfaces are evidence entry points)

Treat each interface as a place where the system must expose at least one measurable field for acceptance and field debugging:

  • Antenna / coupler interface → coupling strength indicators (RSSI proxy), AGC level, saturation/weak-signal counters.
  • RF front-end I/O and tap points → limiter flags, gain state, amplitude metrics (enough to classify “too strong vs too weak”).
  • Demod/decoder output → frame detect state, sync-failure reason code, CRC results, retry counters.
  • MCU interface (SPI/I²C/parallel) → structured fault codes, read attempt history, reset reason, temperature/voltage snapshots.
  • Timestamp source → event timepoints (frame start / decode OK / commit done) and drift indicators.
  • Storage & service/maintenance port → evidence packet commit status and sequence continuity (detect missing events).

3) Operating conditions (variation → mechanism → what to observe)

  • High-speed pass reduces the effective read window → raises sensitivity to sync timing margin and retry policy.
  • Gap / height / attitude changes create rapid coupling swings → AGC/limiter behavior and decision thresholds dominate.
  • Trackside EMC + nearby metal introduces common-mode pickup and reflection effects → front-end dynamic range and filtering must be provable via logged fields, not guessed.
Balise / Transponder System Context Map Diagram showing train underside antenna coupling to trackside balise, onboard BTM chain (RF, DEMOD, DEC, LOG), and greyed interface to ETCS/CBTC upper layer. Includes evidence outputs block. Balise / Transponder Scope: coupling → RF → demod/decode → evidence BALISE Train underside / Antenna module Onboard BTM Read Chain RF DEMOD DECODE LOG Evidence RSSI / AGC Limiter flag Sync code CRC stats Reset reason ETCS/CBTC Upper layer (interface only) F1 — System context map (scope-locked)
Figure F1. System context map for the balise/transponder read chain. The upper-layer ETCS/CBTC block is shown only as an interface (out of scope).

ALT (for this figure): Balise transponder system context showing train underside antenna coupling to trackside balise, onboard BTM RF demod decode logging chain, and evidence outputs.

H2-2. User Intent: What “Good” Looks Like in the Field

In rail signaling, “good” is not a subjective impression. It must be measurable, repeatable, and diagnosable. A balise read chain is acceptable only when it can prove: (1) reads are reliable across boundary conditions, (2) decoded content is integrity-checked and consistent, (3) event timing is trustworthy, and (4) failures still produce a usable evidence packet.

Acceptance targets (metrics → evidence → decision)

A) Read reliability

  • Metric: success probability per pass, retry-count distribution, failure-type mix (sync vs CRC vs power).
  • Evidence fields: read_attempts, read_success, retry_hist, fail_reason_code
  • Decision: prove margin at boundary conditions (max speed / worst gap / temperature corners / EMC stress).

B) Timing correctness

  • Metric: timestamp jitter and alignment error versus speed/odometer reference (if available).
  • Evidence fields: t_frame_start, t_decode_ok, t_commit_done, clock_drift_est
  • Decision: confirm timepoints are taken at defined stages (not “some time later”).

C) Decode integrity

  • Metric: CRC/format check rate and multi-read consistency (same balise → same telegram hash).
  • Evidence fields: crc_fail_count, telegram_hash, consistency_score
  • Decision: avoid “false confidence”: CRC pass alone is insufficient without consistency checks under noise.

D) Diagnostics completeness

  • Metric: evidence packet completeness rate, especially for failed reads and reset/brownout events.
  • Evidence fields: log_commit_status, reset_reason, last_event_seqno
  • Decision: failures must be classifiable without oscilloscopes on track.

A practical rule: if the system cannot tell whether a failure is weak coupling, front-end saturation, sync failure, or power/reset, the architecture is not field-ready.

Failure-to-Evidence Funnel for Balise Reads Funnel diagram: symptom layer (no-read, wrong-read, intermittent), mechanism layer (weak coupling, saturation, sync fail, brownout), evidence fields, and first actions. Minimal text with strong graphical structure. What “Good” Proves Symptom → Mechanism → Evidence → First action Symptom Mechanism Evidence fields First action NO READ WRONG READ INTERMITTENT WEAK COUPLING SATURATION SYNC FAIL BROWNOUT/RESET RSSI, AGC level Limiter flag Sync reason code CRC stats Retry histogram Reset reason t_start / t_ok Commit status Event seqno CHECK COUPLING CHECK DYNAMIC RANGE CHECK SYNC WINDOW VERIFY HOLDUP F2 — Failure-to-evidence funnel (field-verifiable targets)
Figure F2. A practical funnel that turns field symptoms into mechanisms, required evidence fields, and a first action.

ALT (for this figure): Balise read failure-to-evidence funnel mapping symptoms to mechanisms, logged evidence fields, and first corrective actions.

H2-3. Over-the-Air Coupling & Antenna Path

Balise read failures are most often margin problems, not protocol mysteries. The over-the-air path must be treated as a variable channel: coupling changes with height, tilt, nearby metal, and train speed. A field-ready design proves reliability by converting these variations into an explicit SNR margin budget and a time-window budget.

Coupling variation Matching & Q SNR margin Speed window

1) Practical coupling model (engineering, not academic)

  • Coupling is not constant. Effective coupling varies with installation height/gap, antenna attitude, and metal proximity (rails, fasteners, underbody structures).
  • Coupling variation becomes amplitude variation at the RF input. The read chain must classify failures as “too weak,” “too strong/saturated,” or “timing/sync limited,” based on observable fields.
  • Reflections and near-field distortion can change the apparent signal shape even when average level is similar, which is why a robust design tracks both level indicators (RSSI/AGC) and decode-stage outcomes (sync/CRC).

2) Matching network impact (Q, bandwidth, tolerance)

  • Higher Q can increase peak gain, but narrows bandwidth and increases sensitivity to component tolerance, temperature drift, and frequency offset. A “lab-perfect” tune can reduce field robustness.
  • Tolerance budgeting is mandatory. Component tolerance and temperature drift shift the resonance point, reducing effective coupling and changing the noise bandwidth seen by the detector.
  • Design goal: choose a Q/bandwidth that preserves SNR margin across worst-case installation and environment, rather than maximizing peak response at a single condition.

3) Speed-driven time window (read budget)

Higher speed primarily reduces the effective acquisition window. A shorter window limits synchronization time, reduces the number of retry opportunities, and tightens allowable processing latency. The design must explicitly budget: detect → sync → decode → verify → commit evidence within the available window.

4) What to prove (margin, not anecdotes)

SNR margin budget

  • Coupling loss variation + mismatch loss + cable loss + interference pickup
  • Front-end noise figure and effective bandwidth
  • Demod threshold and required margin at boundary conditions

Time-window budget

  • Window length at max speed (worst geometry)
  • Minimum time for sync and frame detection
  • Retry policy that converges before leaving the coupling zone
Coupling & SNR Margin Budget Budget-bar diagram mapping coupling loss, mismatch loss, cable loss, interference, noise floor, front-end noise figure, demod threshold and remaining SNR margin. Includes speed window indicator. Coupling & SNR Budget Losses + noise + threshold → remaining margin Coupling drivers HEIGHT / GAP TILT / ATTITUDE NEARBY METAL SPEED WINDOW Shorter window at higher speed SNR margin budget COUPLING MATCH CABLE EMI NF/BW MARGIN DEMOD TH Evidence to capture RSSI / AGC • limiter flag • sync reason • CRC stats • retry histogram t_frame_start / t_decode_ok • temperature • supply snapshot F3 — Coupling & SNR margin budget
Figure F3. A budget-style view that turns installation and environment variation into explicit SNR margin and time-window constraints.

ALT (for this figure): Coupling and SNR budget diagram showing loss segments, demod threshold, remaining margin, and speed window impact for balise reads.

H2-4. RF Front-End Architecture

The RF front-end must be designed as an observable system. A non-observable front-end forces field teams to guess. A robust architecture separates “weak coupling,” “front-end saturation,” “sync failure,” and “power/reset” using measurable tap points and structured status codes.

LNA Limiter AGC Filter Detector / Demod Tap points

1) Front-end blocks (what each block protects or proves)

  • LNA / gain stage preserves weak-signal sensitivity. Failure signature: low level with high AGC demand, repeated sync-fail without limiter activity.
  • Limiter prevents overdrive and clamps transient peaks. Failure signature: limiter active frequently, distorted amplitude leading to “CRC bursts” or unstable sync.
  • AGC stabilizes amplitude under coupling swings. Failure signature: AGC pinned high (too weak) or pinned low (too strong).
  • Filter trades interference rejection versus sensitivity. Too narrow harms frequency tolerance; too wide increases noise bandwidth.
  • Detector / demod interface must expose “why decoding failed” (sync code, CRC) rather than only “fail.”

2) Dynamic range requirement (near-strong vs far-weak)

The required dynamic range must cover the full combination of coupling variation, tolerance drift, temperature corners, and interference pickup. The architecture should prove that strong coupling does not saturate the chain, while weak coupling still exceeds the minimum detectable level with adequate SNR margin.

3) Built-in observability (minimum tap-point set)

  • Level indicators: RSSI proxy, AGC code, detector amplitude
  • Clipping indicators: limiter flag, saturation counter
  • Decode indicators: sync reason code, CRC fail count, retry histogram
  • Context snapshot: temperature, supply voltage, reset reason, commit status
RF Front-End Architecture with Tap Points Block diagram from antenna input through LNA, limiter, AGC, filter, detector/demod, ADC/MCU and logging. Tap points labeled with observable signals: RSSI, AGC code, limiter flag, amplitude, sync/CRC. RF Front-End (Observable) Tap points turn failures into diagnosable categories Signal chain ANT LNA LIMITER AGC FILTER DEMOD Tap points (minimum observability set) RSSI / LEVEL LIMITER FLAG AGC CODE AMP / IQ SYNC / CRC MCU / LOG F4 — RF front-end block diagram with diagnostic tap points
Figure F4. The front-end is structured to expose level, clipping, gain state, and decode outcomes, enabling fast root-cause classification.

ALT (for this figure): RF front-end block diagram for balise reader showing antenna, LNA, limiter, AGC, filter, demod, and diagnostic tap points for RSSI, AGC, limiter, and sync/CRC.

H2-5. Demodulation, Decoding & Telegram Integrity

A “wrong read” is rarely a single bug. It is the result of a chain where waveform quality, synchronization margin, bit decisions, and integrity policy interact. A field-ready design must convert every decode failure into a pipeline-stage outcome with a clear evidence field: “failed at frame detect,” “failed at sync,” “CRC burst,” or “CRC pass but inconsistent across reads.”

Frame detect Sync Bit clock Soft/Hard decision CRC Multi-read consistency

1) Demod approach (kept at engineering abstraction)

  • Envelope / amplitude path (ASK-like): sensitive to clipping, noise floor rise, and threshold jitter. Key evidence: amplitude stats, limiter activity, decision threshold state.
  • Phase / zero-crossing path (PSK-like): sensitive to phase noise and sampling-phase drift. Key evidence: sync quality score and phase/clock error bins (compressed codes are acceptable).
  • Correlation-based detect: sensitive to window length and multipath distortion. Key evidence: correlation peak ratio and peak position stability across attempts.

2) Frame detection and synchronization (deterministic failure classification)

  • Frame detect must distinguish miss-detect vs false-detect. If the design cannot log false-detect rate, field tuning becomes guesswork.
    Evidence: frame_detect_count, false_detect_count, preamble_quality
  • Sync lock must record where it failed (timing window, jitter, threshold). “Sync fail” without a reason code is not actionable.
    Evidence: sync_reason_code, sync_lock_time, sync_quality_score
  • Bit clock recovery failures often appear as “gets worse over the frame.” Capture slips/drift rather than only CRC.
    Evidence: bit_slip_count, phase_error_bin (or clock_drift_code)

3) Bit decisions (soft vs hard) and parameter traceability

Soft decisions can improve robustness near the SNR edge but cost compute and power. Hard decisions are simpler but require well-managed thresholds. In either case, a field-debuggable system must record the decision mode and a configuration version (or threshold ID) so a failure can be reproduced.

Hard decision (threshold-driven)

  • Risk: threshold jitter under noise/clipping
  • Log: decision_mode, threshold_id (or config_version)

Soft decision (confidence-driven)

  • Benefit: improved error tolerance near margin
  • Log: decision_mode, confidence_bin (compressed), compute budget

4) Telegram integrity: CRC pass is necessary, not sufficient

CRC proves internal consistency for one read attempt, but it does not prove correctness under noise and interference. A robust implementation adds a multi-read consistency gate: repeated reads of the same balise within the same pass must converge to a consistent telegram hash before output is accepted.

  • Consistency rule: accept output only when K-of-N attempts agree (majority or thresholded policy).
  • Non-convergence: output “inconsistent” with evidence fields preserved (do not silently pick a random pass).
  • Evidence: telegram_hash per attempt, read_group_id, consistency_score, crc_pass/fail stats
Decode Pipeline Timeline Timeline diagram showing stages: sample, frame detect, sync, demod, decode, CRC/consistency, output/log. Each stage includes typical failure symptom and evidence field labels. Decode Pipeline Stage outcomes → symptoms → evidence fields Timeline SAMPLE FRAME SYNC DEMOD DECODE CRC + CONSIST Symptoms & evidence per stage Noise rise Clipping RSSI/AGC Miss/false detect preamble_q No lock drift sync_code Thresh jitter th_id/mode Bit slip CRC burst bit_slip CRC pass but unstable hash + K/N OUTPUT: telegram + evidence packet (stage outcome preserved) F5 — Decode pipeline timeline with failure symptoms and evidence
Figure F5. The decode chain is diagnosable only when each stage emits a clear outcome and evidence fields, not a single “read failed” bit.

ALT (for this figure): Decode pipeline timeline for balise reads showing sampling, frame detect, sync, demod, decode, CRC and multi-read consistency with symptoms and evidence fields.

H2-6. Low-Power MCU & Wake-Up Strategy

Low-power MCU design in a balise read chain is not only about reducing standby current. It must preserve read availability, prevent brownout mid-read, and guarantee that failures still produce a minimal evidence record. The wake-up strategy, power sequencing, and firmware state machine must be engineered as a single reliability system.

RF-field wake External interrupt Timer wake Brownout State machine Commit & Recovery

1) Wake-up paths (stability vs power)

  • RF-field wake: lowest standby power, but sensitive to threshold drift and false wake from interference.
    Log: wake_source, rf_wake_level, false_wake_count
  • External interrupt wake: more deterministic timing integration, but vulnerable to EMI on harness lines.
    Log: int_source_id, debounce_status, emi_event_counter
  • Timer wake: predictable for self-test and health checks, but increases energy cost if poorly scheduled.
    Log: wake_timer_id, schedule_version

2) Power sequencing and brownout prevention (avoid “dies mid-decode”)

Wake-up creates a steep current transient: clock start, RF front-end enable, demod compute, and non-volatile commit. The most common field symptom is partial reads followed by reset. Prevention requires:

  • Power-good gating: do not enter Acquire until VDD is stable above a defined threshold for a minimum time.
  • Staged enable: avoid overlapping peak loads (RF enable vs storage writes) unless hold-up is guaranteed.
  • Minimal evidence first: on any failure path, commit a compact evidence packet before optional processing.

3) Firmware state machine (deterministic transitions + timeouts)

A rail-grade implementation requires explicit Commit and Recovery states. Every state must define entry conditions, exit criteria, timeouts, and the evidence fields it updates.

  • Listen/Idle: wait for wake source, collect baseline context (temp, VDD, counters).
  • Acquire: enable RF chain, confirm power-good, open the decode window.
  • Decode/Verify: perform demod/decoding and CRC + multi-read consistency gating.
  • Commit: write result and evidence with sequence number; confirm completion.
  • Recovery: on any failure, store minimal evidence and return to safe state (sleep or controlled retry).
Power-State Machine for Low-Power Balise Read State machine diagram: Sleep, Idle, Acquire, Decode, Verify, Commit, Recovery. Labels indicate power-good thresholds, timeouts, and minimal evidence commit requirement. Power-State Machine Wake → acquire → decode → commit → safe recovery States SLEEP VDD low IDLE baseline ACQUIRE PG gate DECODE window VERIFY CRC+K/N COMMIT seqno RECOV min log Guards PG threshold (VDD > V_PG for T_PG) • per-state timeouts • reset_reason + vdd_min capture On any fail: RECOV commits minimal evidence (seqno + cause) before returning to SLEEP F6 — Power-state machine with brownout-safe commit & recovery
Figure F6. A low-power design remains field-reliable only when wake-up, power-good gating, timeouts, and evidence commit are part of a single state machine.

ALT (for this figure): Low-power MCU power-state machine for balise reader showing sleep, idle, acquire, decode, verify, commit, and recovery with power-good thresholds and timeouts.

H2-7. Event Timestamping & Correlation

“Event timestamp” becomes useful only when it forms a provable time chain: the time source has a known trust level, each critical decode milestone is time-stamped, and the record can be correlated to speed/odometer inputs using explicit fields. This chapter defines a minimal, auditable timestamp set that preserves alignment without expanding into full vehicle positioning.

Timebase trust Milestone stamps Commit proof Speed/Odo alignment Quality codes

1) Time sources (trust levels + health evidence)

  • Local RTC provides a local axis but can drift or lose validity after power events.
    Log: timebase_mode, rtc_valid, rtc_epoch, rtc_health_code
  • TCXO / stable local timebase improves short-window stability. Temperature is still relevant.
    Log: timebase_mode, temp_c, timebase_health_code
  • External sync input (if present) enables cross-module alignment, but must detect loss/jitter.
    Log: sync_present, sync_lost_count, sync_offset_est, sync_jitter_bin

2) Timestamp strategy (milestones that form a closed evidence loop)

A reliable chain uses milestone stamps that map to both performance and integrity. The following minimal set supports root-cause classification and correlation:

  • t_frame_start: anchors the entry into the coupling window.
  • t_decode_done: proves processing time and window sufficiency.
  • t_verify_ok: marks CRC + consistency acceptance time.
  • t_commit_done: proves the evidence packet actually landed (not “lost mid-write”).
  • t_fail_stage: records failure stage time when verification does not converge.

3) Correlation (speed/odometer alignment by fields, not inference)

Correlation is implemented through explicit input snapshots and alignment quality codes. The intent is to prove that an event belongs to a specific pass window, not to compute vehicle position.

  • Inputs: speed_in, odometer_in, speed_sample_time
  • Outputs: event_speed, event_odometer, align_quality (OK / STALE / MISSING)
  • Grouping: read_group_id ties multiple read attempts within one pass window.
Timestamp Correlation Map Map showing BTM timebase and event milestones t_frame_start, t_verify_ok, t_commit_done correlated with speed and odometer inputs via fields. Upper layer log is greyed to avoid scope creep. Timestamp Correlation Align events by fields (time + speed + odometer) BTM time chain TIMEBASE mode: RTC / TCXO / SYNC health: valid • offset • jitter MILESTONES t_frame_start t_decode_done t_verify_ok t_commit_done t_fail_stage Group: read_group_id • event_id Correlation inputs SPEED IN speed_in • sample_time ODOMETER IN odometer_in • tick ALIGNMENT event_speed • event_odometer align_quality: OK/STALE/MISSING Upper layer (greyed) upper_log_time upper_event_id interface only no positioning Fields: event_id • read_group_id • event_time_ns • event_speed • event_odometer • align_quality F7 — Timestamp correlation map (alignment by fields)
Figure F7. A provable chain requires timebase health + milestone stamps + explicit correlation fields, while keeping upper-layer positioning out of scope.

ALT (for this figure): Timestamp correlation map linking BTM timebase and milestone event timestamps with speed and odometer inputs and greyed upper-layer logs for alignment.

H2-8. Diagnostics & Evidence Packet (What to Log)

Diagnostics becomes differentiating only when it is a copyable evidence contract. The evidence packet must answer the key field questions even when reading fails: whether the failure is power-related, RF-margin-related, synchronization-related, or commit-related. This chapter defines a fixed schema with a minimal subset that must survive brownout, and a full subset used for deep root-cause analysis.

Schema contract Minimal vs full Power & reset RF status Decode status Time & window

1) Minimal evidence (must survive failures)

Minimal evidence is written on every failure path before optional processing. It ensures that “read failed” can still be correlated to a specific pass window and root-cause direction.

  • Identity: event_id, seqno, read_group_id
  • Failure classification: fail_stage, fail_reason_code
  • Power proof: reset_reason, vdd_min_mv, brownout_count
  • Commit proof: commit_status, t_commit_done

2) Full evidence (answers the “why” with measurable fields)

Environment & state

  • temp_c, vdd_mv, power_good_state
  • reset_reason_code, boot_count

RF status

  • rssi_bin, agc_code
  • limiter_flag, limiter_count
  • sat_count, under_amp_count

Decode status

  • preamble_quality, sync_reason_code, sync_quality
  • crc_fail_count, crc_reason_code, retry_count
  • telegram_hash, consistency_score, k_of_n_result

Time & window

  • t_frame_start, t_decode_done, t_verify_ok
  • sample_window_us, timebase_mode, sync_present
  • event_speed, event_odometer, align_quality

3) Result policy (hash/summary instead of full payload when needed)

The result section can store a telegram summary rather than full payload depending on policy. The evidence contract remains valid when it includes a deterministic summary identifier and a schema version.

  • Result: telegram_hash, payload_len, store_policy_id
  • Versioning: schema_version, fw_version, config_version
Evidence Packet Schema Schema diagram of evidence packet: identity, environment, RF, decode, time, result, commit proof, versioning. Right side maps each group to the question it answers for diagnostics. Evidence Packet Schema Fixed fields that answer field-debug questions PACKET IDENTITY event_id • seqno read_group_id ENV / STATE temp_c • vdd_mv reset_reason RF STATUS rssi_bin • agc_code limiter_flag • limiter_count sat_count • under_amp_count DECODE sync_reason_code • sync_quality crc_fail_count • retry_count telegram_hash • k_of_n_result TIME / WINDOW t_frame_start • t_verify_ok sample_window_us • timebase_mode RESULT + COMMIT telegram_hash • store_policy_id commit_status • t_commit_done schema_version • fw_version • config_version Answers Which pass? event_id / group Power issue? vdd_min / reset RF margin? RSSI / AGC / limiter Which stage? sync/CRC codes Commit real? commit_status F8 — Evidence packet schema (fields mapped to questions)
Figure F8. A fixed evidence packet schema turns failures into answerable questions, even under brownout or incomplete reads.

ALT (for this figure): Evidence packet schema diagram for balise diagnostics showing grouped fields for environment, RF, decode, time, and result with mappings to key troubleshooting questions.

H2-9. EMI/ESD/Environmental Hardening for Trackside Reality

Trackside conditions can disturb the balise read chain through two dominant failure routes: (1) front-end saturation / threshold collapse that drives CRC bursts and unstable hashes, and (2) power/reference perturbation that triggers brownout/reset and incomplete commits. Hardening is effective only when each mitigation maps to a specific coupling path and can be verified through observable fields.

ESD/EFT coupling Common-mode return Front-end saturation Reset/brownout Temp/vibration drift Observable fields

1) ESD/EFT → front-end saturation → decode instability

Fast transients near the antenna/coupler or shield termination can inject common-mode current into the RF input path. The result is often not a total read loss but a destabilized decision process: limiter activity rises, AGC rails, and the demod threshold becomes noisy. The field signature is a combination of RSSI/AGC anomalies and CRC bursts.

  • Victim nodes: limiter/LNA input, AGC loop, ADC/demod front-end
  • Symptoms: RSSI jump, AGC at rails, limiter toggles, false detect increase, CRC bursts, inconsistent telegram_hash
  • Observable fields: rssi_bin, agc_code, limiter_flag/limiter_count, sat_count, crc_fail_count, sync_reason_code, false_detect_count

2) ESD/EFT → reference/power disturbance → reset & incomplete evidence

A second dominant route is reference disturbance: ground bounce and supply dip can force brownout reset during acquire/decode, or abort non-volatile writes. A robust design must ensure that failure paths still commit a minimal evidence packet and record commit completion explicitly.

  • Victim nodes: MCU BOR/WDT, clock start-up, NVM commit window
  • Symptoms: mid-read reset, missing t_commit_done, commit_status failures, boot-count jumps
  • Observable fields: reset_reason, vdd_min_mv, brownout_count, commit_status, t_commit_done

3) Shielding, grounding, and common-mode return (antenna-to-front-end specific)

Shielding is effective only when the common-mode return path is controlled. For the antenna-to-front-end segment, the goal is to prevent high-frequency return currents from flowing through sensitive reference networks. The most common failure pattern is a long/uncertain return path that converts transient current into threshold jitter or reset events.

  • Design focus: minimize RF input loop area and provide a low-impedance return to chassis/reference
  • Verification hint: correlate limiter/AGC anomalies with reset events and harness/shield configurations

4) Temperature and vibration: drift → margin loss → retries

Temperature and vibration can shift matching network parameters and degrade connector integrity. The typical signature is not a single failure, but a progressive loss of margin: sync becomes slower, retries rise, and K-of-N consistency converges less reliably under the same pass window.

  • Observable fields: temp_c, retry_count, consistency_score, k_of_n_result
  • Actionable linkage: tie drift evidence to the timestamp correlation fields (H2-7) for repeatability
Interference Path Map Five-column map: interference sources, coupling paths, victim nodes, mitigation levers, and observable evidence fields. Focuses on antenna to front-end to MCU and commit. Interference Path Map Source → path → victim → mitigation → evidence SOURCE COUPLING VICTIM MITIGATION FIELDS ESD EFT Burst RF Temp Vibration Common-mode via shield Input loop area Power dip ground bounce Connector micro-motion Limiter/LNA AGC rail Demod/ADC threshold MCU BOR/WDT reset NVM commit abort Clamp short loop Return path shield→chassis PG gating brownout safe Min evidence commit first RSSI AGC LIM SAT SYNC CRC RESET VDDmin COMMIT Tcommit TEMP F9 — Interference source-to-evidence map for balise read chain
Figure F9. Hardening is trackable only when each interference route is mapped to a coupling path, a victim node, and observable evidence fields.

ALT (for this figure): Interference path map for balise systems showing sources, coupling paths, victim nodes, mitigation actions, and observable evidence fields like RSSI, AGC, reset and commit status.

H2-10. Verification & Test Playbook (Bench + Field)

Verification becomes repeatable when each test stimulus is tied to expected evidence fields and pass/fail criteria. This playbook is organized as a closed loop: bench margin characterization, transient injection with evidence capture, pass-window stress (speed/time budget), and field regression driven by the evidence packet schema.

Coupling control SNR sweep Strong-field sweep ESD/EFT injection Window stress Field regression

1) Bench margin characterization (coupling + SNR + saturation)

  • Coupling control: use a repeatable antenna-to-balise fixture to vary coupling loss in defined steps.
  • SNR sweep: reduce margin and observe convergence (retry and K-of-N consistency) rather than only CRC.
  • Strong-field sweep: increase field to locate limiter/AGC rail points and verify the system fails safely (classified stage + evidence retained).
  • Evidence focus: rssi_bin, agc_code, limiter_count, crc_fail_count, retry_count, consistency_score

2) Injection tests (ESD/EFT/transients) with commit-proof logging

  • Stimulus: apply ESD/EFT in a controlled set of points around antenna, shield termination, and I/O entry.
  • Expected evidence: limiter/AGC anomalies (route A) or reset/commit events (route B) must be captured, not inferred.
  • Pass condition: any fail must still produce minimal evidence: stage + reason + reset/VDDmin + commit_status.
  • Evidence focus: reset_reason, vdd_min_mv, commit_status, t_commit_done

3) Speed / pass-window stress (time budget + re-read policy)

  • Window scaling: shorten effective read window and confirm sync/verify still converges or fails with a clear stage code.
  • Policy validation: verify K-of-N convergence under reduced window without silently outputting unstable telegrams.
  • Evidence focus: t_frame_start, t_verify_ok, sample_window_us, retry_count, k_of_n_result

4) Field regression loop (reproduce → evidence → root cause → fix → rerun)

  • Reproduce: define repeatable conditions (temperature band, speed band, installation state, location segment).
  • Capture: collect evidence packets with schema_version and configuration IDs.
  • Classify: map to RF margin / sync / power / commit routes using the evidence fields.
  • Fix & rerun: apply mitigation or policy change and rerun the same test-to-evidence matrix.
Test-to-Evidence Matrix Matrix with tests as rows and evidence field groups as columns (ENV, RF, DECODE, TIME, COMMIT, RESULT). Check marks indicate required evidence capture for each test, plus a pass criteria column. Test-to-Evidence Matrix Tests (rows) × evidence groups (columns) TEST ENV RF DECODE TIME COMMIT RESULT PASS Coupling/SNR sweep Strong-field sweep ESD injection EFT/transient burst Window stress (speed) Thermal/vibration soak evidencepresent safefail stagecoded commitproof K/Nconverge stabletrend ✔ required evidence • ● recommended evidence • PASS column shows the primary acceptance intent F10 — Verification matrix tying tests to required evidence groups
Figure F10. A verification plan is complete only when each test maps to required evidence groups and a clear acceptance intent (evidence present, stage coded, commit proven).

ALT (for this figure): Verification matrix for balise systems mapping bench and field tests to evidence groups (environment, RF, decode, time, commit, result) with required check marks and pass criteria.

H2-11. Design Pitfalls & Fix Patterns

This chapter compresses common field failures into actionable fix patterns. Each pattern follows the same workflow: Symptom → 2-field checks → First fix → Confirm. Concrete MPN examples are provided as starting points for design reviews and lab trials.

2-field checks Stage-coded failures Commit-proof logging RF margin vs saturation Temp/vibration drift K-of-N consistency

Pattern 1 — RSSI is high but CRC keeps failing

  • Symptom: strong field indicated, yet crc_fail_count stays high; output is unstable.
  • 2-field checks: (RF) agc_code + limiter_count/sat_count; (DECODE) sync_reason_code + crc_fail_count
  • Interpretation: high RSSI can be distorted RSSI (clipping/AGC rail), not “good margin”.
  • First fix: shorten the input clamp loop and stabilize the AGC operating window (avoid rail + limiter chatter).
  • Confirm: run strong-field sweep (H2-10) and verify limiter_count decreases and CRC bursts disappear.
MPN starting points:
Input ESD/TVS (compact): PESD5V0S1UL (Nexperia), PESD5V0X1B (Nexperia), ESD9B5.0ST5G (onsemi)
RF limiter / front-end protection (broad use): HMC547ALC3 (Analog Devices), HMC987ALP5E (Analog Devices)
RF gain/AGC building blocks (broad use): ADL5501 (Analog Devices, RF detector), AD8361 (Analog Devices, detector), ADL5611 (Analog Devices, LNA)
Note: choose variants by frequency band and package constraints; these are evaluation starting points.

Pattern 2 — RSSI is low, reads sometimes succeed (high sensitivity to conditions)

  • Symptom: intermittent success; retry_count distribution becomes heavy-tail.
  • 2-field checks: (RF) rssi_bin + under_amp_count; (DECODE/TIME) retry_count + sample_window_us
  • Interpretation: margin is insufficient; the system “wins by retries” rather than stable convergence.
  • First fix: restore margin at the antenna/matching/connector path before changing decode thresholds (avoid false locks).
  • Confirm: coupling/SNR sweep shows smoother convergence and reduced retries at the same window.
MPN starting points:
Matching network components (robust MLCC families): GRM series (Murata), C0G/NP0 MLCC where possible (various vendors)
Low-loss RF switches (if needed in antenna path): SKY13385-679LF (Skyworks), ADRF5020 (Analog Devices)
Connector/vibration reliability (example families): JST GH/PH series (board-to-wire), TE MicroMatch (board-to-wire)
Note: connector choice is mechanical-system-dependent; use as a shortlist for reviews, not a single “correct” answer.

Pattern 3 — Fails only at low temperature

  • Symptom: field failures cluster at cold; warm conditions pass.
  • 2-field checks: (ENV/TIME) temp_c + timebase_health_code; (DECODE) sync_reason_code + retry_count
  • Interpretation: cold drift can move matching, oscillator, or decision thresholds.
  • First fix: use temperature-bucketed thresholds/AGC targets and a more stable timebase (TCXO) if sync drift dominates.
  • Confirm: thermal soak regression shows stage-coded failures disappear or migrate to a stable, explainable route.
MPN starting points:
TCXO examples: SIT5358 (SiTime), TXETBLSANF-26.000000 (Epson, example family), Abracon ASTX-H11 series (Abracon)
RTC examples (if RTC required): RV-3028-C7 (Micro Crystal), PCF2129 (NXP)
Temperature sensor (for evidence + compensation): TMP117 (Texas Instruments), MCP9808 (Microchip)

Pattern 4 — Resets mid-read (read half then reboot)

  • Symptom: incomplete telegram handling; missing commit proof; boot counters jump.
  • 2-field checks: (POWER/COMMIT) reset_reason + vdd_min_mv; (COMMIT) commit_status + t_commit_done
  • Interpretation: brownout/wake-up current spike or “commit too late” causes evidence loss.
  • First fix: enforce “minimal evidence first” commit ordering and add brownout-safe power gating / supervisor policy.
  • Confirm: transient injection + power dip tests still produce minimal evidence and t_commit_done when possible.
MPN starting points:
Supervisors / BOR helpers: TPS3839 (Texas Instruments), MAX809 (Analog Devices/Maxim), MCP1316 (Microchip)
Load switch / eFuse (local rail control): TPS22918 (Texas Instruments), TPS25940 (Texas Instruments), LTC4412 (Analog Devices, ideal diode controller)
Hold-up / bulk (application-dependent): polymer electrolytic families (Panasonic OS-CON), low-ESR electrolytics (various vendors)

Pattern 5 — “CRC passes” yet the decoded output is wrong (rare but critical)

  • Symptom: incorrect telegram interpretation slips through single-pass checks; later correlation fails.
  • 2-field checks: (DECODE) false_detect_count + sync_reason_code; (RESULT) telegram_hash + k_of_n_result/consistency_score
  • Interpretation: frame detection or sync false-lock can yield “valid-looking” frames; CRC alone is not sufficient.
  • First fix: promote acceptance from “CRC OK” to “K-of-N convergence + hash consistency” and tighten frame/sync gating.
  • Confirm: edge SNR + interference injection no longer produces wrong outputs; failures become stage-coded.
MPN starting points:
FRAM for robust event hashing/logging (fast commit): FM24CL64B (Cypress/Infineon), MB85RS64V (Fujitsu)
Serial flash (if policy allows): W25Q32JV (Winbond), MX25R6435F (Macronix, low-power family)
Hardware CRC acceleration MCUs (example families): STM32L4 series (ST), MSP430FR series (TI, FRAM-based)

Pattern 6 — After ESD, performance degrades over time (not immediate failure)

  • Symptom: baseline shifts; same installation now shows lower margin or higher retries days later.
  • 2-field checks: (RF) rssi_bin distribution shift + under_amp_count; (DECODE) retry_count trend + crc_fail_count
  • Interpretation: connector/matching drift or shield termination loosened; latent damage is plausible.
  • First fix: re-baseline with the same fixture (SNR sweep) and inspect the antenna/matching/shield path before firmware changes.
  • Confirm: post-ESD baseline matches pre-ESD within the test matrix; trend stabilizes.
MPN starting points:
Low-capacitance ESD for RF nodes: PESD1CAN (Nexperia, example family), ESD5Z series (various vendors)
Shield termination accessories (system-level): 360° EMC cable glands (HUMMEL/Pflitsch families), braid clamps (various vendors)
Adhesive/strain relief (mechanical): epoxy/RTV families (application-dependent)

Pattern 7 — Passes on bench, fails in trackside reality

  • Symptom: lab fixture is stable; field shows high variance and unexplained dropouts.
  • 2-field checks: (RF) agc_code stability + limiter_count; (TIME/DECODE) read_group_id clustering + retry_count tail
  • Interpretation: uncontrolled common-mode return, installation posture, metal reflections, or cable routing dominates.
  • First fix: make the field setup measurable: record installation state in logs and enforce a controlled return path (shield→chassis).
  • Confirm: field regression can reproduce failures and map them to one coupling path (F9), then resolve after the fix.
MPN starting points:
Isolated transceivers for maintenance/debug links: ISO3082 (Texas Instruments, isolated RS-485), ADM2587E (Analog Devices, isolated RS-485)
Common-mode choke (as an interface helper where applicable): WE-CMB series (Würth Elektronik), TDK ACM series (TDK)
Ethernet isolation magnetics (if applicable in the module boundary): Pulse H5007NL (Pulse, example family)

Pattern 8 — Logs exist but cannot be correlated to the same pass window (evidence is unusable)

  • Symptom: events cannot be aligned; upper-layer comparisons are ambiguous.
  • 2-field checks: (TIME) timebase_mode + align_quality; (ID) event_id + read_group_id
  • Interpretation: missing group IDs or timebase health makes timestamps non-evidentiary.
  • First fix: enforce group IDs and alignment quality codes; record timebase health fields on every event.
  • Confirm: multi-read events within one pass window cluster under the same read_group_id with OK alignment.
MPN starting points:
Secure timestamp / tamper-aware RTC options (if needed): NXP PCF85063A (RTC), Microchip MCP79410 (RTC)
Small backup supply element (application-dependent): ML1220 rechargeable coin cell (Panasonic), supercap families (various vendors)
Low-power MCU families with good retention: STM32L0/L4 (ST), EFR32BG (Silicon Labs), MSP430FR (TI)
Root-Cause Decision Tree Decision tree: start from symptom categories (CRC fails, mid-read reset, cold-only failure, bench-pass field-fail). Each branch prompts two-field checks (limiter/AGC, reset/VDDmin, sync codes, timebase alignment) and ends at a first-fix action (clamp loop, return path, minimal commit first, temperature-bucket thresholds, K-of-N acceptance). Root-Cause Decision Tree Symptom → 2-field checks → first fix START: what is seen? CRC fails / unstable crc_fail, sync codes Reset mid-read reset, VDDmin, commit Cold-only failure temp, timebase, sync Check: agc_code + limiter_count and sync_reason_code First fix short clamp loop + AGC window Check: reset_reason + vdd_min_mv and commit_status / t_commit_done First fix minimal evidence first + supervisor Check: temp_c + timebase_health and sync_reason_code / retries First fix temp-bucket thresholds + TCXO Always enforce acceptance integrity K-of-N convergence + telegram_hash consistency (not CRC only) F11 — Symptom-driven decision tree with 2-field checks and first fixes
Figure F11. A fast field-debug flow: start from the symptom, inspect two evidence fields, then apply one “first fix” before expanding scope.

ALT (for this figure): Root-cause decision tree for balise failures mapping symptoms to two evidence-field checks and first fix actions such as clamp loop tuning, brownout-safe commits, and temperature-bucket thresholds.

Implementation note: MPN examples are provided as evaluation starting points. Final selection must be validated against the actual RF band, interface impedance, transient levels, and mechanical constraints of the balise module and antenna path.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12. FAQs (Evidence-Driven Troubleshooting)

Each FAQ is designed as a fast field SOP: Verdict → Evidence ×2 → First fix, and explicitly maps back to H2-3…H2-11.

RSSI is high but decoding always fails — front-end saturation or sync thresholds?
Maps to: H2-4 / H2-5 / H2-8
Verdict: Most cases are front-end clipping/AGC rail, not “good margin.”
Evidence: Check agc_code with limiter_count/sat_count (H2-4) and confirm whether failures cluster at sync_reason_code vs crc_fail_count (H2-5/H2-8).
First fix: Shorten the clamp loop / improve return path, then retune AGC target window before touching decode logic (H2-11).
Unreadable only at high speed — window too short or re-read policy wrong?
Maps to: H2-3 / H2-5 / H2-10
Verdict: High-speed failures usually come from time-budget collapse before convergence, not raw RF loss alone.
Evidence: Compare sample_window_us vs retry_count distribution (H2-3), and locate the dominant fail stage via sync_reason_code / CRC bursts (H2-5).
First fix: Enforce a speed-aware K-of-N policy (bounded retries + early abort stage codes) and validate with window-stress tests (H2-10).
Occasional wrong read but CRC shows no error — telegram version compatibility?
Maps to: H2-5 / H2-8
Verdict: Treat “CRC OK” as insufficient; false-lock/parse mismatch can still yield wrong semantics.
Evidence: Require telegram_hash consistency across reads and track k_of_n_result/consistency_score (H2-8). Also inspect false_detect_count and stage codes around sync/parse (H2-5).
First fix: Upgrade acceptance to “K-of-N + hash stable,” and log telegram_version_id/parser_path_id (H2-11).
Failure spikes in cold/heat — matching drift or clock drift causing sampling misalignment?
Maps to: H2-3 / H2-5 / H2-9
Verdict: Separate “RF margin drift” from “timebase/threshold drift” using trends, not guesses.
Evidence: Correlate temp_c with rssi_bin shift (margin drift, H2-3) and with sync_reason_code/sync_lock_time (timing drift, H2-5). Track interference sensitivity changes (H2-9).
First fix: Apply temperature-bucketed thresholds/AGC targets and add timebase health reporting before redesigning hardware.
Works in one direction but fails on the return pass — posture/reflection or install height?
Maps to: H2-3 / H2-9
Verdict: Direction-dependent reads point to coupling geometry and metal environment sensitivity.
Evidence: Compare rssi_bin and agc_code stability between directions (H2-3). If interference route changes, limiter_count spikes and stage codes shift (H2-9).
First fix: Log installation state (height/posture ID) and correlate with the coupling/SNR budget before firmware changes (H2-3/H2-10).
After ESD, reads fail — front-end damage or MCU reset loop?
Maps to: H2-9 / H2-6 / H2-8
Verdict: Distinguish “RF path injured” vs “power/reset collapse” using one RF field and one power field.
Evidence: Check reset_reason with vdd_min_mv (H2-6/H2-8) and compare limiter_count/sat_count behavior pre/post ESD (H2-9).
First fix: Ensure minimal evidence commits survive resets, then improve clamp/return path if RF saturation signatures dominate (H2-11).
Unreadable and no logs either — did power loss prevent commit?
Maps to: H2-6 / H2-8
Verdict: “No log” is a power/commit problem until proven otherwise.
Evidence: Look for missing t_commit_done and non-OK commit_status, then correlate with reset_reason and vdd_min_mv (H2-6/H2-8).
First fix: Move “minimal evidence first” ahead of heavy parsing and use a commit-proof scheme (FRAM or two-phase flags) (H2-11).
Needs many re-reads to stabilize — sync/threshold issue or insufficient SNR margin?
Maps to: H2-3 / H2-5
Verdict: Persistent retries mean the system is not converging; treat it as margin or gating instability.
Evidence: Use retry_count trend plus sync_reason_code to separate “can’t lock” from “locks then CRC bursts” (H2-5). Check whether rssi_bin is near the sensitivity knee (H2-3).
First fix: Restore margin first (coupling/matching), then tune gating thresholds to avoid false acceptance.
AGC stays pinned — weak coupling or wrong gain configuration?
Maps to: H2-4 / H2-8
Verdict: AGC rail can indicate either “input too small” or “gain path misconfigured”; differentiate by RSSI bins.
Evidence: Compare agc_code with rssi_bin and under_amp_count (H2-8). If RSSI is not low, the gain table/register path is suspect (H2-4).
First fix: Audit gain/AGC register configuration and tap-point observability before mechanical changes (H2-4).
Timestamps don’t align with speed/odometer — wrong marking point or clock drift?
Maps to: H2-7 / H2-10
Verdict: Misalignment is usually a definition/marking problem before it is a hardware clock problem.
Evidence: Compare align_quality with timebase_health_code (H2-7). Validate whether timestamps are taken at frame start/end/verify OK consistently using test logs (H2-10).
First fix: Standardize marking points (t_frame_start/t_verify_ok) and add alignment quality codes for every event (H2-7/H2-8).
“Read OK but position is still wrong” — missing correlation fields between telegram and upper layers?
Maps to: H2-7 / H2-8
Verdict: Position disputes are often correlation failures, not decode failures.
Evidence: Verify read_group_id clustering and event_id uniqueness per pass window (H2-7/H2-8). Check whether align_quality is OK when the upstream claims mismatch.
First fix: Add/validate correlation IDs and a minimal telegram summary (hash + version) so upstream mapping can be proven (H2-8).
Failure rate changes after switching balise batches — protocol nuance or RF tolerance distribution?
Maps to: H2-5 / H2-10 / H2-11
Verdict: Batch shifts must be separated into “parser/protocol differences” vs “RF margin distribution.”
Evidence: Compare telegram_version_id/parser_path_id outcomes (H2-5) and baseline rssi_bin + retry/CRC statistics under identical fixture conditions (H2-10).
First fix: Run A/B baseline with the test-to-evidence matrix and apply the decision-tree route before hardware changes (H2-11).