Rail Driver Desk & HMI Design Guide

Q: Black screen but system still running — Backlight or SoC?

If UI remains responsive but the screen is dark, the issue typically lies in the backlight driver rather than the SoC display engine. Verify backlight PWM duty and enable rail before resetting the processor.

Q: Touch drifting intermittently — EMI or temperature drift?

Drift under disturbance indicates common-mode EMI coupling; drift following temperature change suggests calibration shift. Check cm_event counters and thermal state correlation.

Q: Audio hum present — ground loop or DC-DC ripple?

Low-frequency hum implies grounding loop issues; broadband noise correlates with DC-DC ripple injection. Measure ripple voltage and noise floor before redesign.

Q: Log timestamps misaligned — PTP or RTC?

Large offset jumps point to PTP sync loss; gradual drift indicates RTC instability. Validate hardware timestamp integrity and time-quality markers.

Q: Reboot after power dip but logs missing — holdup insufficient?

If evidence is missing after a power dip, the holdup energy budget may be insufficient to complete log commit operations. Increase holdup capacity and validate commit timing.

← Back to: Rail Transit & Locomotive

Driver Desk & HMI is a rail-grade operator interface that must remain readable and controllable under power dips, EMI, temperature extremes, and vibration—while producing aligned, signed evidence packets for fast root-cause and compliance. This guide maps failures to measurable fields and first fixes across input, display, audio, networking/time sync, safety states, logging, and validation.

H2-1 · System Scope & Boundary

System Scope & Boundary

The Driver Desk & HMI page defines the operator-facing I/O endpoint that turns train states into actionable displays and turns human inputs into bounded, auditable commands. The goal is not “UI beauty”; it is deterministic interaction, evidence-grade event records, and predictable safe behavior under rail power and EMC stress.

In-scope responsibilities (what this page must fully answer)

Input acquisition: touch, rotary encoder, hard keys, emergency/safety-related inputs (HMI-side), with debouncing and fault detection.
Output presentation: display link + backlight control, annunciation (lamps/buzzer), and “freshness” of shown data (stale vs valid).
Connectivity edge: Ethernet/serial (RS-485/CAN/other) as the HMI’s boundary to TCMS/vehicle networks, including link supervision and watchdog behavior.
Evidence entry-point: event IDs, timestamps, operator action traces, power-state context, and version context sufficient for post-incident reconstruction.
Rail constraints: wide input rails and transients, temperature class expectations (e.g., EN 50155), vibration/connector robustness, HMI-side EMC survivability.

Out-of-scope (explicit exclusions to prevent overlap)

Traction control algorithms (FOC/SVPWM/torque loops) and traction inverter power-stage design. These belong to Traction Inverter / TCMS control pages.
Core network switching design (TSN scheduling policy, full-train ECN topology). This belongs to Train Backbone Ethernet/ECN/WTB/MVB Gateway pages.
CBTC/ETCS business logic, interlocking/vital logic, and signaling protocol semantics. These belong to signaling/safety pages.

Boundary checklist (the “hard variables” that must be stated up front)

Power domain: typical 24/48/110 Vdc vehicle supplies; define brownout/holdup expectations for display continuity and log commit.
Environment: temperature class target (e.g., EN 50155 TX) and vibration/shock expectations (e.g., EN 61373) that drive connector and mounting choices.
Interfaces: TCMS/vehicle network (Ethernet/serial), safety-related inputs (if any), and event recorder/log collector interface (local or remote).

Design intent: treat the Driver Desk & HMI as a bounded system with a clear “contract”: what enters (states, power, time), what leaves (commands, alarms, evidence packets), and what must remain true under stress (latency bounds, safe states, and record integrity).

Figure F1. System context map: the HMI sits between the operator and train control, while power/network health and evidence outputs must remain well-defined under rail stress.

Cite this figure: Driver Desk & HMI — System Context Map (F1)

H2-2 · Operational Intent & Failure Narrative

Operational Intent & Failure Narrative

Rail HMI failures are rarely “UI bugs” in isolation. Most incidents are the result of a stress condition (transients, EMC, temperature, vibration) interacting with a missing evidence field or an undefined safe behavior. This chapter converts symptom-style questions into an evidence-first diagnostic structure that the rest of the page will consistently reference.

What operators and maintainers actually care about (symptom list)

Black screen / flicker while the system appears “alive”.
Touch mis-trigger / drift (especially after ESD or in wet/glove operation).
Encoder jitter (jumps, double-steps, direction reversals).
Audio echo / hum / noise during PA/intercom actions.
Network drop (Ethernet link flaps, serial timeouts).
Timestamp mismatch across subsystems (events cannot be aligned).
Slow response (button latency, screen update lag).
Missing logs after reboot or during transient disturbances.

Four rail stress scenarios (use as a diagnostic matrix)

Normal operation: baseline latency, display refresh, input accuracy, and log completeness.
EMC stress: ESD/EFT/surge exposure; focus on false inputs, link flaps, and recovery behavior.
Power disturbance: brownout/holdup windows; focus on “what stays on”, “what resets”, and “what is committed”.
Extreme temperature: drift, backlight derating, touch controller baseline stability, and boot-time impact.

Evidence discipline: each symptom will be mapped to a minimum set of evidence fields (flags, counters, raw channels, and timestamps) and a first corrective action. If an incident cannot produce Level-1/Level-2 evidence (events/waveforms), the design must be improved before deployment.

Figure F2. A practical funnel: symptom → suspected block → required evidence fields → first corrective action. The “Evidence Fields” layer is the anchor for later chapters and FAQ answers.

Cite this figure: Driver Desk & HMI — Failure to Evidence Funnel (F2)

H2-3 · Hardware Architecture Decomposition

Hardware Architecture Decomposition

The Driver Desk & HMI is best treated as six cooperating hardware blocks. Each block must publish a clear interface contract, declare its isolation boundary, meet a timing budget, and expose a minimum set of observable health fields (counters/flags/latency) so incidents can be diagnosed without guesswork.

Six blocks to decompose (fixed structure for vertical depth)

1) Processing Core (MCU / SoC)

UI rendering, input processing, protocol endpoints, event packet assembly. Track boot stages, load, resets.

2) Touch & Encoder Interface

Touch controller + encoder/key capture. Define debounce, drift recovery, and input confidence signals.

3) Display & Backlight

Display link + backlight driver. Separate “panel alive” vs “backlight alive”, and expose link-lock states.

4) Audio/Video Subsystem

Codec/DSP/amp chain. Expose clipping/overrun, echo-control states, and power/noise coupling indicators.

5) Network & Serial

Ethernet PHY + serial buses. Provide link flap counters, CRC/error stats, and timeout/retry telemetry.

6) Logging & Storage

Event queue + commit path. Publish queue depth, dropped logs, commit time, and sequence gap detection.

Per-block writing checklist (must appear for every block)

Interfaces: what enters/leaves (signals, data, power rails, and who owns the contract).
Isolation boundary: where common-mode suppression / isolation is applied (HMI-side only).
Timing budget: input → process → display/command, including jitter and freshness thresholds.
Fault observability: minimum counters/flags to localize the fault domain without ambiguity.

Observability rule: every block should emit (1) health counters (trendable) and (2) incident snapshots (captured at trigger time). This prevents “black-box” failures during EMC/power stress.

Figure F3. Modular decomposition: each block must define interfaces (IF), isolation boundaries (ISO), and minimum observability (OBS) so failures can be localized under rail stress.

Cite this figure: Driver Desk & HMI — Modular Block Diagram (F3)

H2-4 · Power Architecture & Brownout Behavior

Power Architecture & Brownout Behavior

In rail environments, HMI availability is constrained by inrush, short supply dips, cold-start latency, and reset storms. The design objective is not to “never reset”, but to guarantee a predictable degraded mode and an evidence-grade shutdown sequence when voltage crosses defined thresholds.

Primary rail power risks (and why they matter)

Start-up surge: inrush and backlight/amp load steps can sag the rail and trigger protection or brownout.
Transient dips: short drops can corrupt storage commits or desynchronize timestamps if policy is undefined.
Cold-start delay: boot chain latency changes with temperature; “UI ready” must be measurable and bounded.
Repeated resets: mismatched UV thresholds + watchdog policy can create reset loops that hide root causes.

Required building blocks (HMI-side, implementation-focused)

Wide-VIN front-end: define measurement points for rail monitoring and set explicit brownout thresholds.
eFuse / hot-swap: protection actions must be readable as telemetry (fault codes, retry counts, latch state).
Holdup budgeting: allocate a minimum window to keep the UI in “minimum mode” and to finish log commits.
Brownout policy: map voltage thresholds to actions: freeze commands → seal evidence → safe off.

Policy anchor terms: define (1) Minimum UI Mode, (2) Commit Window, and (3) Reset Reason. These terms become hard references for later validation steps and FAQ answers.

Figure F4. Holdup budgeting is a policy tool: it reserves time to enter Minimum UI Mode, complete a commit window, and then execute a predictable safe-off sequence tied to thresholds (V1/V2/V3).

Cite this figure: Driver Desk & HMI — Holdup Energy Budget (F4)

H2-5 · Touch & Encoder Interfaces

Touch & Encoder Interfaces

Input reliability in rail HMI is a measurable engineering problem: it must remain trustworthy under long harnesses, vibration, temperature drift, and EMC exposure. The design target is not only “responsive inputs”, but also evidence-based separation between false triggers and missed triggers, backed by counters, snapshots, and reject-reason codes.

Touch technologies (selection logic, not algorithm details)

Capacitive touch (typical)

Strong UI experience but sensitive to common-mode injection, ESD recovery, wet/glove operation, and baseline drift. Requires explicit mode/state evidence.

Resistive touch (legacy / niche)

Mechanically direct press behavior; different wear/aging profile. Often simpler for gloves but can trade durability and precision under vibration.

Glove / wet operation (mode, cost, and evidence)

Mode declaration: the HMI must expose which profile is active (normal / glove / wet) so incidents can be reconstructed.
Trade-offs: increasing sensitivity can raise false-trigger probability; reducing sensitivity can raise missed-trigger probability.
Minimum evidence fields: touch_mode, threshold_profile_id, touch_latency_ms, ghost_touch_cnt, baseline_reset_cnt.

EMI hardening (path + cut-point approach)

Common-mode path: long cables and shield reference shifts can inject into the sensor reference. Cut-point: HMI-side CM suppression and stable reference plane definition.
Electrode coupling: large sensor electrodes amplify parasitics under ESD/EFT. Cut-point: controlled recovery path and baseline discipline.
Supply injection: rail ripple can modulate measurements. Cut-point: sensor rail filtering and “snapshot on event” observability.

Encoder debouncing (edge-density model)

Edge density: abnormal A/B edge burst rate is a primary discriminator between mechanical bounce and injected noise.
Direction reversals: reverse_step_cnt highlights jitter and EMI-induced phase errors.
Reject accounting: debounce_reject_cnt must increase when steps are discarded; this prevents “silent misses”.

Redundant inputs & safety button hardware channel (HMI-side only)

Dual confirmation pattern: require two independent input channels for critical HMI actions (e.g., touch + physical confirm), focusing on channel independence.
Safety button channel: safety-related hard path should remain hardware-based; the HMI records press/release duration and debounce outcomes without implementing safety logic.
Minimum evidence fields: safety_btn_state, press_duration_ms, reject_reason, event_id, timestamp_source.

Evidence separation rule:

False trigger = an accepted input event exists (touch_down / encoder_step) with abnormal spatial/temporal patterns, often accompanied by ESD/EMC recovery signals.
Missed trigger = raw activity exists (raw_delta / edge activity) but the event is rejected or dropped; a reject_reason must be logged (debounce / out_of_region / stale / safety_lock / queue_full).

Figure F5. A rail-grade input pipeline must explain “accepted” and “rejected” decisions with mode/profile state, debounce windows, and explicit reject reasons—enabling reliable false-vs-missed separation.

Cite this figure: Driver Desk & HMI — Input Integrity & Debounce (F5)

H2-6 · Display & Backlight Subsystem

Display & Backlight Subsystem

In rail HMIs, “display” is a system: rendering, link timing, panel behavior, backlight power, and recovery policy. Failures must be diagnosable by layer (render vs link vs panel vs backlight), and the design must support visual evidence capture through brightness trends and backlight modulation characteristics.

Display links (LVDS / MIPI / eDP) — what matters for reliability

Link stability: treat link_lock and retrain_cnt as first-class health signals (separate “panel alive” from “backlight alive”).
Harness & vibration sensitivity: connector intermittency appears as burst errors and retraining events; keep counters and timestamps.
Minimum observability: link_lock, link_err_cnt, retrain_cnt, ui_fps, frame_drop_cnt, panel_temp.

Backlight control (constant-current vs PWM)

Constant-current (CC)

Brightness is regulated through current; focus on thermal derating states and current ripple evidence.

PWM dimming

Flexible control but can introduce flicker at low brightness and EMI peaks; pwm_freq and duty must be logged.

Low-brightness flicker (mechanism classification + evidence)

Type A — PWM too low: flicker correlates with low pwm_freq and duty extremes; evidence: pwm_freq_hz + duty vs brightness steps.
Type B — beat with refresh/bit-depth: flicker correlates with frame timing changes; evidence: ui_fps/frame_drop + flicker reports vs render load.
Type C — rail ripple into LED current: flicker correlates with backlight rail ripple; evidence: bl_current ripple + supply ripple under load steps.

EMI vs backlight switching frequency (principles)

Noise source region: backlight switching edges can radiate via harness; isolate and bound the switching region, and measure outcomes via link and input counters.
Frequency selection: avoid sensitive bands (audio coupling, system sampling interactions); keep the selected pwm_freq and profile ID as evidence.

Thermal derating (display readability as a contract)

Derating state: expose brightness_derate_state and panel_temp, and define a minimum readability mode for alarm-critical UI.
Cold-start impact: track ui_ready_ms and link_lock time to detect temperature-dependent boot regressions.

Visual evidence requirement: the design should support capturing (1) a brightness curve (time vs brightness/current) and (2) a PWM modulation signature (freq/duty profile). These become the fastest discriminators for flicker root-cause classes.

Figure F6. Display is a layered system. Link-lock/retraining counters localize interface issues, while PWM/current signatures and brightness curves provide direct evidence for low-brightness flicker classes and EMI coupling.

Cite this figure: Driver Desk & HMI — Display Timing & Backlight (F6)

H2-7 · Audio/Video & Codec Chain

Audio/Video & Codec Chain

Cab audio/video issues are rarely “one component failures”. In rail HMIs, audible noise, echo, burst dropouts, and sudden level changes are often caused by coupling across analog, digital, power, and ground/common-mode domains. The chain must provide enough observability to distinguish DSP-state problems from power/ground coupling without trial-and-error replacements.

Audio chain blocks (with minimum evidence fields)

Mic array & AFE

Analog/PDM/I²S capture with clipping and overrun evidence: mic_level_rms, mic_clip_cnt, adc_overrun_cnt, pdm_clk_err_cnt.

Codec/DSP path

Clock/stream stability evidence: codec_lock, sr_mismatch_cnt, buf_underflow_cnt, buf_overflow_cnt.

AEC (echo control)

Treat AEC as a state machine: aec_state, double_talk_cnt, residual_echo_level for incident reconstruction.

PA/GA interface

Amplifier protection and “pop” events: amp_ocp_flag, amp_otp_flag, amp_fault_cnt, audio_pop_event_cnt.

Video decode/encode (HMI-side impact only)

Load coupling: video workloads can steal memory bandwidth and raise UI/audio dropouts; evidence via decoder_load, ui_fps, frame_drop_cnt, thermal_state.
Priority discipline: alarm-critical UI and audio prompts should remain measurable under video stress (report latency and drop counters).

Noise root-cause domains (use correlation-first diagnosis)

Power coupling: rail ripple or load-step markers correlate with noise_floor or bursts → prioritize power-domain mitigation and snapshots.
Ground loop / common-mode: noise changes with external connections or shield reference → prioritize CM evidence and interface boundary checks.
Digital injection: tones correlate with PWM/PHY activity (fundamental or harmonics) → prioritize frequency profile evidence and isolation boundaries.

Noise Snapshot (recommended): when audio incidents trigger, capture a compact snapshot containing noise_floor, rail_ripple_mV, aec_state, buf_underflow_cnt, and a load_step_marker. This prevents ambiguous “it sounds noisy” reports.

Figure F7. Noise should be diagnosed by coupling domain. Correlate audible symptoms with rail ripple, CM events, AEC state, and buffer counters to avoid “blind part swapping”.

Cite this figure: Driver Desk & HMI — Audio Noise Coupling Map (F7)

H2-8 · Networking & Time Synchronization

Networking & Time Synchronization

Networking is not only connectivity. For rail HMIs, time synchronization is the foundation of evidence: it enables cross-system log alignment between HMI events, vehicle control logs, and external recorders. The design must export a time-quality tag (source, offset, holdover, step events) so timestamps remain trustworthy during link loss and recovery.

Connectivity interfaces (HMI-side health and recovery)

Ethernet / TSN: expose link_up_time, link_flap_cnt, crc_err_cnt, and a freshness policy (stale threshold) for displayed state.
RS-485 / CAN: expose timeout_cnt, retry_cnt, and bus_off_cnt (CAN) to prevent silent data loss.
Watchdog PHY: use PHY watchdog or controlled resets to recover from stuck link states; record reset count and last reason.
Isolation boundary: isolate network interfaces and treat CM suppression as part of reliability evidence (isolation_fault_flag, cm_event_cnt).

Time synchronization (PTP hardware timestamps + quality tagging)

PTP hardware timestamping

Prefer hardware timestamp points for stable event alignment. Evidence: offset_to_master, ptp_lock_state, time_step_detected.

Holdover discipline

During link loss, holdover_state and drift indicators must be recorded so timestamps remain explainable.

Cross-system log alignment (the evidence contract)

Alignment key: use event_id + sequence to correlate across devices; timestamp alone is not enough when steps occur.
Time quality tag: every event should include timestamp_source, offset class, holdover_state, and time_step markers.
Conflict handling: when time_step is detected, seal an event and mark the affected window as reduced-quality for forensic reconstruction.

Time Quality Tag (recommended fields): timestamp_source (PTP/GNSS/RTC/monotonic), offset_to_master (ns/us class), holdover_state (locked/holdover/drifting), and time_step_detected (flag + magnitude bucket).

Figure F8. A “time sync ladder” makes timestamps explainable. Events carry a time quality tag at capture time, PTP provides offset/holdover evidence during transport, and recorders align by event_id/sequence across systems.

Cite this figure: Driver Desk & HMI — Time Sync Ladder (F8)

H2-9 · Safety, Redundancy & Fail-safe States

Safety, Redundancy & Fail-safe States

For rail HMIs, safety is defined by controllability and provability: the interface must enter predictable restricted states when its trust is degraded, while maintaining minimum critical visibility and evidence continuity. This section focuses on HMI-internal redundancy, watchdog discipline, and fail-safe UI behavior (not vehicle safety logic).

Functional failure vs safety failure (must be separable)

Functional failure

Non-safety features degrade (e.g., video, advanced pages). Recovery may reboot or load-shed without implying unsafe operation.

Safety failure

Trust in visibility, time quality, or critical inputs degrades. HMI must enter a restricted UI state and seal evidence.

Dual MCU / safety island (HMI-internal responsibilities)

UI domain: graphics, touch, audio/video, networking; high load and non-deterministic by nature.
Monitor domain: low-load supervision for heartbeats, time quality, logging triggers, and controlled resets.
Observed safety inputs: record/present only (e.g., dual-channel states); do not implement vehicle-level voting outcomes.
Minimum evidence: heartbeat_miss_cnt, cross_check_mismatch_cnt, monitor_reset_cnt, safe_state_latched_flag.

Watchdog strategy (tiered, evidence-backed)

Windowed watchdog: detects both “no kick” and “bad kick” patterns; record wdt_trip_cnt and last_reset_reason.
Independent monitor reset: enables recovery even when the UI domain stalls; record monitor_initiated_reset_cnt.
Reset continuity: reboot must preserve boot_counter, last_good_seq, last_log_commit_ts to maintain forensic continuity.

Fail-safe UI behavior (minimum UI mode + restrictions)

Degraded UI: load-shedding (video off, reduced effects) while keeping critical status readable and logging active.
Fail-safe UI: restricted navigation and inputs; allow only alarm acknowledgement and limited confirmations.
Emergency mode limits: block configuration, updates, and deep menus; expose read-only status + exportable error codes.
Required fields: safe_state_enter_reason, safe_state_current, ui_ready_ms, time_quality_degraded_flag.

HMI-side safety input handling: dual-channel input disagreement should lead to restricted UI and a recorded disagreement window (safety_in_disagree_ms), while the final safety decision remains outside the HMI scope.

Figure F9. A fail-safe HMI is a controlled state machine. Transitions must be triggered by explicit reasons (heartbeat, time-step, log commit failures) and preserve evidence continuity across resets.

Cite this figure: Driver Desk & HMI — Safe-State Machine (F9)

H2-10 · Event Logging & Forensics

Event Logging & Forensics

Event logging for rail HMIs must be designed as a forensic evidence packet: a bounded pre/post window, aligned timestamps with quality tags, and integrity protection. A record without time quality is not reconstruction-grade and cannot support consistent cross-system timelines.

Trigger windows (pre-buffer + freeze + post-tail)

Pre-trigger ring buffer: always-on rolling cache of key fields for a configurable time/event window.
Trigger freeze: a defined condition seals the start boundary (fault, threshold, operator action, time-step event).
Post-trigger tail: extend until system stabilizes or a fixed tail window is reached; record closure reason.

Evidence sections (minimum contents)

UI action trace

page_id • action_id • input_source • accept/reject_reason • latency_ms

System health

cpu_load • mem_pressure • thermal_state • watchdog_events • safe_state_reason

Network timeline

link_flap • crc_err • reconnect • timeout_cnt • bus_off (CAN)

Power timeline

vin • brownout_flag • rail_ripple_mV • holdup_state

Identity & config

fw_version • ui_build_id • config_profile_id • calibration_id

A/V snapshots

noise_floor • aec_state • buf_underflow_cnt • decoder_load

Timestamp requirements (trust must be explicit)

Dual time basis: monotonic_time for local ordering + wall_time for cross-device alignment.
Time quality tag: timestamp_source, offset class, holdover_state, and time_step markers on every critical event.
Quality transitions: time_quality_change events must be logged; affected windows are tagged as reduced-quality.

Signing & encryption (purpose and boundaries)

Signature: proves integrity (packet header + content hash). Store key_id and signature status for auditability.
Encryption: protects sensitive traces; record encrypt_flag and policy profile without exposing secrets.

Forensics rule: a log line without timestamp quality cannot be reliably aligned across systems. Evidence packets should carry time_quality_tag and a signed content_hash.

Figure F10. An evidence packet is a structured, signed bundle: header identity + timeline index + six evidence sections, then a footer with content hash/signature/encryption flags to support consistent reconstruction.

Cite this figure: Driver Desk & HMI — Evidence Packet (F10)

H2-11 · EMC & Rail Compliance Mapping

EMC & Rail Compliance Mapping

Rail HMI must comply with multiple standards such as EN 50155 (temperature and voltage), EN 50121 (EMC), and 61373 (vibration). This section maps these standards to design actions, test evidence, and the necessary log fields for compliance.

EN 50155 (Temperature & Voltage Compliance)

Design action: Implement thermal derating strategies and define measurement points for critical thermal points.
Test evidence: Thermal chamber curves, testing at critical voltage and transient drop conditions.
Log fields: thermal_state, derate_level, ui_fps, decoder_load, vin_min, brownout_flag.

EN 50121 (EMC Compliance)

Design action: Ensure proper shielding and isolation at input and output connections. Implement common-mode suppression strategies.
Test evidence: Rail ripple measurements, EMC testing under operational conditions.
Log fields: rail_ripple_mV, cm_event_cnt, noise_floor, touch_reset_cnt.

61373 (Vibration Compliance)

Design action: Secure connectors and fixings; ensure resilience against vibration-induced intermittent failures.
Test evidence: Vibration testing and monitoring during operational conditions.
Log fields: link_flap_cnt, crc_err_cnt, log_commit_fail, input_reject_reason.

ESD Path Analysis & Touchscreen Electrostatic Recovery

Design action: Implement effective ESD suppression strategies, with special attention to touchscreen recovery.
Test evidence: Measurement of ESD event counts, recovery times for touchscreens.
Log fields: esd_event_cnt, touch_ctrl_reset_cnt, touch_lockout_ms.

Matrix for Compliance & Test Evidence

Compliance Matrix

Standard → Design Action → Test Evidence → Log Fields (Example format)

Figure F11. Noise sources (Power, Amplifier Load, PHY Activity) couple into HMI modules (Mic AFE, Codec/DSP, PA Output), and can be tracked using appropriate evidence fields such as rail_ripple_mV, cm_event_cnt, and noise_floor.

Cite this figure: Driver Desk & HMI — DM/CM Path for HMI (F11)

H2-12 · Validation & Field Debug Playbook

Validation & Field Debug Playbook

The Validation and Debug Playbook defines how to approach testing, from initial boot verification to network consistency. It ensures that evidence is captured before, during, and after each test phase, including essential waveforms, logged fields, and recovery actions.

Key test phases and required actions

Boot Validation

Must capture: Power supply, reset waveform, UI startup latency. Log: boot_counter, boot_time_ms, ui_ready_ms.

Power Down Validation

Must capture: Power drop behavior, voltage hold-up, shutdown transition. Log: brownout_flag, holdup_state, last_log_commit_ts.

EMI Injection

Must capture: EMI event count, touchscreen recovery, input rejection. Log: esd_event_cnt, touch_ctrl_reset_cnt, touch_lockout_ms.

Long Run (Soak)

Must capture: Memory usage, thermal state, UI frame rate. Log: thermal_state, derate_level, buf_underflow_cnt.

Network Consistency

Must capture: Timestamp drift, offset, network link health. Log: offset_to_master, time_step_detected, link_flap_cnt.

Test-to-Evidence Matrix

Figure F12. Test-to-evidence matrix maps testing phases (boot, power down, EMI) to the required evidence fields, waveforms, and recovery actions. This matrix ensures the system captures all essential data for validation.

Cite this figure: Driver Desk & HMI — Test-to-Evidence Matrix (F12)

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13. Driver Desk & HMI FAQs

Black screen but system still running — Backlight or SoC?

Conclusion: If the system is responsive but the screen is dark, the issue is typically in the backlight chain rather than the SoC display engine.

Evidence: backlight_pwm_duty > 0 but no luminance output; ui_fps stable and display link status OK (see H2-6, H2-9).

First Fix: Verify backlight enable rail and PWM driver before resetting the SoC.

Ref: H2-6 / H2-9

Touch drifting intermittently — EMI or temperature drift?

Conclusion: Drift during EMI tests indicates common-mode coupling; drift with temperature change suggests calibration shift.

Evidence: cm_event_cnt increases during disturbance; thermal_state correlates with false_touch_event_cnt (see H2-5, H2-11).

First Fix: Apply shielding validation before recalibrating temperature compensation.

Ref: H2-5 / H2-11

Encoder occasional jump — debounce or mechanical wear?

Conclusion: Rapid spike events without mechanical noise indicate debounce filtering issues rather than hardware wear.

Evidence: debounce_error_cnt rising; encoder_signal_quality remains within limits (see H2-5).

First Fix: Increase debounce window and verify mechanical mounting.

Ref: H2-5

Audio hum present — ground loop or DC-DC ripple?

Conclusion: Persistent 50/60Hz hum points to ground loop; broadband noise indicates DC-DC ripple injection.

Evidence: noise_floor increase; rail_ripple_mV spikes during load transitions (see H2-7, H2-11).

First Fix: Validate grounding reference before redesigning power filtering.

Ref: H2-7 / H2-11

Log timestamps misaligned — PTP or RTC?

Conclusion: Large offset jumps imply PTP synchronization loss; gradual drift suggests RTC instability.

Evidence: offset_to_master out of range; time_quality_change_event logged (see H2-8, H2-10).

First Fix: Validate PTP hardware timestamp integrity before replacing RTC.

Ref: H2-8 / H2-10

Cold start takes too long — PMIC sequencing or filesystem?

Conclusion: Delay before UI ready typically originates from PMIC rail sequencing rather than storage mount.

Evidence: pmic_startup_time exceeds limit; fs_mount_time within tolerance (see H2-4, H2-12).

First Fix: Validate rail timing and reset release order.

Ref: H2-4 / H2-12

Touch mis-trigger with wet gloves — algorithm or shielding?

Conclusion: High moisture mis-trigger often indicates insufficient rejection filtering before hardware shielding flaws.

Evidence: false_touch_event_cnt increases without cm_event rise; shielding_fault_count stable (see H2-5, H2-11).

First Fix: Adjust sensitivity threshold before redesigning shielding.

Ref: H2-5 / H2-11

Network link drops intermittently — PHY or power transient?

Conclusion: Simultaneous brownout_flag and link_flap_cnt indicates power transient impact on PHY.

Evidence: vin_min below threshold; phy_reset_cnt logged (see H2-4, H2-8).

First Fix: Validate power stability before replacing PHY device.

Ref: H2-4 / H2-8

Reboot after power dip but logs missing — holdup insufficient?

Conclusion: Missing evidence after dip confirms insufficient holdup energy budget.

Evidence: holdup_state insufficient; log_commit_fail incremented (see H2-4, H2-10).

First Fix: Increase holdup capacitance and validate commit window timing.

Ref: H2-4 / H2-10

System freezes during EMI test — common-mode path or shielding gap?

Conclusion: Freeze during high-field injection indicates common-mode coupling rather than logic crash.

Evidence: cm_event_cnt spike; safe_state_reason triggered without watchdog_reset (see H2-11, H2-9).

First Fix: Inspect shielding continuity and isolation boundaries.

Ref: H2-11 / H2-9

Rail Driver Desk & HMI Design Guide

Rail Driver Desk & HMI Design Guide

System Scope & Boundary

Operational Intent & Failure Narrative

Hardware Architecture Decomposition

1) Processing Core (MCU / SoC)

2) Touch & Encoder Interface

3) Display & Backlight

4) Audio/Video Subsystem

5) Network & Serial

6) Logging & Storage

Power Architecture & Brownout Behavior

Touch & Encoder Interfaces

Capacitive touch (typical)

Resistive touch (legacy / niche)

Display & Backlight Subsystem

Constant-current (CC)

PWM dimming

Audio/Video & Codec Chain

Mic array & AFE

Codec/DSP path

AEC (echo control)

PA/GA interface

Networking & Time Synchronization

PTP hardware timestamping

Holdover discipline

Safety, Redundancy & Fail-safe States

Functional failure

Safety failure

Event Logging & Forensics

UI action trace

System health

Network timeline

Power timeline

Identity & config

A/V snapshots

EMC & Rail Compliance Mapping

Compliance Matrix

Validation & Field Debug Playbook

Boot Validation

Power Down Validation

EMI Injection

Long Run (Soak)

Network Consistency

Request a Quote

Accepted Formats

Attachment

H2-13. Driver Desk & HMI FAQs

Explore

Categories

Get in Touch