123 Main Street, New York, NY 10001

Sanding & Anti-Slip Control for Rail Rolling Stock

← Back to: Rail Transit & Locomotive

This article covers the design, testing, and maintenance strategies for sanding and anti-slip systems in rail transit. It addresses key topics such as sensor integration, actuator control, slip detection logic, and evidence-based diagnostics. Through detailed analysis and actionable guidelines, the content equips engineers with the knowledge to ensure reliable system performance, minimize false alarms, and maintain long-term operational integrity.

H2-1. Scope & Boundary: What this page covers and does not

This page focuses on Sanding & Anti-Slip as an onboard adhesion-assist function: detect slip/slide, command sanding, drive the valve/motor safely, and produce auditable evidence (events, counters, and trigger context). The goal is not to “describe rail systems” but to define an implementable, testable boundary with clear interfaces and evidence outputs.

Rolling stock onboard Slip/slide detection Valve/motor actuation Event triggers & logging EN 50155 / EN 50121 touchpoints

Primary deliverables of this page: (1) a minimal interface definition, (2) a trustworthy sensor-to-trigger path, (3) actuator drive + diagnostics, and (4) an evidence packet structure that survives EMC and power transients.

Upstream inputs (interface level only):

  • Wheelset sensing: axle speed sources such as encoder/VR/Hall (handled here only as conditioned inputs and quality flags).
  • Motion confirmation: acceleration channel(s) to suppress false positives and to validate rapid changes.
  • Commands: driver/TCMS sanding enable/disable and mode selection (no TCMS internal architecture here).
  • Traction/brake requests: “request/state lines” only (no traction inverter or brake controller internal algorithms).

System outputs (what this page owns):

  • Actuation: sand valve drive and/or auger/feeder motor control, including safe defaults and diagnostics.
  • Dosing primitives: commanded duty/level with speed-aware limiting and anti-chatter behavior.
  • Evidence: event triggers, fault codes, counters, and a compact record suitable for post-incident review.

Environmental constraints (rail-specific):

  • EN 50155: wide input variations, undervoltage dips, temperature extremes, and restart behavior that must not corrupt evidence.
  • EN 50121: strong EMC where false triggers often originate from common-mode coupling and command-line glitches.

Explicit non-scope (to prevent overlap):

  • Traction inverter power stage design, IGBT/SiC gate-driver deep design, or DC-link control internals.
  • Brake control unit internal closed-loop algorithms and pneumatic/hydraulic control details (only interface-level signals are referenced).
  • CBTC/ETCS signaling, passenger systems, wayside/station systems, and traction substations.
Sanding & Anti-Slip — System Context (Interface-Level) Wheelset Sensors Speed Accel Inputs: VR / Hall / Encoder (conditioned), IMU accel Anti-Slip Logic Quality gating + hysteresis Slip score + confidence Trigger rules Sanding Actuators Valve Motor Outputs: duty/level, current signature, jam/open/short flags Event Recorder / Evidence Output Evidence Packet (minimum) • timestamp + source • slip_score + confidence • cmd_duty + actuator_current • threshold_version + reasons Interface-Level Signals (no subsystem internals) Driver/TCMS command (signal only) Traction/Brake request lines (signal only) EN 50155/50121 constraints applied here System context map for rail sanding and anti-slip Block diagram showing wheelset sensor inputs, anti-slip logic, sanding actuators, and event recorder with evidence packet fields and interface-only signals.
Figure F1. System context (interface-level): wheelset sensors → anti-slip logic → sanding actuators → evidence recorder. Labels emphasize evidence outputs and non-overlap boundaries.

H2-2. User Intent & Failure Narrative: What engineers actually search for

Sanding and anti-slip functions are rarely “tuned for comfort.” They are deployed to control risk under low adhesion (rain, snow, leaf film, oil contamination) while keeping the system measurable and defensible. Real-world searches cluster around a small set of failure narratives. Each narrative should map to: mechanism → evidence fields → first corrective action.

Narrative A — Slip detection is unreliable (false positives or missed events).

  • Likely mechanism: speed input quality collapses (jitter, dropout, saturation), or detection lacks stable hysteresis/gating.
  • Evidence to capture: speed_quality_flag, jitter_est, dropout_count, enter_reason, exit_reason.
  • First fix: enforce quality gating (do not confirm slip when the input is not trustworthy), then widen hysteresis to prevent chatter.

Narrative B — Sanding “works,” but slip persists.

  • Likely mechanism: actuator response is slow or dosing is not matched to speed and conditions (insufficient delivery, nozzle clog, valve stick).
  • Evidence to capture: cmd_duty, response_ms, actuator_current_peak, jam_flag, sand_rate_est.
  • First fix: verify response time and current signature on a confirmed command; if response is late/abnormal, treat it as delivery failure before retuning thresholds.

Narrative C — Sanding triggers, but causes secondary issues (overuse, contamination, nuisance).

  • Likely mechanism: triggers are driven by EMC/power glitches or overly permissive gating; the state machine is being pushed by spurious edges.
  • Evidence to capture: input_glitch_count, cm_noise_flag, brownout_counter, reset_reason, threshold_version.
  • First fix: tighten gating to require persistence and context (speed band + traction/brake state + quality flag), then address EMC coupling paths.

Narrative D — Audit/maintenance requires proof (post-incident defensibility).

  • Likely mechanism: the system cannot prove “what happened” because triggers lack context, timestamps are untrusted, or pre/post buffers are missing.
  • Evidence to capture: timestamp_source, event_id, pre_window_s, post_window_s, plus the detection state and actuator feedback.
  • First fix: implement a minimal evidence packet with trigger context and timebase health; tuning is secondary if proof is not reproducible.

A practical rule: treat “mechanism guessing” as a last resort. A reliable anti-slip system should narrow each narrative to a short list of evidence checks and a single first corrective action.

Failure → Evidence Funnel (Actionable Troubleshooting) Phenomenon Likely Mechanism Evidence to Capture First Fix A) Unreliable false/missed slip Input quality collapse or weak hysteresis quality_flag, jitter enter/exit reasons Gate by quality then widen hysteresis B) Still slips after sanding Delivery failure slow/jam/under-dose cmd_duty, response current peak, jam Verify response before retuning C) Over-triggers side effects EMC/power glitches weak gating glitch_count, reset brownout, version Tighten gating then fix coupling D) No proof audit/maintenance Missing context untrusted timestamp event_id, pre/post timebase health Add evidence packet before tuning Failure to evidence funnel for sanding and anti-slip Four troubleshooting narratives mapped from observed phenomena to likely mechanisms, required evidence fields, and a first corrective action.
Figure F2. Failure → evidence funnel: each common field narrative is reduced to mechanism, required evidence fields, and the first corrective action.

H2-3. Architecture Decomposition: Modules to implement

A sanding and anti-slip function becomes reliable only when it is decomposed into explicit modules with interface contracts. Each module below lists: inputs, outputs, isolation boundary, and diagnostic/evidence fields. Traction, brake, and TCMS are referenced only as interface-level signals.

Sensor → Quality → Detection Actuation + Feedback Safety interlocks Evidence packet Diag interface (isolated)

Module A — Sensor front-end (speed + acceleration)

InputsVR/Hall/encoder pulses (conditioned), accel channel(s), supply/ground reference health
Outputsspeed_raw, speed_filtered, accel_raw, quality_flag, signal_present, jitter_est
Isolation boundaryIsolate sensor domain when long harness/common-mode noise can corrupt logic ground
Key diagnosticsdropout_count, saturation_flag, estimator_mode (period/edge), min_edge_interval

Module B — Slip detection core (score + gating + hysteresis)

Inputsspeed_filtered, dv/dt estimate, accel_consistency_flag, quality_flag, traction/brake request lines (signal only)
Outputsslip_score, confidence, enter_reason, exit_reason, trigger_candidate
Isolation boundaryLogic domain separated from actuator power return to reduce false edges coupling into detection
Key diagnosticshysteresis_state, gating_mask, dwell_ms, inconsistency_count

Module C — Actuator control (valve / motor drive + sensing)

Inputssanding_cmd_level, mode, interlock_ok, supply_ok, temperature_ok
Outputscmd_duty, actuator_current_peak, response_ms, jam_flag, open_short_flag
Isolation boundaryDriver stage separated/isolated where common-mode and surge currents are expected
Key diagnosticscycle_count, current_signature_id, overcurrent_trip_reason, actuator_temp

Module D — Safety & interlocks (inhibit + fail-safe)

Inputswatchdog, supply_ok, sensor_health, maintenance_mode, inhibit lines
Outputsinhibit_reason, safe_default_state, reset_policy, latch_policy
Isolation boundarySafety decision signals protected from actuator noise and bus chatter
Key diagnosticsbrownout_counter, reset_reason, watchdog_trip_count, safe_state_entries

Module E — Event logging (trigger rules + evidence packet)

Inputstrigger_candidate, fault flags, interlock transitions, timebase health
Outputsevent_id, pre/post buffers, timestamp, threshold_version, compact evidence packet
Isolation boundaryLog commit path protected from power dips; evidence must survive resets
Key diagnosticscommit_status, buffer_overflow_count, packet_drop_count, time_sync_status

Module F — Communications (diagnostic interface only)

Inputsdiag requests, time sync status (if present), read/write access rules
Outputsfault summary, counters, latest events, configuration versioning
Isolation boundaryIsolated transceivers for rail comms; suppress common-mode injection into logic
Key diagnosticscrc_error_count, link_reset_count, auth/role (if applicable)

A practical implementation pattern: treat evidence generation as a first-class module. When slip detection, actuation, and logging share a clear contract, false triggers become diagnosable rather than “mysterious.”

F3 — Architecture Blocks & Interface Contracts A) Sensors & AFE VR / Hall / Encoder Accel input Outputs speed_filtered quality_flag B) Detection Core Slip score Gating + hysteresis Outputs trigger_candidate confidence C) Actuation Valve driver Motor drive Feedback actuator_current jam_flag E/F) Evidence Event pre/post Timestamp health Diag IF isolated bus D) Safety & Interlocks Inhibit reasons + fail-safe default + watchdog brownout_counter • reset_reason • safe_state_entries Isolation boundary feedback → evidence Architecture block diagram with interface contracts Diagram decomposes sanding and anti-slip into sensors/AFE, detection core, actuation, safety interlocks, evidence logging, and isolated diagnostic interface with data flow arrows.
Figure F3. Architecture decomposition: sensor domain → detection core → actuator control, with safety interlocks and evidence/diagnostic interfaces as first-class modules.

H2-4. Speed & Acceleration AFE Design: Make slip detection trustworthy

Slip/slide decisions are only as trustworthy as the speed and acceleration inputs. This section focuses on interface-level signal conditioning, estimator selection, and the generation of quality flags that gate detection. When input quality is unknown, detection should downgrade to “suspect” rather than confirm and trigger sanding.

4.1 Speed input families (interface behavior)

  • VR (variable reluctance): amplitude changes with speed; low-speed edges are fragile and often require adaptive thresholds.
  • Hall/MR: digital-like edges; sensitive to wiring reference and EMC spikes that can look like valid pulses.
  • Encoder: high pulse density; robust at medium/high speed, but low-speed bounce and missing edges can create false micro-speed.

4.2 Conditioning primitives (noise immunity)

  • Schmitt / hysteresis shaping: suppresses edge chatter and short spikes that would inflate speed.
  • Adaptive thresholding (VR): tracks amplitude changes but must avoid “following noise” under strong EMI.
  • Debounce windows: enforce a minimum edge interval to reject impossible pulse rates.

4.3 Speed estimation choice: period capture vs edge count

Two estimators dominate rail speed capture: period capture performs better at low speed where few edges exist, while edge counting stabilizes at high speed. A stable system defines an estimator_mode and uses a switch hysteresis band to avoid mode-chatter.

Period capture (low speed)Measures time between edges; requires timeout rules to detect dropout and to support zero-speed hold.
Edge count (higher speed)Counts edges per window; requires window sizing to balance responsiveness vs noise sensitivity.
Mode switch guardUse a hysteresis band + minimum dwell time to prevent estimator_mode toggling during EMI bursts.

4.4 Low/zero speed reliability

  • Signal-present gating: explicitly track signal_present and dropout_count rather than assuming missing edges imply zero speed.
  • Zero-speed hold: define zero_speed_hold_ms to avoid rapid flip-flopping when edges vanish at very low speed.
  • Minimum edge interval: reject physically impossible pulses using min_edge_interval as a hard filter.

4.5 Acceleration as a truth-check

Acceleration is used as a consistency check: if speed-derived changes are not consistent with acceleration trends, the input quality is downgraded. This reduces nuisance confirmations caused by track vibration, wiring motion, or EMC spikes on speed edges.

Suggested accel flagsaccel_saturation_flag • accel_consistency_flag • accel_noise_est
Detection impactWhen accel consistency fails, allow “suspect” only; block confirmation unless persistence and context are strong.

Minimum evidence fields produced by AFE: speed_raw, speed_filtered, accel_raw, quality_flag, signal_present, jitter_est, dropout_count, estimator_mode.

F4 — Signal Integrity & Quality Flags (Gate Before Detect) Waveform Patterns (examples) Normal Noise spikes Dropout Jitter Quality Flags → Gating signal_present edge timeout / dropout count jitter_est min edge interval / debounce saturation_flag clip / overload detection quality_flag (summary) OK → allow confirm LOW → suspect only Quality gating Block confirmation when low extract flags Sensor signal integrity and quality flags Waveform examples show normal, noise spikes, dropout, and jitter, mapped to quality flags that gate slip detection to prevent false confirmations.
Figure F4. Speed/accel signal integrity: waveform patterns map to quality flags (signal present, jitter, saturation) that gate detection before confirmation.

H2-5. Slip/Slide Detection Logic: Thresholds, hysteresis, and gating

Slip/slide triggering should not be treated as a single threshold. A reliable design uses state progression plus evidence fields: input quality gating, context gating (speed band and traction/brake state), hysteresis, and multi-axle consistency checks. This produces a defensible record of why the system entered and why it exited.

5.1 Definitions (what “slip/slide” means at interface level)

  • Slip score: derived from Δv and/or dv/dt between wheel speed and a reference speed.
  • Reference speed: selected from trusted wheel inputs (e.g., neighbor/median) under quality constraints; output ref_source_id.
  • Direction awareness: traction state and brake state are used as context (interface-only) to interpret sign and severity.
Core outputs (minimum)slip_score • confidence • enter_reason • exit_reason • ref_source_id

5.2 Gating (when detection is allowed to confirm)

Gating prevents nuisance confirmations during low-speed ambiguity, sensor quality collapse, and rail EMC/power disturbances. A practical approach is to compute a gating_mask and a gating_block_reason that can be logged.

  • Speed band gating: disable confirmation below a defined speed floor; allow “suspect” only.
  • Context gating: require consistent traction/brake state lines (signal only) before confirmation.
  • Quality gating: block confirmation when quality_flag is LOW; keep state at SUSPECT.
  • Health gating: block confirmation on brownout/overtemp conditions; record health_ok and reset_reason.
Gating evidence fieldsgating_mask • gating_block_reason • quality_flag • health_ok • traction_state • brake_state

5.3 Hysteresis (avoid chatter and edge-driven triggers)

  • Enter/exit separation: use distinct enter and exit thresholds to avoid boundary oscillation.
  • Minimum dwell time: require persistence (enter_dwell_ms) to enter CONFIRMED and persistence (exit_dwell_ms) to leave it.
  • Reason codes: log enter_reason and exit_reason as first-class evidence, not debug text.

5.4 Multi-axle consistency (reduce false positives)

A single axle reporting a severe slip score while neighbor axles remain stable and high-quality often indicates sensor corruption rather than true adhesion loss. Consistency checks should influence confidence and can be encoded as neighbor_consistency_ok.

Consistency evidence fieldsneighbor_consistency_ok • consistency_count • ref_source_id • confidence

Recommended logging rule: every state transition must produce a small record containing slip_score + confidence + gating_mask + enter/exit_reason. Without these fields, tuning becomes guesswork and post-incident review becomes inconclusive.

F5 — Detection State Machine (State + Evidence) NORMAL stable speed no trigger SUSPECT evidence building no confirm yet CONFIRMED trigger sanding log event RECOVERY exit + cooldown avoid re-trigger cond: slip_score > pre_thr log: enter_reason (SUSPECT) cond: gating OK + enter_thr + dwell log: gating_mask • confidence • enter_reason cond: slip_score < exit_thr + dwell log: exit_reason • confidence cond: cooldown done log: state_exit + counters gating: quality • speed band • health • context multi-axle: neighbor_consistency_ok → confidence Slip/slide detection state machine State machine showing NORMAL, SUSPECT, CONFIRMED, and RECOVERY with transition conditions and logged evidence fields including gating_mask, confidence, and enter/exit reasons.
Figure F5. Detection is a state machine with evidence. Each transition emits reason codes and confidence fields for auditable behavior.

H2-6. Actuator Control: Valve/motor drive, dosing, and jam detection

Actuator control must convert a detection outcome into a measurable delivery action: command level, response timing, and current signature. This section covers valve and small motor actuation at the interface level, including protection, dosing curves, and jam/open detection without expanding into onboard PDU or brake controller internals.

6.1 Actuator types (implementation primitives)

  • Solenoid valve: PWM with “pull-in” and “hold” behavior; current signature is the primary evidence of motion.
  • Feeder motor / auger: duty-based control with current-based stall detection; advanced motor control is out of scope.

6.2 Drive protection and diagnosability

  • Overcurrent/short: trip fast, latch per policy, record overcurrent_trip_reason.
  • Open/line break: command present but current signature missing; raise open_short_flag.
  • Overtemperature: inhibit or derate; record inhibit_reason and temp_ok.
  • Back-EMF / harness stress: clamp and protect; evidence should not be lost during transient handling.
Protection evidence fieldsopen_short_flag • overcurrent_trip_reason • inhibit_reason • cycle_count • actuator_temp

6.3 Dosing control (speed-aware curve + limits)

A dosing strategy should be defined as a curve (or table) tied to operating context rather than a fixed duty. Typical rail constraints include low-speed compensation, high-speed limiting, and sand-rate budgeting to prevent overuse.

  • Low-speed compensation: stabilize delivery when speed estimation is sparse and adhesion changes rapidly.
  • High-speed limiting: cap sand rate to reduce contamination and excessive consumption.
  • Rate limiting / cooldown: prevent repeated triggers from exhausting sand; log rate_limit_active.
Dosing evidence fieldscmd_duty • sand_rate_est • curve_id • rate_limit_active • cooldown_active

6.4 Jam / clog detection (current signature + response time)

Jam detection is most robust when it uses two independent cues: current signature classification and response time window. This separates normal motion, mechanical stall, and open-circuit behavior.

Core actuator outputs (minimum)cmd_duty • drv_current_peak • response_ms • jam_flag • sand_rate_est

Logging rule: on every trigger, record cmd_duty, drv_current_peak, and response_ms. If sanding “was commanded” but these fields do not confirm action, the system should treat it as a delivery failure rather than tuning a detection threshold.

F6 — Actuator Drive + Current Signature (Evidence) Normal Jam / Stall Open / No load cmd_duty driver actuator (valve/motor) current_sense cmd_duty driver actuator (stalled) current_sense cmd_duty driver actuator (open) current_sense Current signature peak steady Current signature high no settle Current signature near zero window_ms: measure response_ms + current peak flags: jam_flag • open_short_flag • trip_reason Actuator drive and current signature evidence Three cases compare actuator current signatures: normal peak-to-steady, jam/stall high current with no settle, and open/no load near-zero current. Detection window highlights response_ms and drv_current_peak.
Figure F6. Current signature as evidence: normal vs jam/stall vs open/no-load. The response window produces drv_current_peak and response_ms for diagnosable delivery.

H2-7. Coordination Interfaces: How sanding interacts with traction/brake (signals only)

This section defines the coordination contract between the sanding/anti-slip function and upstream traction/brake systems using signals only. It avoids traction inverter and brake unit internal algorithms and focuses on interface reliability: priority rules, debounce under EN 50121 disturbance, and evidence-friendly outputs.

signals only priority & inhibit debounce / anti-glitch event_id linkage

7.1 Interface I/O (contract-level signals)

Inputs (from upstream)traction_active • brake_active • wheel_slip_request (status/request only)
Outputs (to upstream)sanding_active • inhibit_reason • fault_status • event_id
Timing intentstable input windows → confirmed slip → command → feedback → event commit

7.2 Priority model (safety inhibition first)

Coordination is safest when inhibition is explicit and enumerable. The output inhibit_reason should be treated as a primary interface signal, not an internal detail, and it should be logged with event_id when it changes.

  • Safety inhibition: maintenance mode, diagnostic failure, supply/temperature unhealthy, or system-wide inhibit lines.
  • Interlock inhibition: upstream indicates “do not sand” for the current context (signal only).
  • Resource inhibition: cooldown/limit strategy active to prevent excessive consumption; report as a reason code.
Priority evidence fieldsinhibit_reason • fault_status • health_ok • cooldown_active • event_id

7.3 Debounce and consistency (EN 50121 anti-glitch)

Long harnesses and strong common-mode noise can create short pulses that appear as valid status transitions. Interface protection should therefore combine debounce windows and consistency checks before state acceptance.

  • Debounce window: accept traction/brake state changes only after stability for if_debounce_ms.
  • Glitch counters: record if_glitch_count and optional if_inconsistency_flag when conflicting states occur.
  • Consistency gating: if traction_active and brake_active are simultaneously asserted in an invalid context, block confirmation and log a reason.
Interface robustness fieldsif_debounce_ms • if_glitch_count • if_state_stable • if_inconsistency_flag

Recommended contract rule: sanding_active should represent the real “action-active” state (confirmed command + verified response), not just a software request. This prevents upstream systems from assuming delivery when actuation failed.

F7 — Interface Timing & Debounce (Signals Only) t0 t1 t2 t3 t4 t5 Inputs Detection Actuation Logging traction_active brake_active glitch (ignored) debounce window slip_confirmed enter_dwell_ms + gating_mask sanding_cmd cmd_duty feedback drv_current_peak • response_ms event commit event_id: propagate to upstream Interface timing and debounce Timeline showing traction/brake inputs with debounce/glitch filtering, slip confirmation using gating and dwell, sanding command, feedback measurement, and event commit with event_id propagation.
Figure F7. Interface timing: stable upstream inputs → confirmed slip → sanding command → verified feedback → event commit and event_id linkage.

H2-8. Event Triggers & Black-Box Logging: Evidence that survives audits

Black-box logging is the page’s differentiator: it turns detection and actuation into auditable evidence. A practical design uses enumerable trigger types, fixed pre/post windows, and a compact evidence packet that survives resets. Cryptographic integrity is referenced at the interface level only (hash/signature presence) without expanding into security architecture.

8.1 Trigger taxonomy (enumerable events)

  • slip_confirmed — detection reached CONFIRMED (include slip_score, confidence, gating_mask).
  • sanding_cmd — command issued (include cmd_duty, curve_id, cooldown/rate-limit status).
  • actuator_fault — jam/open/overcurrent/overtemp (include trip_reason, jam_flag, open_short_flag).
  • sensor_quality_drop — quality gating degraded (include quality_flag, dropout_count, jitter_est).
  • brownout_reset — reset/power dip (include supply_v summary, reset_reason, counters).
Event identity fieldsevent_type • event_id • event_seq • threshold_version • time_sync_status

8.2 Pre/post windows (why they exist)

Pre/post windows provide context: pre-window reveals whether input quality or supply conditions degraded before the trigger, and post-window confirms whether actuation produced a measurable response and whether recovery occurred. Windows should be configurable and recorded as configuration evidence.

Window configurationwindow_pre_ms (e.g., 1000–2000) • window_post_ms (e.g., 3000–5000) • threshold_version

8.3 Minimum evidence set (audit-friendly layers)

Headertimestamp • event_id • event_type • threshold_version • time_sync_status
Contexttraction_state • brake_state • speed_band • health_ok • inhibit_reason
Detectionslip_score • confidence • gating_mask • enter_reason • state (NORMAL/SUSPECT/CONFIRMED)
Actuationcmd_duty • drv_current_peak • response_ms • jam_flag • open_short_flag • sand_rate_est
Power/EMCsupply_v_summary • reset_reason • brownout_counter • if_glitch_count

8.4 Integrity hooks (interface-level only)

If a security module is present, the event packet can include a hash and a signature status field. This provides tamper-evidence without describing key management or remote attestation flows on this page.

Integrity fieldspacket_hash • signature_status • security_module_present

Implementation rule: event commit should be designed to survive resets. If a brownout occurs during commit, the system should record commit_status and preserve the last valid event pointer rather than silently dropping evidence.

F8 — Evidence Packet (What’s inside one event) Event Packet Header timestamp • event_id • event_type • threshold_version Context traction_state • brake_state • speed_band • health_ok inhibit_reason (if any) Snippets (summaries) speed/accel summary • slip_score • gating_mask • confidence cmd_duty • drv_current_peak • response_ms Counters dropout_count • if_glitch_count • brownout_counter commit_status • reset_reason Integrity packet_hash signature_status Triggers slip_confirmed sanding_cmd actuator_fault quality_drop brownout_reset creates packet Evidence packet structure Diagram shows a structured event packet with layers: Header, Context, Snippets, Counters, and Integrity fields, alongside a trigger taxonomy list that generates packets.
Figure F8. Evidence packet: fixed structure for auditability. Trigger types generate packets containing context, detection evidence, actuation proof, counters, and integrity hooks.

H2-9. Power Integrity & EMC in Rail: Why false triggers happen

False sanding triggers typically come from a repeatable cause chain: noise sourceDM/CM coupling pathvictim point (speed AFE / command input / MCU/logger) → observable fields. Rail environments amplify these effects through long harnesses, high-energy switching, and strong common-mode disturbances.

brownout & reset DM / CM coupling harness & chassis ground bounce evidence fields

9.1 Voltage dips and transients (how they break detection)

  • Threshold drift: references and comparators shift under dips, producing artificial spikes and saturation at the sensor interface.
  • Reset and timing rupture: MCU resets or clock instability corrupt dwell time, debounce windows, and timestamps.
  • Record integrity risk: event commit can be interrupted, creating missing post-windows unless commit status is recorded.
Power-related evidence fieldsbrownout_counter • reset_reason • supply_v_summary • commit_status • time_sync_status

9.2 DM vs CM coupling (where noise travels)

DM coupling follows a signal-return loop (input pair + return), while CM coupling rides the harness relative to chassis and seeks a return through shield termination and parasitic capacitance. Both can corrupt slip confirmation and interface debouncing.

  • DM victims: speed AFE thresholds, edge capture, and quality gating can be displaced by return-path disturbance (ground bounce).
  • CM victims: command/status inputs and isolated interfaces can see short pulses as valid edges unless common-mode currents are controlled.
Coupling symptoms to loginput_saturation_flag • quality_flag • if_glitch_count • if_inconsistency_flag

9.3 Suppression strategies (placement + partitioning)

  • Filter placement: apply input conditioning at the receiver entry (speed AFE and command input) before decision logic.
  • Common-mode control: provide a defined CM return to chassis via correct shield termination; avoid routing CM through AFE reference.
  • Partitioning: separate sensitive AFE reference and high dI/dt actuator return; reduce ground bounce injection into detection.
  • Clamping & saturation handling: detect and flag saturated inputs rather than letting them masquerade as valid slip evidence.

Recommended “false-trigger suspicion” rule: if slip_confirmed aligns with reset_reason, rising if_glitch_count, or input_saturation_flag, treat the episode as an EMC/power integrity incident and log it explicitly instead of tuning detection thresholds.

F9 — EMC Coupling Path Map (DM / CM) Noise sources traction switching relay/contactor harness discharge surge / lightning DM lane signal + return loop CM lane harness ↔ chassis DM loop ground bounce CM current to chassis Victim points speed AFE input_saturation_flag command input if_glitch_count MCU / logger brownout_counter • reset_reason commit_status Mitigations filter placement CM to chassis partition & return EMC coupling path map Noise sources couple through DM and CM paths into victim points: speed AFE, command inputs, and MCU/logger. Mitigation focuses on filter placement, common-mode return to chassis, and partitioning to control ground bounce.
Figure F9. False triggers can be explained by coupling paths. The map links sources → DM/CM paths → victims → mitigation actions and fields.

H2-10. Diagnostics & Health Monitoring: Counters that enable maintenance

Maintenance value comes from counters and trends, not one-time functionality. A clear health model groups signals by subsystem, defines “redline” concepts (without disclosing proprietary numbers), and ties every counter to an actionable inspection step. Reporting is described as a minimal diagnostic set rather than a full TCMS stack.

10.1 Counter groups (dashboard-ready)

Sensorsensor_dropouts • quality_low_time • jitter_est_trend
Actuatoractuator_cycle_count • jam_count • overcurrent_trip_count
Eventssand_usage_counter • false_trigger_suspected • event_rate
Powerbrownout_counter • commit_fail_count • reset_reason_hist

10.2 “Redline” concepts (thresholds without numbers)

  • Rising dropout density suggests harness/shield/connector issues rather than detection tuning.
  • Jam and overcurrent frequency suggests mechanical resistance, nozzle clogging, or insulation degradation.
  • Sand usage vs slip rate mismatch suggests dosing curve issues or false triggering under EMC conditions.
  • Commit failures and brownouts suggest power integrity and hold-up weaknesses that jeopardize audit evidence.

10.3 Trend trust (version & calibration IDs)

Trends are only comparable when the system reports which configuration produced them. Maintenance reviews should therefore capture: detection threshold version, calibration/configuration ID, and confidence trend summaries.

Trust fieldsthreshold_version • calibration_id • confidence_trend • curve_id

10.4 Minimal reporting interface (no full-stack expansion)

  • Read-only summary: counters + trust fields via registers/diagnostic frames.
  • Event index readout: fetch event summaries by event_id without pulling full raw streams.
  • Controlled reset/clear: counters cleared only in maintenance mode, and the clear action is logged as an event.

Recommended maintenance linkage: when false_trigger_suspected crosses a redline concept, review the last N events for input_saturation_flag, if_glitch_count, and reset_reason before altering thresholds.

F10 — Health Dashboard Signals (Counters + Redlines) Trust / Version threshold_version calibration_id confidence_trend curve_id Sensor sensor_dropouts quality_low_time jitter_est_trend redline Actuator actuator_cycle_count jam_count overcurrent_trip_count redline Events sand_usage_counter false_trigger_suspected event_rate redline Power brownout_counter commit_fail_count reset_reason_hist redline Health dashboard signals Dashboard groups maintenance counters into Sensor, Actuator, Events, and Power, with a top trust band for threshold_version, calibration_id, confidence_trend, and curve_id. Each group includes a redline concept indicator.
Figure F10. Maintenance dashboard: grouped counters plus trust/version fields. Redline concepts highlight when inspections should start—before tuning detection thresholds.

H2-11. Validation Playbook: What to measure and how to prove it works

This playbook is an executable checklist. Every test item must produce traceable evidence fields and/or an event packet linked by event_id. The goal is not only “it works,” but “it can be proven during audits and maintenance reviews.”

test_case_id pass/fail criteria event_id evidence power/EMC robustness

11.1 Bring-up checklist (inputs + safe actuation)

Bring-up validates full-range sensor ingestion and actuator protection before any slip logic tuning. Each case should record test_case_id, a short configuration fingerprint, and the resulting evidence fields.

  • Speed input coverage: low-speed boundary, high-speed boundary, missing tooth/dropouts, jitter/glitch pulses, reduced amplitude, saturation/clamp.
  • Acceleration sanity: vibration-like bursts must not be misread as slip evidence when gating is active.
  • Actuator safety: open/short detection, overcurrent, overtemperature, jam signature, and response timing window.
Evidence fields (minimum set)speed_raw • speed_filtered • quality_flag • signal_present • dropout_count • jitter_est • input_saturation_flag
Actuation proof fieldscmd_duty • drv_current_peak • response_ms • jam_flag • open_short_flag

Example MPNs (bring-up instrumentation & interfaces)

Fast current-sense amplifierTI INA240 (PWM-friendly current sensing for valve/motor drive)
Solenoid/valve driverTI DRV110 (current-controlled solenoid drive)
Motor driver (DC)TI DRV8876 (protected H-bridge / motor drive option)
Supervisor / resetTI TPS3839 (voltage supervisor for reset_reason correlation)
Industrial digital isolatorTI ISO7721 (logic isolation on command/status interfaces)

11.2 Slip simulation (controlled injection + state stability)

Slip validation requires controlled stimuli: bench injection, simulation, or replay of captured traces. The objective is stable state transitions with hysteresis and gating—no chattering between states.

  • Injection levels: mild / medium / strong slip patterns (relative bins are sufficient).
  • State stability: enter_dwell_ms enforced; enter/exit thresholds separated; recovery behavior consistent.
  • Gating correctness: poor sensor quality or invalid traction/brake context must block CONFIRMED.
Detection evidence fieldsslip_score • confidence • gating_mask • enter_reason • exit_reason • state
Expected event typesslip_confirmed (positive cases) • sanding_cmd (when allowed) • inhibit_reason changes (negative cases)

Example MPNs (data acquisition / replay / logging blocks)

Multi-channel ADC (example)TI ADS131M04 (multi-channel precision sampling for validation rigs)
FRAM (robust non-volatile)Fujitsu MB85RS64V (event counters / small metadata with high endurance)
Quad SPI NOR flashWinbond W25Q128JV (event packet storage option)

11.3 EMC & power transient (no false trigger + no evidence loss)

Under rail EMC and supply disturbances, validation must prove two outcomes simultaneously: no false triggers and logs that remain interpretable (even across resets).

  • Transient immunity: command inputs and sensor paths must not produce spurious CONFIRMED transitions.
  • Reset transparency: resets must be explainable via reset_reason and brownout_counter.
  • Commit durability: event commit interruptions must be visible via commit_status / commit_fail_count.
Power/EMC evidence fieldsbrownout_counter • reset_reason • supply_v_summary • if_glitch_count • input_saturation_flag • commit_status
Audit statement (what must be provable)“No slip_confirmed without valid gating” + “No silent evidence drop on resets”

Example MPNs (surge/ESD protection & CM control)

TVS diode (automotive/industrial)Littelfuse SM8S series (high-energy surge clamping option)
ESD protection (I/O)TI TPD1E10B06 (ESD diode for sensitive digital inputs)
Common-mode choke (example)Würth Elektronik 744232 series (CM noise suppression on harness links)

11.4 Field regression (route conditions + audit alignment)

Field regression converts real routes into repeatable test libraries. Each scenario should map to an event packet that supports post-incident questions: when slip occurred, whether sanding actuated, whether inputs were trustworthy, and whether the power/EMC environment was healthy.

  • Low adhesion: rain/snow/leaves/oil; ensure sanding_cmd correlates with confirmed slip evidence.
  • Vibration hotspots: joints/switches; acceleration bursts must not bypass gating.
  • Post-maintenance: nozzle changes; detect jam trends and abnormal sand usage rates.
Required linkageevent_id → (slip_confirmed / sanding_cmd / actuator_fault) → evidence packet fields
F11 — Test ↔ Evidence Matrix (Executable Proof) Tests test_case_id + pass/fail Sensor quality_flag Detect slip_score Actuate drv_current Power reset_reason Log event_id Bring-up speed coverage + safe drive Slip simulation state + hysteresis + gating EMC / Power transient no false trigger + no loss Field regression route library + audit alignment Legend: ✓ required evidence • □ optional / case-dependent Test to evidence matrix Matrix ties each validation phase to required evidence groups: Sensor, Detection, Actuation, Power/EMC, and Logging. This makes verification auditable and repeatable.
Figure F11. Validation is complete only when each test produces the required evidence fields (and event packets) to support audits and maintenance decisions.

H2-12. Field Feedback Loop: Update thresholds, models, and triggers from returns

The system should behave as a dynamic model system: field returns are converted into structured evidence, parameter updates are versioned, validated against the matrix, deployed in subsets, and continuously monitored. Logging is not an accessory—it is the core of controlled improvement.

12.1 Return triage (from cases → evidence gaps)

Every return should start from event_id and the evidence packet, not from assumptions. The first goal is to identify whether the packet is sufficient; if not, the priority is to close evidence gaps before tuning thresholds.

Return archive keyevent_id • event_type • threshold_version • calibration_id • (sensor/detect/actuate/power/log summaries)
Case categoriesfalse positive • false negative • actuation failure • power/EMC disturbance • incomplete commit

12.2 Threshold strategy (versioning + subset rollout + rollback)

  • Versioning: every rule change increments threshold_version and is recorded in events.
  • Subset deployment: enable new versions on a limited fleet/route subset before broad rollout.
  • Rollback triggers: rising false_trigger_suspected, abnormal sand_usage_counter, or increased event density without supporting slip evidence.
Change-control evidencethreshold_version • rollout_group_id • rollback_reason • monitoring_window

12.3 Preventive maintenance windows (trend-driven)

Counters should drive maintenance scheduling. Trend-based triggers reduce unplanned downtime and avoid “threshold chasing” when the root cause is mechanical resistance or power/EMC disturbances.

  • Clog/jam trend: increasing jam_count + rising drv_current_peak trend → clean nozzle/feeder earlier.
  • Overuse mismatch: high sand_usage_counter but low slip evidence density → investigate false-trigger mechanisms.
  • Evidence risk: rising brownout_counter or commit_fail_count → prioritize power integrity improvements.

Example MPNs (durable storage + integrity hooks)

Secure element (interface-level integrity option)Microchip ATECC608B (signature_status / packet_hash support, without expanding key management here)
Industrial MCU (safety-grade examples)TI TMS570 (lockstep-class family option) • NXP S32K (automotive-grade MCU option)
RS-485 transceiver (robust field bus option)ADI ADM485 / ADM2587E family (RS-485; isolated variants exist for harsh environments)

Operational rule: do not tune detection thresholds until evidence packets can separate sensor quality, interface glitches, and power/EMC incidents. Otherwise, “fixes” can reduce safety margin and create new failure modes.

F12 — Closed-Loop Improvement (Controlled Iteration) Field events event_id • returns Analysis / triage gap • root bucket Parameter update threshold_version Validation matrix pass/fail (F11) Deployment subset rollout Monitoring redlines • rollback Gate A: validate Gate B: rollback trigger Closed-loop improvement Closed-loop process: field events are analyzed for evidence gaps, parameters are updated with versioning, validated via the test-evidence matrix, deployed in subsets, and monitored with redlines and rollback gates.
Figure F12. Closed-loop improvement: returns drive evidence-first analysis, versioned parameter updates, validation gates, subset rollout, and monitoring with rollback triggers.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-13. FAQs (Accordion ×12)

Each answer follows the same proof pattern: 1 conclusion + 2 evidence checks + 1 first fix. Chapter mapping is shown under each question.

Sanding triggers but slip still happens—dosing curve wrong or valve response slow? Maps to: H2-6 / H2-5 Q1

Conclusion: Most “sanding-but-still-slip” cases are actuator-response limited, not threshold-limited.

Evidence: Compare sanding_cmd with drv_current_peak and response_ms; slow or clipped current suggests a valve/feed restriction (e.g., DRV110 drive never reaches hold profile).

Evidence: Check whether slip_score decays during RECOVERY after sanding; if it stays high, dosing is not reaching the rail.

First fix: Validate valve/feeder response under cold/contamination and tighten jam detection using current signatures (INA240-style sensing).

Evidence fields: cmd_duty, drv_current_peak, response_ms, slip_score, state
Slip is detected only at high speed—sensor conditioning or low-speed estimator issue? Maps to: H2-4 Q2

Conclusion: This pattern usually indicates low-speed estimator/conditioning weakness rather than “no slip.”

Evidence: At low speed, inspect signal_present, quality_flag, and dropout_count; edge-capture can fail when pulse amplitude is near threshold.

Evidence: Compare jitter_est and period-capture stability; excessive jitter suggests poor hysteresis/Schmitt conditioning or noise pickup.

First fix: Improve low-speed conditioning and estimator path (threshold + debounce) before adjusting detection thresholds.

Evidence fields: signal_present, quality_flag, dropout_count, jitter_est
False slip alarms during EMC tests—CM coupling or hysteresis too tight? Maps to: H2-9 / H2-5 Q3

Conclusion: EMC false alarms are more often coupling/glitch driven than “too sensitive thresholds.”

Evidence: Correlate alarms with if_glitch_count and input_saturation_flag; spikes implicate CM currents and poor entry filtering/termination.

Evidence: Review enter_reason and state dwell time; if transitions occur without sustained evidence, hysteresis/dwell is not being enforced.

First fix: Add/verify interface debouncing and CM return strategy (e.g., ISO7721-isolated inputs plus proper shield-to-chassis termination).

Evidence fields: if_glitch_count, input_saturation_flag, enter_reason, state
Actuator works in lab but jams in the field—current signature or debris/icing? Maps to: H2-6 / H2-10 Q4

Conclusion: Field jams are best identified by current signatures and trend counters, not by command logs alone.

Evidence: Compare drv_current_peak waveforms: “debris/icing” shows rising current and slow response, unlike open/short faults (DRV8876 protections can mask symptoms if not logged).

Evidence: Confirm jam_count trending with actuator_cycle_count; repeated near-jam cycles indicate mechanical resistance build-up.

First fix: Tighten jam detection windows and schedule preventive cleaning when current trend crosses a redline.

Evidence fields: drv_current_peak, response_ms, jam_count, actuator_cycle_count
Events are logged but timestamps don’t align—timebase drift or trigger ordering? Maps to: H2-8 / H2-7 Q5

Conclusion: Misaligned timestamps usually come from trigger ordering and reset/time-sync state, not from “bad clocks” alone.

Evidence: Check time_sync_status alongside event timestamp; drift often appears after resets or loss-of-lock.

Evidence: Validate ordering: slip_confirmed should precede sanding_cmd and actuator feedback in the packet; if not, triggers are racing.

First fix: Enforce a single timebase and deterministic trigger ordering before increasing time-sync complexity.

Evidence fields: timestamp, time_sync_status, event_type, trigger_order
System resets during slip events—brownout margin or hold-up/log commit policy? Maps to: H2-9 / H2-8 Q6

Conclusion: Resets during slip episodes are a power-integrity problem until proven otherwise.

Evidence: Correlate reset_reason with brownout_counter and supply summaries; a supervisor (TPS3839-class) helps make resets explainable.

Evidence: Inspect commit_status/commit_fail_count; silent commit loss breaks auditability even if detection was correct.

First fix: Increase brownout margin/hold-up and harden commit policy so events remain interpretable across resets.

Evidence fields: reset_reason, brownout_counter, supply_v_summary, commit_status
Slip detection oscillates (on/off)—missing hysteresis or gating conditions unstable? Maps to: H2-5 Q7

Conclusion: Oscillation is almost always hysteresis/dwell or gating instability, not “wrong threshold.”

Evidence: Compare enter_reason/exit_reason patterns; rapid toggling indicates missing enter/exit separation or too-short minimum dwell.

Evidence: Inspect gating_mask changes near transitions; if sensor-quality or context bits flap, the state machine will chatter.

First fix: Add enter/exit hysteresis plus minimum duration and stabilize gating inputs before retuning thresholds.

Evidence fields: enter_reason, exit_reason, gating_mask, state
Sand usage too high—over-triggering or no speed-dependent limiting? Maps to: H2-6 / H2-10 Q8

Conclusion: Excess usage is either over-triggering or missing speed-dependent dose limiting.

Evidence: Compare sand_usage_counter against slip_confirmed density; a high ratio suggests false triggers rather than genuine low adhesion.

Evidence: Verify whether cmd_duty clamps as speed increases; lack of limiting causes runaway consumption even on marginal slips.

First fix: Implement a speed-indexed cap (dosing curve limit) before lowering detection sensitivity.

Evidence fields: sand_usage_counter, slip_confirmed, cmd_duty, false_trigger_suspected
Sensor quality drops intermittently—wiring/reference or input saturation? Maps to: H2-4 / H2-9 Q9

Conclusion: Intermittent quality loss is either wiring/reference instability or saturation from coupled noise.

Evidence: If quality_flag drops with input_saturation_flag spikes, suspect EMC/CM coupling rather than a pure wiring open.

Evidence: If dropouts increase with brownout_counter or resets, suspect supply/reference disturbance in the AFE path.

First fix: Improve entry filtering and reference/return routing; then revalidate low-speed conditioning.

Evidence fields: quality_flag, input_saturation_flag, dropout_count, brownout_counter
Audit asks “prove sanding engaged”—which evidence fields are mandatory? Maps to: H2-8 / H2-11 Q10

Conclusion: Proof requires command + actuator response + integrity context, not just a “sanding_active” bit.

Evidence: Mandatory chain: event_id, sanding_cmd, cmd_duty, and a physical response indicator like drv_current_peak (INA240-class sensing).

Evidence: Integrity fields like commit_status and threshold_version must accompany the packet to show it is complete and version-traceable.

First fix: Lock a “minimum evidence packet template” and fail-safe log when any mandatory field is missing.

Evidence fields: event_id, cmd_duty, drv_current_peak, commit_status, threshold_version
After firmware update, behavior changed—threshold versioning or parameter migration bug? Maps to: H2-12 / H2-8 Q11

Conclusion: Post-update drift is usually version/migration related until evidence shows a real physics change.

Evidence: Compare threshold_version and configuration fingerprints across events; unexpected jumps indicate migration or default resets.

Evidence: Check monitoring counters (false_trigger_suspected, sand_usage_counter) for step changes aligned with the update window.

First fix: Roll back or subset-disable the new version, then re-run the F11 matrix before broad redeploy.

Evidence fields: threshold_version, calibration_id, false_trigger_suspected, sand_usage_counter
Two axles disagree on slip—sensor issue or multi-axle consistency rule too strict? Maps to: H2-5 / H2-4 Q12

Conclusion: Disagreement is a sensor-quality asymmetry first, and a consistency-rule issue second.

Evidence: Compare axle-wise quality_flag, dropout_count, and jitter_est; the weak axle typically drives false inconsistency.

Evidence: Inspect the consistency decision output (multi_axle_consistency_flag) and which gating bits suppressed CONFIRMED.

First fix: Fix the degraded axle input (wiring/reference/conditioning) before relaxing multi-axle rules.

Evidence fields: quality_flag, dropout_count, jitter_est, multi_axle_consistency_flag