UART Framing & Parity Errors Under Noise: Oversampling & Filters

Q: FE spikes only when motors switch—false start or threshold noise first?

Likely cause: Event-driven coupling causes false START (idle glitches) or threshold chatter (return-path injected noise). Quick check: Trigger captures on the motor event and log false-start count + FE bursts; probe at the MCU RX pin to catch fast dips. Fix: Enable start qualification (N consecutive low samples), set a de-glitch threshold, add series-R, and correct clamp/return routing. Pass criteria: False-start rate < X/min during motor switching; FE < X per 10^N bytes; burst P95 < X bytes; recovery < X ms across the event set.

Q: Parity errors only on certain byte values—pattern test to tell noise vs parity mismatch?

Likely cause: Parity mode mismatch or value-dependent bit flips caused by edge/coupling sensitivity. Quick check: Run 0x00/0xFF, 0x55/0xAA, walking-1/0, and PRBS; compare PE vs pattern and confirm both ends’ parity configuration. Fix: Align parity mode first; if noise remains, reduce ringing (series-R), add hysteresis, and tighten sampling robustness. Pass criteria: PE < X per 10^N bytes on all patterns; no value-dependent spikes; recovery < X ms after disturbances.

Q: FE but scope looks “clean”—what trigger/measurement is missing?

Likely cause: The disturbance is too rare/fast without the right trigger, or it happens at the MCU pin but not where the probe is placed. Quick check: Trigger on FE (software windowing) and use scope glitch/holdoff + infinite persistence; probe at the MCU RX pin with a ground spring. Fix: Add event-tagged captures, tighten probe technique, and enable start qualification/de-glitch. Pass criteria: Captures reproduce the event on demand; FE < X per 10^N bytes and false-start < X/min under the same stress.

Q: Works on bench, fails in chassis—what return-path check is fastest?

Likely cause: Chassis bonding changes the return path, causing ground bounce or common-mode shifts that push RX across the threshold. Quick check: Measure RX idle level and ground delta bench vs chassis; A/B with a temporary ground strap/shield bond and watch FE change. Fix: Restore a controlled return path and keep clamp current out of signal reference; add conditioning if threshold chatter dominates. Pass criteria: FE < X per 10^N bytes and burst P95 < X bytes in chassis, also passing on bench and across cable/route variants.

Q: Errors burst in clusters—what does burstiness imply about coupling?

Likely cause: Clustered bursts indicate intermittent/event-driven coupling rather than steady drift. Quick check: Histogram burst length and align to event tags and supply ripple captures; check for repeated threshold crossings. Fix: Remove the coupling path, slow edge aggressiveness (series-R), and enforce a safe resync/discard policy. Pass criteria: Burst P95 < X bytes (and max < X if required); recovery < X ms; FE/PE remain below gates across the event set.

Q: Adding an ESD array increased FE—first capacitance/edge-rate sanity check?

Likely cause: Added Cpar slows edges and changes sampling margin, or leakage/clamp return injects noise that creates false START and FE bursts. Quick check: A/B compare RX pin rise/fall time and idle level with/without the ESD array; verify single threshold crossing and event alignment. Fix: Choose a lower-cap ESD part, route clamp return cleanly, add series-R, and retune qualification thresholds. Pass criteria: FE < X per 10^N bytes and false-start < X/min after protection changes; edges remain single-crossing under stress.

Q: Oversampling 8x works, 16x worse—what setting interaction to check?

Likely cause: Sampling-point/majority logic and digital filtering interact with the oversampling ratio and can shrink mid-bit margin. Quick check: Verify oversampling and sampling-point registers together; repeat pattern tests and compare FE/PE + false-start counts under identical stress. Fix: Retune filter/qualification windows for 16x, ensure mid-bit alignment, and confirm a stable UART clock source. Pass criteria: Selected mode meets FE/PE gates (FE < X, PE < X per 10^N bytes) and does not increase false-start or dropped-character rates.

Q: De-glitch filter reduces FE but drops characters—how to set minimum pulse width?

Likely cause: The filter is too aggressive and rejects legitimate START transitions, causing missing characters. Quick check: Sweep the minimum pulse-width threshold and log false-start rate and dropped-character rate; compare against expected bit-time at current baud/oversampling. Fix: Set the threshold above noise pulse width but below the shortest valid START signature; use qualify-then-confirm logic. Pass criteria: False-start < X/min and dropped characters < X per 10^N bytes under stress; FE < X per 10^N bytes.

Q: PE rises with temperature—clock drift or input threshold drift first?

Likely cause: Either clock drift reduces sampling margin, or RX idle/threshold shifts via leakage/threshold drift with temperature. Quick check: Compare PE under 0x55/0xAA vs 0x00/0xFF while logging RX idle level and VDDIO; prioritize the dimension that tracks PE. Fix: Stabilize or calibrate the UART clock where supported and/or add hysteresis/conditioning to prevent threshold chatter. Pass criteria: PE < X per 10^N bytes across temperature corners; FE remains below gate; recovery < X ms after stress events.

Q: FE only after brown-out—what “ghost powering” check is fastest?

Likely cause: IO back-powering through clamps or incomplete reset leaves the receiver in a bad state after brown-out. Quick check: During brown-out cycles, monitor VDDIO and RX; check whether RX is driven while VDDIO is low and whether clamp conduction appears. Fix: Add series-R to limit injection, enforce power sequencing/high-Z, enable BOR reset, and clear UART state + counters on recovery. Pass criteria: After X brown-out cycles, FE < X per 10^N bytes and recovery < X ms; idle-only false-start remains below X/min.

← Back to: I²C / SPI / UART — Serial Peripheral Buses

This page turns UART framing/parity noise issues into an executable workflow: classify FE/PE into receiver decision points, then harden the link with oversampling, start-bit qualification/de-glitch, input conditioning, and safe resync policies.

The result is measurable robustness—lower FE/PE rates, shorter error bursts, and bounded recovery time under real board and system noise.

Problem definition & symptom taxonomy (FE/PE/noise)

This section converts “garbled bytes / missing bytes / sporadic failures” into a repeatable, engineering-first triage. The goal is to classify symptoms quickly and choose the first measurement that separates noise-induced receiver mis-detect from configuration/throughput issues—without expanding into baud-budget or PHY deep dives.

Scope guard: Covers FE/PE meaning, noise symptom patterns, first checks. Does not cover detailed baud error budgeting, frame-format selection strategy, or long-cable PHY design.

Minimal distinction (flags → first interpretation)

FE (Framing Error)

Meaning: stop-bit check failed at the receiver decision point.
First suspicion: false start → bit misalignment, or stop level pulled low by noise/glitch.
First check: capture idle→start→stop waveform; look for short low glitches near stop.

PE (Parity Error)

Meaning: parity check mismatched (computed vs received parity bit).
First suspicion: parity mode mismatch OR single-bit flips caused by noise.
First check: run a known pattern (walking-1/0 or PRBS); correlate PE to bit position/value.

OE (Overrun)

Meaning: RX FIFO/register overwritten before software/DMA drained it.
First suspicion: ISR latency, DMA configuration, or flow-control mismatch (not “noise” first).
First check: inspect FIFO level/overrun counters; reproduce with lower baud or reduced bursts.

BREAK / long-low

Meaning: line held low longer than a frame (receiver treats as break/abort).
First suspicion: brown-out dragging IO, contention, hot-plug transient, or deliberate break.
First check: measure low duration and correlate with power events and TX enable states.

Symptom map (translate “it fails” into first actions)

Symptom A: “FE spikes in bursts” (normally 0, then sudden clusters)

Primary suspicion: false start events or stop-bit glitches caused by impulsive EMI/ground bounce.
First check: trigger capture around the spike; look for short low pulses during idle/stop windows.
Fast divider: if errors align with a system event (motor/relay/ESD), treat as coupling-path first.

Symptom B: “PE only on certain byte values”

Primary suspicion: parity mode mismatch OR noise flipping specific bit positions near edges.
First check: send walking-1/0 (bit-sweep) and compare PE distribution per bit position.
Fast divider: deterministic “always wrong” points to configuration; probabilistic points to noise.

Symptom C: “Errors happen during idle” (no traffic)

Primary suspicion: false start detection (idle is not stable) or input threshold chatter.
First check: log false-start counts (if available) and observe RX pin idle stability + noise floor.
Fast divider: add/start qualify or de-glitch and verify if “idle errors” collapse quickly.

Symptom D: “Garbled bytes without FE/PE” (parser fails, flags look clean)

Primary suspicion: higher-layer framing mismatch, buffer overrun without flag visibility, or analyzer decode mismatch.
First check: compare raw captured bits vs MCU flags; verify sampling point assumptions and decode settings.
Fast divider: if logic analyzer decodes cleanly but MCU flags errors, sampling-window/qualify settings are suspect.

Minimum logging (makes noise issues reproducible)

Counts: FE/PE/OE/BREAK counters with time windowing (per second or per N bytes).
Context: supply state (UVLO/brown-out indicator), temperature band, and “event tag” (motor/relay/hot-plug).
RX metrics (if supported): false-start rejects, majority-vote disagreement rate, or noise filter hit rate.
Recovery: time-to-resync and discard policy used (byte drop vs frame drop).

Diagram intent: FE anchors to the stop-bit decision; PE anchors to the parity decision. OE/BREAK are included only as a minimal triage boundary.

What framing/parity errors really mean (receiver decision points)

UART errors are not “random.” Each flag corresponds to a specific receiver decision moment. Understanding where the decision happens turns vague symptoms into targeted checks: false start detection, sampling-window collapse, and stop/parity mis-judgment.

Receiver decision points (the only moments that matter)

Start qualify: decide whether a low level is a valid start bit or a glitch.
Sampling window: choose a sampling instant (often mid-bit; sometimes majority vote across sub-samples).
Parity decision: compute expected parity from data bits and compare to received parity bit.
Stop-bit decision: verify line is at the idle level during stop-bit check (framing integrity).

FE (Framing Error) = stop-bit decision failed

Most common narrative: a false start or edge glitch shifts the perceived bit boundary, so the “stop check” samples inside data/low.
Noise signature: clustered FE bursts aligned to impulsive events (switching, hot-plug, ESD, ground bounce).
Fast confirmation: capture the stop window and look for brief low pulses or ringing crossing threshold.

PE (Parity Error) = parity decision mismatched

Two dominant causes: parity mode mismatch (deterministic) OR single-bit flips from noise (probabilistic).
Noise signature: PE correlates with specific bit positions or edge-sensitive transitions; may rise with EMI events.
Fast confirmation: run a bit-sweep pattern and build “PE vs bit position” statistics.

Three common failure narratives (FE/PE combinations)

Case 1: FE + PE together (both spike)

Most consistent with false start or severe sampling-window collapse: the receiver locks onto the wrong bit boundary, then parity and stop checks both fail because the decision points are no longer aligned to the actual frame.

Case 2: FE dominates, PE is rare

Points to stop window disturbance: a short low glitch or ringing crosses threshold near stop sampling. The data bits may still be mostly correct, but stop validation fails under noise.

Case 3: PE dominates, FE stays near zero

Either a parity configuration mismatch (highly deterministic) or single-bit flips that do not disrupt stop validation. Pattern-driven statistics separates configuration from noise quickly.

What this implies next: false start → start qualification + de-glitch; sampling collapse → oversampling/majority vote; stop disturbance → input conditioning and edge control; parity mismatch → verify parity mode + pattern-based noise check.

Diagram intent: sampling points and qualify stages are the only places where short glitches become FE/PE. Later sections address oversampling, de-glitch filters, and resync policies.

Problem definition & symptom taxonomy (FE/PE/noise)

Scope guard: Covers FE/PE meaning, noise symptom patterns, first checks. Does not cover detailed baud error budgeting, frame-format selection strategy, or long-cable PHY design.

Minimal distinction (flags → first interpretation)

FE (Framing Error)

Meaning: stop-bit check failed at the receiver decision point.
First suspicion: false start → bit misalignment, or stop level pulled low by noise/glitch.
First check: capture idle→start→stop waveform; look for short low glitches near stop.

PE (Parity Error)

Meaning: parity check mismatched (computed vs received parity bit).
First suspicion: parity mode mismatch OR single-bit flips caused by noise.
First check: run a known pattern (walking-1/0 or PRBS); correlate PE to bit position/value.

OE (Overrun)

Meaning: RX FIFO/register overwritten before software/DMA drained it.
First suspicion: ISR latency, DMA configuration, or flow-control mismatch (not “noise” first).
First check: inspect FIFO level/overrun counters; reproduce with lower baud or reduced bursts.

BREAK / long-low

Meaning: line held low longer than a frame (receiver treats as break/abort).
First suspicion: brown-out dragging IO, contention, hot-plug transient, or deliberate break.
First check: measure low duration and correlate with power events and TX enable states.

Symptom map (translate “it fails” into first actions)

Symptom A: “FE spikes in bursts” (normally 0, then sudden clusters)

Primary suspicion: false start events or stop-bit glitches caused by impulsive EMI/ground bounce.
First check: trigger capture around the spike; look for short low pulses during idle/stop windows.
Fast divider: if errors align with a system event (motor/relay/ESD), treat as coupling-path first.

Symptom B: “PE only on certain byte values”

Primary suspicion: parity mode mismatch OR noise flipping specific bit positions near edges.
First check: send walking-1/0 (bit-sweep) and compare PE distribution per bit position.
Fast divider: deterministic “always wrong” points to configuration; probabilistic points to noise.

Symptom C: “Errors happen during idle” (no traffic)

Primary suspicion: false start detection (idle is not stable) or input threshold chatter.
First check: log false-start counts (if available) and observe RX pin idle stability + noise floor.
Fast divider: add/start qualify or de-glitch and verify if “idle errors” collapse quickly.

Symptom D: “Garbled bytes without FE/PE” (parser fails, flags look clean)

Primary suspicion: higher-layer framing mismatch, buffer overrun without flag visibility, or analyzer decode mismatch.
First check: compare raw captured bits vs MCU flags; verify sampling point assumptions and decode settings.
Fast divider: if logic analyzer decodes cleanly but MCU flags errors, sampling-window/qualify settings are suspect.

Minimum logging (makes noise issues reproducible)

Counts: FE/PE/OE/BREAK counters with time windowing (per second or per N bytes).
Context: supply state (UVLO/brown-out indicator), temperature band, and “event tag” (motor/relay/hot-plug).
RX metrics (if supported): false-start rejects, majority-vote disagreement rate, or noise filter hit rate.
Recovery: time-to-resync and discard policy used (byte drop vs frame drop).

Diagram intent: FE anchors to the stop-bit decision; PE anchors to the parity decision. OE/BREAK are included only as a minimal triage boundary.

What framing/parity errors really mean (receiver decision points)

Receiver decision points (the only moments that matter)

Start qualify: decide whether a low level is a valid start bit or a glitch.
Sampling window: choose a sampling instant (often mid-bit; sometimes majority vote across sub-samples).
Parity decision: compute expected parity from data bits and compare to received parity bit.
Stop-bit decision: verify line is at the idle level during stop-bit check (framing integrity).

FE (Framing Error) = stop-bit decision failed

Most common narrative: a false start or edge glitch shifts the perceived bit boundary, so the “stop check” samples inside data/low.
Noise signature: clustered FE bursts aligned to impulsive events (switching, hot-plug, ESD, ground bounce).
Fast confirmation: capture the stop window and look for brief low pulses or ringing crossing threshold.

PE (Parity Error) = parity decision mismatched

Two dominant causes: parity mode mismatch (deterministic) OR single-bit flips from noise (probabilistic).
Noise signature: PE correlates with specific bit positions or edge-sensitive transitions; may rise with EMI events.
Fast confirmation: run a bit-sweep pattern and build “PE vs bit position” statistics.

Three common failure narratives (FE/PE combinations)

Case 1: FE + PE together (both spike)

Case 2: FE dominates, PE is rare

Points to stop window disturbance: a short low glitch or ringing crosses threshold near stop sampling. The data bits may still be mostly correct, but stop validation fails under noise.

Case 3: PE dominates, FE stays near zero

Either a parity configuration mismatch (highly deterministic) or single-bit flips that do not disrupt stop validation. Pattern-driven statistics separates configuration from noise quickly.

Diagram intent: sampling points and qualify stages are the only places where short glitches become FE/PE. Later sections address oversampling, de-glitch filters, and resync policies.

Noise coupling paths that create FE/PE (board & system reality)

FE/PE spikes are usually not “mystery UART behavior.” They are the receiver’s decision points reacting to energy coupled into the RX path. This section classifies failures by coupling path, so the first measurement targets the right chain (ground, supply/threshold, near-field, or impulsive events).

Scope guard: This section provides diagnostic entry points (what to measure first). It avoids deep PHY/RS-485 transmission theory and baud-budget math.

Coupling-path taxonomy (signature → first measurement → fast mitigations)

1) Ground bounce / return-path discontinuity

Signature: clustered FE bursts aligned with switching; “idle errors” appear when nearby high di/dt toggles; stop window crosses threshold intermittently.
First measurement: probe RX pin and reference ground near the RX pad; compare to a distant ground to expose ground delta and reference movement.
Fast mitigations: enforce continuous return path; shorten RX return loop; add local ground stitching; reduce aggressor loop area near RX routing.

2) Conducted noise via supply / IO threshold (VDDIO/ground injection)

Signature: FE/PE increases with load transients or DC/DC state changes; false-start rate rises while RX waveform looks “fine” relative to a moving threshold.
First measurement: measure VDDIO ripple and ground at the RX domain; correlate error counters with supply events and threshold crossings.
Fast mitigations: improve local decoupling; isolate noisy rails; add series impedance where appropriate; verify input hysteresis behavior at the IO domain.

3) Crosstalk / near-field coupling (neighbor aggressors)

Signature: errors correlate with a specific neighbor activity (PWM, fast clocks); PE may correlate with bit transitions if coupling hits edges.
First measurement: probe the neighbor aggressor and RX simultaneously; check timing alignment (edge-to-error correlation) and measure overshoot/ringing at RX.
Fast mitigations: increase spacing; route with ground shielding; reduce edge rate at the aggressor; add small series-R to tame ringing at the victim input.

4) Impulsive events (motor/relay switching, hot-plug, ESD-like transients)

Signature: short high-amplitude disturbances; FE burst dominates; stop-bit window shows brief low glitches or threshold crossings.
First measurement: trigger on the event (relay coil, motor PWM edge, plug-in moment) and capture RX with a time-aligned window around the event.
Fast mitigations: add suppression on the event source (snubbers/TVS where appropriate); ensure robust return; harden RX input with conditioning and filtering strategy.

5) Post-ESD / stress drift (threshold shift, leakage, edge distortion)

Signature: system “passes once” but becomes fragile later; FE/PE rises without a clear new aggressor; idle level looks different across temperature/humidity.
First measurement: compare RX pin idle level and edge shape before/after stress; look for increased leakage or shifted effective threshold behavior.
Fast mitigations: re-check protection network loading; validate clamp capacitance/placement; confirm no partial damage or ghost-power paths affecting IO state.

“Only during a specific action”: EMI/coupling or clock/config?

Event correlation: if errors align tightly with a motor/relay/hot-plug/ESD-like event, treat coupling-path first.
Error morphology: bursts suggest impulsive coupling; value/bit-position correlation suggests edge-sensitive coupling or parity mismatch.
A/B quick experiment: one fast change (series-R, re-route cable path, local decoupling, qualify/filter setting) should shift error rate by >X if coupling dominates.

See also (do not expand here): Baud Rate & Error Budget (clock drift/ppm) · Voltage Levels & PHY (long-cable/RS-232/RS-485) · Start-bit qualify & de-glitch (next section).

Oversampling fundamentals (8x/16x) & sampling window robustness

Oversampling improves noise tolerance by turning a fragile single decision into a more robust sampling window. FE/PE spikes often appear when the window collapses toward bit edges (where ringing, jitter, and short glitches are most harmful).

Sampling-window model (why mid-bit sampling is safer)

Mid-bit is stable: it maximizes distance from edges where noise and ringing cross thresholds.
Window collapse: if noise or reference movement shifts the effective crossing time, the “mid-bit” point drifts toward the edge and becomes error-prone.
Receiver symptom: stop-bit checks fail (FE) or parity mismatches rise (PE) when decisions land in the uncertain edge region.

8x vs 16x oversampling (principle-level trade-offs)

8x oversampling

Strength: simpler timing; fewer internal phases; often stable in basic implementations.
Risk: less phase granularity for qualify/vote; more sensitive when a single sample is used.
Best when: edges are clean and coupling is mild; filtering/qualify hooks are limited.

16x oversampling

Strength: finer phase resolution enables start qualify and multi-sample voting.
Risk: if voting window spans edge regions, short glitches can influence multiple sub-samples.
Best when: the receiver supports robust qualify/vote configuration and edge regions are controlled.

Majority vote (3-sample voting) as a short-glitch suppressor

Single-sample receiver: a short glitch hitting the sampling instant can flip the bit immediately.
3-sample voting: the same glitch must corrupt at least 2 out of 3 sub-samples to change the decision.
Design implication: keep the vote cluster centered (avoid edge proximity) and avoid overly wide spacing that overlaps edges.

Edge noise vs low-frequency drift (what to fear first)

Edge noise (ringing, crosstalk, impulsive spikes)

Typical outcome: stop window crosses threshold → FE dominates.
Most effective: center sampling + majority vote + edge conditioning (series-R / controlled slew) + de-glitch.

Low-frequency drift (threshold wander, reference movement, slow baseline shift)

Typical outcome: start qualify becomes fragile; sampling points drift toward edges → FE/PE rise over time.
Most effective: stronger qualify/resync policy + stable IO domain reference + correlation logging (events/rails/temperature).

Start-bit qualification & de-glitch filters (reject false start)

The most destructive path for FE/PE bursts is a false start: an idle line is pulled low by a transient, the receiver “locks” on the wrong bit boundary, and parity/stop checks land in the wrong time slots. This section hardens the RX front-end using start-bit qualification and de-glitch policies.

False-start signatures (idle disturbed)

Short low dip on idle: a narrow low pulse (glitch) that resembles start for a fraction of a bit time.
Threshold chatter: repeated threshold crossings around idle due to supply/ground movement.
Edge ringing: a fast aggressor creates overshoot/undershoot that crosses the RX threshold briefly.
Symptom pattern: FE/PE appear in bursts and align with a switching event (relay/motor/hot-plug).

Start-bit qualification (N-of-M low samples)

Core rule: declare START only after N consecutive oversamples are LOW (or N out of a short window are LOW, depending on implementation).
Practical effect: narrow idle glitches fail qualification and are rejected before the receiver commits to a bit boundary.
Where it helps most: event-driven EMI and ringing that produces brief threshold crossings.

De-glitch filters (minimum pulse width / vote window / digital integrator)

Minimum pulse width

Rule: pulses shorter than Tglitch are ignored.
Best for: sharp, narrow spikes (impulsive coupling).
Risk: if Tglitch is too long, true START edges may be delayed or missed.

Sliding-window vote

Rule: in a window of W oversamples, require ≥K LOW samples.
Best for: threshold chatter and edge ringing.
Risk: if W is too wide, edge regions are included and timing shifts worsen.

Digital integrator

Rule: LOW evidence accumulates; trigger when an internal score crosses a threshold.
Best for: noisy idle with frequent small crossings.
Risk: excessive accumulation delays START recognition (sampling point drift).

Guardrails (avoid “filter so strong it breaks real UART”)

Do not push START confirmation too late: qualification should consume only a small fraction of one bit time (α·Tbit, α = X placeholder).
Set Tglitch with two constraints: reject the observed glitch width, but remain shorter than the stable-low portion of a real START edge (worst voltage/temperature).
Validate with A/B tests: inject narrow glitches and real frames; require both false-start rejection and frame-detect retention.

Pass criteria (threshold placeholders)

False-start rate: < X per minute (idle-only test window).
Event burst reduction: FE/PE burst length drops to < X bytes at the triggering event.
True-frame detect: ≥ X% frame detection under worst edge/voltage/temperature conditions.
Recovery: receiver returns to stable decoding within < X ms after a disturbance.

Parity error patterns (why only certain bytes trigger PE)

“Parity errors only for certain byte values” typically means either a deterministic configuration mismatch or a probabilistic single-bit flip driven by edge-sensitive noise. The fastest way to distinguish them is to use targeted test patterns and observe how PE maps to byte values and bit positions.

Deterministic fingerprints (config mismatch)

High, stable PE rate: errors reproduce consistently across runs and environments.
Weak event correlation: PE does not track switching events or noise injections.
Typical causes: parity enable/disable mismatch; even/odd/mark/space mismatch.

Probabilistic fingerprints (noise bit flips)

Variable PE rate: changes with event timing, edge quality, temperature, or supply noise.
Edge sensitivity: patterns with high transition density (e.g., alternating bits) may show higher PE.
Hot bit position: PE concentrates on a specific bit index when coupling targets one timing window.

Fast checks (patterns that separate mismatch from noise)

Pattern set A: walking-1 / walking-0

Goal: reveal a “hot” bit position (bit index sensitivity).
Interpretation: if PE clusters at one bit position, noise is likely aligning to one edge/window.

Pattern set B: 0x00, 0xFF, 0x55, 0xAA

Goal: compare low-transition vs high-transition density.
Interpretation: higher PE on 0x55/0xAA suggests edge-related coupling and sampling-window fragility.

Pattern set C: PRBS (long-run statistics)

Goal: separate stable deterministic mismatch from environment-dependent noise.
Interpretation: deterministic mismatch stays stable across conditions; noise-driven PE varies with events and coupling.

Config check entry (do not expand into system-level framing guidance)

Parity enabled? verify both ends match (enabled/disabled).
Parity type: verify even/odd/mark/space match.
Observation: if PE is near-constant across patterns and conditions, treat mismatch first.

Pass criteria (threshold placeholders)

After mismatch correction: PE < X per 10^N bytes on stable patterns.
After noise hardening: PE reduces by > X% and “hot bit” concentration disappears or drops > X dB (placeholder).
Event correlation: PE bursts at switching events are < X bytes and recover within < X ms.

Hardware input conditioning (threshold, hysteresis, RC, series-R)

FE/PE bursts often originate at the RX threshold: ringing, spikes, and threshold chatter create false-starts or destabilize sampling windows. This section turns hardware measures into an executable checklist while keeping scope limited to single-ended UART RX conditioning (not PHY-level migrations).

Threshold stability & hysteresis (Schmitt behavior)

Why it matters: repeated threshold crossings on idle can trigger false START and shift the receiver’s bit boundary.
Hysteresis value: two thresholds reduce “chatter” around the switching point and suppress micro-glitches.
Quick checks: observe idle-level stability at the RX pin; look for multiple crossings near the threshold during events.

Series-R (edge damping)

Target: reduce ringing and overshoot that re-crosses the threshold.
Verification: count threshold crossings on a scope; require fewer crossings after adding R.
Guardrail: avoid slowing edges so much that start qualification or sampling becomes timing-fragile (placeholder: X).

RC (glitch shaping)

Target: attenuate narrow spikes without changing bit-time structure.
Verification: compare spike width/amplitude vs RX threshold; require fewer false-starts.
Guardrail: avoid stacking strong analog RC with strong digital de-glitch (H2-5), which can delay true START.

ESD/Clamp side effects (what to watch)

Parasitic capacitance: slows edges and changes ringing; can increase timing sensitivity and start-detect fragility.
Leakage drift: after ESD or in hot/humid conditions, leakage can bias idle level toward threshold and raise false-start risk.
Clamp current path: poor return routing can convert a spike into ground bounce that destabilizes RX threshold during events.
Quick check: A/B swap (same footprint, different part) and compare idle level, edge shape, and FE/PE statistics.

Escalation entry (when single-ended fixes are not enough)

Common-mode disturbance is large: long cabling, ground potential differences, or repeated event-driven spikes.
Best-effort conditioning still fails: after series-R/RC and reasonable protection, FE/PE cannot meet acceptance thresholds (X).
Next step: consider differential or isolation strategies and link to the relevant PHY/isolator topics.

Pass criteria (threshold placeholders)

Threshold crossings: ringing-induced multi-crossing reduces by > X% (scope-based).
False-start rate: < X per minute during idle + event stress.
Edge integrity: true-frame detect ≥ X% under worst-case voltage/temperature with conditioning enabled.

Firmware/driver handling (flags, resync, discard policy)

Robust UART systems treat FE/PE as a control problem: decide what to discard, how to resynchronize, what to count, and how to recover. This section defines practical handling policies without expanding into idle-detect or throughput-tuning topics.

FE handling (stop check failed)

Default stance: treat bit boundary as suspect; prefer discarding the current frame segment.
Resync trigger: enter a resync state and wait for stable idle-high or clean stop patterns.
Measure: burst length (bytes) and recovery time (ms) vs acceptance threshold X.

PE handling (parity mismatch)

Default stance: discard the byte or mark it invalid (depends on upper-layer tolerance).
Fingerprinting: if PE is stable across patterns/conditions, prioritize config mismatch checks.
Measure: PE rate per 10^N bytes and PE hot-bit concentration (placeholder).

Overrun warning (avoid misclassification)

Risk: FIFO/DMA overruns can look like “noise corruption”.
Minimum check: track overrun flags and FIFO watermark alongside FE/PE counters.
Scope: no throughput tuning here; only classification and logging entry.

Resynchronization (restore a clean boundary)

Enter resync: on FE bursts or repeated PE clusters within a short window.
Wait condition: stable idle-high for ≥ X time (placeholder) or a clean stop/idle pattern sequence.
Re-arm: re-enable start qualification (H2-5) before accepting the next START.
Exit criteria: sustained decode stability for ≥ X bytes after resync.

Counters & logging (minimum evidence set)

Windowing: log every X ms or every N bytes (choose one) to keep statistics interpretable.
Counters: FE count, PE count, (optional) overrun count, recovery count.
Context tags: temperature (if available), VDDIO status (if available), and event labels (relay/motor/hot-plug).
Outcome: enable reproduction and regression checks across builds, boards, and environments.

Pass criteria (threshold placeholders)

Recovery time: < X ms from disturbance to stable decoding.
Burst containment: FE/PE burst length < X bytes, followed by ≥ X bytes error-free.
Logging completeness: every error window has counters + context tags (no missing fields).

Debug workflow (what to measure first, how to trigger)

Turn “garbled bytes” and “sporadic FE/PE” into a 10-minute executable path: capture the moment an error happens, classify the failure mode, then run fast A/B experiments to converge on the root cause.

Step 0 · Establish a minimal baseline (1 minute)

Window: count per X ms or per 10^N bytes (choose one; placeholder X).
Counters: FE, PE, (optional) overrun, recovery/resync count.
Tags: cable length, power mode, and event labels (motor start / relay / hot-plug).
Output: a baseline snapshot that makes A/B comparisons meaningful.

Logic analyzer (protocol decode)

Best for: pinpointing which byte/frame fails and how long bursts last.
Fingerprinting: deterministic PE patterns vs probabilistic noise.
Patterns: 0x00/0xFF, 0x55/0xAA, walking-1/0, PRBS (fast correlation).

Oscilloscope (edge & threshold behavior)

Best for: ringing, spikes, and multiple threshold crossings on RX.
Idle sanity: detect idle drifting near the threshold (leakage/return noise).
Event linkage: align RX behavior with supply ripple or motor/relay/hot-plug events.

Trigger cookbook (make “sporadic” reproducible)

Flag-trigger: capture pre/post windows around FE/PE flags (±X ms placeholder).
Long-LOW trigger: treat abnormal long-LOW as an “RX disturbance” capture (used only to grab the moment, not a protocol lesson).
Event-trigger: tag motor/relay/hot-plug timestamps and align with FE/PE burst density.
Supply-trigger: correlate error clusters with VDDIO ripple/step transitions (scope + log tags).

A/B ladder (fast root-cause narrowing)

Change cable/length/route: strong length dependence suggests SI/return-path coupling.
Change ground reference/return: event-driven improvements point to ground bounce/return discontinuity.
Change supply mode: idle errors disappearing indicates threshold drift via VDDIO noise.
Add/adjust series-R: fewer multi-crossings implies ringing/edge over-aggression.
Adjust qualify/de-glitch thresholds: reduced false-start rate without “missed frames” indicates false-start dominance.
A/B protection parts: systematic shifts implicate Cpar/leakage/clamp path side effects.

Output: Debug decision tree

Classify failures into false start, sampling shift, or threshold noise, then select the smallest fix set: qualify/de-glitch, front-end conditioning, or policy/resync/logging.

Pass criteria & metrics (quantify robustness)

Replace “looks better” with quantified acceptance. Use a consistent counting window, measure burst behavior and recovery time, and gate results across defined stress conditions.

Metric definitions (consistent windows)

FE rate: FE per 10^N bytes (placeholder N, threshold X).
PE rate: PE per 10^N bytes (placeholder N, threshold X).
False-start rate: false START detections per minute under idle-only stress (threshold X).
Burst length: distribution of consecutive error bytes (P95 or max; threshold X bytes).
Time-to-recover: time from first error to ≥ X consecutive error-free bytes (threshold X ms).

Conditions pack (minimal but meaningful)

Temperature: room / hot / cold (placeholders).
Power: nominal / ripple injected / transient step (placeholders).
Events: motor start / relay toggle / hot-plug (tagged and repeatable).
Cabling: short vs long harness/trace (placeholders).
Config edges: highest baud and worst-case operating corners (placeholders; no budget derivation here).

Pass/Fail gates (threshold placeholders)

Gate 1: FE rate < X per 10^N bytes.
Gate 2: PE rate < X per 10^N bytes.
Gate 3: Burst length P95 (or max) < X bytes.
Gate 4: Time-to-recover < X ms.
Gate 5: False-start rate < X per minute (idle-only).
Rule: gates must pass across the defined conditions pack (or document exceptions explicitly).

Engineering checklist (design → bring-up → production)

A noise-tolerant UART RX is an evidence-driven process: control threshold behavior, avoid false-starts, measure burst morphology, and prove recovery under corner conditions. This checklist stays tightly scoped to FE/PE/noise robustness and recovery handling.

Design gate

Input chain: RX trace, reference plane, and return continuity are documented.
Filter plan: avoid stacking strong analog RC with strong digital de-glitch (define X guardrail).
ESD Cpar budget: verify protection capacitance/leakage won’t pull idle toward the threshold.
Ground path: clamp current return does not inject bounce into RX threshold.
Expected waveform: “single crossing” edges (no repeated threshold crossings during events).

Evidence: scope captures (idle + event), schematic notes (Cpar/return path), planned counters window.

Bring-up gate

Patterns: 0x55/0xAA, 0x00/0xFF, walking-1/0, PRBS (for PE fingerprints).
Triggers: FE/PE flag windows (±X ms), abnormal long-LOW capture, event-tag correlation.
Metrics: FE/PE rate, burst length, time-to-recover, false-start rate (X placeholders).
A/B ladder: cable/ground/power/series-R/thresholds/protection swap to classify root cause.

Evidence: decoded burst logs + scope snapshots aligned to events.

Production gate

BIST/loopback: fixed patterns + counter capture (prove observability and recovery).
Corners: temperature, supply ripple/transients, cable length, event injection (placeholders).
Minimum logs: window definition + FE/PE/burst/recovery/false-start + event tags.
Gate rules: pass/fail thresholds must hold across conditions pack (document exceptions).

Evidence: “PASS/FAIL” record sheet per build/board/lot.

Applications & IC selection notes (noise-tolerant UART)

Noise-tolerant UART reception focuses on false-start rejection, sampling robustness, and observable recovery. The notes below remain scoped to UART RX behavior (not a PHY migration guide).

Industrial service/debug port

Noise source: relays, motors, ESD events.
Signature: FE bursts aligned with events; idle errors.
Key hooks: start validation + de-glitch, series-R, hysteresis, resync + counters.
Metric: burst length < X and recovery < X ms under event tags.

Long harness console (cabinet / multi-board)

Noise source: return discontinuities, common-mode shifts.
Signature: errors vary strongly with cable length/route.
Key hooks: front-end conditioning + conservative sampling robustness.
Metric: FE/PE rate below gates across short/long cabling conditions.

High-noise bypass link (fallback channel)

Noise source: switching supplies and transient load steps.
Signature: clustered errors during power steps.
Key hooks: event-tag logging + strict recovery targets.
Metric: time-to-recover and resync count within gates.

Robust RX before low-power transitions

Noise source: rail ramping and thresholds shifting during mode transitions.
Signature: idle instability and false starts near transitions.
Key hooks: start qualification + conservative de-glitch + clear discard/resync policy.
Metric: false-start rate under idle-only stress < X per minute.

MCU/UART “noise-tolerance” feature checklist

Oversampling options: selectable oversampling (commonly 8×/16×) and robust mid-bit sampling behavior.
Start-bit validation: configurable START qualification or digital de-glitch support (or available via programmable filters).
Error observability: FE/PE/overrun flags plus counters or low-overhead logging capability.
Recovery hooks: resync/discard policy can be implemented deterministically without losing long frames.
Clocking flexibility: stable clock source options and divider granularity for robust sampling windows.

Example MCU/UART platforms (for feature comparison)

ST: STM32G0 series (e.g., STM32G071) — verify package/suffix/availability.
ST: STM32L4 series (e.g., STM32L476) — verify package/suffix/availability.
NXP: i.MX RT (e.g., MIMXRT1062) — verify package/suffix/availability.
Microchip: SAM E5x (e.g., ATSAME54P20A) — verify package/suffix/availability.
TI: MSPM0 (e.g., MSPM0G3507) — verify package/suffix/availability.

Note: these examples anchor a comparison checklist; feature availability varies by sub-family and revision.

Schmitt buffer / input conditioning ICs

TI: SN74LVC1G17 (Schmitt buffer) — verify package/suffix/availability.
Nexperia: 74LVC1G17 variants — verify package/suffix/availability.
Onsemi: NC7SZ17 (Schmitt buffer family) — verify package/suffix/availability.

Use when RX threshold chatter is dominant; validate input capacitance and edge timing margins.

Digital isolators (isolation entry)

ADI: ADuM1201 (dual-channel isolator) — verify package/suffix/availability.
TI: ISO7721 (dual-channel isolator) — verify package/suffix/availability.
Silicon Labs: Si8621 family — verify package/suffix/availability.

Use when common-mode disturbance dominates; evaluate propagation delay and edge shaping vs sampling windows.

Protection arrays (ESD entry examples)

TI: TPD2E007 (low-cap ESD protection) — verify package/suffix/availability.
Nexperia: PESD5V0 families — verify package/suffix/availability.
Semtech: RClamp families (low-cap arrays) — verify package/suffix/availability.

Always validate Cpar/leakage/clamp return path; A/B swap can reveal hidden edge and idle shifts.

Selection rule: example part numbers are reference anchors only; always verify package/suffix/availability and confirm capacitance/delay/edge effects against sampling robustness and false-start rejection gates.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (framing/parity errors under noise)

These FAQs close long-tail troubleshooting without expanding the main text. Each answer is fixed to four lines: Likely cause / Quick check / Fix / Pass criteria (thresholds use placeholders X and N).

FE spikes only when motors switch—false start or threshold noise first?

Likely cause

Event-driven coupling causes false START (idle glitches) or threshold chatter (return-path injected noise).

Quick check

Trigger captures on the motor event and log false-start count + FE bursts; probe at the MCU RX pin with short ground to catch sub-µs dips.

Fix

Enable start qualification (N consecutive low samples), set a de-glitch threshold, add series-R, and correct clamp/return routing to keep idle away from the threshold.

Pass criteria

False-start rate < X/min during motor switching; FE < X per 10^N bytes; burst P95 < X bytes; recovery < X ms across the event set.

Parity errors only on certain byte values—pattern test to tell noise vs parity mismatch?

Likely cause

Either parity mode mismatch (even/odd/mark/space) or value-dependent bit flips caused by edge/coupling sensitivity.

Quick check

Run 0x00/0xFF, 0x55/0xAA, walking-1/0, and PRBS; compare PE vs pattern and confirm both ends’ parity configuration in one screen.

Fix

Align parity mode first; if noise fingerprints remain, reduce ringing (series-R), add hysteresis (Schmitt input), and tighten sampling robustness (oversampling + filters).

Pass criteria

PE < X per 10^N bytes on all patterns; no value-dependent PE spikes; recovery < X ms after injected disturbances.

FE but scope looks “clean”—what trigger/measurement is missing?

Likely cause

The disturbance is too rare/fast to see without the right trigger, or it happens at the MCU pin but not where the probe is placed.

Quick check

Trigger on FE (software windowing) and use scope glitch/holdoff + infinite persistence; probe at the MCU RX pin with a ground spring and compare to upstream nodes.

Fix

Add event-tagged captures, tighten probe technique, and enable start qualification/de-glitch so sub-threshold dips cannot create false frames.

Pass criteria

The capture method reproduces the event on demand; after fixes FE < X per 10^N bytes and false-start < X/min under the same stress.

Works on bench, fails in chassis—what return-path check is fastest?

Likely cause

Chassis bonding changes the return path, causing ground bounce or common-mode shifts that push RX across the threshold.

Quick check

Measure RX idle level and ground delta between endpoints in bench vs chassis; A/B with a temporary ground strap/shield bond and watch FE burst change immediately.

Fix

Restore a controlled return path (bonding strategy + routing), keep clamp current out of signal reference, and add hysteresis/conditioning if threshold chatter is dominant.

Pass criteria

FE < X per 10^N bytes and burst P95 < X bytes in chassis; same metrics also pass on bench and across cable/route variants.

Errors burst in clusters—what does burstiness imply about coupling?

Likely cause

Clustered bursts indicate intermittent/event-driven coupling (transients, switching edges) rather than steady drift.

Quick check

Histogram burst length and align bursts to event tags (motor/relay/hot-plug) and supply ripple captures; verify whether bursts coincide with repeated threshold crossings.

Fix

Remove the coupling path (return, routing, shielding), slow edge aggressiveness (series-R), and enforce a safe resync/discard policy to bound data loss during bursts.

Pass criteria

Burst P95 < X bytes (and max < X bytes if required); recovery < X ms; FE/PE rates remain below gates across the event set.

Adding an ESD array increased FE—first capacitance/edge-rate sanity check?

Likely cause

Added Cpar slows edges and changes sampling margin, or leakage/clamp return injects noise that creates false START and FE bursts.

Quick check

A/B compare RX pin rise/fall time and idle level with/without the ESD array; verify single threshold crossing and check whether FE bursts align to clamp current events.

Fix

Choose a lower-capacitance ESD part, relocate/route clamp return cleanly, add series-R for ringing control, and retune start qualification/de-glitch thresholds.

Pass criteria

FE < X per 10^N bytes and false-start < X/min after protection changes; edges remain single-crossing under event stress.

Oversampling 8x works, 16x worse—what setting interaction to check?

Likely cause

Sampling-point/majority logic and digital filtering interact with the oversampling ratio; an incorrect clock or filter window can shrink the robust mid-bit margin.

Quick check

Verify oversampling and sampling-point registers together; repeat the same pattern set at both ratios and compare FE/PE + false-start counts under identical stress.

Fix

Retune filter/qualification windows for 16x, ensure mid-bit sampling alignment, and confirm a stable UART clock source; keep only one “strong” filter layer if stacking causes misses.

Pass criteria

Selected oversampling mode meets FE/PE gates (FE < X, PE < X per 10^N bytes) and does not increase false-start or dropped-character rates under stress.

De-glitch filter reduces FE but drops characters—how to set minimum pulse width?

Likely cause

The filter is too aggressive and rejects legitimate START transitions (or legitimate short low pulses in the presence of slow edges), causing missing characters.

Quick check

Sweep the minimum pulse-width threshold and log (a) false-start rate and (b) dropped-character rate; compare against the expected bit-time at the current baud/oversampling.

Fix

Set the threshold above the observed noise pulse width but below the shortest valid START signature; use two-stage logic (qualify then confirm at mid-bit) instead of a single hard cutoff.

Pass criteria

False-start < X/min and dropped characters < X per 10^N bytes under the same stress; FE < X per 10^N bytes.

PE rises with temperature—clock drift or input threshold drift first?

Likely cause

Either sampling margin degrades via clock drift, or the RX idle/threshold shifts via leakage/threshold drift with temperature.

Quick check

Compare PE under 0x55/0xAA vs 0x00/0xFF while logging RX idle level and VDDIO; if PE tracks edge density/phase, prioritize sampling/clock; if idle drifts, prioritize threshold conditioning.

Fix

Stabilize or calibrate the UART clock where supported, and/or add hysteresis/conditioning to prevent threshold chatter; keep acceptance tied to the same stress corners and counters.

Pass criteria

PE < X per 10^N bytes across temperature corners; FE remains below gate; recovery < X ms after stress events (clock-budget math is handled in the baud-budget subpage).

FE only after brown-out—what “ghost powering” check is fastest?

Likely cause

IO back-powering through clamps or incomplete reset leaves the receiver in a bad state; threshold and idle behavior change during/after brown-out.

Quick check

During brown-out cycles, monitor VDDIO and RX simultaneously; check whether RX is being driven while VDDIO is low and whether the RX pin shows abnormal clamp conduction signatures.

Fix

Limit injection with series-R, enforce power sequencing/high-Z during brown-out, enable BOR reset, and explicitly reset/clear UART state + counters on recovery.

Pass criteria

After X brown-out cycles, FE < X per 10^N bytes and recovery < X ms; no persistent false-start elevation in idle-only stress.

Noise causes random framing—what resync policy is safest: discard byte vs discard frame?

Likely cause

False-starts and stop-bit violations desynchronize the byte stream; continuing to parse “through errors” can amplify corruption.

Quick check

Compare two policies under the same disturbance: (a) discard single byte on FE vs (b) discard until a clean idle/stop is observed; measure recovery time and max data loss per burst.

Fix

Safest default is discard until idle is stable (≥ one character time) and then resync; keep bounded loss by using counters + a deterministic recovery state machine.

Pass criteria

Recovery < X ms and worst-case data loss < X bytes per disturbance; FE burst length is bounded and post-recovery FE/PE return below gates.

Analyzer decodes fine but MCU flags FE—what sampling-point mismatch check first?

Likely cause

The analyzer sees a different threshold/reference than the MCU, or the disturbance occurs at the MCU pin; oversampling/sample-point configuration may also be misaligned.

Quick check

Probe at the MCU RX pin and set analyzer thresholds to match VIH/VIL behavior; verify UART sampling-point/oversampling settings and confirm stop-bit level at the exact sampling moment.

Fix

Align measurement thresholds, add hysteresis (Schmitt buffer) if needed, and tune start qualification/de-glitch so the MCU cannot be fooled by edge chatter invisible to the analyzer.

Pass criteria

MCU FE counter aligns with external decode (near zero) under identical conditions; FE < X per 10^N bytes and false-start < X/min across the stress set.

UART Framing & Parity Errors Under Noise: Oversampling & Filters

UART Framing & Parity Errors Under Noise: Oversampling & Filters

Problem definition & symptom taxonomy (FE/PE/noise)

Minimal distinction (flags → first interpretation)

Symptom map (translate “it fails” into first actions)

Minimum logging (makes noise issues reproducible)

What framing/parity errors really mean (receiver decision points)

Receiver decision points (the only moments that matter)

Three common failure narratives (FE/PE combinations)

Problem definition & symptom taxonomy (FE/PE/noise)

Minimal distinction (flags → first interpretation)

Symptom map (translate “it fails” into first actions)

Minimum logging (makes noise issues reproducible)

What framing/parity errors really mean (receiver decision points)

Receiver decision points (the only moments that matter)

Three common failure narratives (FE/PE combinations)

Noise coupling paths that create FE/PE (board & system reality)

Coupling-path taxonomy (signature → first measurement → fast mitigations)

“Only during a specific action”: EMI/coupling or clock/config?

Oversampling fundamentals (8x/16x) & sampling window robustness

Sampling-window model (why mid-bit sampling is safer)

8x vs 16x oversampling (principle-level trade-offs)

Majority vote (3-sample voting) as a short-glitch suppressor

Edge noise vs low-frequency drift (what to fear first)

Start-bit qualification & de-glitch filters (reject false start)

Start-bit qualification (N-of-M low samples)

De-glitch filters (minimum pulse width / vote window / digital integrator)

Parity error patterns (why only certain bytes trigger PE)

Fast checks (patterns that separate mismatch from noise)

Hardware input conditioning (threshold, hysteresis, RC, series-R)

Firmware/driver handling (flags, resync, discard policy)

Debug workflow (what to measure first, how to trigger)

Pass criteria & metrics (quantify robustness)

Engineering checklist (design → bring-up → production)

Applications & IC selection notes (noise-tolerant UART)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

FAQs (framing/parity errors under noise)

Explore

Categories

Get in Touch