Speed & Timing Budgets for I2C, SPI and UART
← Back to: I²C / SPI / UART — Serial Peripheral Buses
Speed/Timing is not “frequency”—it is proven sampling margin at the worst node and worst corner. This page shows how to convert datasheet timing into a board-level budget and close the loop with measurements and pass criteria.
What “Speed/Timing” Means Across I²C / SPI / UART
“Speed” is not just a bus frequency number. It is the measurable sampling margin left after the real board (drivers, interconnect, thresholds, and sampling) consumes the ideal timing window.
- What are tSU/tHD? (setup/hold as a safety window around the sampling instant)
- Why do rise/fall limits cap speed? (threshold-crossing uncertainty shrinks the window)
- How to interpret “sampling window”? (where the receiver truly samples, and how margin is proven)
- Launch edge: where timing starts (slew, drive strength, clock quality).
- Interconnect: where edges distort (RC, reflections, skew, coupling).
- Threshold: where “logic 0/1” is decided (VIH/VIL, noise, common-mode shifts).
- Sample point: where the receiver commits a bit (sampling jitter, duty error, phase).
- Margin: the remaining safe window (must survive worst-case PVT and be measurable).
- Convert datasheet timing into a board-level timing budget (window → losses → guardband).
- Identify where margin is being consumed (edge distortion, threshold drift, sampling uncertainty).
- Define Pass criteria as measurable limits (placeholders used here; thresholds set per design).
This page focuses on timing budgets and sampling margin. Protection, isolation, topology, and full protocol behavior are handled in their dedicated pages; this chapter only references them when they directly change timing.
Timing Primitives: tSU/tHD, tR/tF, Skew, Jitter, Duty, Window
This chapter standardizes definitions and measurement conventions. Without consistent primitives, any timing budget will be numerically correct but operationally wrong.
- “Faster edge is always better” — not if ringing/noise increases threshold-crossing uncertainty and shrinks the safe window.
- “Skew equals jitter” — skew is deterministic arrival mismatch; jitter is statistical timing uncertainty around an edge.
- “Setup/hold are abstract” — they are a physically drawable safety zone around a defined sampling instant (edge-based or threshold-based).
- tR/tF convention: measure using the same percentage points as the spec (10–90% or 30–70%). Record probe loading and bandwidth limits.
- Setup/hold reference: define whether timing is referenced to a sampling edge (clocked interfaces) or a threshold crossing (edge-detection cases). Do not mix these.
- Duty-cycle relevance: duty error reduces the effective half-cycle window, directly consuming timing margin even when frequency is correct.
Minimum time the data must be stable before the defined sampling instant.
- Measured at: receiver pin or defined reference point (must be stated).
- Consumes window via: clock arrival skew, data propagation uncertainty, threshold-crossing jitter.
- Pass criteria placeholder: tSU_meas ≥ X (after guardband).
Minimum time the data must remain stable after the defined sampling instant.
- Measured at: receiver pin or defined reference point (must be stated).
- Consumes window via: minimum data delay, clock/data relative timing uncertainty.
- Pass criteria placeholder: tHD_meas ≥ X (after guardband).
Transition time between defined voltage percentages (spec-dependent). It directly changes threshold-crossing timing spread and thus the usable sampling window.
- Measured with: correct percentage points and a clearly defined threshold reference.
- Consumes window via: slower slope → more time uncertainty under noise; faster slope → potential ringing/overshoot shifting effective crossing.
- Pass criteria placeholder: tR/tF within spec at the receiver pin (worst-case node).
- Skew (Δt): deterministic clock/data arrival mismatch across endpoints; subtracts directly from setup/hold.
- Jitter: statistical uncertainty of edge timing; shrinks safe window based on the chosen confidence level.
- Duty: changes the effective half-cycle; a correct frequency with wrong duty still reduces sampling margin.
Budget mapping rule of thumb: usable window = ideal window − (skew + jitter + edge/threshold uncertainty + propagation variation) − guardband.
The Budget Method: Worst-Case Stack-Up + Guardband
A reliable interface is proven by worst-case sampling margin, not by a single “good-looking” waveform. This method converts a datasheet spec into a board-level, measurable timing proof.
- Spec → board: how to translate datasheet timing into a board-level budget.
- Worst-case: how to pick max/min combinations and avoid “average-value engineering”.
Express the timing window as “ideal window − losses”. Each loss must have a bound (max/min) and a measurement method.
- Driver delay / output uncertainty
- RC edge / threshold-crossing spread
- Trace/cable skew (endpoint mismatch)
- Receiver threshold shift (VIH/VIL, noise)
- Sampling uncertainty (jitter, duty, phase)
Worst-case is a combination, not “all maximum values”. Setup worst-case typically pairs latest data with earliest sampling; hold worst-case pairs earliest data change with latest sampling.
For each row, record: worst direction (max or min), why it is worst, and how it is bounded.
Guardband reserves margin for uncontrolled factors: PVT corners, assembly variation, probe loading, and measurement uncertainty. Treat guardband as a named budget row, not a hidden “extra”.
- Use a percentage of the ideal window and/or an absolute time floor.
- Choose the worse of the two: guardband = max(α·window, X) (α, X defined per program).
Pass criteria must specify what is measured, where it is measured (receiver pin / worst node), and under what corner (voltage, temperature, load).
- Setup margin ≥ X ns at receiver pin (worst-case node).
- Rise/Fall within spec at receiver pin (same % convention as datasheet).
- Clock duty within X–Y% at the loaded clock node.
A single “typical” waveform may pass, while a production unit fails at a corner (temperature, voltage, longer harness, or worst node). Worst-case stack-up makes failure modes predictable and testable before scaling.
I²C Timing Budgets: tLOW/tHIGH, tSU:DAT, tHD:DAT, tSU:STA, tBUF
I²C speed is often limited by open-drain edges: the pull-up network and bus capacitance shape the rise time, which directly consumes tHIGH and shrinks the usable margin.
- Mode limits: why 100/400/1000/3400 kbps is usually constrained by timing, not “drive strength”.
- Same 400 kHz, different outcomes: bus capacitance, edge shape, and worst-node timing decide reliability.
- Open-drain + pull-up defines the rise time (and therefore the threshold-crossing moment).
- Rise time consumes tHIGH and reduces the time available for data validity and sampling margin.
- Worst node matters: the farthest / highest-capacitance branch is usually the true limiter.
START and STOP are recognized when SDA transitions while SCL is high. This makes rise time and threshold-crossing stability critical during the high period.
- tSU:STA: setup time before START is recognized.
- tHD:DAT / tSU:DAT: data stability around the sampling of bits (timing must be measured at the receiver pin).
- tBUF: bus free time between STOP and the next START (timing-only constraint).
Low/high periods must remain compliant at the worst node. Allow extra margin for devices that delay responses (e.g., stretching) because it reshapes effective low/high timing.
- tR/tF: must meet mode requirements at the receiver pin (correct % convention).
- tHIGH/tLOW: must remain within the mode’s min/max under corner conditions.
- Margin note: state the master timeout policy elsewhere; here, only quantify the timing impact.
- Declare the target mode (100/400/1,000/3,400 kbps) and the measurement convention (10–90% or 30–70%).
- Measure at the worst node (highest Cbus / farthest branch), not only at the controller pins.
- Verify tR/tF and tHIGH/tLOW across voltage and temperature corners.
- Record probe loading, threshold level, and capture settings so results are reproducible.
- Publish pass criteria as placeholders: tR ≤ X, tHIGH ≥ X, tLOW ≥ X, tBUF ≥ X (at the named node).
I²C Rise/Fall (tR/tF) from RC: Pull-Up Sizing as a Timing Budget
For I²C, the rising edge is shaped by Rpull-up × Cbus. If tR is too slow, the threshold-crossing moment drifts later, consuming tHIGH and shrinking usable sampling margin.
- How to size pull-ups: compute a feasible R range and choose a safe point.
- Why 400 pF is a trap: slow edges consume tHIGH and increase crossing uncertainty.
- Too small vs too large: trade power loss against timing margin.
- Declare the target I²C mode and the rise/fall convention (10–90% or 30–70%) consistent with the spec.
- Measure at the worst node (highest Cbus / farthest branch), not only at the controller pin.
Cbus is the sum of device input capacitance, PCB trace/cable capacitance, connectors, and any added protection components. Use a conservative estimate for the worst node.
- Devices: ΣCin across all attached nodes on the segment
- Interconnect: trace/cable capacitance by length
- Connectors/harness: often non-trivial at the far end
- Added parts: ESD arrays, test points, probe loading
Rise time scales with the RC time constant. Use the mode’s rise-time limit to compute the maximum pull-up resistance that still meets tR at the worst node.
tR ∝ Rpull-up · Cbus. Use the spec convention (10–90% or 30–70%) to map tR_spec to an upper bound Rmax.
The pull-up must not force a low-level current that exceeds device sink capability, and VOL must remain within limits. This defines a lower bound Rmin.
- Device datasheet: IOL (sink current) at the relevant VOL condition.
- System requirement: worst-case VOL(max) at the receiver pin.
If Rmin > Rmax, the target mode cannot be met on this segment without reducing Cbus (segmentation/buffering) or lowering the bus speed. Otherwise, select a mid-range value biased toward margin.
- Cbus estimate and worst-node definition
- Computed Rmin (VOL/IOL) and Rmax (tR_spec)
- Recommended R (margin-biased) with measurement plan
Series-R/RC is not only about emissions; it can stabilize the threshold-crossing time by reducing ringing and multiple crossings near VIH/VIL. A more deterministic crossing moment improves usable margin.
- If the waveform crosses the threshold multiple times, sampling becomes probabilistic.
- Prefer timing evidence: reduced crossing spread at the receiver pin under worst-node loading.
- tR ≤ X (receiver pin, worst node, spec % convention)
- VOL ≤ X at IOL condition (worst node)
- No multiple crossings near VIH/VIL (crossing spread ≤ X)
SPI Timing Budgets: CPOL/CPHA Sampling Window, tSU/tHD, tCO, tDO
SPI reliability is decided by whether the sampling edge lands inside the data-valid window. CPOL/CPHA choose the relationship between clock edges and data transitions; interconnect delay and uncertainty shrink the usable window.
- Why modes 0–3 slip bits: sampling edge too close to data transitions.
- How to align sampling: map sample edges to the data-valid window.
- What to do when tSU/tHD is tight: shift phase (CPHA), reduce uncertainty, or reduce SCLK.
Skew, reflections, buffering, and threshold drift move the sampling edge on the time axis.
Slave tCO, path delay, and edge drift define where data becomes valid and how long it stays valid before the next transition.
Start from the effective half-cycle (or phase window). Subtract bounded uncertainty terms to get the usable sampling window:
usable window = phase window − clock jitter − clock skew − data uncertainty − edge/threshold drift − guardband
At the same SCLK, longer traces, heavier loads, and different buffering can increase skew/jitter and shrink the data-valid window, reducing margin even when frequency is unchanged.
- Shift sampling phase: choose CPHA so the sample edge lands deeper inside the data-valid region.
- Reduce uncertainty: bound skew/jitter/threshold drift (tighten Stage A and Stage B terms).
- Reduce SCLK or shorten the window path (improve margin without protocol changes).
- Setup margin ≥ X ns at the sampling receiver pin (worst slave node)
- Hold margin ≥ X ns at the sampling receiver pin (worst slave node)
- Clock skew/jitter ≤ X at the loaded SCLK node (document probe + trigger)
SCLK Quality: Frequency, Duty Cycle, Edge Placement, Skew Control
Clock quality directly determines the phase window available for sampling. Even at the same frequency, duty distortion, edge uncertainty, jitter, and arrival skew can shrink the usable window and push the sampling edge too close to data transitions.
- Duty out of spec: how the half-cycle window gets squeezed.
- Edge placement issues: why crossing uncertainty becomes a timing penalty.
- Skew control: how to bound master↔slave SCLK arrival spread as a budget row.
usable window = phase window − duty distortion − edge/threshold spread − jitter − arrival skew − guardband
When duty deviates from 50%, one half-cycle becomes shorter. The effective sampling phase window can shrink even if frequency is unchanged.
Very slow edges increase sensitivity to noise near the threshold; ringing can create multiple crossings. Both enlarge the uncertainty of the effective edge time.
Buffering/re-timing matters in timing terms when it reduces the master↔slave arrival spread and tightens the worst-case sampling edge placement.
- Duty cycle in X%–Y% at the loaded SCLK node (document probe + threshold)
- Edge-to-edge jitter ≤ X at the loaded SCLK node (consistent definition)
- Arrival skew (master vs worst slave) ≤ X (same threshold, same reference)
UART Timing: Baud Error Budget + Oversampling Sampling Window
UART is asynchronous: alignment starts at the start-bit edge (t0), then sampling targets the center of each bit cell. TX/RX clock error accumulates as sampling drift across the frame; long frames reduce end-of-frame margin.
- Where ±2% comes from: center-sampling tolerance before crossing bit boundaries.
- Why 1% + 1% can fail: errors add, then edge uncertainty consumes margin.
- Sampling drift: worst risk appears at the last data/stop region.
- TX clock error (XTAL tolerance, temp drift, PLL/divider error)
- RX clock error (same contributors)
- Noise / slow edges → threshold-crossing spread
- Input filtering / de-glitch → effective timing spread
- Choose lower-drift clock sources; add calibration hooks
- Use 8×/16× oversampling to stabilize center sampling
- Watch long frames: drift accumulates toward the end
Oversampling provides finer timing granularity for selecting a robust center sample point and reduces sensitivity to short-lived edge disturbances. It cannot compensate when the total drift pushes the center sample beyond bit boundaries.
- TX baud error ≤ X% across voltage/temperature corners
- RX tolerance: no framing errors within ±X% injected baud mismatch
- End-of-frame margin ≥ X% of bit time at the last data/stop region
Measurement & Validation: From Spec to Scope/LA Pass Criteria
Validation closes the loop by translating datasheet timing into board-level budgets, then proving margin at the worst node with repeatable probe points, triggers, instrument settings, and measurable pass thresholds.
- How to measure tR/tF (consistent thresholds and probe loading)
- How to measure setup/hold (align to the sampling event)
- Scope vs logic analyzer (analog margin vs long-run statistics)
- Worst node: far end / highest load / largest C branch
- Receiver-side timing: measure where sampling happens
- Document the node: connector, test pad, device pin, or branch end
- I²C: align to START/STOP or the relevant SCL edge
- SPI: align to CS active edge and the CPOL/CPHA sampling edge
- UART: align to the start-bit threshold crossing (t0)
- Bandwidth: too low hides ringing/jitter and distorts tR/tF
- Probe loading: capacitance + ground lead can change the edge
- Threshold: LA thresholds must match the intended VIH/VIL definition
- Sampling rate: LA undersampling can miss rare or narrow violations
- Scan probe nodes: driver-side vs receiver-side vs far end
- Scan conditions: voltage, temperature, load count, cable/trace length
- Combine tools: scope for analog margin + LA for long-run statistics
- Metric: tR / tF / setup / hold / duty / jitter / skew
- Definition: threshold convention + reference event
- Probe point: worst node (name it)
- Instrument: bandwidth / sampling / threshold / probe
- Threshold: ≤ X or ≥ X, with sample count N
- Scope screenshot with thresholds + time markers
- LA capture with decoding + statistics (errors/retries)
- Run log: node, corner, settings, pass/fail result
Principle: pass is evaluated at the worst node under repeatable worst-case conditions, not the cleanest-looking waveform.
Firmware Timing Hooks: Timeouts, Retries, Sampling Adjust, Guard Time
Firmware can consume or recover timing margin. Timeouts bound worst-case latency, retries change load and event density, sampling adjustments move the effective sample point toward the center of the window, and guard time reduces continuous window pressure under marginal conditions.
- Hardware looks fine but still drops: timing knobs that change effective margin.
- Timeout/retry impact: how policy boundaries alter user-visible timing behavior.
- Sampling adjustment: how to validate sample-point movement with measurable pass limits.
- When: slow devices or stretching can stall the bus
- Set: cover acceptable delay + safety margin; avoid infinite waits
- Verify: no false timeouts in worst-case operation
- Pass: recovery time ≤ X; false timeout rate ≤ X
- When: marginal edges cause occasional NAK
- Set: limit retry count; add spacing to avoid “retry storms”
- Verify: retry rate decreases under worst-node tests
- Pass: retries/transaction ≤ X; latency ≤ X
- When: back-to-back transfers increase event density
- Set: insert minimum spacing to reduce continuous window pressure
- Verify: fewer bursts of NAK/timeouts in long-run capture
- Pass: error burst rate ≤ X; throughput drop ≤ X%
- When: setup/hold margins are tight; occasional bit errors
- Set: move sample point toward the center of data-valid window
- Verify: increased setup/hold at receiver node
- Pass: setup ≥ X; hold ≥ X; error rate ≤ X
- When: transient errors appear under load
- Set: bounded retry count + spacing to avoid repeated marginal sampling
- Verify: long-run BER/CRC improves on LA capture
- Pass: retries/frame ≤ X; tail latency ≤ X
- When: continuous bursts reduce effective margin (heat/rail/noise)
- Set: inter-word spacing or chunking to reduce continuous stress
- Verify: fewer error bursts at worst node & corners
- Pass: error burst rate ≤ X; throughput drop ≤ X%
- When: clock drift causes framing/parity errors
- Set: calibrate divisor from known edges/characters (timing value)
- Verify: baud estimate converges across corners
- Pass: baud error ≤ X%; framing errors ≤ X
- When: noisy edges create threshold-crossing spread
- Set: 8×/16× oversampling; keep center sampling stable
- Verify: fewer start/bit decision errors in long runs
- Pass: error rate ≤ X at worst node & corners
- When: back-to-back frames reduce recovery margin
- Set: minimum inter-frame spacing under marginal conditions
- Verify: fewer error bursts and lower tail latency
- Pass: framing bursts ≤ X; throughput impact ≤ X%
Engineering Checklist: Design → Bring-up → Production (Timing-Focused)
This checklist turns timing budgets into repeatable sign-off. Each item is measurable at the worst node, under defined corners, and produces an evidence pack suitable for handoff and production audits.
- Item: one timing claim to verify
- Why: which margin/uncertainty it consumes
- How: probe point + trigger + sample count
- Evidence: scope / CSV / auto-test / LA stats
- Pass: ≤ X or ≥ X (placeholder)
Applications & IC Selection Notes (Timing-First)
Selection is framed as required timing capabilities, not a catalog. The example material numbers below are common building blocks; always verify package/suffix, voltage thresholds, and current datasheet timing bounds.
- Main risk: setup/hold squeezed by duty + skew + threshold spread
- Needed capabilities: low additive delay uncertainty, low skew, stable thresholds
- Example parts: SPI level shift/buffer SN74AXC8T245, clock buffer CDCLVC1102, I²C switch TCA9548A
- Main risk: tR/tF grows, threshold crossing spreads, skew rises across branches
- Needed capabilities: segmentation, rise-time acceleration, differential extension, bounded delay
- Example parts: I²C diff extender PCA9615, bus buffer P82B96, rise accel LTC4311
- Main risk: repeated edges amplify marginal timing under noise/temperature drift
- Needed capabilities: re-timing/clean buffering, adjustable sampling, guard-time support
- Example parts: SPI isolator (timing bounded) ADuM4151, I²C hot-swap buffer TCA4311A, MEMS XO SiT1602
- Edge control: slew options, rise-time assist, threshold stability
- Delay bounds: tPD (max/min), delay uncertainty vs temperature/voltage
- Skew: channel-to-channel skew and drift across corners
- Sampling alignment: adjustable delay/phase, re-timing support
- Clock impact: duty-cycle distortion, additive jitter (bounded)
- Rise-time accelerator: Analog Devices LTC4311 (tR support under heavy C)
- Hot-swap / stuck-bus recovery buffer: TI TCA4311A
- Bus buffers (capacitance isolation): NXP PCA9517A, TI TCA9803
- I²C switch/mux (fanout control): TI TCA9548A, NXP PCA9548A
- Differential I²C extender: NXP PCA9615 (long reach, timing bounded at ends)
- I²C isolators: Analog Devices ADuM1250/ADuM1251, TI ISO1540/ISO1541
- Open-drain level shift: NXP PCA9306 (verify rise-time impact under C)
- SPI isolation (bounded tPD): Analog Devices ADuM4151 (validate setup/hold at receiver)
- Direction-controlled level translator/buffer: TI SN74AXC8T245 (tight timing vs auto-direction)
- Small translators: TI SN74LVC1T45, TI SN74AXC1T45 (use for point-to-point)
- Clock buffer (low added uncertainty): TI CDCLVC1102 / CDCLVC1104 (validate duty & jitter at load)
- MEMS oscillator (XO): SiTime SiT1602 (select ppm/temp grade to meet budget)
- Crystal oscillator (example family): Epson SG-210STF series (verify frequency & stability option)
- USB-to-UART bridge (deep buffers for bursty traffic): FTDI FT232R, Silicon Labs CP2102N (timing impact: latency/buffering)
Note: example parts are listed for timing capability mapping. Always re-check tPD bounds, skew, and threshold conventions in the latest datasheet for the chosen package and temperature grade.
Recommended topics you might also need
Request a Quote
FAQs (Timing-Only): Quick Triage Without Expanding Scope
Each answer uses a fixed, measurable format and stays strictly within speed/timing boundaries. Replace X placeholders with project limits and test conditions.
I²C @ 400 kHz runs but shows occasional NAK — first compute tR or tHIGH margin?
Likely cause: Timing margin is consumed at the worst node by rise-time (tR) and/or tHIGH shrinking under real load.
Quick check: Probe SCL/SDA at the farthest device pins; measure tR using the same % definition as the spec and measure tHIGH/tLOW at that node.
Fix: Move pull-up into the feasible window (VOL/IOL bound), reduce effective Cbus, or segment the bus (buffer/switch) so worst-node tR and tHIGH recover.
Pass criteria: tR ≤ X ns @ worst node; tHIGH ≥ X ns @ worst node; NAK rate ≤ X per N transactions across temp/voltage corners.
I²C becomes less stable after switching to a smaller pull-up — ringing or threshold-crossing jitter?
Likely cause: Faster edges increase ringing/undershoot, creating multiple threshold crossings (effective “edge-placement jitter”).
Quick check: Probe at receiver pins; set scope threshold near the intended VIH/VIL boundary and look for multiple crossings or bounce within one edge.
Fix: Add small series-R near the driver, tune pull-up toward a calmer edge, or segment/accelerate edges with controlled devices rather than brute-force pull-up reduction.
Pass criteria: Single clean threshold crossing per edge; overshoot ≤ X% of VDD and undershoot ≥ −X V @ worst node; timing metrics stable over N events.
SPI at the same frequency shows bit-slip on one board but not another — check CPHA or SCLK arrival skew first?
Likely cause: The sampling edge lands too close to the data transition due to mode mismatch and/or arrival skew between SCLK and data at the receiver.
Quick check: Confirm CPOL/CPHA settings match on both ends, then measure SCLK and data at the slave pins to quantify clock-to-data timing at the sampling edge.
Fix: Correct CPHA/CPOL, add sampling delay/phase shift if supported, or reduce skew with buffering/re-timing so the sample point returns to the data-valid center.
Pass criteria: Setup margin ≥ X ns and hold margin ≥ X ns @ slave pins; bit-slip/CRC errors ≤ X per N frames at corners.
SPI edges look “square” on the scope but CRC spikes — check duty cycle or sample-edge placement first?
Likely cause: Duty distortion and/or edge placement shifts the sampling edge into the transition region even when the waveform amplitude looks ideal.
Quick check: Measure duty cycle at the receiver clock pin and overlay the sampling edge against the data-valid region (at receiver pins, not at the driver).
Fix: Reduce duty distortion (clock buffer/shorter distribution), adjust CPHA/sample delay, or lower SCLK so half-cycle window widens.
Pass criteria: Duty = 50% ± X% @ receiver; sample point ≥ X% away from nearest data transition; CRC error rate ≤ X over N frames.
SPI fails at max speed but works when slowed down — is it window shortage or tCO temperature drift?
Likely cause: The available sampling window (half-cycle or effective phase window) is consumed by tCO(max) spread plus skew/jitter, which often worsens at temperature corners.
Quick check: Measure tCO at the slave output pin across hot/cold points and compare worst-case tCO against the window budget at the sampling edge.
Fix: Increase window (lower SCLK or shift sample phase), reduce skew/jitter (buffer/re-time), or select a slave with tighter tCO(max) bounds for the target corner.
Pass criteria: Worst-case [tCO(max)+skew+jitter] ≤ [window − guardband] with guardband ≥ X%; errors = 0 over N frames at hot/cold and VDD corners.
UART framing errors appear occasionally with identical settings — compute baud error first or blame noisy edges?
Likely cause: The sampling point drifts out of the bit center due to total baud error (TX+RX, temp drift) and/or edge uncertainty from noise.
Quick check: Calculate worst-case ppm at temperature for both endpoints, then measure the start-bit edge timing and eye “thickness” at the receiver input (threshold definition recorded).
Fix: Tighten clock accuracy (XO/PLL bounds), enable autobaud or periodic calibration, and increase oversampling/robust sampling strategy where available.
Pass criteria: |baud_error_total| ≤ X% across corners; framing errors = 0 over N frames; sampling point remains within ±X% of bit center at worst case.
UART errors are more common on long frames — why does drift accumulate, and what is the first validation step?
Likely cause: With asynchronous sampling, a small baud mismatch causes bit-center drift that grows with bit count until the last bits lose margin.
Quick check: Capture one full frame at the receiver and mark the expected bit centers; quantify drift at the final data/stop bits relative to the ideal center.
Fix: Reduce total baud error (better timebase/calibration), increase oversampling robustness, or add periodic re-alignment opportunities (guard time/shorter burst framing).
Pass criteria: Final-bit sampling offset ≤ X% of bit time; 0 framing errors over N long frames at temperature and VDD corners.
Logic analyzer decoding looks correct but the system still fails — are thresholds/sampling definitions aligned?
Likely cause: LA thresholds and sampling assumptions differ from the real receiver, masking threshold-crossing ambiguity or sampling-edge proximity issues.
Quick check: Record LA threshold levels, sample rate, and probe at the receiver pin; cross-check with an analog scope to confirm single threshold crossing and true edge placement.
Fix: Align LA thresholds to the intended VIH/VIL convention, move probes to the worst node, and validate timing at the sampling event (not just protocol decode).
Pass criteria: Threshold definition documented; receiver-pin capture shows ≥ X ns setup/hold (or ≥ X% eye margin); system errors ≤ X over N events.
Scope-measured tR passes the spec but the system is unstable — can probe capacitance/bandwidth “fake” compliance?
Likely cause: Probe loading and bandwidth/threshold settings change the apparent edge, producing a measurement artifact that does not match the receiver’s experience.
Quick check: Repeat tR with a low-C active probe (or different probe/BW) and keep the same % definition; compare results at the same worst node.
Fix: Use lower-loading probing, record instrument settings in the evidence pack, and sign off using worst-node measurements that match the datasheet definition.
Pass criteria: Probe C ≤ X pF (or active probe used); tR variance ≤ ±X% across repeats; system errors ≤ X over N events.
The same design fails at a temperature corner in production — which timing field is most often missing from logs?
Likely cause: A corner-dependent timing term (clock accuracy drift, delay bound, threshold definition, or worst-node identity) is not recorded, blocking reproduction and guardband tuning.
Quick check: Audit one failing record for: node_id, threshold definition, temp, VDD, cable/fixture ID, sample rate, and the exact metric values used for pass/fail.
Fix: Add timing-only structured logs and bind each failure to a measurable metric (tR/tF, duty, skew, baud error, window margin) so the worst-case stack-up can be updated.
Pass criteria: Logs contain all required fields for ≥ X% of units; corner reproduction success ≥ X%; updated margin ≥ X at worst node/corner.
SCLK frequency is compliant but duty cycle is not — what is the most direct damage to the sampling window?
Likely cause: Duty distortion shrinks the intended half-cycle (or phase) so the data-valid window is narrower even at the same frequency.
Quick check: Measure duty at the receiver clock pin and compute the reduced half-cycle window; compare against the required setup/hold (plus jitter/skew) at the sampling edge.
Fix: Improve clock distribution (buffer with low duty distortion), shorten/clean the clock path, or shift sampling phase so the sample returns to the window center.
Pass criteria: Duty = 50% ± X% @ receiver; effective window ≥ X ns after subtracting jitter+skew; errors ≤ X over N frames.
Margin looks sufficient but errors still occur — suspect jitter first or skew first?
Likely cause: The budget missed the dominant uncertainty: cycle-to-cycle jitter (random edge movement) or deterministic skew (fixed arrival offset that shifts with load/topology).
Quick check: Measure a distribution: (1) clock edge jitter at receiver over many cycles, (2) arrival delay difference between boards/nodes; identify which term dominates the window loss.
Fix: If jitter dominates: improve clock source/buffering or re-time; if skew dominates: reduce path mismatch, tighten distribution, or shift sampling phase to re-center the window.
Pass criteria: pk–pk jitter ≤ X ns @ receiver; Δskew ≤ X ns across worst nodes; error rate ≤ X per N events at corners.
Data note: placeholders (X, N) are intended for project-specific limits and sample sizes. Each “Pass criteria” line is designed to be directly copied into a test plan or production script.