Timing & Synchronization: CDR/Retimers, Jitter Clean-Up, PTP

Q: Cleaner output jitter looks great, but PTP offset still drifts — tap point or asymmetry first?

Likely cause: Timestamp semantics are inconsistent (tap location differs) or a fixed asymmetry bias dominates mean offset, not clock noise. Quick check: Lock the same tap definition (PHY/MAC/ingress/egress) and run a direction-swap test; compare mean offset vs RMS offset under identical path/load. Fix: Standardize tap placement end-to-end; remove or compensate asymmetry sources (mismatched paths, buffering modes, unbalanced ingress/egress points); then re-check RMS as the next step. Pass criteria: |mean offset| ≤ X ns and RMS offset ≤ X ns over X s, with the same tap + same path + same load.

Q: Same system, after swapping a retimer the offset shifted by a constant amount — deterministic latency change or FIFO-level behavior?

Likely cause: A deterministic baseline latency changed, or buffer/FIFO behavior introduces discrete steps depending on link state. Quick check: Force repeated link re-train / mode toggles and record offset vs time; a constant shift stays constant, while FIFO/state issues produce step events correlated to state transitions. Fix: Lock retimer mode and buffering policy; align up/down path devices and modes; calibrate a single constant offset only if the system proves step-free across state changes. Pass criteria: Latency step ≤ X ns across X retrain/mode cycles; mean offset change is stable within ±X ns.

Q: Offset gets worse at low/high temperature — XO drift or path-delay temperature drift first?

Likely cause: Temperature-dependent delay in the physical path dominates, or the timebase drifts due to reference/PLL behavior. Quick check: Run a temperature sweep with fixed load and fixed tap; compute drift slope (ns/°C). Repeat with a short/golden path baseline to see whether slope follows the path or the clock domain. Fix: Reduce temperature sensitivity of the dominant segment (stabilize path, lock modes, avoid state changes with temperature); if timebase drift dominates, adjust loop/holdover strategy and validate again. Pass criteria: Drift slope ≤ X ns/°C over X–Y °C; no temperature-triggered steps > X ns.

Q: Offset is great at idle, but degrades under traffic — queueing/shaping or load-dependent latency first?

Likely cause: Traffic introduces variable residence time/queueing or a load-coupled latency term, increasing offset RMS and mimicking jitter. Quick check: Perform a load sweep (0 → X%); plot RMS offset vs load while keeping the same tap and the same route. Fix: Fix the path and shaping policy; move timestamp closer to the wire if possible; lock switch/queue modes; validate again under the same load profile. Pass criteria: RMS offset increase from 0→X% load ≤ X ns; mean bias shift ≤ X ns under the same tap/path.

Q: Two devices both claim “hardware timestamps”, yet they won’t align — which MAC/PHY layer is actually stamping?

Likely cause: Timestamp tap points differ (PHY vs MAC vs switch ingress/egress) or timestamp resolution differs, producing bias and/or a floor. Quick check: Document tap location and timestamp resolution on both sides; run a same-path, same-load comparison and separate mean shift vs RMS growth. Fix: Harmonize tap semantics, reduce hidden pipeline/queue between tap and wire, and keep ingress/egress definitions consistent in the measurement setup. Pass criteria: Tap location matches end-to-end; timestamp resolution ≤ X ns; |mean offset| ≤ X ns on a fixed path.

Q: PN curve looks “better” after changing settings — how to detect RBW/VBW/integration-band artifacts?

Likely cause: RBW/VBW smoothing, averaging, or changed integration limits hid spurs or reshaped the noise floor. Quick check: Freeze settings (RBW/VBW, window, averaging, capture length, integration band) and repeat; then vary one setting to see if the “improvement” tracks the setting rather than the DUT. Fix: Define a single measurement contract and only compare results taken under that contract; store settings with every plot/log. Pass criteria: Under fixed settings, integrated jitter ≤ X fs rms (band X–Y) and spur amplitude ≤ X dBc (same RBW/VBW).

Q: Locks fast but occasionally loses lock — adjust loop BW first or improve input reference quality?

Likely cause: Loop BW/peaking increases sensitivity to reference disturbances, or the input reference intermittently violates stability/noise assumptions. Quick check: Correlate unlock events with reference status (LOS/alarms) and with environment/load; check whether unlock aligns with a peaking band or reference disturbance. Fix: Eliminate reference integrity issues first; then tune loop BW to reduce sensitivity while preserving tracking/holdover requirements. Pass criteria: Unlock events = 0 over X h; peaking ≤ X dB; integrated jitter ≤ X fs under the same test conditions.

Q: Enabling SSC makes the link steadier but time accuracy worse — cleaner tracking or timestamp path sensitivity?

Likely cause: The timing chain tracks modulation differently (loop-follow behavior) or the timestamp path becomes more load/latency sensitive, increasing offset RMS or bias. Quick check: With identical tap/path/load, compare mean vs RMS offset (SSC on/off). Mean shift suggests bias/asymmetry; RMS growth with load suggests path sensitivity. Fix: Stabilize timestamp semantics and fixed path/queue modes first; adjust tracking strategy only if required by the dominant error term; re-validate with gated tests. Pass criteria: ΔRMS (SSC on−off) ≤ X ns and |Δmean| ≤ X ns at load = X%, with fixed measurement semantics.

Q: Same board, different test stations show different jitter — what is the first correlation calibration?

Likely cause: Station measurement contracts differ (reference lock, RBW/VBW, integration band, trigger/timebase), not the DUT. Quick check: Run station-to-station A/B using the same reference, same fixture, and identical settings; measure station delta before comparing DUTs. Fix: Standardize and lock the measurement contract; align references/timebases; store settings and evidence with every run. Pass criteria: Station delta ≤ X fs rms (band X–Y) and ≤ X dB spur deviation under the same contract.

Q: Offset occasionally “jumps by one step” — mode switching, re-training, or fan/thermal disturbance first?

Likely cause: A discrete latency step from state changes (retrain, mode transitions, FIFO level shift), or an environment-triggered event (fan mode, thermal control) causing path/loop reconfiguration. Quick check: Time-align the step with mode/link/fan/alarm logs; repeat controlled toggles to test reproducibility and quantization. Fix: Lock the system into a single stable mode; bound or disable automatic transitions during validation; calibrate only after proving step-free behavior. Pass criteria: Step amplitude ≤ X ns and step rate ≤ X/hour under fixed configuration; no correlated mode events in logs.

← Back to:Interfaces, PHY & SerDes

Timing & Synchronization is won by controlling the hardware timestamp path and the clock-noise chain together—then proving it with budgets, gated measurements, and repeatable pass criteria.

This page turns “jitter/offset problems” into a calculable workflow: quantify error terms (noise, bias, steps, drift), choose the right device category (cleaner/retimer/fanout/tap), and validate with X-threshold gates.

Page Boundary & Core Thesis

This page connects clock quality, data recovery behavior, and hardware timestamp error into one engineering workflow: budget → placement → measurement → bring-up gates. The goal is deterministic synchronization that survives temperature, traffic load, and configuration changes.

Scope (hard boundary)

This page covers

Timing roles of CDR / retimer / jitter cleaner: jitter clean-up, wander/holdover behavior, deterministic latency, and timestamp-path sensitivity.
TSN/PTP hardware timestamp placement and error sources across PHY/MAC/switch/bridge datapaths (tap point, quantization, residence time, load-dependent latency, asymmetry).
Engineer-ready execution: budgets, measurement setup, verification gates, fault isolation, and selection logic.

This page does NOT cover

Protocol stack details, profile planning, or application-layer TSN design (belongs to TSN Switch/Bridge and Industrial Protocol SoC/Bridge).
Deep signal-integrity equalization or eye-diagram tuning (CTLE/DFE/FFE presets beyond timing impact) (belongs to Retimer/Redriver pages).
ESD/EMI/TVS selection and compliance component sizing (belongs to Protection & Test and PHY Robustness).
Power-per-Gbps and thermal resistance/cooling trade-offs (belongs to Power & Thermal).

When this page is the right tool

Cleaner jitter looks excellent, yet PTP offset still drifts or steps under temperature/traffic.
Adding a retimer “fixes” link stability but shifts time alignment (latency/asymmetry changes).
Different stations/instruments disagree on jitter: measurement correlation and gating are needed.

What this page delivers (engineer-ready outputs)

1) Budget template

A consistent field list for jitter/wander/latency/timestamp error budgeting (threshold placeholders).

2) Topology & placement rules

Where to clean jitter (near source vs near sink) and where retimers change latency/asymmetry.

3) Timestamp-path decomposition

Tap-point map + error sources (quantization, residence time, load dependence, asymmetry).

4) Measurement & correlation workflow

Instrument sanity checks to avoid “settings artifacts” (RBW/VBW, integration limits, record length).

5) Bring-up gates & production checklist

Link-stable → jitter-within-X → offset-within-X gating, with pass criteria placeholders.

6) Selection logic (symptom → mechanism → block)

Decision path to choose cleaner vs retimer vs clock distribution vs timestamp placement.

Diagram focus: clock cleanliness and timestamp accuracy are coupled but not identical—budget and verify them as separate paths.

Timing Stack Primer: Clock vs Time vs Data Recovery

Synchronization failures often come from mixing three different layers: Clock quality (phase/jitter/wander), Time accuracy (offset/drift/asymmetry), and Data recovery (CDR lock/tolerance/transfer). A clean clock does not guarantee accurate timestamps, and accurate timestamps do not guarantee link margin.

Clock domain (phase noise / jitter / wander / holdover)

What it is

The oscillator/PLL phase stability that sets instantaneous edge timing and long-term frequency drift (wander).

What to measure

Phase-noise curve (offset frequency axis must be stated).
Integrated jitter over a clearly defined band (e.g., 12 kHz–20 MHz, placeholder).
Wander / holdover behavior under reference loss or temperature drift.

Common traps (3)

Non-comparable numbers: integrated jitter without specifying integration limits is meaningless.
Settings artifacts: RBW/VBW/averaging changes can “improve” plots without improving physics.
Short captures: record length too short hides wander and temperature-driven drift.

Time domain (PTP offset / drift / sync interval / asymmetry)

What it is

The accuracy of time transfer between nodes (offset) and its stability over time (drift), dominated by timestamp placement and path symmetry—not only by “clock cleanliness”.

What to measure

Offset distribution: mean (bias) vs RMS/jitter of offset (noise).
Drift slope vs temperature/voltage; correlation with fan/airflow changes.
Asymmetry indicators: uplink/downlink delay mismatch and step changes after link events.

Common traps (3)

Tap-point mismatch: comparing offsets when timestamps are taken at different layers (PHY vs MAC) is invalid.
Path mismatch: measurement traffic uses a different queue/path than production traffic, hiding load-dependent latency.
Bias vs noise: a constant asymmetry bias can look like “good stability” yet be wrong by tens of ns.

Data domain (CDR lock / jitter tolerance / jitter transfer)

What it is

The recovery loop behavior that decides if the link remains locked and how input jitter maps to the recovered clock. Retimers may add buffering that changes latency and asymmetry even when the link looks stable.

What to measure

Lock margin and lock events (training/re-lock frequency, step symptoms).
Jitter tolerance (JTOL) and sensitivity to low-frequency wander vs high-frequency jitter.
Jitter transfer and peaking behavior; latency determinism across modes.

Common traps (3)

Lock ≠ margin: a link can “lock” but still have near-zero tolerance under temperature or load.
Hidden peaking: retimer/CDR settings may amplify certain jitter bands while reducing others.
Elasticity side effects: buffering can change packet timing/timestamp-path asymmetry without changing eye quality.

If only one mental model is kept

Stabilize data recovery first (no re-lock loops, no mode-flapping).
Control clock cleanliness next (define integration bands; validate transfer/peaking).
Validate time accuracy last (tap-point consistency + asymmetry + load-dependent latency).

Diagram focus: three layers must be analyzed separately, then linked via real coupling points (loop BW, tap location, FIFO elasticity).

Jitter/Wander Budget: Make Timing Quantifiable

A usable synchronization design starts with a budget table that is comparable across teams and instruments. The budget must capture noise shape (transfer/peaking), not only “one-number jitter”. Always record the integration band and observation window; without them, results are not comparable.

Budget template (field groups)

Input reference: f_ref, PN@offset, integrated jitter band (e.g., 12 kHz–20 MHz, placeholder), observation window, environment tags (temperature/airflow/rail mode).
Cleaner / PLL: jitter transfer (LF/MF/HF), jitter generation, loop BW, peaking, holdover wander.
CDR / retimer: jitter tolerance (JTOL concept), jitter peaking band, deterministic latency behavior (fixed vs step).
Receiver/system targets: receiver tolerance target (method only), jitter margin, time error target (offset RMS vs bias), pass criteria placeholders (X).

Budget field table (copy & fill)

Budget item	Represents	How to obtain	Risk if wrong	Placeholder	Owner
f_ref	Clock reference frequency used by cleaner/CDR	Design spec + bring-up measurement	Wrong scaling and invalid comparisons	X	Systems
PN@offset	Phase noise at defined offsets (shape anchor)	Phase noise analyzer / spectrum method	Cleaner BW choice becomes guesswork	X dBc/Hz	Test
Integrated jitter band	One-number jitter (only valid with band stated)	Specify integration limits (e.g., 12k–20M)	“Better” jitter number may be an artifact	X fs/ps rms	Test
Observation window	Capture length that reveals wander/drift	Long record (seconds/minutes), log temperature	Wander hidden; false confidence	X s	Test
Cleaner loop BW	Where ref noise is passed vs filtered	Vendor model + in-system validation	Slow lock or offset drift under load	X Hz/kHz	HW
Transfer & peaking	Noise shape reshaping (band amplification)	Transfer curve + compare before/after	Target band gets worse despite lower RMS	X dB	HW/Test
CDR/retimer latency	Deterministic delay vs step changes	Timestamp A/B measurement across modes	Offset steps; asymmetry bias	X ns	Systems/Test
Receiver tolerance target	Acceptance boundary (method-only here)	Map spec → required margin (no protocol details)	Pass in lab, fail in system	X	Systems
Time error target	Offset RMS (noise) vs mean (bias/asymmetry)	Long log + split bias vs noise	Stable but wrong (bias), or noisy but centered	X ns	Systems/Test

Notes: keep the table “comparable” by freezing integration limits, record length, and measurement point. Otherwise, budget deltas cannot be trusted.

Budget workflow (Step 1 → Step 6)

Define success gates: jitter target (X), wander/holdover target (X), deterministic latency requirement (X), time error target (offset RMS & bias, X).
Freeze measurement semantics: integration band, RBW/VBW, record length, and measurement point-of-use.
Model ref → cleaner mapping: apply transfer/BW/peaking to predict the post-cleaner noise shape.
Add CDR/retimer behavior: note peaking bands, re-lock events, and whether latency is fixed or can step.
Compare against tolerance: map receiver tolerance/spec to a margin method (no protocol detail here) and compute remaining margin.
Translate to tests: convert each budget row into a bring-up measurement and a production pass criterion (X).

Diagram focus: a budget is valid only if measurement semantics are frozen; track transfer/peaking and latency steps, then convert rows into bring-up and production gates.

Clocking Topologies: Shared Ref vs Local XO vs Cleaner Placement

Topology selection is a trade between distribution risk, drift/holdover, and deterministic latency. The same “clean clock” can become unusable if the point-of-use is not controlled and verified under temperature and traffic load.

1) Shared ref + fanout (central distribution)

Best for: strong system coherence; many endpoints needing a common base.
Improves: drift alignment; consistent frequency reference.
Can break: noise injection on the ref tree, uncontrolled skew, ground/return coupling.
Verify gates: output-to-output skew within X; far-end PN/jitter within X; temperature sweep stability within X.

2) Local XO + async (endpoint self-hold)

Best for: harsh distribution environments; long or noisy ref routes.
Improves: immunity to ref-tree coupling; local stability when links flap.
Can break: drift accumulation, relock steps, stronger dependency on CDR/buffering.
Verify gates: holdover drift within X; relock step < X ns; traffic-load offset variance within X.

3) Cleaner near source (clean early)

Best for: stable distribution network; predictable routing.
Improves: shared cleanliness at the source; simpler central control.
Can break: “clean” clock gets polluted after distribution; far-end differs from source.
Verify gates: measure at point-of-use; compare source vs far-end delta < X; temp sweep delta < X.

4) Cleaner near sink (clean late)

Best for: sensitive endpoints; noisy or variable distribution paths.
Improves: local point-of-use cleanliness; direct control where it matters.
Can break: multi-point consistency; configuration drift between nodes; bias shifts if not aligned.
Verify gates: cross-node consistency within X; mode/traffic sweep step < X ns; temp drift within X.

Diagram focus: topology changes where noise enters, where it is cleaned, and where latency/asymmetry can step; verification must be done at the point-of-use.

CDR / Retimer Options: Data Recovery vs Re-timing

In timing-critical systems, the key difference is not the protocol name but the loop behavior and the timestamp path impact. A typical retimer behaves like CDR + elastic buffer, which can change deterministic latency, introduce latency steps, and create PTP asymmetry bias if paths are not symmetric.

Selection dimensions (timing-focused)

Lock range & programmable loop BW: which disturbances are tracked vs filtered (X).
Jitter transfer shape: LF follow vs HF clean-up; watch peaking bands (X dB).
Deterministic latency / hitless: fixed delay vs mode/temperature/load-dependent steps (X ns).
Asymmetry risk: direction-dependent buffering/queues can create stable but wrong time (bias X ns).
Timestamp tap impact: whether the tap point is before/after buffering and how residence time is accounted.

Comparison matrix (timing behavior)

Field	CDR	Retimer	Cleaner / PLL	Fanout
What it fixes	Recover clock from data; tolerate channel jitter within JTOL	Re-time data; often cleans HF jitter and reshapes LF content	Clean/shape reference clock; provide holdover options	Distribute clock copies; manage skew and loading
What it breaks	Peaking bands; lock/relock events; recovered-clock wander coupling	Latency steps (FIFO); asymmetry bias if paths differ	BW tradeoffs: can pass ref noise or worsen holdover/lock time	Noise injection via distribution tree; uncontrolled skew/return currents
Where to place	At receiver side where data is recovered	Where timing margin is needed; avoid impacting timestamp tap unless accounted	Near source (central) or near sink (PoU), based on distribution risk	Near source; route like a sensitive analog net
How to verify	Lock margin; peaking sensitivity; event counters (X)	Latency stability vs temp/load; tap-point correlation (X ns)	Transfer/peaking check; holdover test; band+window frozen (X)	Skew; far-end PN/jitter delta; EMI/ground coupling checks
Time risk	Noise risk (offset RMS) if peaking hits sensitive band	Bias risk (asymmetry) + step risk (latency)	Noise-shape risk (BW/peaking) + wander risk	Skew/bias risk if paths are not controlled

Practical rule: verify at the point-of-use and separate bias (asymmetry) from noise (offset RMS).

Placement rules (fast decisions)

Rule 1: Measure at PoU

Any “improvement” must be validated where the clock/time is consumed (near PHY/MAC/timestamp tap).

Rule 2: Latency first

Confirm deterministic latency and absence of steps before optimizing jitter numbers.

Rule 3: Separate bias vs noise

Stable but wrong time indicates asymmetry bias; noisy time indicates jitter/noise. Fixes differ.

Diagram focus: “re-timing” often implies buffering; buffer placement relative to timestamp taps determines bias and step risk.

Jitter Clean-up: PLL Loop Bandwidth Is Benefit and Side Effect

Loop bandwidth (BW) sets what gets tracked and what gets filtered. Wider BW improves tracking and lock time but can pass reference disturbances; narrower BW improves clean-up but can worsen wander behavior and recovery time. A “better” phase-noise plot can still yield worse system time if BW reshapes noise into a sensitive band or if measurement settings hide wander.

BW choice logic (engineering mapping)

BW larger: faster lock and better LF tracking, but ref noise/disturbances can leak through → time drift under environment/load.
BW smaller: stronger HF clean-up, but slower acquisition and worse holdover/wander behavior → long-term offset issues.
Always check: peaking band (X dB), latency steps (X ns), and long-window logs (X s) before declaring “improved”.

Jitter component lens (used for debugging)

RJ (random): reduces margin; looks like broadband noise.
DJ (deterministic): correlates with system activity; often repeats with patterns.
PJ (periodic): shows spurs; can be amplified by transfer/peaking.
Wander (LF drift): dominates PTP offset drift; requires long-window capture.

Tuning ladder (stability → noise shaping → time accuracy)

Phase 1: Stability

No re-lock / mode flapping (X).
No latency steps > X ns.
Temp sweep behavior consistent (X).

Phase 2: Noise shaping

Freeze integration band (X).
Peaking not in sensitive band (X dB).
RBW/VBW/averaging frozen (X).

Phase 3: Time accuracy

Offset RMS < X ns.
Asymmetry bias < X ns.
Traffic sweep does not widen error (X).

Common illusions (measurement artifacts)

RBW/VBW changed: PN plot looks better but system is unchanged.
Averaging hides spurs: PJ remains but disappears visually.
Short record length: wander and thermal drift are invisible.
Wrong measurement point: source looks clean while point-of-use is polluted.
Instrument bandwidth limit: HF noise is clipped and under-reported.

Diagram focus: BW moves the “follow vs clean” boundary; peaking can amplify a sensitive band even if integrated jitter improves.

PTP / TSN Timestamp Path: Where the Error Comes From (Not the Algorithm)

In TSN-grade synchronization, the dominant uncertainty is often the hardware timestamp path: where the tap is located, what delay segments are included, and which segments are left as unknown or load-dependent. This section focuses only on timestamp-related hardware behavior (tap location, asymmetry, residence time, and drift), not protocol stack details.

Error decomposition (engineering view)

Total time error = clock error + timestamp quantization + asymmetry (bias) + temperature/voltage drift + load-dependent latency.

Practical split: bias (stable but wrong) vs noise (RMS spread).

Error sources (mechanism → symptom → how to measure)

1) Timestamp location

Mechanism: tap point at PHY/MAC/switch ingress/egress includes different delay segments.
Symptom: offset mean shifts after moving the tap or swapping endpoint/switch design.
Measure: A/B compare mean offset change at fixed conditions (X ns).

2) Quantization / resolution

Mechanism: timestamp granularity or interpolation limits create “time ticks”.
Symptom: offset RMS hits a floor; distribution shows a step size.
Measure: detect minimum step in the offset histogram (X).

3) Ingress/egress asymmetry

Mechanism: up/down directions traverse different FIFOs/retimers/paths → bias.
Symptom: stable but always off by a constant; bias can drift with temperature.
Measure: loopback/reversal tests; track mean bias vs temperature (X ns, X ns/°C).

4) Residence time / queueing / shaping

Mechanism: load and shaping policies change per-hop residence time (variable delay).
Symptom: offset RMS widens under traffic; periodic timing ripple under scheduled shaping.
Measure: traffic sweep; compare offset RMS/peak-to-peak at fixed clock quality (X ns).

5) Temperature / voltage drift

Mechanism: delay varies with device and interconnect temperature/voltage.
Symptom: slow offset drift correlates with airflow, fan modes, or rail states.
Measure: long record + temperature tags; fit drift slope (X ns/°C).

6) State changes (retrain / FIFO level)

Mechanism: link re-training, re-lock, or FIFO level changes cause delay steps.
Symptom: offset jumps to a new plateau, then stays stable.
Measure: event-aligned logging; step amplitude < X ns, event count (X).

Timestamp placement decision (target-driven)

Target: µs-class

Tap can be MAC-level if queueing is controlled.
Priority is stable behavior and observability.
Pass gates: offset RMS < X, no large steps > X.

Target: ns to sub-100ns

Prefer PHY or switch ingress/egress timestamps.
Require symmetric paths and deterministic latency.
Pass gates: asymmetry bias < X ns, step < X ns, drift < X ns/°C.

Diagram focus: tap points determine which Δt segments are observable; queueing introduces load-dependent residence time that widens offset spread.

Deterministic Latency & Asymmetry: Hidden Killers of Sync Systems

Many synchronization failures are driven by delay steps and directional mismatch, not by “PTP math”. Deterministic latency determines whether calibration is meaningful; asymmetry creates a bias that can be stable yet permanently wrong. Verification must include mode, temperature, and load sweeps to catch steps and drift.

Three step/drift sources (mode / temp / load)

A) Mode-related (steps)

Re-training, re-lock, FIFO level changes, configuration profile switches can alter effective delay.
Gate: step amplitude < X ns; event-aligned logs exist (X).

B) Temperature/voltage (drift)

Delay shifts with device and interconnect temperature/voltage, creating slow bias movement.
Gate: drift slope < X ns/°C across the operating range.

C) Load-related (variance)

Queueing/residence time and shaping behavior widen the offset distribution under traffic.
Gate: offset RMS and peak-to-peak remain within X ns under load sweep.

Calibration strategy (engineering method only)

Static calibration

Use when the path is fixed and delay is deterministic.
Store a calibrated constant (X ns).
Invalidate if mode changes introduce steps.

Dynamic compensation

Use when temperature/load changes shift delay over time.
Update based on controlled triggers (X) and consistent measurement semantics.
Verify that updates reduce bias without adding noise (X).

Control the risk by structure, then prove it by sweeps

Design: keep up/down paths symmetric; lock configuration; avoid direction-dependent FIFOs/queues where possible.
Verify: temperature sweep, load sweep, and mode-switch sweep; log events and measure steps/drift (all X).
Decide: if bias dominates → asymmetry; if RMS dominates → clock/noise/queueing.

Diagram focus: asymmetry creates a bias term; deterministic latency enables calibration, while steps and drift invalidate static assumptions.

Measurement & Instrumentation: Wrong Setup = Wrong Conclusions

Timing and synchronization debugging lives or dies by measurement semantics. Phase-noise/jitter results are only comparable when settings are frozen, and timestamp/offset results are only comparable when tap location, path, and load are controlled. This section provides practical checklists and a minimal trustworthy workflow.

Phase-noise / jitter setup checklist (settings traps)

Freeze these, or results are not comparable

RBW/VBW: changes can hide spurs or reshape the noise floor. Record RBW/VBW = X / X.
Integration band: integrated jitter depends on the band. Use X–Y (fixed).
Measurement duration: short captures miss wander/drift. Duration ≥ X s.
Window / averaging: smoothing can erase periodic components. Averaging policy = X.
Reference lock: unlocked reference injects its own error. Ref lock = OK.
Probe/instrument bandwidth: bandwidth limits can make jitter look “better”. BW ≥ X.

Always log the hidden variables

Environment: temperature, fan mode, airflow direction. Tag = X.
Power: rails, load state, regulator mode. Tag = X.
Config identity: firmware, register profile, loop BW selection. Hash/version = X.
Statistic semantics: RMS vs p-p, UI vs ns, filter choices. Define once and keep fixed.

PTP / offset test layout checklist (system traps)

Control these, or offset comparisons break

Tap location: PHY vs MAC vs ingress/egress changes what delay is included. Tap = X.
Same path: fixed route, fixed queue/shaping policy. Path ID = X.
Same load: empty vs loaded changes residence time. Load = X% (or sweep recorded).
Same timing baseline: establish a golden baseline before A/B changes. Baseline = X.

Interpretation shortcut

Mean offset shift → suspect asymmetry / tap mismatch / deterministic delay change.
RMS spread growth → suspect queueing/load effects or clock noise.
Step events → suspect re-training, re-lock, FIFO level changes.

Minimal trustworthy workflow (A/B-valid)

Calibrate the source: verify reference mode and alarm-free distribution. Pass: ref status = X.
Calibrate the instruments: lock Ref-in/trigger/timebase. Pass: ref lock = OK.
Freeze PN/jitter settings: RBW/VBW, band, duration, averaging. Record = X.
Tag environment: temperature, fan, rails, load. Log fields = X.
Run golden path: capture baseline mean/RMS/p-p. Baseline = X.
A/B change one variable: compare only deltas, not absolute charts. Δ metric = X.

Diagram focus: keep tap location and path consistent; freeze PN/jitter settings; treat load as a first-class variable and log environment tags for every run.

Engineering Checklist: Design → Bring-up → Production

A synchronization system becomes repeatable only when it is gated. Use the same structure across design, bring-up, and production: every checklist item includes what to verify, a quick check, and a pass criterion (threshold X placeholders).

Design checklist (structure correctness)

Ref-tree integrity: verify fanout/termination/return-path assumptions. Quick check: Δ jitter vs source = X. Pass: Δ < X.
Timestamp semantics: define tap level (PHY/MAC/ingress/egress) per target. Quick check: tap = X. Pass: meets target (X).
Placement rules: locate cleaner/retimer to reduce unknown delay and steps. Quick check: step events = X. Pass: step < X ns.
Asymmetry control: keep up/down paths symmetric in type and mode. Quick check: bias = X. Pass: bias < X ns.
Holdover strategy: specify acceptable drift under reference loss. Quick check: drift slope = X. Pass: < X.
Config identity: define immutable profiles (FW/regs) for correlation. Quick check: hash = X. Pass: matches golden.
Measurement contract: freeze PN/jitter settings and logging fields. Quick check: settings record = X. Pass: complete.
Load model: define worst-case traffic and shaping regime for validation. Quick check: load sweep plan = X. Pass: executed.

Bring-up checklist (gated sequence)

Gate 1 — Link stable: no repeated re-training/re-lock events. Quick check: retrain count = X. Pass: < X.
Freeze identity tags: FW version, config profile, path ID. Quick check: tags present. Pass: all fields logged.
Gate 2 — Jitter within X: verify under frozen PN settings. Quick check: integrated jitter = X. Pass: < X.
Catch peaking: ensure loop BW/peaking does not amplify bands of interest. Quick check: peaking = X dB. Pass: < X.
Gate 3 — Offset within X: evaluate mean + RMS, not charts. Quick check: mean/RMS = X/X. Pass: < X.
Mode sweep: validate no delay steps after controlled switches. Quick check: step = X ns. Pass: < X.
Temp sweep: quantify drift slope across range. Quick check: slope = X ns/°C. Pass: < X.
Load sweep: quantify RMS growth under traffic. Quick check: RMS@load = X. Pass: < X.

Production checklist (repeatability and correlation)

Golden profile lock: freeze registers, FW, and allowable modes. Quick check: hash = X. Pass: equals golden.
Station correlation: align instruments and semantics across stations. Quick check: Δ between stations = X. Pass: < X.
Fixture/path control: enforce same tap and routing in test setup. Quick check: path ID = X. Pass: fixed.
Environmental control: test limits and compensation policy. Quick check: temp range = X. Pass: within spec.
Gated pass/fail: Link stable → jitter → offset. Quick check: 3 gates pass. Pass: all green.
Evidence retention: store logs, screenshots, and metadata for traceability. Quick check: record complete. Pass: audit OK.
Escalation triggers: define step/drift/RMS alarms. Quick check: alarm thresholds = X. Pass: configured.
Change control: re-qualify on FW/reg/topology changes. Quick check: change log = X. Pass: re-qual done.

Diagram focus: enforce a gated bring-up order. If Gate 1 is not stable, Gate 2 and Gate 3 data are not actionable. Record tags enable station correlation and traceability.

IC Selection Logic: choose by “symptom → mechanism → device category”

Selection is a decision system, not a part-number list. Start from the observed timing symptom, force a mechanism check, then choose the device category that fixes the dominant error term without creating hidden asymmetry, latency steps, or timestamp bias.

Entry points (what is failing?)

Clock noise (integrated jitter / PN / wander / holdover)
Delay steps (mode switch, retrain, FIFO level change)
Skew & distribution (fanout, switching, multi-domain skew)
Timestamp error (tap location, quantization, load sensitivity)

Force a mechanism check (before choosing)

Noise vs bias: RMS grows (random) vs mean shifts (asymmetry)
Load coupling: error changes with traffic/load or queueing
Temp coupling: drift slope vs °C / fan / airflow changes
Tap semantics: timestamp taken at PHY/MAC/switch ingress/egress

Category mapping (what to use)

Ref/clock noise dominant → clock cleaner / DPLL / jitter attenuator
Data recovery + reach (and latency behavior matters) → retimer / CDR + controlled buffering
Skew / fanout / clock-tree control → fanout buffer / clock distribution IC
Redundancy switching (no glitch/step) → hitless / glitch-free clock mux/switch
Timestamp precision → prioritize timestamp placement (PHY/MAC/switch TSU) before swapping cleaners

Figure 11 — The same symptom can map to different device categories depending on the dominant mechanism (noise, bias, load, temperature, or tap semantics).

Minimum pass criteria (selection is only “correct” if these hold)

Gate A — Clock quality

Integrated jitter (band X–Y) ≤ X fs rms (same measurement settings)
Loop peaking ≤ X dB and no “too-good-to-be-true” PN artifact
Holdover / wander meets X over X s if applicable

Gate B — Deterministic latency

Mode switch / retrain latency step ≤ X ns (log events + measure Δt)
Temperature drift slope ≤ X ns/°C across the sweep range
Load sweep (0 → X% traffic) adds ≤ X ns RMS offset increase

Gate C — Timestamp correctness

Offset mean bias (asymmetry) ≤ X ns with up/down symmetry controls locked
Timestamp resolution / quantization floor ≤ X ns at the chosen tap location
Tap point is consistent (PHY/MAC/switch ingress/egress) across all test setups

Notes: thresholds use placeholders (X). Keep only one variable changing per A/B test (same tap point, same path, same load profile, same instrument settings).

Concrete material numbers (examples by category; not exhaustive)

Use these as “category anchors” to speed up sourcing conversations. Always verify package, speed grade, suffix options, and availability for the actual design.

Clock cleaner / DPLL / jitter attenuator (ref noise + holdover + SyncE/PTP clocking)

Texas Instruments LMK04832 (clock jitter cleaner / distribution class)
Analog Devices AD9545 (DPLL clock synchronizer / translator)
Skyworks (Silicon Labs) Si5345 (high-performance jitter attenuator)
Microchip ZL30772 (IEEE 1588 / SyncE timing device family)
Renesas 8A34001 (system synchronizer for IEEE 1588 / SyncE)

Retimer / CDR (reach extension + controlled latency behavior)

Texas Instruments DS280DF810 (multi-rate 8-channel retimer class)
Timing note: retimers can change determinism via buffering; verify step & asymmetry gates

Fanout / clock distribution buffers (skew + multi-output distribution)

Texas Instruments LMK1C1104 (LVCMOS clock buffer family)
Analog Devices ADCLK948 (low-jitter clock fanout buffer class)
Skyworks (Silicon Labs) Si5332 (clock generator / fanout style clock tree building)
Microchip ZL40213 (precision LVDS fanout buffer class)

Glitch-free / hitless clock mux & redundancy switching (avoid switching steps)

Renesas ICS581-01 (glitch-free PLL-based clock multiplexer)
Renesas 8T49N285 (hitless reference switching + holdover class)
Microchip ZL30168 (glitch-less / hit-less reference switching class)

Timestamp-capable edges (fix timestamp placement before swapping cleaners)

Texas Instruments DP83640 (IEEE 1588 PTP Ethernet PHY, “close-to-the-wire” timestamp class)
NXP SJA1105 (TSN/AVB switch family; supports IEEE 1588v2 transparent clock update class)

Keep boundary: selection here focuses on timestamp tap location and latency behavior; protocol profile and TSN scheduling details belong to the TSN Switch/Bridge or Industrial Ethernet SoC pages.

The 10 questions that must be answered (turn this into a requirements sheet)

Question (what)	Why it matters (mechanism)	Quick measurement (how)	Target (X)
Integrated jitter band (X–Y)	Defines clock-noise budget and cleaner/DPLL need	Same RBW/VBW, same integration limits, same capture length	≤ X fs rms
PTP offset RMS and mean	RMS = noise; mean shift = bias/asymmetry	Lock tap point; compare same path A/B only	RMS ≤ X ns; \|mean\| ≤ X ns
Max allowed delay step	Retimer/FIFO/retrain can create discrete jumps	Trigger on mode change; measure Δt distribution	Step ≤ X ns
Drift vs temperature / supply	Separates clock drift from path drift	Temp sweep; log fan state and rail modes	≤ X ns/°C; ≤ X ns/V
Holdover requirement	Defines DPLL/holdover class and ref redundancy	Pull reference; record drift over time window	Drift ≤ X over X s
Allowed asymmetry budget	Up/down mismatch becomes mean offset bias	Swap directions; measure bias sign/magnitude	\|Δt_up−Δt_dn\| ≤ X ns
Timestamp resolution floor	Quantization can dominate at ns-level goals	Check TSU tick size and tap placement	≤ X ns
Load sensitivity (queueing)	Offset RMS can “look like jitter” but is traffic-driven	Traffic sweep; keep path and tap constant	ΔRMS ≤ X ns
Skew budget across outputs	Fanout selection depends on skew + enable behavior	Measure output-to-output skew under real loading	Skew ≤ X ps
Switching requirement	Glitch/step on ref switching can reset the timing budget	Force LOS; check hitless behavior and recovery time	Hitless: pass; recovery ≤ X

Table is scroll-safe on mobile. Copy these rows into a project requirement sheet before selecting devices.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs: Timing & Synchronization (hardware-path focused)

These FAQs close long-tail troubleshooting without expanding the main text. Each answer uses a fixed, data-oriented 4-line structure and stays within clock/timestamp hardware semantics (tap, asymmetry, latency steps, loop BW effects, measurement setup).

Cleaner output jitter looks great, but PTP offset still drifts — tap point or asymmetry first?

Likely cause: Timestamp semantics are inconsistent (tap location differs) or a fixed asymmetry bias dominates mean offset, not clock noise.

Quick check: Lock the same tap definition (PHY/MAC/ingress/egress) and run a direction-swap test; compare mean offset vs RMS offset under identical path/load.

Fix: Standardize tap placement end-to-end; remove or compensate asymmetry sources (mismatched paths, buffering modes, unbalanced ingress/egress points); then re-check RMS as the next step.

Pass criteria: |mean offset| ≤ X ns and RMS offset ≤ X ns over X s, with the same tap + same path + same load.

Same system, after swapping a retimer the offset shifted by a constant amount — deterministic latency change or FIFO-level behavior?

Likely cause: A deterministic baseline latency changed, or buffer/FIFO behavior introduces discrete steps depending on link state.

Quick check: Force repeated link re-train / mode toggles and record offset vs time; a constant shift stays constant, while FIFO/state issues produce step events correlated to state transitions.

Fix: Lock retimer mode and buffering policy; align up/down path devices and modes; calibrate a single constant offset only if the system proves step-free across state changes.

Pass criteria: Latency step ≤ X ns across X retrain/mode cycles; mean offset change is stable within ±X ns.

Offset gets worse at low/high temperature — XO drift or path-delay temperature drift first?

Likely cause: Temperature-dependent delay in the physical path (cables, buffers, retimers, switch fabric) dominates, or the timebase drifts due to reference/PLL behavior.

Quick check: Run a temperature sweep with fixed load and fixed tap; compute drift slope (ns/°C). Repeat with a “short/golden path” baseline to see whether slope follows the path or the clock domain.

Fix: Reduce temperature sensitivity of the dominant segment (stabilize path, lock modes, avoid state changes with temperature); if timebase drift dominates, adjust loop/holdover strategy and validate again.

Pass criteria: Drift slope ≤ X ns/°C over X–Y °C; no temperature-triggered steps > X ns.

Offset is great at idle, but degrades under traffic — queueing/shaping or load-dependent latency first?

Likely cause: Traffic introduces variable residence time/queueing or a load-coupled latency term, which increases offset RMS and can look like “more jitter”.

Quick check: Perform a load sweep (0 → X%); plot RMS offset vs load while keeping the same tap and the same route. Queueing effects typically scale with load and scheduling regime.

Fix: Fix the path and shaping policy for determinism; move timestamp closer to the wire if possible; lock switch/queue modes; validate again under the same load profile.

Pass criteria: RMS offset increase from 0→X% load ≤ X ns; mean bias shift ≤ X ns under the same tap/path.

Two devices both claim “hardware timestamps”, yet they won’t align — which MAC/PHY layer is actually stamping?

Likely cause: Timestamp tap points are different (PHY vs MAC vs switch ingress/egress), or timestamp quantization/resolution differs, producing a systematic bias and/or floor.

Quick check: Explicitly document the tap location and timestamp resolution on both sides; then run a same-path, same-load comparison. If mean offset shifts but RMS stays similar, suspect tap mismatch.

Fix: Harmonize tap semantics (choose a consistent stamping layer), reduce hidden pipeline/queue between tap and wire, and ensure identical ingress/egress definitions in the measurement setup.

Pass criteria: Tap location matches end-to-end; timestamp resolution ≤ X ns; |mean offset| ≤ X ns on a fixed path.

PN curve looks “better” after changing settings — how to detect RBW/VBW/integration-band artifacts?

Likely cause: Measurement settings reshaped the reported spectrum (RBW/VBW smoothing, averaging, or changed integration limits), hiding spurs or compressing the noise floor.

Quick check: Freeze settings (RBW/VBW, window, averaging, capture length, integration band) and repeat; then intentionally vary one setting to see whether “improvement” tracks the setting rather than the DUT.

Fix: Define a single measurement contract for the project (settings + integration band + duration) and only compare results taken under that contract; store settings with every plot/log.

Pass criteria: Under fixed settings, integrated jitter ≤ X fs rms (band X–Y) and spur amplitude ≤ X dBc (same RBW/VBW).

Locks fast but occasionally loses lock — adjust loop BW first or improve input reference quality?

Likely cause: Loop BW/peaking makes the system sensitive to reference disturbances, or the input reference intermittently violates stability/noise assumptions.

Quick check: Correlate unlock events with reference status (LOS/alarms) and with environment/load; measure whether unlock coincides with a peaking band or reference disturbance.

Fix: First eliminate reference integrity issues (distribution, alarms, mode stability); then tune loop BW to reduce sensitivity while preserving required tracking/holdover behavior.

Pass criteria: Unlock events = 0 over X h; peaking ≤ X dB; integrated jitter ≤ X fs under the same test conditions.

Enabling SSC makes the link steadier but time accuracy worse — cleaner tracking or timestamp path sensitivity?

Likely cause: The timing chain tracks modulation differently (loop-follow behavior) or the timestamp path becomes more load/latency sensitive, increasing offset RMS or bias.

Quick check: With identical tap/path/load, compare mean offset vs RMS offset (SSC on/off). If mean shifts, suspect bias/asymmetry; if RMS grows with load, suspect path sensitivity.

Fix: Stabilize timestamp semantics (tap closer to wire; fixed queue/path modes), then adjust tracking strategy only if required by the dominant error term; re-validate with gated tests.

Pass criteria: ΔRMS (SSC on−off) ≤ X ns and |Δmean| ≤ X ns at load = X%, with fixed measurement semantics.

Same board, different test stations show different jitter — what is the first correlation calibration?

Likely cause: Station measurement contracts differ (reference lock, RBW/VBW, integration band, trigger/timebase), not the DUT.

Quick check: Run a station-to-station A/B with the same reference source, same cabling/fixtures, and identical settings; compare the delta between stations before comparing DUTs.

Fix: Standardize and lock the measurement contract; align references and timebases; store settings + screenshots/logs with results for traceability.

Pass criteria: Station delta ≤ X fs rms (same band X–Y) and ≤ X dB spur deviation under the same contract.

Offset occasionally “jumps by one step” — mode switching, re-training, or fan/thermal disturbance first?

Likely cause: Discrete latency step from state changes (retrain, mode transitions, FIFO level shift), or an environment-triggered event (fan mode, thermal control) that causes a path/loop reconfiguration.

Quick check: Time-align the step event with system logs (mode, link state, fan PWM, alarms); repeat controlled mode toggles to see if the step is reproducible and quantized.

Fix: Lock the system into a single stable mode; disable or bound automatic transitions during validation; if needed, calibrate only after proving step-free behavior across transitions.

Pass criteria: Step amplitude ≤ X ns and step rate ≤ X / hour under fixed configuration; no correlated mode events in logs.

Cleaner placed near the source, yet the far end is still bad — distribution/return/coupling or remote CDR behavior first?

Likely cause: The distribution segment adds noise/coupling/skew, or the far-end recovery/clocking stage (CDR/retimer) dominates with its own transfer/peaking behavior.

Quick check: Measure at intermediate nodes (source → post-distribution → far end) with a consistent contract; identify where the delta appears. If degradation only shows after the far-end stage, suspect its transfer/peaking.

Fix: Harden the distribution path (mode/termination/return discipline) and lock far-end recovery modes; relocate cleaning only after the dominant segment is proven by segmented measurements.

Pass criteria: Segment-to-segment delta ≤ X fs rms (band X–Y) and far-end meets integrated jitter ≤ X fs rms under the same contract.

Reducing loop BW improved noise, but wander/holdover got worse — how to quickly confirm BW is the culprit?

Likely cause: Lower BW increases clean-up but reduces low-frequency tracking, making slow disturbances accumulate as wander or degrading holdover behavior.

Quick check: Run a controlled “slow disturbance” or reference-loss test; compare wander/holdover metrics for BW=A vs BW=B with identical conditions (tap/path/load).

Fix: Choose BW to satisfy both noise and tracking/holdover requirements; if necessary, use a staged tuning strategy (stability → noise → time accuracy) and re-validate gates.

Pass criteria: Wander metric ≤ X over X s and holdover drift ≤ X over X s, while integrated jitter remains ≤ X fs.

Timing & Synchronization: CDR/Retimers, Jitter Clean-Up, PTP

Timing & Synchronization: CDR/Retimers, Jitter Clean-Up, PTP

Page Boundary & Core Thesis

Timing Stack Primer: Clock vs Time vs Data Recovery

Jitter/Wander Budget: Make Timing Quantifiable

Clocking Topologies: Shared Ref vs Local XO vs Cleaner Placement

CDR / Retimer Options: Data Recovery vs Re-timing

Jitter Clean-up: PLL Loop Bandwidth Is Benefit and Side Effect

PTP / TSN Timestamp Path: Where the Error Comes From (Not the Algorithm)

Deterministic Latency & Asymmetry: Hidden Killers of Sync Systems

Measurement & Instrumentation: Wrong Setup = Wrong Conclusions

Engineering Checklist: Design → Bring-up → Production

IC Selection Logic: choose by “symptom → mechanism → device category”

Minimum pass criteria (selection is only “correct” if these hold)

Concrete material numbers (examples by category; not exhaustive)

The 10 questions that must be answered (turn this into a requirements sheet)

Request a Quote

Accepted Formats

Attachment

FAQs: Timing & Synchronization (hardware-path focused)

Explore

Categories

Get in Touch

Timing & Synchronization: CDR/Retimers, Jitter Clean-Up, PTP

Timing & Synchronization: CDR/Retimers, Jitter Clean-Up, PTP

Page Boundary & Core Thesis

Timing Stack Primer: Clock vs Time vs Data Recovery

Jitter/Wander Budget: Make Timing Quantifiable

Clocking Topologies: Shared Ref vs Local XO vs Cleaner Placement

CDR / Retimer Options: Data Recovery vs Re-timing

Jitter Clean-up: PLL Loop Bandwidth Is Benefit and Side Effect

PTP / TSN Timestamp Path: Where the Error Comes From (Not the Algorithm)

Deterministic Latency & Asymmetry: Hidden Killers of Sync Systems

Measurement & Instrumentation: Wrong Setup = Wrong Conclusions

Engineering Checklist: Design → Bring-up → Production

IC Selection Logic: choose by “symptom → mechanism → device category”

Minimum pass criteria (selection is only “correct” if these hold)

Concrete material numbers (examples by category; not exhaustive)

The 10 questions that must be answered (turn this into a requirements sheet)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

FAQs: Timing & Synchronization (hardware-path focused)

Explore

Categories

Get in Touch