Sync, Trigger & Timing for Medical Imaging & Monitoring
← Back to: Medical Imaging & Patient Monitoring
This page shows a practical system method for aligning multi-board, multi-channel acquisition using a low-jitter clock tree, deterministic trigger routing, hardware timestamps, and continuous jitter/latency monitoring—so timing quality can be proven, not assumed.
The result is fewer timing-driven artifacts (bands/misaligned frames/drift) and a measurable pass/fail workflow for calibration, production test, and field debugging.
H2-1 · What this page answers
This page provides a system method to align multi-board / multi-channel acquisition to one timebase using a low-jitter clock tree, deterministic trigger routing, a hardware timestamp loop, and continuous jitter/health monitoring.
Practical benefit: fewer stripe-like artifacts, fewer intermittent mis-frames, and fewer “events that do not line up” due to drift, skew, or trigger uncertainty.
What can be verified (engineering pass/fail vocabulary)
- Clock jitter: short-term timing uncertainty (ps RMS class) at endpoints after cleaning + distribution.
- Channel skew: inter-channel relative arrival mismatch (fixed + temperature/aging components).
- Trigger latency: mean delay and its distribution (σ, tails, multi-peak behavior) through the trigger fabric.
- Drift / holdover: long-term offset accumulation when external reference is lost or switched.
- Lock / health: lock state, holdover entries, ref-switch events, and threshold-based alerts with logs.
Scope boundary (to avoid cross-page overlap)
- Covered here: reference selection, PLL cleaning, clock-tree distribution, trigger routing, hardware timestamp loop, jitter/health monitoring.
- Not expanded here: PCIe/DMA capture pipelines, codec/security engines, recorder/storage architecture, modality-specific analog front-ends.
H2-2 · Where timing breaks systems (symptoms → fastest isolation path)
Typical search intents answered here
- Why do stripes, intermittent mis-frames, or trigger drift show up “sometimes” and then disappear?
- Is the issue clock jitter, channel skew, trigger latency, or software timestamps?
- What is the minimum set of measurements to isolate the root cause fast?
Failure modes (3 buckets) with “what to measure”
A) Excess jitter / phase noise (timing uncertainty is too large)
- What it looks like: elevated noise floor, unstable edge timing, phase-sensitive processing becomes inconsistent across runs.
- Why it happens: sampling instants become random variables; higher-frequency content becomes more sensitive to the same jitter level.
- Minimum verification: endpoint jitter stats (RMS + tails) or time-interval error (TIE) trend vs time, correlated with lock/holdover events.
- Likely knobs: reference quality, PLL cleaning bandwidth choices, fanout stages, endpoint retime strategy (do not guess—measure).
B) Channel skew / mismatch (relative alignment is broken)
- What it looks like: multi-channel fusion fails, “same event” appears at different times across channels/boards.
- Minimum verification: common trigger → all channels timestamp the event → inspect Δt distribution (mean, peak-to-peak, σ, temperature dependence).
- Key insight: skew contains a fixed component (calibratable) and a random component (jitter/thermal/wander) that must be budgeted.
- Likely knobs: topology (star vs daisy), fanout skew spec, trace/connector symmetry, endpoint retime placement + calibration table.
C) Timestamp inconsistency (event ordering is wrong across devices)
- What it looks like: event order flips, replay does not line up, time gaps appear after resets or ref switches.
- Minimum verification: compare hardware timestamp vs software time for the same event; software tends to show queue/scheduling variability.
- Key insight: without a disciplined hardware counter (TSU), “time” becomes a software artifact, hard to reproduce under load.
- Likely knobs: TSU discipline (PPS/PTP), counter reset rules, monotonicity guarantees, log correlation using unique event IDs.
Fastest isolation SOP (minimum steps)
- Check timebase health first: lock state, holdover entries, ref-switch timestamp, and whether the symptom aligns with these events.
- Measure trigger latency as a distribution: mean + σ + tails + multi-peak behavior (multi-peak often indicates routing/priority changes).
- Close the timestamp loop: PPS/PTP alignment error and counter drift; confirm event stamps remain monotonic and within threshold.
Common misdiagnoses that waste debugging time
- Treating a fixed latency problem as random jitter (fixed offsets are calibrated; random jitter is reduced/budgeted).
- Looking only at average delay and ignoring the distribution tails (field failures hide in tails).
- Explaining everything with software timestamps without a hardware TSU reference (reproducibility collapses).
H2-3 · Time reference choices (accuracy, short-term stability, holdover)
Engineering goal: pick a reference strategy that matches what the system actually needs—short-term phase stability, long-term time accuracy, and predictable behavior when the external reference disappears (holdover).
A stable reference is not the same as a clean sampling clock. This section focuses on the reference layer only: what to lock to, how to switch, and how to stay sane during outages.
Three metrics that must be separated (to avoid “spec soup”)
- Short-term phase stability (ps-class): determines how stable the timing is over microseconds to milliseconds. This drives phase consistency and sets the foundation for low-jitter clock cleaning in the next section.
- Frequency accuracy (ppm): determines how fast time error accumulates over seconds to hours. Poor ppm causes timestamps to drift even if the waveform “looks fine” in a short capture.
- Holdover & wander (outage behavior): determines how predictable time error remains when GNSS/10 MHz is lost. A good design makes the growth of error monotonic, bounded, and loggable.
Selection boundary (fast decision questions)
Q1) Is the system more sensitive to short-term phase noise or long-term drift?
- Phase-sensitive alignment dominates → prioritize a cleanable reference chain (VCXO/OCXO discipline + well-behaved switching).
- Long-run time consistency dominates → prioritize low ppm, disciplined counter behavior, and tight logging of ref events.
Q2) What happens when external reference is lost, and for how long must operation continue?
- Outages are expected → require defined holdover mode: error growth is monitored, thresholded, and logged.
- Outages are rare or not critical → simpler reference still needs lock state + ref-switch logs (debuggability matters).
Q3) Does the system need standard external reference inputs?
- System-level synchronization → support 10 MHz and/or PPS as disciplined reference paths with safe switching.
- Standalone operation → still recommended to keep a service/calibration reference path (for validation and field correlation).
Practical outcomes (what “good” looks like)
- Predictable switching: reference changes do not create unexplained timing jumps; transitions are tracked and bounded.
- Traceable holdover: the timebase state (LOCK / HOLDOVER / REF-SWITCH) is always visible and recorded.
- Measurable drift: PPS/phase error trend is available to correlate field issues with timebase health.
H2-4 · Clock cleaning & PLL architecture (turn a “dirty” ref into clean sampling clocks)
Key idea: long-term alignment and short-term cleanliness are different problems. A practical architecture separates them so the system can stay locked and still meet endpoint jitter targets.
Two-stage PLL is common because it decouples availability (lock + holdover + switching) from phase-noise performance (jitter cleaning).
Why two PLL stages (and what each stage is responsible for)
- PLL1 (lock & holdover stage): tracks the selected reference, manages ref switching, and keeps timebase continuity during outages.
- PLL2 (jitter cleaner stage): suppresses short-term phase noise so endpoints receive stable clocks that meet the jitter budget.
- Fanout / retime layer: replicates clean clocks to multiple domains while controlling skew and added jitter (distribution details expand later, not here).
Loop bandwidth (LBW) intuition (no long math, but real consequences)
Wide LBW (fast tracking)
- Follows reference changes quickly (good for lock and switch behavior).
- Risks passing more reference phase noise through to the output if the reference is noisy.
Narrow LBW (strong filtering)
- Filters reference noise more aggressively (good for jitter targets).
- Tracks slowly; poor tuning can cause long lock times or undesirable transient behavior during switching.
Engineering priority rule (prevents common spec traps)
- Do not optimize for frequency synthesis convenience first. The output phase noise / integrated jitter budget is the top constraint for timing integrity.
- Always validate as distributions: mean is not enough—track σ and tails for jitter and for trigger/clock-related timing paths.
- Make transitions observable: lock/holdover/ref-switch events must be logged and correlated with measured timing metrics.
Minimum verification checklist (what must be measurable)
- PLL lock: lock time, lock stability, and ref-switch transient behavior.
- Holdover behavior: drift trend over time while in holdover state (and clear thresholds for alerts).
- Endpoint clock quality: jitter statistics after PLL2 + distribution (RMS + tails).
- Observability: logs include ref source, state transitions, and measurement snapshots around events.
H2-5 · Clock tree & fanout distribution (topology, load, retime)
Goal: deliver clocks to every board/channel while staying inside a measurable jitter budget and a measurable skew budget, with any unavoidable fixed delays made calibratable.
A distribution design is correct only when its uncertainty is bounded and verifiable: added jitter per stage, skew per branch, and retime delay offsets per endpoint.
The three knobs that control outcomes (and what each one changes)
- Topology (star vs daisy): shapes how skew accumulates and how failures propagate; it defines whether endpoints share similar paths or inherit cascaded delay.
- Load & fanout stages: each buffer/replicator has an added-jitter cost and may amplify sensitivity to power/ground noise if not budgeted.
- Retime points (endpoint retimer / PLL / clock recovery): can reduce cross-board variation, but turns differences into a fixed delay that must be measured and compensated.
Budget “ledger” (minimum accounting that prevents surprises)
Jitter budget (endpoint short-term uncertainty)
- Start with the cleaned clock at the distribution entry (post-PLL cleaning stage).
- Add Jadd for every fanout/buffer/retimer stage in the path.
- Validate at the endpoint: RMS + tails, not only a typical number.
Skew budget (channel-to-channel arrival differences)
- Fixed skew: length/topology driven → can be minimized by symmetry or corrected by calibration.
- Drifting skew: temperature/power dependent → must be bounded and monitored across conditions.
- Report skew as a distribution (mean, p-p, σ) under defined conditions.
Fixed delay & calibration (what retiming changes)
- Endpoint retiming may add a stable ΔD fixed per channel/board.
- Stability is useful when it is measured: store per-endpoint delay offsets with a calibration version ID.
- Any topology change, retimer configuration change, or ref-switch policy change must bump the calibration context.
Minimum verification checklist (fast to execute, hard to fake)
- Measure endpoint clock quality after the full tree: RMS + tails jitter.
- Measure skew distribution across channels/boards at multiple conditions (temperature and supply corners).
- If retiming is used, produce a CAL table: per-endpoint delay offset + calibration version ID.
- Log distribution health snapshots near ref-switch/holdover events (correlation matters).
H2-6 · Trigger routing fabric (matrix, priority, interlock)
Goal: route multiple trigger sources to multiple endpoints with predictable latency and traceable behavior. A correct design measures and logs mean, σ, and tails for trigger delay.
A trigger fabric is not “more GPIO.” It is a routing matrix with rules (priority + interlock) and a timestamp path so delay distributions can be verified and correlated.
Trigger shapes (three forms, three different failure modes)
- Pulse (edge/pulse): starts an action. Verify arrival distribution and repeatability across routes and conditions.
- Gate (window): enables a time window. Verify both edges (open/close) and window width error under switching and holdover states.
- Event marker + timestamp: records “what happened when” for cross-device ordering. Verify marker-to-stamp latency and its σ/tails.
Predictability criteria (what must be logged, not guessed)
- Latency mean: typical route delay (configuration dependent).
- Latency σ: jitter of the route (exposes synchronization and contention issues).
- Tails / worst-case: outliers that cause missed windows and hard-to-reproduce faults.
- Route profile ID: which matrix mapping and rule set is active.
- Interlock state: whether safety gating is allowing or blocking propagation.
- Timebase state: lock/holdover/ref-switch context for correlation.
Minimum validation SOP (turn it into a histogram, then trust the data)
- Route one input to multiple outputs; stamp each arrival with a timestamp unit (TSU).
- Generate a delay histogram per route: check mean, σ, and tails.
- Switch route profiles and priority rules: confirm distributions change in explainable ways (no multi-peak surprises).
- Toggle interlocks: confirm blocked triggers produce a log event with cause + timebase state.
H2-7 · Timestamp unit & time domains (local / global / capture)
This section builds a “timestamp closed loop”: discipline a local counter with PPS/PTP, stamp events at hardware points, and publish error stats + logs so alignment quality can be proven.
The practical goal is reliable cross-device ordering and replay alignment—without being dominated by software scheduling tails.
Time domains (define first, then align)
Local time (device counter)
- Monotonic ticks inside FPGA/MCU; ideal for in-device ordering and latency histograms.
- Drifts over long windows unless disciplined; drift must be tracked as a measured quantity.
- Use when verifying trigger-path latency and pipeline determinism.
Global time (aligned to PPS/PTP)
- Mapping of the local counter onto a common reference; enables cross-device event ordering.
- Always carry timebase state: LOCK, HOLDOVER, REF-SW to make replay explainable.
- Use when correlating events between boards (triggers, capture stamps, frame tags).
Capture time (stamp at the physical event)
- Hardware stamps taken at trigger edges, sample-enable boundaries, or frame-tag insertion points.
- Minimizes uncertainty by avoiding queue/interrupt/scheduler tails.
- Use when debugging “rare” misalignment: tails usually dominate the fault signature.
Hardware vs software timestamps (why tails matter)
- Software stamps are separated from the physical event by scheduling, buffering, and contention; mean can look fine while tails break alignment.
- Hardware stamps are taken at a defined pipeline point; latency is modelable, measurable, and correctable as a calibrated parameter.
- Alignment quality should be reported as distributions: mean, σ, and tails, plus timebase state.
Minimal TSU implementation (small module, big leverage)
- Discipline Counter: update offset/ppm/phase-error registers from PPS/PTP alignment inputs.
- Timestamp Stamps: stamp trigger/capture/frame-tag events and attach event type + route profile ID.
- Error Stats: maintain mean/σ/tails for marker-to-stamp latency and PPS phase error.
- Logs: record timebase state (LOCK/HOLDOVER/REF-SW) and configuration changes for correlation.
H2-8 · Jitter / skew / latency budgets (from specs to measurable criteria)
A system-level spec stays honest only when it separates clock jitter, channel skew, and trigger latency, and ties each to a measurement point and a distribution (mean/σ/tails).
Budgets prevent “typical value marketing”: every segment contribution is listed, measured, and reconciled at the endpoint.
Three independent budgets (do not mix them)
- Clock jitter budget: short-term phase uncertainty. Track contributions from Ref, PLL, Fanout, Layout, and Endpoint.
- Skew budget: channel-to-channel arrival difference. Separate fixed skew (calibratable) from drifting skew (temperature/power dependent).
- Latency budget: trigger path delay. Measure per route profile and report mean, σ, and tails (multi-peak indicates path changes).
Measurement mapping (each budget must have a “proof method”)
- Jitter: measure at entry, after fanout tree, and at endpoints; publish RMS + tails under defined conditions.
- Skew: stamp the same event on multiple channels; publish Δt distribution (mean/σ/p-p) and its drift across corners.
- Latency: for each route profile, produce a histogram; publish mean/σ/tails and flag multi-peak signatures.
Anti-marketing rules (keep specs measurable)
- No “typical-only” numbers: require conditions (integration range / bandwidth / temperature) and tails.
- No single-point jitter: require endpoint reconciliation against the contribution ledger.
- No RMS-only acceptance: tails often cause rare artifacts and missed windows.
- No method, no metric: a spec is invalid without a measurement point and a histogram/stat summary.
H2-9 · Jitter monitoring & health telemetry (lock, drift, anomaly alerts)
Time quality becomes actionable only when it is exported as health metrics that can be trended, compared between devices, and correlated with logs and route profiles.
The monitoring “triad” is: state (LOCK/HOLDOVER), alignment error (PPS phase error vs time), and path determinism (trigger latency histograms).
Monitoring triad (what to record, how to interpret)
1) Lock / holdover state and transitions
- Record: timebase_state (LOCK / HOLDOVER / REF-SW), entry count, time in state, last switch reason.
- Interpret: frequent transitions indicate unstable reference, tight thresholds, or environmental sensitivity.
- Correlate: transitions must align with changes in PPS error trends and latency tail growth.
2) PPS phase error statistics (phase error vs time)
- Record: mean / σ in a short window, p99 (tails), and a long-window trend slope (wander).
- Interpret: σ growth suggests rising short-term noise; slope growth suggests frequency error or holdover drift.
- Correlate: annotate each sample with timebase_state so holdover segments are explainable.
3) Trigger arrival time distribution (latency histogram)
- Record per route profile: mean / σ / p99 plus multi-peak flag to detect path changes.
- Interpret: mean drift indicates added stages; σ growth indicates non-determinism; tail growth explains rare misalignment.
- Correlate: tag each histogram with profile_id, interlock_state, and timebase_state.
Minimal health metrics (export-friendly)
- timebase_state, lock_uptime_ratio, holdover_entry_count, ref_switch_count, time_in_holdover
- pps_phase_err_mean, pps_phase_err_sigma, pps_phase_err_p99, pps_trend_slope
- trigger_lat_mean, trigger_lat_sigma, trigger_lat_p99, multi_peak_flag
- profile_id, cal_version_id, fw_build_id, last_alarm_code, last_ref_switch_reason
Alert classes should include threshold (instant), trend (drift), and state (frequent transitions).
H2-10 · Determinism, calibration & production test (consistent units at scale)
Determinism is measurable when fixed delays are calibrated into a per-unit table, while random jitter is controlled by design and accepted only by distribution-based tests (mean/σ/tails).
Production tests should create repeatable histograms for each route profile and record the calibration version used.
Fixed delay vs random jitter (treat them differently)
- Fixed delay (calibratable): stable offsets from routing, retimers, and pipelines; corrected by a calibration table (ΔD per endpoint).
- Random jitter (not calibratable): stochastic variation; reduced by architecture and validated by σ/tails acceptance limits.
- Acceptance: require single-peak histograms under a given profile; multi-peak indicates path variability.
Calibration flow (loopback method)
- Emit a known trigger edge (or pulse train) from a controlled source.
- Loop back through a defined physical path and capture stamps at the endpoint TSU.
- Compute per-endpoint ΔD offset and verify with a histogram (mean/σ/p99).
- Write offsets into a cal table and lock them to a cal_version_id for traceability.
Production test checklist (release gates)
- PLL lock: lock time window + stable state (no thrashing).
- Frequency accuracy: offset/ppm snapshot under defined conditions; record drift indicator for holdover.
- Trigger latency: per profile, measure mean/σ/p99 and enforce single-peak behavior.
- PPS alignment: phase error σ and p99 must remain within thresholds; count discontinuities.
- Record: cal_version_id, fw_build_id, profile_id, timebase_state for every pass/fail record.
H2-11 · Component blocks & “how to choose” checklist (with example part numbers)
This section converts the system method (reference → cleaning → distribution → routing → timestamp → monitoring) into a practical, BOM-friendly checklist: what blocks are needed and which parameters decide the correct class.
Tip: never compare “jitter” numbers without stating the integration bandwidth (for example, 12 kHz–20 MHz) and the output format used.
90-second “how to choose” checklist (write these down first)
- What must align? sampling clock phase, trigger path latency, or event ordering by timestamp (three different budgets).
- How is the spec written? RMS jitter (with band), channel-to-channel skew, trigger latency (mean + σ + tails).
- External reference needed? 10 MHz and/or 1PPS input; define holdover goal when GPS/network disappears.
- Clock domains? ADC clock, FPGA fabric clock, SERDES reference clock—separate outputs often outperform “one clock for all”.
- Trigger forms? edge/pulse, gate/window, or event marker + timestamp; define which endpoints must be deterministic.
- What gets monitored online? timebase state, PPS phase error trend, trigger latency histogram per route profile.
- Production release gates? lock time, frequency offset, PPS phase error limits, latency histogram limits, calibration version tracking.
Component blocks (category → key parameters → example parts)
| Block | Key parameters (compare on the same basis) | Example parts (representative) | When to “upgrade” class |
|---|---|---|---|
|
Reference source TCXO / OCXO / VCXO / XO |
Short-term stability / phase noise; frequency accuracy (ppm); holdover drift (wander); power & warm-up; temperature sensitivity; frequency options. | SiTime SiT5356 (TCXO), SiTime SiT5711 (OCXO), SiTime SiT3808 (VCXO), Microchip DSC1001 (programmable XO), Crystek CVHD-950 (low phase noise VCXO class) | Move to OCXO or stronger holdover control when external reference dropouts must not break time alignment (holdover drift limit becomes a hard requirement). |
|
PLL / jitter cleaner DPLL / 2-loop |
Output phase noise & integrated jitter (band stated); loop bandwidth choices; reference input range; lock time; holdover behavior; number of outputs & frequency plan. | ADI AD9545 (DPLL / reference monitor / holdover class), TI LMK04828 (2-loop jitter cleaner class), TI LMK05318 (clock synchronizer / cleaner class), ADI LTC6952 (PLL + distribution class), Renesas 8A34001 (sync manager / jitter attenuation class) | Upgrade when holdover + reference validation must be hardware-enforced, or when the jitter target forces a dedicated cleaner stage separate from “lock/discipline”. |
|
Fanout / clock tree buffers / retime |
Additive jitter per stage; output-to-output skew; output standards (LVDS/LVPECL/CML/LVCMOS); supply noise sensitivity; output count & layout constraints. | ADI LTC6953 (distribution class), TI LMK1C1104 (LVCMOS buffer class), TI SN65LVDS105 (LVDS 1:4 fanout class) | Upgrade when channel-to-channel skew becomes a system-level spec, or when output standards and cable/connector uncertainty dominate the skew budget. |
|
Trigger routing matrix / priority |
Deterministic latency (mean + σ + tails); channel count (in×out); multi-source priority; interlock capability; input/output electrical format. | TI SN65LVDT125A (4×4 LVDS crosspoint), TI SN65LVCP22 (2×2 LVDS crosspoint), ADI MAX9393 (dual 2×2 crosspoint), Lattice MachXO3 (small CPLD/FPGA fabric for routing + interlock) | Upgrade when the system requires provable determinism per profile, multi-source routing rules, or safety interlocks that must be hardware-enforced. |
|
Timestamp unit (TSU) counter / fine time |
Stamp point definition (capture vs trigger); counter resolution; discipline method (PPS/PTP input as a concept); drift statistics; ability to tag stamps with profile/state; optional fine-time interpolation. | TI TDC7200 (time-interval measurement / fine-time class), Lattice ECP5 LFE5U-25F (FPGA class for TSU logic), Intel 10CL016 (Cyclone-10-class FPGA for TSU logic) | Upgrade when event ordering across devices is safety-critical or when sub-nanosecond measurement is required (fine-time interpolation becomes necessary). |
|
Monitoring & telemetry phase / jitter / logs |
Timebase state visibility; reference validation; phase error statistics; histogram export; threshold + trend alerting; logging interface and rate control. | ADI AD9545 (reference monitor/holdover telemetry class), TI LMK05318 (sync/clean + status/telemetry class) | Upgrade when field service requires “time health” dashboards (state + trends + distributions) rather than single snapshot measurements. |
Note: part numbers are examples to anchor categories. Final selection must match availability, temperature grade, package, supply rails, and clock/trigger electrical standards.
Practical “do not get trapped” notes (high-impact pitfalls)
- Jitter numbers: always compare with the same integration band and the same output format. “Lower” without a band is not actionable.
- Skew control: treat connectors/cables/trace length as first-class skew contributors; do not assume fanout IC skew dominates.
- Trigger determinism: verify distributions (mean/σ/tails) per route profile; a deterministic mean is not enough if tails grow.
- Calibration: fixed delays should be stored as a versioned per-unit cal table; random jitter is accepted only by histogram limits.
- Monitoring: export tags (profile_id, timebase_state, cal_version_id) so field logs can be reproduced and traced.
H2-12 · FAQs × 12 (answers + FAQ schema)
These FAQs target the most common “how to choose / how to measure / how to verify / how to debug” timing questions. Jitter, skew, and latency are treated as measurable distributions (mean/σ/p99) with explicit measurement bands and states.