123 Main Street, New York, NY 10001

Timing Cards & Modules: Integrated PLL, Cleaner, Fanout & Alarms

← Back to:Reference Oscillators & Timing

A timing card/module is a system-level “time backbone” that turns uncertain time sources into verified, maintainable, multi-domain clocks—delivering controlled jitter, repeatable phase alignment, disciplined holdover, and actionable alarms for production and field operations.

This page explains how to design, integrate, validate, and operate timing cards/modules using measurable checks, acceptance criteria, and selection logic—so timing performance stays predictable across temperature, power events, and failovers.

Definition: What is a Timing Card / Module?

A timing card/module is a deployable clock subsystem that turns one or more time references into controlled clock outputs with repeatable alignment, auditable alarms, and maintainable holdover. It is designed to be integrated, validated, and operated as a system capability (not a single IC).

The system pain it fixes (why it exists)

Controlled jitter profile
Symptom: “Looks OK on one bench” but fails elsewhere.
Impact: downstream lock/quality becomes unpredictable.
Repeatable alignment
Symptom: reboot/reseat changes relative phase.
Impact: multi-card systems lose deterministic timing.
Observable alarms (audit-ready)
Symptom: “Something drifted” with no traceable evidence.
Impact: field debugging turns into guesswork.
Maintainable holdover
Symptom: reference degrades → time “jumps” or “walks away”.
Impact: no stable behavior can be guaranteed during outages.

Timing card vs timing module vs “clock tree board”

Clock tree board
Focus: distribution, levels, terminations, skew.
Often missing: disciplined holdover + auditable alarms as a closed loop.
Timing module
Focus: embeddable subsystem with defined I/O and control.
Typical fit: constrained space, moderate telemetry, integrated platforms.
Timing card
Focus: deployable + maintainable (upgrade, logs, alarms, redundancy).
Typical fit: systems that require operational SLAs and audit trails.

Typical inputs/outputs (card-level view)

Inputs (time sources)
  • GNSS / ToD + 1PPS (absolute time anchor)
  • PTP hardware-timestamped port (network time feed)
  • SyncE recovered clock (transport-grade frequency)
  • 1PPS / 10 MHz (lab or system reference)
Outputs (clock domains)
  • Ref clocks (multi-output fanout to endpoints)
  • SYSREF / sync pulses (deterministic alignment hooks)
  • 1PPS out (system time marker)
  • ToD distribution (time-of-day delivery to systems)
Scope lock for this page
Focus is on system integration, validation, disciplining/holdover behavior, and alarms. PLL math and protocol stack details are intentionally kept out to avoid cross-page overlap.
Timing Card/Module in the System Time sources on the left feed an integrated timing card/module in the middle, producing multiple clock domains and alarm visibility. TIME SOURCES GNSS PTP 1PPS 10MHz TIMING CARD / MODULE PLL Cleaner Fanout Holdover Discipline ALARMS lock • phase • temp OUTPUT DOMAINS RefClk SYSREF 1PPS Out ToD Key idea: deliver a controlled clock + alignment + alarms + holdover as a subsystem.
System position: a timing card/module sits between time sources and multi-domain endpoints, making clock quality and state observable and repeatable.

When to Use It: Discrete vs Card/Module (Decision Triggers)

Choose a timing card/module when clock alignment, time stability, alarms, and failover must behave like a system-level SLA. If the system can tolerate manual tuning and limited observability, a discrete clock tree can be more cost-effective.

Decision triggers (engineer-first)

Must-have triggers
  • Multi-chassis / multi-card alignment must be repeatable (ps–ns class), including after reboot or reseat.
  • Time/clock health must be auditable: alarms, timestamps, counters, and clear state transitions are required for operations.
High-ROI triggers
  • Redundancy and failover are required (A/B references, hitless switching goals, defined recovery behavior).
  • Production consistency matters: calibration parameters must be fixed, and acceptance tests must be repeatable at scale.
Practical “quantifiable” framing (without going into math)
  • Alignment need: is the requirement “repeatable after reboot” or “stable over hours across temperature”?
  • Observability need: are time events required to be logged with timestamps and state transitions?
  • Failover need: is a defined maximum phase transient required during switching?
  • Production need: is there a fixed acceptance workflow with stored calibration data and audit trails?

Typical fits (fast sanity check)

Discrete clock tree
Best when: single board/domain, low operational burden, manual tuning acceptable, limited telemetry required.
Timing module
Best when: embedded integration is needed, defined I/O is preferred, moderate alarms/logs, controlled deployment footprint.
Timing card
Best when: system-level SLAs, auditability, redundancy, remote operations, and repeatable acceptance tests are required.

Common cost of picking the wrong level

Too light (stays discrete)
  • Field faults become non-reproducible and hard to audit.
  • Alignment drifts or resets are difficult to bound.
  • Production variability increases without a fixed validation template.
Too heavy (over-spec card)
  • Unnecessary BOM/power/complexity and longer bring-up time.
  • More configuration states to validate and operate.
  • The real bottleneck may be elsewhere (layout, power noise, or endpoint constraints).
Discrete vs Module vs Card Decision Flow A simple decision flow that maps system triggers to an appropriate integration level: discrete clock tree, timing module, or timing card. Decision triggers → recommended integration level Need system-level SLA? alignment • alarms • holdover • failover NO Discrete single-domain, low ops YES Deployable ops? logs • upgrades • audit NO Timing Module YES Timing Card Triggers prod • field
Use triggers (SLA, auditability, redundancy, and production repeatability) to select the appropriate integration level instead of chasing single-component specs.

Internal Architecture: The “Timing Stack” Inside

A timing card/module behaves like a small timing subsystem. The fastest way to understand it is to separate three parallel planes: Clock plane (what time flows through), Control plane (how behavior is configured), and Telemetry plane (what can be measured, alarmed, and audited).

The three-plane mental model (subsystem view)

Clock plane
Reference → synth/clean → fanout/levels → output domains.
Control plane
Mode selection, loop profiles, output mapping, thresholds, and stored calibration.
Telemetry plane
Lock/phase/frequency/temperature/rail status → alarms + logs + audit trail.

Five functional bricks (role → interfaces → failure signature)

1
Reference sources
Role: provide predictable short/mid-term stability for holdover and tracking.
Interfaces: local osc, (optional) tuning input, temperature sensing.
Signature: temperature-correlated drift, warm-up behavior changes.
2
Synth / Cleaner
Role: shape the jitter profile and manage modes (track vs clean).
Interfaces: reference input, loop profile, lock detect.
Signature: abnormal lock time, phase steps on mode changes.
3
Fanout / Levels
Role: deliver the conditioned clock to many endpoints with controlled skew.
Interfaces: per-output enable, level select, delay trim.
Signature: one output degrades due to loading/termination mismatch.
4
Monitor
Role: convert health into alarms and audit signals.
Interfaces: taps, thresholds, debounce, event timestamps.
Signature: alarm storms (too sensitive) or missed drift (too loose).
5
Control plane
Role: configuration, stored calibration, logging, remote operations.
Interfaces: MCU/FPGA, EEPROM, mgmt links, firmware control.
Signature: version drift or config mismatch causing behavior changes.
Engineering takeaway
Clock quality issues are rarely “one chip” problems on a card. The correct debug axis is: which plane failed (clock vs control vs telemetry) and which brick is responsible (reference / cleaner / fanout / monitor / control).
Internal Architecture: Timing Stack Five functional bricks connected by three link types: clock, control, and telemetry. Links: Clock Control Telemetry Reference XO VCXO TCXO/OCXO Synth / Cleaner PLL Attenuator Track / Clean modes Fanout / Levels LVDS HCSL LVPECL CMOS Monitor Phase Freq Temp Missing pulse Rails Control Plane MCU FPGA EEPROM Logs / Counters Remote Mgmt Build intuition by separating planes: clock path (solid), control (dashed), telemetry (dotted).
Internal stack: five bricks connected by three planes. Debugging becomes faster when a symptom is mapped to a plane (clock/control/telemetry) and then to a brick.

Inputs & References: Time Sources and Isolation Strategy

Cards/modules rarely rely on a single input. Multiple time sources are classified, then passed through health gates, and finally selected by priority + switching policy. The goal is predictable behavior during degraded inputs, not maximum sensitivity.

Input types (by timing meaning, not by connector)

Absolute time anchor
GNSS RF / ToD + 1PPS (time-of-day and a stable epoch marker).
Network time feed
PTP via a hardware-timestamped port (time updates plus path variability).
Frequency transport
SyncE recovered clock (frequency reference delivered by transport).
Local/lab reference
10 MHz and/or 1PPS from a system backplane or lab source.

Health gates (sanity checks that prevent “bad-but-preferred” inputs)

Freq offset gate
Reject inputs with frequency error beyond the allowed capture/hold window.
Phase step gate
Detect sudden phase jumps that would produce time “bumps” after selection.
Noise / stability gate
Prefer inputs with stable short-term behavior; avoid “flapping” between good/bad.
Continuity gate
Missing pulses/packets or unstable link states are treated as degraded even if averages look OK.
Key rule
Quality is not the same as priority. A high-priority input still must pass health gates. This prevents “preferred but unhealthy” references from dominating the system.

Isolation strategy (minimum set that prevents cross-domain contamination)

Power domain hygiene
Separate quiet rails for sensitive timing blocks; filter and control sequencing to avoid “healthy input, noisy output” surprises.
Ground/return control
Manage return paths across connectors and shields; avoid unintended current loops that convert cable motion into phase events.
Signal isolation
Use appropriate coupling/isolation on inputs; keep noisy digital edges from polluting reference-sensitive nodes.
Input Arbitration and Reference Selection Inputs are classified, checked by health gates, then selected by priority with hysteresis and stable windows to avoid flapping. Inputs GNSS PTP SyncE 10MHz Health gates Freq gate Phase gate Noise gate Priority selector Active reference Switch policy Hysteresis Stable window Rule: inputs must pass gates before priority can select; switching uses hysteresis to avoid flapping.
Input arbitration: classify inputs, reject unhealthy references via gates, then select by priority with hysteresis/stable windows for predictable behavior.

Disciplining & Holdover: Control Modes and What “Good” Looks Like

The real value of a timing card/module is not “having clocks,” but bounded time behavior when references degrade or disappear. This section defines control modes as observable states, focuses on logs and measurable curves, and provides a reusable acceptance template (placeholders must be set by system requirements).

Three modes (defined by behavior, not by control theory)

A
Free-run
Intent: keep outputs running from the local oscillator only.
Entry: no valid external reference passes health gates.
Observable: phase error drifts according to local stability.
Log: mode, temp, tune word (if any), drift metrics.
B
Discipline (track)
Intent: steer local time/frequency to a selected reference.
Entry: reference selected + stable window satisfied.
Observable: phase error converges into a stable band.
Log: active_ref, loop profile, lock time, phase/freq error.
C
Holdover
Intent: maintain bounded time without the external reference.
Entry: reference fails gates; holdover policy asserted.
Observable: phase error grows within a defined envelope.
Log: last-good ref stats, temp, predicted drift, alarms.

What “good” looks like (curves and signatures)

Discipline: stable convergence
  • Phase error moves into a steady band and stays there.
  • No periodic “time bumps” tied to mode updates.
  • Recovery does not produce a visible step beyond policy limits.
Holdover: bounded envelope
  • Phase error grows predictably (mostly smooth slope).
  • Temperature changes shift slope, but remain bounded.
  • Alarms reflect genuine degradation, not noise flapping.
Common bad signatures
  • Phase steps during switching or recovery (“time jump”).
  • Holdover slope changes abruptly with minor temperature swings.
  • Alarm storms caused by missing hysteresis/stable window.

Acceptance templates (placeholders; set by system requirements)

Holdover phase drift
Test: enter holdover and observe for X hours.
Metric: peak/percentile of phase_error(t).
Pass: |phase_error| ≤ Y within X hours.
Frequency error bound
Metric: freq_error(t) and slope stability.
Pass: |freq_error| ≤ Z (Z depends on wander budget).
Mode transition behavior
Events: track↔holdover and recovery.
Pass: no phase step > A, alarms clear within B.
Minimum log set for auditability
mode • active_ref • health_gate_state • loop_profile • phase_error • freq_error • tune_word (or DAC) • temp • event_timestamp • firmware_version • config_hash
Control Modes: State Machine Mode transitions are driven by reference health gates, stable windows, and recovery policy. Three-mode state machine (behavioral view) Free-run local osc only Discipline track selected ref Holdover bounded drift Guards health gates stable window hysteresis ref_ok ref_fail recover_stable manual_force no_valid_ref Logs should capture mode transitions with timestamps + config hash for reproducible acceptance.
Modes are operational states. Transitions must be gated (health/stable window/hysteresis) and logged (event timeline + configuration identity).
Holdover: Phase Error vs Time (Conceptual) Target envelope band with an example of a good curve staying inside and a bad curve drifting out after temperature influence. Time → Phase error Target band temp GOOD BAD Acceptance template: within X hours of holdover, |phase_error| ≤ Y (X/Y set by system SLA). Holdover behavior is evaluated as a curve inside an envelope
Holdover acceptance is defined by an envelope over time. A “good” system stays in band across temperature changes; a “bad” system exits the band due to drift slope changes.

Output Clocking: Domains, Alignment, and Distribution Rules

Output clocking is domain management. A timing card/module must feed multiple endpoints while keeping skew, phase continuity, and configuration traceability under control. This section describes a practical approach without relying on interface-specific standards.

Output domains (organized by meaning)

System clock
The platform-wide timebase used as the common root for other domains.
RefClk
Continuous reference clocks delivered to endpoints that require jitter control.
SYSREF / sync pulse
Event-like alignment markers; managed separately from continuous clocks.
1PPS / ToD
Epoch marker (1PPS) and time-of-day distribution (data), often used for audit and coordination.

Alignment strategy (repeatable after reboot/reseat)

Fixed delay
Use when topology is stable and paths are repeatable; minimizes configuration states.
Programmable delay
Compensate assembly and path variation; store per-channel trim values for field reproducibility.
Phase trim
Use for fine alignment; treat as a closed-loop adjustment with a measurable before/after delta.
Skew budget template (structure only)
total_skew_budget = source_variation + fanout_variation + trace/connector + endpoint_variation (set each term and guardband for temperature and restart repeatability).

Termination & levels (card-level rules, no protocol dependence)

HCSL
Keep return paths clean across connectors; ensure output drive and termination strategy are consistent per channel.
LVDS
Maintain differential impedance continuity; avoid common-mode injection from noisy domains.
LVPECL
Treat supply noise as a jitter contributor; enforce consistent termination and avoid long stubs.
LVCMOS
Fast edges amplify coupling risk; keep routes short, manage series damping, and avoid crossing split returns.

Output acceptance templates (placeholders)

Intra-domain skew
Pass: |skew| ≤ S across channels, verified after reboot/reseat and across temperature.
Deterministic alignment
Pass: phase relationship returns within R after restart, with stored trim map and config hash.
Traceable output map
Export: enable/level/delay status per output (output_id) for field comparison and audit.
Output Tree: Cleaner → Fanout → Endpoints Clock distribution tree with per-branch delay trim, output domains, and per-output configuration traceability. Links: Clock Per-output config Status Cleaner jitter profile Fanout O1 O2 O3 O4 O5 O6 Endpoints RefClk SYSREF System clk 1PPS ToD Δ Δ Δ Δ Δ Rule: one clock tree, per-branch trim, traceable output_id map (enable/level/delay/status).
Output management: keep a clear hierarchy (cleaner→fanout→endpoints), add per-branch delay trim where variability exists, and make per-output configuration exportable for field audit.

Monitoring & Alarms: What to Measure, What to Log, How to Act

A timing card/module is operated as a closed-loop, observable system. Alarms are not “strings”; they are events with context, confidence (debounce/confirm), and a bounded action policy. This section focuses on on-card monitoring points and operational logic (not external NMS).

Alarm classes (organized by impact)

T
Time integrity
Examples: Loss-of-lock, phase step, wander out-of-band.
Impact: alignment/SLA risk.
Typical action: degrade or switch (policy-gated).
R
Reference health
Examples: freq offset, ref quality drop, GNSS degraded.
Impact: increased risk of drift and switching.
Typical action: tighten gates, change profile, prepare failover.
H
Hardware health
Examples: temp out-of-range, Vrail droop, sensor missing.
Impact: performance collapse or false positives if unhandled.
Typical action: protective degrade + strong logging.

Event chain (Detection → Debounce → Confirm → Report → Act)

Detection
Measure phase/freq error, lock state, ref quality, temperature, and rails. Treat each as a signal with a sampling policy.
Debounce
Use time windows/counters to avoid flapping. A single transient should not trigger a switch or storm.
Confirm
Require correlated evidence (e.g., phase step + lock transition) to raise confidence and reduce false positives.
Report
Emit an event object with context: active_ref, mode/profile, and before/after metrics around the trigger.
Act
Apply policy: log-only, degrade, or switch. Every action must be traceable to an event and reason code.

Action policy (bounded, auditable)

Log-only
Informational events and early warnings. Used for trending and root-cause correlation.
Degrade
Change operating profile or tighten gates; keep outputs stable while risk is rising (e.g., GNSS degraded).
Switch / failover
Trigger only on high-confidence events; log switch points and post-check results to prove correctness.

Minimum log fields (to reproduce and audit decisions)

Event header
event_id • timestamp • severity • reason_code • state_before • state_after
Timing context
active_ref • ref_quality • loop_mode • loop_profile • phase_error • freq_error
Hardware context
temp • Vrail(s) • sensor_status • lock_state • switch_events • firmware_version • config_hash
Operational rule
Any switch must include: trigger event + before/after window metrics + post-check result + rollback path (if fail).
Alarm Pipeline (On-card) Detection points flow into a pipeline: Detect, Debounce, Confirm, Report, Act, with action branches: Log-only, Degrade, Switch. Alarm pipeline: measure → decide → act (auditable) Detection points Lock detect Phase monitor Freq offset Temp Vrail GNSS quality Detect Debounce Confirm Report Event object (context) Log-only Degrade Switch / Failover Key rule: alarms must be confidence-scored (debounce+confirm) before policy actions.
Treat alarms as event objects: detect, debounce, confirm, report, then act by policy (log-only / degrade / switch), with traceable before/after windows.

Redundancy & Failover: Hitless Switching and Guard Paths

Redundancy is not “having a backup.” It is a measurable switching policy that keeps critical domains stable under reference faults. This section describes on-card A/B references, guard paths, and acceptance templates for hitless behavior (placeholders set by system requirements).

Redundancy targets (what is actually duplicated)

Reference redundancy
A/B references (e.g., two independent sources). Health gates decide eligibility and priority.
Path redundancy
Main/backup routing, including guard/bypass paths. The backup is monitored continuously (not cold).
Module redundancy
Dual modules/cards are supported by consistent configuration identity and comparable telemetry.

Guard paths (keep the backup “ready and comparable”)

Continuous health
The guard path evaluates ref quality and lock readiness so switching is not a cold-start gamble.
Compare & pre-align
Maintain a comparable phase/freq view to reduce phase steps at the switch point.
Traceable readiness
Readiness is logged as a state: eligible/not eligible, with reason codes and thresholds used.

Hitless switching (defined by allowed transients)

Phase transient
Pass: |Δphase| ≤ P at switch (P set by system window).
Frequency transient
Pass: |Δfreq| ≤ F, settles within T.
No glitch behavior
Critical domains must avoid missing/double pulses at the switch point (domain-specific rules).
Exercise template (black-box + rollback)
Routine: force main→backup→main; record before/after windows; pass if Δphase ≤ P and Δfreq ≤ F within T, alarms clear within B; if post-check fails, auto rollback and log reason_code.
Redundancy: Main/Backup + Guard Path + Switch Dual reference inputs feed main and guard paths. Guard path monitors readiness and phase compare. Switch point selects output with hitless acceptance window. Main/Backup redundancy with guard path (strategy view) Ref A Ref B Main path Health gate Track / Clean Guard path (backup readiness) Health gate Phase compare pre-align • eligible • reason_code Switch point hitless window Outputs RefClk SYSREF / 1PPS ToD Post-check fail → rollback to last stable source, log reason_code and before/after metrics window.
Guard paths keep the backup ready and comparable. Hitless switching is defined by allowed phase/frequency transients (P/F/T), verified by post-check windows and rollback rules.

System Integration: Power, Thermal, EMC, and Backplane Reality

Timing cards/modules are unusually sensitive to supply noise, return paths, connector/backplane behavior, and thermal gradients. This section focuses on the integration-specific failure modes that typically turn “bench-good” into “system-bad”.

Power: low-noise rails, domain partitioning, filters, and boot sequencing

Define “quiet” rails
Treat the reference/cleaner/VCXO supply domain as a performance limiter. Track rail noise at the same time as jitter/phase error to prove causality.
Partition domains
Separate timing-analog from control/telemetry digital. Keep high di/dt loads out of the analog island return path and regulator headroom.
Filter with intent
Filters must target the dominant noise bands (switching fundamentals/harmonics and coupling points). Over-filtering can destabilize rails or increase droop during load steps.
Power-up sequencing
A repeatable sequence avoids false lock and alarm storms: stabilize rails → load config → enable outputs → enable alarm actions (policy gates last).

Thermal: gradients, airflow, and sensor placement

Gradient is the enemy
For OCXO/TCXO, a stable average temperature may still drift if the device sees a moving gradient. Monitor both temperature and its rate-of-change.
Airflow realism
Place oscillators away from pulsed airflow and adjacent hot spots. A “cold” location near a fan can create periodic thermal modulation.
Sensor placement
Sensors must represent the true drift source, not a distant board average. A “good-looking” sensor can hide a local gradient near the oscillator.

Backplane & chassis: returns, common-mode noise, reflections, and cable length

Return paths matter
Backplane and chassis returns can inject common-mode noise into clock paths, showing up as elevated jitter or slow alignment drift. Treat return topology as a first-class design input.
Connector reflections
Connectors/backplanes can create reflections that don’t “break” a scope check but do degrade phase stability. Validate at the output port and at the endpoint, not only at a nearby test pad.
Cable length discipline
For cross-card alignment, “controlled and repeatable” length beats “short.” Any mismatch becomes deterministic delay error and reduces margin.

Integration checklist (risk → quick check → fix → pass)

Rail noise couples into jitter
Risk: switching rail artifacts raise random jitter / create spurs.
Quick check: compare jitter/phase_error with management traffic OFF vs ON; log Vrail ripple at the same time.
Fix: separate rails, add targeted filtering, re-route returns to keep digital currents out of analog island.
Pass: incremental jitter/phase_error change ≤ ΔJ / ΔP (placeholders set by system budget).
Thermal gradient creates slow drift
Risk: stable average temperature but moving gradient causes wander and false trend alarms.
Quick check: correlate phase_error slope with temp slope and fan/airflow states; look for periodic modulation.
Fix: move oscillator away from hot spots/pulsed airflow; relocate/duplicate sensors near the drift source.
Pass: drift slope vs temperature stays within the holdover envelope (system-defined).
Backplane common-mode & reflections
Risk: connector/backplane behavior degrades phase stability even when waveforms look “ok”.
Quick check: measure at port and at endpoint; compare skew/jitter with direct-cable vs backplane path (one-variable change).
Fix: tighten terminations/levels, add common-mode control where required, enforce cable length rules for aligned domains.
Pass: endpoint jitter/skew meets budget with backplane installed (not only on bench).
Power Domains + Isolation Domains Analog island and digital island are separated, both referenced to chassis/backplane with controlled return and isolation bridges. Power filtering and isolation blocks are shown. Power domains + isolation domains (integration view) Backplane power DC/DC Analog island Ref PLL VCXO Cleaner Digital island MCU FPGA Telemetry / Logs Chassis / Backplane Return paths Connectors Common-mode LC LC ISO Ports Integration rule: separate domains, control returns, and validate at endpoints (not only at local pads).
A timing card’s performance is dominated by domain separation and controlled bridges (filters/isolation/returns). Backplane and chassis behavior must be treated as part of the timing system.

Validation & Acceptance: Bench Tests That Actually De-risk Deployment

Acceptance should be a repeatable engineering flow: define baselines, lock measurement windows, prove one-variable deltas, and record context. The goal is not “pretty plots” but de-risking deployment by isolating power/backplane/thermal effects and verifying modes (holdover and failover) with auditable evidence.

Test setup rules (so results remain comparable)

Baseline first
Measure the reference source and the DUT under a “golden” setup before stressing power/thermal/backplane variables.
One-variable deltas
Change a single variable per run (rail noise, airflow, backplane path, loop profile). Record configuration identity for every run.
Window discipline
Keep jitter bandwidth/integration time and alignment observation windows fixed. Report both absolute results and incremental deltas.

Output phase noise / jitter (measure points + windows + baselines)

Where to probe
Compare internal-cleaner output vs port output vs endpoint. This isolates fanout/connectors/backplane contributions.
Window definition
Use a fixed RMS jitter window (placeholder BW) and/or offset PN points that match system sensitivity.
Acceptance style
Prefer “delta to baseline” acceptance: jitter ≤ J or incremental increase ≤ ΔJ (placeholders set by budget).

Phase alignment (multi-channel + cross-card + temperature deltas)

Skew budget
Treat end-to-end delay as a budget: routing + connectors + cables + programmable delay trim. Verify budget at endpoints.
Cross-card proof
Validate with the intended backplane/cabling. A bench-only result may hide connector reflections and return-path coupling.
Temp sweep delta
Compare skew before/after temperature changes. Record Δskew vs ΔT and ensure drift stays within alignment margin.

Holdover (loss-of-reference, thermal change, and aging trend)

Reference cut test
Remove the external reference and log phase_error(t) and freq_error(t). Compare the envelope to the system budget.
Thermal change
Apply a controlled temperature step/ramp. Validate drift slope and compensation behavior under realistic gradients.
Aging trend
Use accelerated comparison or long-run deltas to confirm drift is predictable and consistent across units and time.
Acceptance placeholders (bind to SLA)
Pass examples: within X hours, |phase_error| ≤ Y; or |freq_error| ≤ Z. Values X/Y/Z depend on system alignment and service requirements.

Failover (transients, alarm correctness, recovery time)

Switch transient
Measure Δphase and Δfreq around the switch point using fixed observation windows. Validate “no glitch” rules for critical domains.
Alarm correctness
Confirm the full chain: detection → debounce → confirm → report → action. A switch without a traceable trigger is not acceptable.
Recovery time
Measure time from fault injection to stable outputs and cleared alarms. Define a rollback path and verify it in the same test plan.
Acceptance Bench Setup (Measurement Chain) Reference source feeds DUT. Instruments measure phase noise, time interval, and glitches at defined test points. Thermal and power stimulus blocks enable one-variable comparisons. Acceptance bench: instruments + DUT + fixed test points Reference source 10 MHz / 1PPS DUT: timing card Cleaner / Fanout Ports / Domains TP1 TP2 TP3 Instruments PN analyzer TIA Scope Logger Stimulus blocks Thermal Power delta Rule: fixed windows + baseline + one-variable deltas; record config_hash and environment for every run.
A de-risking acceptance flow measures at fixed test points (TP1/TP2/TP3), compares against baselines, and uses one-variable deltas (power/thermal/backplane) to isolate root causes.

Engineering Checklist: Bring-up → Production → Field

This section turns the timing-card “capabilities” into executable stage-gates. Each gate contains only actions, required evidence, and measurable pass criteria (placeholders such as X/Y/Z/T must be set by the system timing budget and SLA).

Gate G1

Bring-up (Lock → Mode transitions → Output sanity)

Goal
Prove the card locks, switches modes deterministically, and drives endpoints with correct electrical levels and domain mapping.
Actions
  • Lock check: verify reference selection, lock indicators, and “healthy” state under nominal input.
  • Mode walk: execute Free-run → Discipline → Holdover transitions; record the exact trigger used.
  • Output electrical: validate standard + termination at both the port and the endpoint (HCSL/LVDS/LVPECL/LVCMOS).
  • Domain mapping: confirm each output domain is routed to the intended consumer (system clock / refclk / sysref / pps / ToD).
  • Alarm sanity: inject a controlled fault (reference removed / degraded) and confirm alarm + log closure.
Evidence to capture
  • Config snapshot (register dump / profile ID), firmware version, and build hash.
  • Lock timeline and mode transition timestamps (with input ref quality label).
  • Endpoint measurement screenshots (level + termination + jitter/phase delta vs baseline).
  • Alarm event entries for each injected fault + recovery.
Pass criteria (placeholders)
  • Lock time ≤ T_lock and remains locked for ≥ T_stable under nominal conditions.
  • Mode transitions produce no endpoint faults; phase transient ≤ Δφ_switch.
  • Additive jitter at endpoints ≤ ΔJ_budget (window defined by the system spec).
  • Alarm injection produces: detect → debounce → confirm → report → action, all within ≤ T_alarm.
Gate G2

Production (Calibration → Sealing → Sampling plan)

Goal
Ensure repeatable timing behavior across units and lots by freezing calibration parameters and enforcing traceable acceptance records.
Actions
  • Calibration items: temperature compensation coefficients, frequency offset trim, delay table (channel alignment), holdover model parameters.
  • Sealing: write EEPROM/flash, verify CRC/signature, lock critical fields; bind to serial number.
  • Golden baseline: compare each unit against a golden reference (relative deltas preferred over absolute).
  • Sampling strategy: define lot sampling rate and re-test triggers (process change / firmware change / component swap).
Evidence to capture
  • Calibration record: coefficients + delay table + firmware/build ID.
  • Acceptance summary: jitter/phase delta vs golden; holdover short test snapshot.
  • Configuration hash + EEPROM CRC report (pass/fail).
Pass criteria (placeholders)
  • Config sealing succeeds (CRC/signature valid) with version match.
  • Relative deltas vs golden: jitter ≤ ΔJ_golden, skew ≤ ΔSkew_golden.
  • Lot statistics meet thresholds: out-of-family rate ≤ R_oof, drift shift ≤ ΔDrift_lot.
Gate G3

Field (Alarm policy → Log rotation → OTA upgrade + rollback → Drills)

Goal
Make timing behavior auditable and recoverable: actionable alarms, complete logs, safe remote updates, and repeatable drills.
Actions
  • Alarm policy: define warning/degrade/failover thresholds and the exact action taken for each state.
  • Log rotation: set retention, roll size, and “must-keep” fields (timestamp, ref quality, loop mode, phase/freq error, temp, rail status, switch events).
  • Remote upgrade: enforce pre/post checks; require rollback trigger conditions and a validated recovery path.
  • Periodic drills: scheduled failover drill, holdover drill, and alarm chain drill (black-box success criteria).
Evidence to capture
  • Drill reports: trigger → detected → action → recovered timeline.
  • Upgrade reports: pre-check snapshot, post-check snapshot, and rollback record (if used).
  • Degrade/failover logs with root-cause tags (ref degraded, temp excursion, rail anomaly).
Pass criteria (placeholders)
  • Alarm-to-action latency ≤ T_action; false-trigger rate ≤ R_false.
  • Failover drill: phase transient ≤ Δφ_hitless; service impact = “none” by system definition.
  • Holdover drill: phase error envelope ≤ E_holdover(t) over X hours.
  • Rollback completes within ≤ T_rollback and restores last-known-good timing profile.
Diagram: Stage-gates (G1/G2/G3) with Actions / Evidence / Pass criteria (conceptual)
Timing card engineering stage-gates Three stage-gates for bring-up, production, and field operations, each showing actions, evidence, and pass criteria blocks. Bring-up → Production → Field (Actions / Evidence / Pass) G1 G2 G3 Bring-up Production Field Actions lock · modes · levels Evidence config · logs · screenshots Pass T_lock ≤ X ΔJ ≤ Y Actions calibrate · seal · sample Evidence CRC · hash · deltas Pass ΔJ_g ≤ X ΔSkew_g ≤ Y Actions alarms · logs · drills Evidence drill report · rollback Pass Δφ ≤ X E_hold(t) ≤ Y

Applications & IC Selection Notes (Card-Level Selection Logic)

This is not a shopping list. It is a card-level selection method: required capabilities → system constraints → spec-writing rules. The material numbers below are starting points for datasheet lookup and lab validation; package, lifecycle, and availability must be verified.

A) Selection dimensions (capabilities to specify)

  • Inputs: GNSS / PTP (hardware timestamp) / SyncE recovered clock / 1PPS / 10 MHz; multi-source arbitration needed or not.
  • Outputs: count + standards (HCSL/LVDS/LVPECL/LVCMOS) + special domains (SYSREF / PPS / ToD).
  • Alignment: on-card channel skew budget and cross-card phase alignment target (ps–ns class).
  • Holdover: define an error envelope over time E_holdover(t) (phase vs time), not a single number.
  • Alarms: required signals/telemetry (GPIO/I²C/host) and a graded policy (warning/degrade/failover).
  • Redundancy: A/B ref, hitless definition (Δφ_hitless, Δf_hitless), and drill requirements.

B) System constraints (what silently breaks timing)

  • Thermal reality: airflow stability, gradients across OCXO/TCXO/MEMS area, and sensor placement for control decisions.
  • Power noise: rail noise density, load steps, and isolation between analog timing island vs digital control plane.
  • Backplane/cabling: reflections, common-mode injection, and ground potential differences across chassis.
  • Operations: remote-only vs local serviceability, allowed downtime for updates, required log retention and audit trail.

C) Risk notes (how to write specs that are testable)

  • Typical vs worst-case: require worst-case across temperature, rails, and chosen reference inputs; typ-only specs are not deployable.
  • Bind every number to a window: RMS jitter must state integration limits; phase/ToD must state averaging time and measurement method.
  • Budget alignment: card targets must roll up to a system budget (converter SNR, SerDes tolerance, network SLA).
  • Auditability: every “degrade/failover” decision must be provable by logs (fields + timestamps + thresholds).
  • Lifecycle reality: check PCNs, NRND/obsolete status, and second-source plan for long-life programs.

D) Reference material numbers (starting points only)

Grouped by function blocks commonly found in timing cards/modules. Verify package suffix, grade, lifecycle, and timing performance in the intended measurement window.

Clock synchronizer / DPLL (time & frequency)
  • AD9545 (ADI) — clock synchronizer / DPLL platform :contentReference[oaicite:13]{index=13}
  • ZL30772 (Microchip) — packet/SyncE DPLL class device :contentReference[oaicite:14]{index=14}
Jitter cleaner / clock generator (converter & SerDes trees)
  • Si5345 (Skyworks/Silicon Labs line) — jitter attenuator family :contentReference[oaicite:15]{index=15}
  • LMK04828 (TI) — jitter cleaner + distribution class device :contentReference[oaicite:16]{index=16}
  • AD9528 (ADI) — JESD clock generator class device :contentReference[oaicite:17]{index=17}
  • HMC7044 (ADI) — dual-loop jitter attenuator class device :contentReference[oaicite:18]{index=18}
Fanout buffer / level translation (endpoint driving)
  • ADCLK948 (ADI) — low-jitter fanout buffer family :contentReference[oaicite:19]{index=19}
  • LMK00334 (TI) — clock buffer / level translator class device :contentReference[oaicite:20]{index=20}
Low-noise power (timing island rails)
  • ADM7150 (ADI) — ultralow-noise LDO class device :contentReference[oaicite:21]{index=21}
Isolation + sensors (control plane robustness)
  • ADuM1250 (ADI) — I²C isolator class device :contentReference[oaicite:22]{index=22}
  • TMP117 (TI) — digital temperature sensor class device :contentReference[oaicite:23]{index=23}
GNSS timing receiver modules (if GNSS disciplining is required)
  • ZED-F9T (u-blox) — timing GNSS module :contentReference[oaicite:24]{index=24}
  • LEA-M8T (u-blox) — timing GNSS module family :contentReference[oaicite:25]{index=25}
  • mosaic-T (Septentrio) — GNSS timing receiver module :contentReference[oaicite:26]{index=26}
  • LC29H (Quectel) — dual-band GNSS module series :contentReference[oaicite:27]{index=27}
Practical note: material numbers above are intended for “block matching” (DPLL / cleaner / fanout / rails / sensors / GNSS). Final selection must be driven by the measurable acceptance criteria in H2-10 and stage-gates in H2-11.
Diagram: Scenario × Capability matrix (✓ required / ! high risk / – optional)
Timing card selection matrix Matrix mapping four common deployment scenarios to required capabilities such as alignment, holdover, outputs, alarms, redundancy, and manageability. Scenario × Capability (conceptual) ✓ required ! risk Scenario Align Hold Out Alarm Redun Mgmt Multi-card align ! Carrier timing Lab instrument ! ! Data center ! !

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (Troubleshooting Only) + JSON-LD

These FAQs only close long-tail troubleshooting within the timing card/module boundary. Each answer is a data-driven 4-line checklist with measurable probes and pass criteria placeholders (X/Y/Z/T) that must be set by the system timing budget and SLA.

Recommended log fields (map to your device registers/telemetry)
ref_selected ref_quality loop_mode phase_error freq_error switch_event switch_reason temp_osc temp_board rail_event config_hash fw_version
GNSS says “locked” but 1PPS phase still slowly drifts—first log which two counters?

Likely cause: GNSS lock indicates tracking, but timing quality is degraded so the disciplining integrator accumulates slow phase error.

Quick check: Trend phase_error_slope (ps/s) and freq_steer_word (or equivalent DAC/FCW) over ≥ T hours while recording ref_quality.

Fix: Tighten reference validation to reject “noisy-lock” and/or increase averaging/hysteresis; if GNSS receiver is marginal, validate with a timing-grade module (e.g., u-blox ZED-F9T / LEA-M8T) before changing loop targets.

Pass criteria: Over T hours the absolute phase drift rate stays ≤ X ps/s and the steer word remains within ±Y% of its nominal range without repeated ref-quality drops.

Holdover is fine at room temp but fails across temperature—what trend plot reveals it fastest?

Likely cause: Holdover model is under-calibrated versus temperature (or sensor placement misses the actual oscillator gradient), so prediction error spikes during thermal transitions.

Quick check: Plot phase_error(t) together with temp_gradient = temp_osc - temp_board and holdover_residual during a controlled temp sweep.

Fix: Re-run temperature calibration and update coefficients/EEPROM; if the platform requires stronger holdover, validate DPLL/holdover devices (e.g., ADI AD9545 or Microchip ZL30772) with correct sensor placement and airflow constraints.

Pass criteria: Across the specified temperature range, holdover phase error remains inside the envelope E_holdover(t) ≤ X for at least T hours after reference loss.

Cleaner output jitter is great, yet downstream FPGA occasionally loses lock—probe what at the connector?

Likely cause: The endpoint is failing on electrical integrity (swing/common-mode/termination/reflections) even though the source jitter is low.

Quick check: At the card connector measure differential swing, common-mode level, and reflection/ringing (overshoot/undershoot) with the intended termination populated at the endpoint.

Fix: Correct the output standard and termination, reduce stub length/return discontinuities, and if loading is heavy use a dedicated fanout/buffer stage (e.g., ADI ADCLK948 or TI LMK00334) per domain.

Pass criteria: FPGA lock drop count equals 0 over T hours and connector waveform meets limits (e.g., overshoot/undershoot ≤ X mV and stable common-mode within ±Y mV).

After failover, alignment is off by a fixed offset—what does that imply about delay table vs phase trim?

Likely cause: A fixed post-switch offset typically indicates an unaccounted fixed path latency (delay table mismatch) rather than random phase noise or lock instability.

Quick check: Compare delay_table_id and phase_trim_value pre/post failover and confirm the measured offset is constant (±X ps) across repeated switches.

Fix: Calibrate and store separate delay tables for main/backup paths (and for each output domain) and ensure the switch sequence applies the correct table before declaring “in-service.”

Pass criteria: After any failover event, residual fixed offset ≤ X ps and channel-to-channel skew remains within the budget ≤ Y ps without manual re-trim.

Periodic “time bump” every N minutes—how to tell disciplining step vs software timestamp jump?

Likely cause: The bump is either a deliberate phase step from disciplining policy or a discontinuity introduced by the timestamp/ToD distribution path.

Quick check: Correlate the bump timestamps with phase_step_event_count/discipline_step_log and the host/ToD event log; if only the host log jumps, the source is software.

Fix: If it is disciplining, switch to continuous steering or reduce step magnitude and increase smoothing; if it is software, enforce monotonic timestamp handling and audit the ToD update transaction.

Pass criteria: No phase step exceeds X ps in magnitude and ToD/timestamps remain monotonic with max discontinuity ≤ Y ns over T hours.

PTP input looks stable but card switches ref anyway—what health gate threshold is likely too tight?

Likely cause: Health gating is rejecting PTP on transient metrics (delay variation, offset spikes, or missing-stamp bursts) due to insufficient debounce/hysteresis.

Quick check: Inspect the last 60–300 s before switch: switch_reason, ptp_offset_peak, and missing_stamp_count versus the configured thresholds.

Fix: Add hysteresis and increase confirmation window for PTP degrade, and align thresholds to the system wander budget rather than instant jitter snapshots.

Pass criteria: With stable PTP, ref switching does not occur for ≥ T days and any switch is preceded by metrics exceeding thresholds continuously for ≥ X seconds.

Why does enabling SSC reduce EMI but break one output domain—what compatibility check first?

Likely cause: The affected endpoint PLL/CDR does not tolerate the applied spread depth/rate, even if other domains remain fine.

Quick check: Verify SSC is enabled on the failing domain only, then measure modulation depth (ppm) and modulation rate at that output and compare to the endpoint tolerance spec.

Fix: Disable SSC on sensitive domains while keeping it on EMI-critical ones, or route the sensitive domain through a non-spread path (typical clock-tree uses jitter attenuators like Si5345-class or conditioners like LMK04828-class with per-domain policy).

Pass criteria: EMI peak reduction meets target while the sensitive endpoint shows 0 lock-loss events over T hours and phase/frequency excursions remain ≤ X/Y.

Multi-output skew is good at boot but degrades over hours—what thermal gradient check?

Likely cause: Channel delay elements and routing experience drift under thermal gradients, so skew slowly walks even if the source remains locked.

Quick check: Log per-channel skew_error alongside temp_osc and temp_board, then compute correlation with temp_gradient.

Fix: Improve airflow/heat spreading, relocate/duplicate sensors, and enable periodic phase re-trim if supported (ensure trims are logged and bounded).

Pass criteria: Over T hours and across operating temperatures, skew drift stays ≤ X ps (p-p) and does not correlate strongly with temperature (|r| ≤ Y).

Alarm storms appear during power events—what to filter vs what must be immediate?

Likely cause: A rail transient triggers many dependent alarms simultaneously, and missing policy separation causes repeated debounce/retry loops.

Quick check: Align timestamps of rail_uv/ov_event (or brownout) with the alarm burst rate (alarms/min) and verify whether resets coincide with switch_event.

Fix: Debounce and rate-limit “secondary” alarms during known power-sequencing windows, but keep “hard” timing integrity alarms (loss-of-lock, missing pulse) immediate with clear single-shot actions.

Pass criteria: During power events, alarm rate ≤ X alarms/min with no repeated oscillation, and critical alarms still assert within ≤ Y ms when truly violated.

One channel shows higher jitter than others—how to isolate fanout loading/termination issue quickly?

Likely cause: The “bad” channel is seeing different loading/termination or crosstalk, increasing deterministic jitter and edge distortion.

Quick check: Swap endpoint loads between two outputs and see whether the higher jitter follows the load, and measure connector reflections (ringing amplitude) on the affected path.

Fix: Normalize termination and loading, reduce stubs, and use a robust per-output buffer if needed (e.g., ADCLK948 / LMK00334 class fanout) to isolate domains.

Pass criteria: Channel-to-channel RMS jitter delta ≤ X fs (in the defined integration window) and reflection/ringing at the connector is ≤ Y mV (p-p).

Phase monitor shows noise but system works—what measurement bandwidth/window mistake is common?

Likely cause: The monitor is integrating the wrong band/timebase (mixing jitter with wander or using inconsistent averaging), producing “noise” that is not relevant to the system budget.

Quick check: Record the analyzer integration limits (f1..f2) and averaging time, then re-run using the exact window defined in acceptance (same reference path and trigger).

Fix: Standardize a single measurement recipe (window + averaging + reference) and validate it against a known-good baseline trace before concluding a hardware issue.

Pass criteria: With the correct window, measured RMS jitter/phase stats fall within the system budget ≤ X and correlate with observable system behavior (no false-fail alerts).

Firmware update changed timing behavior—what “golden log snapshot” should you compare?

Likely cause: Default profiles or calibration mappings changed (loop bandwidth, thresholds, delay tables), shifting behavior even if hardware is unchanged.

Quick check: Compare a golden snapshot set: fw_version, profile_id, config_hash, plus loop_mode, ref_quality, and summary stats of phase_error/freq_error under the same input conditions.

Fix: Restore the prior timing profile, migrate EEPROM calibration fields explicitly, and re-run a short acceptance suite; if the design uses DPLL/cleaner blocks, validate config equivalence for devices like AD9545, ZL30772, Si5345, LMK04828 class parts.

Pass criteria: Post-update deltas versus golden remain within limits (jitter ≤ X, skew ≤ Y, holdover envelope unchanged) and no new unexpected switch events occur over T hours.