Glitch-Free Clock Mux for Hitless Main/Backup Switching

Q: Why does a “glitch-free” mux still cause a noticeable phase step at switchover?

Likely cause: Glitch-free prevents illegal pulses, but does not guarantee phase continuity; A/B have Δf or drifting phase. Quick check: Measure phase step |Δt| and TIE_peak around the event; verify Δf at mux pins ≤ ____. Fix: Use/enable windowed switching or “wait-for-safe-window”; downgrade requirement to L2 (allow bounded phase step) if inputs cannot be aligned. Pass criteria: |Δt|max ≤ ____, TIE_peak ≤ ____ (window ____), and runt/double/missing = 0.

Q: Why do I see occasional double clocks during failover only at cold temperature?

Likely cause: Cold changes edge rate/swing/duty or worsens reflections, causing qualifier/gating mis-detection during switching. Quick check: At cold, probe at mux input pins: swing/overshoot/duty; check alarm/counter for rapid LOS/qualify toggling. Fix: Correct termination/return path and reduce stubs; add debounce (N_bad) and hold-off to prevent borderline chatter. Pass criteria: Cold corner: toggle X=____ times; double=0, runt=0, missing=0, and counters show no repeated back-to-back switches.

Q: Switching looks fine on the scope, but the FPGA sometimes miscounts—what trigger should be used?

Likely cause: Rare narrow pulses are missed by normal triggering; probing method creates false confidence. Quick check: Use pulse-width / runt / dropout triggers + persistence; measure at the receiver/termination point, not a stub. Fix: Use proper differential probing and short return; add an event counter (or FPGA edge counter) to correlate with failover events. Pass criteria: Over N_switch=____ events: trigger hits=0, FPGA miscounts=0, and waveform failure counters remain 0.

Q: Why does revertive switching “flap” between main and backup even though both clocks look present?

Likely cause: No hysteresis/soak on return; quality gate is too permissive, so marginal inputs cause oscillation at thresholds. Quick check: Inspect alarm/counter logs for frequent “good/bad” toggles; confirm T_holdoff and T_soak are non-zero. Fix: Add hysteresis + hold-off after each switch; require “good for T_soak + N_good” before revert; use non-revertive if needed. Pass criteria: Under disturbance/temperature sweep: switch_rate ≤ ____, no back-to-back switches within T_holdoff=____.

Q: What is the first check when only some outputs fail after switching (same mux, same source)?

Likely cause: Branch-level differences (termination, stub, load, routing) after the mux; not the mux core. Quick check: Compare “pass vs fail” branch: termination location/value, stubs, return path discontinuities; probe at each receiver. Fix: Normalize termination and remove stubs; relocate mux/fanout hierarchy so switching happens before divergent routing where possible. Pass criteria: All outputs pass the same morphology + timing windows (runt/double/missing=0; |ΔT|≤____; |Δt|≤____).

Q: Can clocks of different standards (e.g., LVCMOS ↔ LVDS) be switched without glitches?

Likely cause: Cross-standard switching violates threshold/termination/common-mode assumptions and often breaks qualification logic. Quick check: Confirm the device explicitly supports dual-standard inputs/translation; verify both paths meet the same input-qualify limits at the pins. Fix: Convert to a single standard before the mux (recommended); or choose a mux explicitly designed for that standard combination and re-validate. Pass criteria: At mux pins both inputs meet qualify limits; switching shows 0 illegal pulses and bounded |Δt|≤____ across corners.

Q: Why does additive jitter look worse after inserting a mux even when the datasheet seems small?

Likely cause: Different jitter integration windows or measurement modes; switch-correlated spurs inflate apparent jitter. Quick check: Re-measure using the same f1–f2 window and same instrument settings; separately check top spurs around switch events. Fix: Align measurement definitions; improve control/power isolation to the mux; choose a lower-additive class if ΔJ remains over budget. Pass criteria: ΔJ_RMS(additive)≤____ with identical f1–f2; no new event-correlated spur above ____ dBc.

Q: How should debounce/qualification time be set to avoid false failover but still meet availability targets?

Likely cause: One set of timers is incorrectly used for both “switch away” and “switch back,” causing either false failover or slow recovery. Quick check: Measure the duration distribution of real disturbances; log N_bad/N_good events and compare to timer settings. Fix: Use fast qualify for hard faults (LOS/LOL) and slower soak+quality gate for revert; tune N_bad, N_good, T_soak, T_holdoff independently. Pass criteria: False failover rate ≤____, max outage ≤____, and flapping=0 under the defined disturbance profile.

Q: Why does the backup clock pass frequency checks but still causes downstream link errors after switchover?

Likely cause: Frequency windowing is necessary but insufficient; phase transient, jitter spectrum, or event spurs exceed downstream tolerance. Quick check: Measure |Δt|/TIE_peak at switchover and compare spurs/jitter with identical settings; check if revert is gated by “quality OK.” Fix: Add quality-gate before using/reverting to a source; keep backup in warm-standby if supported; tighten termination/routing for sensitive endpoints. Pass criteria: Link errors=0 and all defined windows pass: |Δt|≤____, TIE_peak≤____, ΔJ≤____, top spur≤____ dBc.

Q: What is the simplest production test to prove “no runt pulse / no missing cycle” at scale?

Likely cause: Production tries to “scope screenshot” instead of using event-driven triggers and counters, missing rare failures. Quick check: Use a pulse-width/runt trigger with persistence and log hit_count while toggling MAIN↔BACKUP X times. Fix: Standardize a minimal script: X toggles + corner sweep (temp/voltage) + automated pass/fail counters; keep deeper jitter/PN as audit sampling. Pass criteria: Over X=____ toggles: runt=0, double=0, missing=0, and event counters match expected totals.

← Back to:Reference Oscillators & Timing

A glitch-free clock mux keeps critical clock trees running through main/backup failover without illegal pulses (runt/double/missing), while making phase steps and jitter impact measurable and bounded. The core engineering task is to define switching policy, qualification/hold-off parameters, and acceptance windows so failover is predictable, testable, and diagnosable in production and the field.

What is a Glitch-Free Clock Mux (and where it sits in a redundant clock tree)

A glitch-free clock mux switches between two clock sources without generating runt pulses, double clocks, or missing cycles at the output. In redundant systems it enables main/backup failover so downstream endpoints continue to see a valid clock during faults—provided the sources are compatible and the switching decision is properly qualified.

Terminology: three practical acceptance levels

Glitch-free (no invalid pulses)

Output switching produces no runt pulses, no double edges, and no missing cycles.

Hitless (no clock outage)

Output remains continuous (no “dead time”). Phase steps may still occur depending on source alignment.

Seamless / near phase-continuous (controlled phase transient)

Phase transient is bounded and small enough for the endpoint’s tolerance window. This typically requires tight frequency match, qualified switching windows, and a defined policy.

Where it sits in the clock tree (typical)

Reference sources (main + backup) feed the system clock chain.
Cleaning / conditioning (if used) ensures both inputs meet level, jitter, and stability requirements.
Glitch-free mux performs failover switching.
Fanout / distribution replicates the selected clock to multiple endpoints (FPGA/SerDes/Converters/PHY/SoC).

This page focuses on hitless switching mechanics, decision logic, and validation. Details of PLL loop design, crosspoint routing, or fanout buffer architectures are intentionally not expanded here.

Key prerequisites for true hitless behavior

Same frequency (or tightly bounded Δf): if the inputs drift apart, phase difference will walk, and “seamless” is not feasible.
Same signaling standard and valid levels: mismatched common-mode, swing, or termination can break edge qualification and create false switching.
Comparable quality signals: there must be reliable indicators (LOS/LOL/frequency window/phase window) to drive a deterministic decision.
Warm standby (recommended): the backup path should be stable before it is needed, otherwise failover becomes recovery-with-transient.

The mux is only one element of hitless redundancy: qualification, policy, and measurement close the loop.

What “glitch-free / hitless” really means: failure modes and pass criteria

“Glitch-free” and “hitless” only matter if they are tied to observable metrics. A failover can look acceptable on a slow timebase yet still break endpoints due to rare runt pulses, one-in-a-thousand double clocks, or a phase step that exceeds tolerance. This section defines the failure modes and a measurement-friendly acceptance template.

Failure modes to look for (grouped by impact)

Waveform validity (logic-level fatal)

Runt pulse: too narrow / too small but still crosses an input threshold.
Double clock: two valid edges occur within one expected period.
Missing cycle: output period stretches beyond the allowed limit.
Abnormal duty: duty-cycle distortion shifts edge timing or triggers mis-detection.

Timing transients (endpoint tolerance dependent)

Phase step: a sudden time offset Δt at switchover (not necessarily a glitch).
Period / frequency step: short-term period error during decision or gating windows.
Temporary wander: low-frequency drift that accumulates phase error over time.

Clock quality degradation (budget-limited)

Additive jitter increase: RMS jitter rises beyond remaining system budget.
New spurs / transients: switching control injects discrete tones or wideband noise.

Acceptance template (application fills X/Y/budget)

Waveform: no runt pulses, no double edges, no missing cycles over N switches.
Timing: phase step (TIE) < X; max period error < Y.
Quality: additive RMS jitter increase < remaining budget; no new spur violates system mask.

Critical detail: RMS jitter numbers are only comparable if the integration window and measurement method are identical.

Why “glitch-free” ≠ “phase-continuous”

A mux can be perfectly glitch-free while still producing a measurable phase step at switchover. Phase continuity depends on source frequency match, the allowed switching window, and endpoint tolerance. For engineering clarity, success should be declared as Level 1 (glitch-free), Level 2 (hitless), or Level 3 (near phase-continuous) before tuning policies or thresholds.

A scope screenshot alone can be misleading—pair waveform validity checks with timing (TIE/phase step) and jitter-budget verification using consistent measurement windows.

Switching policies: revertive vs non-revertive, priority, manual override, warm-standby

A glitch-free mux becomes hitless in the field only when switching is driven by a deterministic policy: clear triggers, stable qualification windows, anti-flap guardrails, and a defined return strategy. The goal is to switch fast on real faults while avoiding “thrash” on marginal conditions.

Policy matrix (choose per availability vs stability needs)

Revertive vs Non-revertive

Revertive (auto return): returns to main when main is stable and qualified; best when main has a clear quality advantage.
Non-revertive (stay on backup): remains on backup until manual action or a strict return condition; reduces repeated switching risk.

Priority vs “Best clock”

Priority: main is preferred unless it is declared bad; simpler and easier to qualify.
Best-clock selection: chooses the better source based on quality metrics; requires stable measurements and strong anti-flap controls.

Warm-standby vs Cold-standby

Warm-standby: backup path is already stable (frequency/level/lock) before it is needed; enables faster failover.
Cold-standby: backup starts on demand; often cannot meet tight hitless requirements due to start/lock time.

The flapping problem (why systems switch back and forth)

“Flapping” is usually caused by unstable decision signals (borderline LOS/LOL, noisy frequency checks, marginal levels) rather than the mux core. Prevent repeated switching with guardrails that convert noisy observations into stable decisions.

Debounce: require N consecutive good/bad windows before asserting a state.
Hysteresis: use asymmetric thresholds for entering vs exiting a fault condition.
Hold-off: after any switch, block additional switching for T_hold.
Soak time: require main to be good for T_soak before any revertive return.

Practical engineering defaults (safe starting point)

Failover: if LOS/LOL is confirmed, switch quickly (availability first).
Return: never return immediately; require “main present” + quality OK + T_soak, then apply T_hold after the switch.
Manual override: allow forcing MAIN/BACKUP for commissioning, but always log the reason and block auto actions if policy requires.

A stable failover system switches quickly on confirmed faults and returns only after sustained “good” qualification (debounce + hysteresis + soak + hold-off).

Inside the box: architectures that make switching glitch-free

“Glitch-free” is achieved by controlling the switching instant. The mux core must ensure that output gating never produces illegal pulse widths, even when inputs are noisy or slightly misaligned. Different internal architectures implement this with different trade-offs in phase transient, latency, and tolerance to input imperfections.

Common implementations (what they optimize)

Windowed gating (synchronous gating + safe window)

Mechanism: switching is only allowed when the selected and candidate clocks are in a “safe” region that cannot create a runt pulse.
Strength: robust glitch prevention for fast clocks and tight validity requirements.
Risk: excessive input jitter, poor duty, or marginal levels shrink the window and increase decision sensitivity.

Phase compare + edge alignment (optional, transient-focused)

Mechanism: estimates relative phase and chooses a switch instant that minimizes phase step.
Strength: improved phase transient control when inputs are truly same-frequency.
Risk: phase estimation must be stable; frequency offset or noisy detection can worsen repeatability.

Elastic buffering / divider-domain switching (latency-tolerant)

Mechanism: switches in a buffered or lower-frequency domain where safe timing margins are larger.
Strength: simplifies glitch prevention for low-frequency or divided clocks.
Risk: adds latency and may introduce phase uncertainty that must be budgeted.

Why it is glitch-free (unified principle) + key input constraints

The core rule is simple: switch only in a safe window where the output cannot form an illegal pulse width. In practice, the safe window depends on edge detection quality, logic thresholds, noise, and duty-cycle.

Constraint → common failure symptom mapping

Low swing / slow edge: unstable edge detection → runt pulses or sporadic double clocks.
Duty distortion: safe window shifts → missing-cycle risk or larger phase transient.
Common-mode / termination errors (differential): comparator mis-detect → rare validity failures.
High input jitter: safe window erodes → non-deterministic switch instant and tolerance violations.

Mechanism-level checks (quick validation before deep tuning)

Window stress: repeat switching while reducing input swing and observing whether failures appear as runt/double/missing.
Duty sensitivity: introduce controlled duty distortion and confirm the pass criteria remains satisfied.
Repeatability: measure the distribution of switching instants (phase step histogram) to detect non-deterministic behavior.

Conceptual blocks only: real devices vary, but glitch-free behavior always depends on qualification and a safe switching window that prevents illegal pulse widths.

Alignment & hitless mechanics: phase continuity, cycle slip, and controlled phase steps

“Hitless” is not a single promise. Switching can be glitch-free yet still introduce a phase step. For engineering clarity, define the target level (L1/L2/L3), then design the alignment window and switching rules to keep phase transients within endpoint tolerance.

Three acceptance levels (use as a system requirement)

L1 — Glitch-free

No runt pulses, no double clocks, no missing cycles. Phase continuity is not guaranteed.

L2 — Gapless

No clock outage and no missing periods. A bounded phase step (Δt) may occur.

L3 — Near phase-continuous

Switch timing is controlled to keep phase transient within a narrow window. Typically requires same-frequency inputs and a defined alignment strategy.

Alignment conditions (when L3 is meaningful)

Same frequency (or tightly bounded Δf): otherwise phase error drifts and “always continuous” is not realistic.
Switch window: switch only when the relative phase falls within a permitted window.
Controlled phase step: if perfect continuity is impossible, cap the step size and select the best switching instant.

Cycle slip (why phase error becomes a sawtooth with small Δf)

With a small frequency offset between sources, relative phase error accumulates over time and wraps, producing a sawtooth-like drift. In this case, the practical control lever is not “forcing phase continuity forever,” but choosing the switching instant and bounding the phase step.

Windowed switching: wait until phase error enters the allowed band.
Step cap: enforce |Δt| ≤ X (system-defined), otherwise delay switching or downgrade to L2 behavior.
Bounded waiting: set a maximum wait time to avoid excessive failover delay under fault conditions.

With small Δf, phase error drifts and wraps. “Near-continuous” switching is achieved by choosing a switch instant inside the allowed window and bounding the phase step.

Fault detection & decision logic: LOS/LOL, frequency windowing, hysteresis, debounce

Field reliability depends on stable decisions, not raw indicators. A robust design converts raw monitors (LOS/LOL, edge counting, frequency windows, phase drift, lock pins) into a qualified state with debounce, hysteresis, timers, and return gating.

Detection inputs (use more than one when possible)

LOS (loss of signal): amplitude/edge disappearance indicates a hard failure.
LOL / lock pins: fast but may chatter near boundary; always filter.
Frequency windowing: counts edges in a measurement window and compares against a Δf window.
Phase drift trend: detects gradual degradation; useful for “quality” decisions.

Anti-false-trigger controls (turn noisy signals into stable states)

Debounce: require N consecutive windows before asserting GOOD/BAD.
Hysteresis: use different thresholds for enter vs exit (Δf_in vs Δf_out).
Hold-off: after switching, ignore transient alarms for T_holdoff.
Quality-gate (return): return to main only after lock is stable for T_qual and quality checks pass.

Executable parameter set (typical configuration knobs)

N_bad / N_good: debounce counts for BAD and GOOD qualification.
Δf_window_in / Δf_window_out: frequency window thresholds (hysteresis pair).
T_holdoff: post-switch quiet time to suppress transient mis-detection.
T_qual: lock/quality qualification time before allowing a return.
LOS threshold: amplitude/edge criteria for declaring LOS (if supported).
Lock filter: delay/filter applied to lock pins to avoid chatter.

A pipeline approach prevents false triggers: debounce and hysteresis stabilize raw monitors, timers qualify returns, and hold-off suppresses post-switch transients.

Clock-quality budgeting around a mux: additive jitter, phase noise, duty-cycle distortion, spurs

A mux should be treated as a budgeted impairment element: it can add random jitter (RMS), shape phase noise (close-in vs floor), distort duty cycle (especially LVCMOS), and introduce spurs correlated with switching or control activity. Robust budgeting separates these contributions and verifies them with consistent measurement conditions.

What the mux can change (focus on incremental impairment)

Additive RMS jitter (with defined integration limits)

“Additive”: the mux contribution beyond the input source.
Bandwidth matters: compare numbers only when the integration range is the same.
Use-case: captures broadband random effects that accumulate across stages.

Phase noise (close-in vs floor)

Close-in: slow phase wander and low-offset noise sensitivity.
Floor / far-out: wideband noise contributing to RMS jitter.
Practical rule: track both, because endpoints weigh them differently.

Duty-cycle distortion (DCD)

Most visible on LVCMOS: edge-rate and threshold effects shift duty.
Why it matters: can break downstream edge-based timing assumptions.
Report: min/typ/max duty under defined load/termination.

Spurs (event-correlated impurities)

Source: switching transients, control coupling, supply/ground bounce.
Risk: may not inflate RMS jitter much but can violate masks.
Verify: spur offsets and dBc, and correlation to switch events.

A practical budgeting framework (keep measurement conditions consistent)

For random jitter-like terms, treat each stage as an RMS contributor under the same integration bandwidth, then combine by RSS: J_total ≈ √(J_src² + J_cleaner² + J_mux² + J_fanout² + …). Track spurs separately as a mask/peak metric rather than folding them into RMS.

RMS line items: use additive jitter terms for “what the stage adds.”
PN split view: keep close-in and far-out summaries (or key offsets).
Spur checklist: top offsets + dBc + event correlation (switching/control).

Budget random jitter terms by RSS under consistent bandwidth, and track spurs separately as peak/mask items (often correlated with switching/control activity).

Interfaces & board design: LVCMOS/LVDS/HCSL/LVPECL, terminations, skew, fanout placement

Switching reliability is often limited by interface discipline, not the mux core. Keep sources comparable (same standard and frequency), terminate correctly, maintain a continuous return path, and apply skew control only where it matters for the chosen hitless level.

Compatibility first (avoid cross-standard switching)

Same standard in: LVCMOS↔LVCMOS or LVDS↔LVDS is the default safe assumption.
Cross-standard risk: level translation can change edge integrity, duty, and threshold behavior.
Comparable conditions: identical termination and biasing strategy on both inputs improves deterministic switching.

Termination & common-mode (why some outputs “fail” while others pass)

HCSL / LVDS

Place Rt near RX: long stubs make reflections and threshold mis-detect more likely.
Keep diff pair coupled: avoid uneven spacing and abrupt geometry changes.
Return path continuity: crossing splits/voids increases mode conversion and overshoot.

LVPECL / LVCMOS

Bias/termination completeness: missing biasing can appear as “clipped” or shifted waveforms.
Edge rate control: too-fast edges can worsen overshoot and false edge detection.
Duty sensitivity: LVCMOS duty can shift with loading and threshold behavior.

Skew control + mux/fanout placement trade (keep it requirement-driven)

Where skew matters: mux output to fanout input (and any parallel endpoints requiring alignment).
Where it often does not: branches with no phase-relationship requirement beyond legal pulses.
Mux before cleaner: unified cleanup after switching, but recovery/qualify timing becomes critical.
Mux after cleaner: both paths are already “clean,” but layout symmetry and coupling become more sensitive.

Keep switching comparable (same standard), place termination near receivers, preserve return continuity, and apply skew control where alignment requirements demand it.

Failover timing: switchover time, holdover, and downstream tolerance windows

“Gapless output” does not automatically mean “zero phase disturbance.” A practical failover spec is a time-budgeted sequence: detection, decision, switching, then post-switch settling. Downstream tolerance should be expressed as fillable windows (phase/period/jitter/spur) instead of a single number.

What “switchover time” means in engineering terms

t_detect: raw fault becomes a qualified alarm (LOS/LOL, frequency window, phase drift).
t_decide: decision logic and policy gating (debounce, hysteresis, timers, priority).
t_switch: the switching actuation (may include bounded waiting for a safe/phase window).
t_settle: downstream stabilization / re-lock observation window after the switch.

Downstream tolerance focus (use to choose acceptance windows)

SerDes / high-speed links

Primary sensitivity to continuity and event-correlated phase/jitter spectrum. Acceptance windows should emphasize phase step, TIE peaks, and spur emergence around switch events.

Converters / sampling-critical endpoints

Primary sensitivity to random jitter budget and spurs. Acceptance windows should lock integration bandwidth and track top spur offsets/dBc before vs after switching.

FPGA / SoC edge-based logic

Primary sensitivity to illegal pulses (runt/double/missing) and abnormal period/duty. Acceptance windows should prioritize morphology pass/fail with aggressive glitch-trigger coverage.

Fillable tolerance windows (use as a requirement template)

Continuity

No missing cycles: YES
Allowed outage < ____

Timing

Max phase step |Δt| < ____
TIE peak < ____ (window ____)
Period error |ΔT| < ____ (N cycles ____)

Quality

ΔJ_RMS(additive) < ____ (f1–f2: ____)
Top spurs < ____ dBc @ offsets ____
Return qualify time T_qual > ____

Recovery

Downstream re-lock < ____
No flapping for ____ after switch

A usable failover specification decomposes time into detection, decision, switching, and settling—then assigns tolerance windows per downstream sensitivity.

Validation & measurement traps: how to prove it is truly glitch-free

A single “clean looking” scope capture is not proof. Reliable validation is layered: morphology (illegal pulses), timing (TIE/phase step/period error), and quality (jitter/phase-noise bandwidth consistency and spurs). Each layer has common traps that can hide rare failures or create false glitches.

Layered validation (three complementary proof paths)

Morphology (scope)

Glitch / pulse-width triggers + persistence coverage.
Proves: no runt, no double, no missing cycles.
Report: 0 hits over N switching events.

Timing (TIE / Δt)

Capture peak phase step and TIE peaks in defined windows.
Proves: bounded transients even if output is gapless.
Report: |Δt|max, TIE_peak, |ΔT|max.

Quality (jitter / spurs)

Keep integration bandwidth identical across comparisons.
Separate RMS jitter from spur/mask checks.
Report: ΔJ(additive), top spurs (offset/dBc), event correlation.

Measurement traps (why “it looks fine” can still fail)

Trigger miss: wrong trigger type or no persistence hides rare runt/double events.
Probe-induced artifacts: long ground loops or single-ended probing of a differential net creates fake glitches.
Bandwidth mismatch: insufficient bandwidth/sampling can smear or completely miss narrow pulses.
Different jitter windows: changing integration limits makes “better/worse” comparisons meaningless.
Wrong measurement point: measuring on a stub or away from termination exaggerates reflections.
RMS-only bias: RMS jitter may look stable while a new switch-correlated spur violates a mask.

Minimal pass test set (directly usable for validation / production templates)

Morphology

N_switch ≥ ____; glitch trigger hits = 0
Persistence coverage ≥ ____; runt/double/missing = 0
Duty in [____, ____] under defined load

Timing

|Δt|max < ____
TIE_peak < ____ (window ____)
|ΔT|max < ____ (N cycles ____)

Quality

ΔJ_RMS(additive) < ____ (f1–f2: ____)
Top spurs < ____ dBc @ offsets ____
No new switch-correlated spur emergence after events

Proving “glitch-free” requires a measurement chain that can catch rare illegal pulses, quantify phase transients, and compare jitter/PN under identical bandwidth settings.

Engineering checklist + Applications & IC selection logic

This section turns “glitch-free / hitless” requirements into an executable bring-up and production plan, then maps those requirements to concrete selection filters and representative IC part numbers (by candidate class, not as a universal BOM).

A) Engineering checklist (design → bring-up → validation → production)

A1) Input readiness (make A/B comparable before expecting “hitless”)

Standard & termination match: keep A/B in the same electrical standard (LVCMOS/LVDS/HCSL/LVPECL) and termination style; avoid “cross-standard switching” unless the mux explicitly supports it.
Frequency window: verify Δf ≤ ____ at the mux input pins (not only at the source connector).
Amplitude/CM/duty: confirm swing, common-mode, and duty-cycle are within the mux input qualification limits (duty in ____ to ____).
Power-up sequencing: define default path (MAIN/BACKUP), input-valid timing, and reset/enable ordering to prevent startup false-switching and “first-switch” artifacts.
Warm-standby option: if available, keep the backup path qualified/locked to reduce switching transient risk.

A2) Decision parameters (prevent flapping and false triggers)

Treat failover and revert decisions as a parameterized filter chain. Default engineering posture: fast switch on hard failure (LOS/LOL), and delayed/qualified return (soak + quality OK).

Parameter template (fill with system limits)

Debounce: N_bad = ____ cycles, N_good = ____ cycles
Hysteresis: Δ = ____ (threshold enter/exit separation)
Hold-off after switching: T_holdoff = ____ ms
Revertive soak/qualify time: T_soak = ____ ms
Frequency window: Δf_window = ____ ppm (or ____ Hz)
(Optional) Phase drift gate: Δφ_window = ____ (or TIE ____)

A3) Output health (unify waveform + timing + quality acceptance)

Waveform (scope)

Runt pulses: 0 events in persistence
Double clocks / extra edges: 0 events
Missing cycles: 0 events
Duty anomaly beyond ____

Timing (TIE / phase step)

Max period error: |ΔT| ≤ ____
Max phase step (time): |Δt| ≤ ____
TIE_peak within window (____): ≤ ____

Quality (jitter / spurs)

Additive RMS jitter ΔJ ≤ ____ (integration: ____ to ____)
Spurs: top spur ≤ ____ dBc @ offsets ____
No event-correlated spur bursts during switching

A4) Control & observability (make failures diagnosable in the field)

Control plane: pin-strap vs I²C/SPI; manual override; priority; revertive/non-revertive configuration.
Alarms: LOS/LOL/frequency window/phase monitor as pin or status bits (and how they map to decisions).
Event counters: switch_count, fail_count, alarm_count to correlate intermittent issues.
Timestamp hook: capture switching events (edge or interrupt) for lab correlation with spurs/phase steps.

A5) Production minimum test set (small but decisive)

Toggle MAIN↔BACKUP for X = ____ cycles/events; waveform failures must remain 0.
Corner sweep: temperature ____, voltage ____, input disturbance ____.
Record alarms + counters; reject lots with abnormal switch_rate or fail_rate.
Reuse the same three-layer acceptance (waveform / timing / quality) for consistency across teams.

B) Applications (strictly within this page boundary)

Redundant reference clock trees

Keep critical endpoints alive during MAIN reference failure without creating illegal pulses that can lock-up digital logic.

Maintenance / test bypass

Switch to test sources or alternate references for on-line validation and service, with manual override and event logging.

High availability (HA) platforms

Combine dual refs + automatic failover + alarms so the system shifts from “hard failure” to “recoverable event”.

Field diagnosability

Alarm pins and counters make intermittent switching explainable, enabling faster root-cause closure and production screening.

C) IC selection logic (decision filters → candidate class → example part numbers)

Selection should be layered: hard constraints first, then switching behavior, then clock quality, then monitoring, then control/integration. Treat part numbers below as representatives per class; always verify package, suffix, and measurement conditions.

C1) Hard constraints (filter)

Input count: 2:1 vs n:1 (and whether multiple independent channels are required).
Electrical standard: LVCMOS / LVDS / HCSL / LVPECL (avoid cross-standard switching unless supported).
Frequency range and duty constraints at the mux pins.
Outputs: single output vs multi-output (mux-only vs mux+fanout architecture).

C2) Switching behavior (target class)

L1: Glitch-free — no runt/double/missing, but phase step may be allowed.
L2: Gapless / hitless (practical) — continuous output with controlled switching transient.
L3: Near phase-continuous — requires alignment/holdover mechanics and tighter input comparability.

C3) Clock quality & monitors (rank + qualify)

Additive RMS jitter must match the same integration window used in the system budget.
Duty-cycle distortion matters most for LVCMOS paths.
Event-correlated spurs must be checked around switching events.
Prefer devices with LOS/LOL/frequency window + counters when field diagnosis is required.

Concrete example part numbers (by candidate class)

Use these as starting points for datasheet lookup and bench verification. Final selection must be driven by the filters above (standard, frequency, jitter window, monitors, control, and power/EMI constraints).

Class L1 — Glitch-free 2:1 clock mux

Renesas 580-01 — glitch-free switching, clock detect; for redundant clock trees.

Class L2 — Glitch-free mux with “zero-delay” style regeneration / multi outputs

Renesas 581G-02LF (ICS581-02) — PLL-based glitch-free mux, zero delay input-to-output, multi low-skew outputs.

Class L2/L3 — Hitless input switching + monitor + distribution outputs

Renesas (IDT) 873996 — dynamic clock switch monitors both inputs, automatic switch to good clock, LVPECL outputs.

Class L3 — DPLL/DSPLL devices with hitless reference selection + holdover mechanics

Texas Instruments LMK05028 — DPLL-based network synchronizer with hitless switching + digital holdover options.
Microchip ZL30105 — DPLL with hitless reference switching behavior and holdover-related mechanics.
Skyworks Si5345 / Si5344 / Si5342 — DSPLL family with hitless input clock switching (manual/automatic) and monitoring.
Skyworks Si5348 — DSPLL family option when higher output flexibility is required (verify switching mode constraints).
Skyworks Si5386 — DSPLL family option for advanced timing trees (use when L3 mechanics and monitoring are required).

Verification reminder (avoid false comparisons)

For any candidate, align the measurement definition: switching transient metric (phase step/TIE), additive jitter integration window, and the exact I/O standard termination used on the PCB.

SVG 11 — Selection flow (candidate class)

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (troubleshooting) — Glitch-Free Clock Mux

Each answer is intentionally short and executable. Use the four-line format to keep decisions measurable: Likely cause → Quick check → Fix → Pass criteria.

Why does a “glitch-free” mux still cause a noticeable phase step at switchover?

Likely cause: Glitch-free prevents illegal pulses, but does not guarantee phase continuity; A/B have Δf or drifting phase.

Quick check: Measure phase step |Δt| and TIE_peak around the event; verify Δf at mux pins ≤ ____.

Fix: Use/enable windowed switching or “wait-for-safe-window”; downgrade requirement to L2 (allow bounded phase step) if inputs cannot be aligned.

Pass criteria: |Δt|max ≤ ____, TIE_peak ≤ ____ (window ____), and runt/double/missing = 0.

Why do I see occasional double clocks during failover only at cold temperature?

Likely cause: Cold changes edge rate/swing/duty or worsens reflections, causing qualifier/gating mis-detection during switching.

Quick check: At cold, probe at mux input pins: swing/overshoot/duty; check alarm/counter for rapid LOS/qualify toggling.

Fix: Correct termination/return path and reduce stubs; add debounce (N_bad) and hold-off to prevent borderline chatter.

Pass criteria: Cold corner: toggle X=____ times; double=0, runt=0, missing=0, and counters show no repeated back-to-back switches.

Switching looks fine on the scope, but the FPGA sometimes miscounts—what trigger should be used?

Likely cause: Rare narrow pulses are missed by normal triggering; probing method creates false confidence.

Quick check: Use pulse-width / runt / dropout triggers + persistence; measure at the receiver/termination point, not a stub.

Fix: Use proper differential probing and short return; add an event counter (or FPGA edge counter) to correlate with failover events.

Pass criteria: Over N_switch=____ events: trigger hits=0, FPGA miscounts=0, and waveform failure counters remain 0.

Why does revertive switching “flap” between main and backup even though both clocks look present?

Likely cause: No hysteresis/soak on return; quality gate is too permissive, so marginal inputs cause oscillation at thresholds.

Quick check: Inspect alarm/counter logs for frequent “good/bad” toggles; confirm T_holdoff and T_soak are non-zero.

Fix: Add hysteresis + hold-off after each switch; require “good for T_soak + N_good” before revert; use non-revertive if needed.

Pass criteria: Under disturbance/temperature sweep: switch_rate ≤ ____, no back-to-back switches within T_holdoff=____.

What is the first check when only some outputs fail after switching (same mux, same source)?

Likely cause: Branch-level differences (termination, stub, load, routing) after the mux; not the mux core.

Quick check: Compare “pass vs fail” branch: termination location/value, stubs, return path discontinuities; probe at each receiver.

Fix: Normalize termination and remove stubs; relocate mux/fanout hierarchy so switching happens before divergent routing where possible.

Pass criteria: All outputs pass the same morphology + timing windows (runt/double/missing=0; |ΔT|≤____; |Δt|≤____).

Can clocks of different standards (e.g., LVCMOS ↔ LVDS) be switched without glitches?

Likely cause: Cross-standard switching violates threshold/termination/common-mode assumptions and often breaks qualification logic.

Quick check: Confirm the device explicitly supports dual-standard inputs/translation; verify both paths meet the same input-qualify limits at the pins.

Fix: Convert to a single standard before the mux (recommended); or choose a mux explicitly designed for that standard combination and re-validate.

Pass criteria: At mux pins both inputs meet qualify limits; switching shows 0 illegal pulses and bounded |Δt|≤____ across corners.

Why does additive jitter look worse after inserting a mux even when the datasheet seems small?

Likely cause: Different jitter integration windows or measurement modes; switch-correlated spurs inflate apparent jitter.

Quick check: Re-measure using the same f1–f2 window and same instrument settings; separately check top spurs around switch events.

Fix: Align measurement definitions; improve control/power isolation to the mux; choose a lower-additive class if ΔJ remains over budget.

Pass criteria: ΔJ_RMS(additive)≤____ with identical f1–f2; no new event-correlated spur above ____ dBc.

How should debounce/qualification time be set to avoid false failover but still meet availability targets?

Likely cause: One set of timers is incorrectly used for both “switch away” and “switch back,” causing either false failover or slow recovery.

Quick check: Measure the duration distribution of real disturbances; log N_bad/N_good events and compare to timer settings.

Fix: Use fast qualify for hard faults (LOS/LOL) and slower soak+quality gate for revert; tune N_bad, N_good, T_soak, T_holdoff independently.

Pass criteria: False failover rate ≤____, max outage ≤____, and flapping=0 under the defined disturbance profile.

Why does the backup clock pass frequency checks but still causes downstream link errors after switchover?

Likely cause: Frequency windowing is necessary but insufficient; phase transient, jitter spectrum, or event spurs exceed downstream tolerance.

Quick check: Measure |Δt|/TIE_peak at switchover and compare spurs/jitter with identical settings; check if revert is gated by “quality OK.”

Fix: Add quality-gate before using/reverting to a source; keep backup in warm-standby if supported; tighten termination/routing for sensitive endpoints.

Pass criteria: Link errors=0 and all defined windows pass: |Δt|≤____, TIE_peak≤____, ΔJ≤____, top spur≤____ dBc.

What is the simplest production test to prove “no runt pulse / no missing cycle” at scale?

Likely cause: Production tries to “scope screenshot” instead of using event-driven triggers and counters, missing rare failures.

Quick check: Use a pulse-width/runt trigger with persistence and log hit_count while toggling MAIN↔BACKUP X times.

Fix: Standardize a minimal script: X toggles + corner sweep (temp/voltage) + automated pass/fail counters; keep deeper jitter/PN as audit sampling.

Pass criteria: Over X=____ toggles: runt=0, double=0, missing=0, and event counters match expected totals.

Why does enabling SSC on one source break hitless switching?

Likely cause: SSC introduces intentional FM so A/B no longer stay within a stable Δf/phase window, defeating “safe-window” assumptions.

Quick check: Confirm SSC depth/rate and observe phase difference trend (rapid drift/sawtooth); verify Δf_window is still satisfied during modulation.

Fix: Disable SSC on hitless paths; or apply matched SSC to both sources and re-validate with updated Δf/Δt windows.

Pass criteria: During SSC operation: runt/double/missing=0 and phase/TIE windows remain within limits (|Δt|≤____, TIE≤____).

How can rare field events (brownout, intermittent LOS) be logged and diagnosed without a lab scope?

Likely cause: Lack of observability (no sticky status, no counters, no timestamped events) makes intermittent failures look “random.”

Quick check: Verify availability of alarm pins/status bits and counters (switch_count/fail_count); confirm MCU can timestamp interrupts/events.

Fix: Enable sticky flags + counters; log {timestamp, reason code, selected path, supply status} on every alarm/switch; add brownout detection gating.

Pass criteria: Field log reconstructs each event (time + reason + path) and shows bounded switch_rate and zero illegal pulses over ____ hours.

Glitch-Free Clock Mux for Hitless Main/Backup Switching

Glitch-Free Clock Mux for Hitless Main/Backup Switching

What is a Glitch-Free Clock Mux (and where it sits in a redundant clock tree)

What “glitch-free / hitless” really means: failure modes and pass criteria

Switching policies: revertive vs non-revertive, priority, manual override, warm-standby

Inside the box: architectures that make switching glitch-free

Alignment & hitless mechanics: phase continuity, cycle slip, and controlled phase steps

Fault detection & decision logic: LOS/LOL, frequency windowing, hysteresis, debounce

Clock-quality budgeting around a mux: additive jitter, phase noise, duty-cycle distortion, spurs

Interfaces & board design: LVCMOS/LVDS/HCSL/LVPECL, terminations, skew, fanout placement

Failover timing: switchover time, holdover, and downstream tolerance windows

Validation & measurement traps: how to prove it is truly glitch-free

Engineering checklist + Applications & IC selection logic

A) Engineering checklist (design → bring-up → validation → production)

B) Applications (strictly within this page boundary)

C) IC selection logic (decision filters → candidate class → example part numbers)

Request a Quote

Accepted Formats

Attachment

FAQs (troubleshooting) — Glitch-Free Clock Mux

Explore

Categories

Get in Touch

Glitch-Free Clock Mux for Hitless Main/Backup Switching

Glitch-Free Clock Mux for Hitless Main/Backup Switching

What is a Glitch-Free Clock Mux (and where it sits in a redundant clock tree)

What “glitch-free / hitless” really means: failure modes and pass criteria

Switching policies: revertive vs non-revertive, priority, manual override, warm-standby

Inside the box: architectures that make switching glitch-free

Alignment & hitless mechanics: phase continuity, cycle slip, and controlled phase steps

Fault detection & decision logic: LOS/LOL, frequency windowing, hysteresis, debounce

Clock-quality budgeting around a mux: additive jitter, phase noise, duty-cycle distortion, spurs

Interfaces & board design: LVCMOS/LVDS/HCSL/LVPECL, terminations, skew, fanout placement

Failover timing: switchover time, holdover, and downstream tolerance windows

Validation & measurement traps: how to prove it is truly glitch-free

Engineering checklist + Applications & IC selection logic

A) Engineering checklist (design → bring-up → validation → production)

B) Applications (strictly within this page boundary)

C) IC selection logic (decision filters → candidate class → example part numbers)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

FAQs (troubleshooting) — Glitch-Free Clock Mux

Explore

Categories

Get in Touch