123 Main Street, New York, NY 10001

Synchronous Ethernet (SyncE) for Carrier & 5G Backhaul

← Back to:Reference Oscillators & Timing

Synchronous Ethernet (SyncE) distributes a stable frequency across Ethernet networks so carrier backhaul stays traceable and resilient. This page shows how to design, configure, and verify the full chain—from PHY recovery and jitter filtering to QL/SSM policy, protection switching, and holdover—using measurable pass criteria.

What is SyncE (Synchronous Ethernet) — and what it is NOT

Synchronous Ethernet (SyncE) is frequency distribution over Ethernet: a node recovers a stable line-derived clock at the PHY and uses it as a disciplined frequency reference for its local clock tree. It improves network-wide frequency coherence across hops—especially valuable in carrier/5G backhaul where ppm-level drift and noisy clock recovery can cascade into timing alarms and service instability.

What SyncE provides (in-scope)
  • A frequency base delivered hop-by-hop via PHY recovered timing (line clock → recovered clock → disciplined local outputs).
  • A controllable jitter/wander profile when paired with filtering (tracker vs cleaner partitioning) and well-defined pass criteria.
  • Operational traceability hooks (quality/alarms/holdover behavior) so timing degradation is visible and actionable.
What SyncE does NOT provide (out-of-scope)
  • Not absolute time-of-day (no “wall-clock time” delivery).
  • Not phase/time alignment by itself (phase/time convergence belongs to time protocols and external references).
  • Not a substitute for full timing budget ownership (it is one layer; filtering, switching, and observability still decide outcomes).
Engineering hooks (how to think about problems)
Problem classification
If the dominant symptom is frequency drift / frequency-quality alarms, the first-order lever is SyncE (recovery + filtering + quality selection). If the symptom is phase/time alignment, SyncE is typically the frequency base while phase/time convergence is handled elsewhere.
What “success” looks like
Validation is not a single number: it is (a) stable lock behavior, (b) controlled jitter/wander transfer, and (c) predictable switching/holdover transients with logs that allow root-cause replay.
Diagram: Timing stack (Frequency / Phase-Time / Absolute reference)
Timing stack diagram for SyncE, PTP, and GNSS Three-layer timing stack: SyncE provides frequency base, PTP handles phase/time, GNSS provides absolute reference. Frequency layer (SyncE) Phase/Time layer (PTP) Absolute reference layer (GNSS / PRC) PHY recovered clk EEC Jitter / wander filter HW timestamp Servo Delay correction GNSS receiver PRC (abstract) Holdover Stable frequency base Solid = base dependency Dashed = external reference Minimal text, box-first

Where SyncE sits in a carrier timing chain (EEC/SEC/SSU roles)

In carrier networks, SyncE is not “a feature on a port”—it is a system timing chain. A high-quality upstream reference is transported over Ethernet links, recovered at each node, filtered, selected against alternatives, and redistributed to local endpoints. Carriers require traceability (knowing what is locked to what), switchability (fault isolation and fallback), and observability (alarms and logs that explain degradations).

Roles as actions (not definitions)
EEC (equipment clock)
  • Does: recover line frequency, filter it, and output a controlled local frequency reference.
  • Verify by: lock status + frequency-quality indication + jitter/wander transfer consistency under stress.
SEC / SSU (system selection & distribution)
  • Does: choose the best source, enforce quality policy, and distribute clocks to all local domains.
  • Verify by: stable source selection (no flapping) + deterministic switching + complete logs for cause/effect.
Holdover (controlled degradation mode)
  • Does: keep frequency stable enough when the reference is lost, then re-acquire without instability.
  • Verify by: drift trend across temperature/time + clean re-acquisition transient + bounded alarm behavior.
Why carriers insist on traceability & switchability
  • Fault isolation: upstream degradation should not silently poison many downstream nodes.
  • SLA protection: switching policy and holdover behavior are part of service continuity, not optional tuning.
  • Operational replay: alarms and logs must reconstruct “what changed” (source/quality/lock) when a timing event occurs.
The chain view that prevents design gaps

Treat the system as a repeated unit: recover → filter → select → distribute → monitor. Each hop can add noise and can also filter noise; the engineering job is to decide who owns which noise band and to make failures observable (lock, quality, switching, and holdover traces).

Minimal “must-log” fields (chapter-wide anchor)
  • Selected source ID + current quality level (QL/quality state)
  • Lock state timeline (locked → holdover → reacquire)
  • Switch reason + holdoff timers + revertive policy state
  • Key performance snapshots (output jitter/wander checks at defined points)
Diagram: End-to-end carrier timing chain (recover → filter → select → distribute → monitor)
Carrier SyncE timing chain with recovery, filtering, selection, distribution, and monitoring A reference source feeds SyncE transport links; each node recovers timing, filters jitter, selects best source, distributes clocks, and reports alarms to monitoring. Monitoring / NOC Upstream ref PRC / GM (abs) Ref out Aggregation node Recover (PHY / CDR) Filter (PLL / cleaner) Select QL Distribute (fanout) DU / RU node Recover + filter Holdover Local clock tree Endpoints (PHY/SoC) SyncE SyncE Traceable source EEC SEC/SSU Holdover Alarms Solid = clock path · Dashed = alarms/status to operations · Every hop: recover → filter → select → distribute

SyncE architecture model: Source → Link → Recovery → Distribution

A SyncE design is best treated as a repeatable chain: source quality enters the system, the Ethernet link transports line timing, the node recovers frequency at the PHY/PLL boundary, and the board distributes disciplined clocks to local endpoints. This model prevents gaps: every block must deliver both clock and truth (status) for operations.

The chain, by responsibility (role → output → risk)
Source (XO / TCXO / OCXO / GNSSDO)
  • Delivers: frequency stability and holdover potential.
  • Exposes: temperature/aging behavior that becomes long-term drift.
  • Primary risk: “good on bench, bad in field” due to thermal gradients and aging.
Link (Ethernet PHY line timing)
  • Delivers: transport of frequency over physical-layer timing.
  • Exposes: hop-to-hop noise accumulation and path dependency.
  • Primary risk: upstream degradation silently propagating downstream.
Recovery (CDR / PLL)
  • Delivers: recovered frequency plus lock/quality state.
  • Exposes: transfer behavior (what noise is tracked vs rejected).
  • Primary risk: “locked but unqualified” behavior without usable status.
Distribution (cleaner / fanout / endpoints)
  • Delivers: disciplined clocks (e.g., 25 / 125 / 156.25 MHz) with controlled skew.
  • Exposes: board coupling (power/EMI/return paths) that can re-inject noise.
  • Primary risk: the clock tree becoming a noise amplifier and skew generator.
Three knobs that decide outcomes (configure → verify → operationalize)
Knob #1 — Loop bandwidth (tracker vs cleaner)
  • Decision: what noise is tracked vs attenuated.
  • Bad sign: faster lock but worse output jitter, or quiet output but poor tracking.
  • Verify: compare recovered clock vs cleaner output under controlled disturbances.
Knob #2 — SSM/QL policy (selection & loop avoidance)
  • Decision: priority, holdoff timers, revertive behavior.
  • Bad sign: source flapping or timing loops after adding redundancy.
  • Verify: event logs reconstruct “source → quality → switch reason” unambiguously.
Knob #3 — Holdover (controlled degradation)
  • Decision: drift bounds, temperature behavior, re-acquisition behavior.
  • Bad sign: acceptable minutes-level drift but hours-level divergence in the field.
  • Verify: long-soak drift logs + temperature sweep with repeatable entry/exit criteria.
Diagram: Reusable SyncE module chain (with knobs and probe points)
SyncE architecture model from source to endpoints with tracker, cleaner, and fanout Block diagram: reference input enters tracker PLL, then cleaner PLL, then fanout to endpoints. Shows loop bandwidth knob, SSM/QL policy selector, holdover path, and measurement probe points. Ref in XO/TCXO/OCXO Tracker PLL CDR / recovery Lock + status Cleaner PLL Jitter filter Holdover mode Fanout Low skew Multi-out Endpoints PHY SoC FPGA 25 / 125 MHz 156.25 MHz Knob: Loop BW (tracker ↔ cleaner) Knob: SSM/QL Knob: Holdover Monitor / Logs Source + QL Lock state Switch reason Holdover TP1 TP2 TP3 Solid = clock path · Dashed = control/status

Performance targets that matter: jitter vs wander, masks, and pass criteria

SyncE acceptance must be written as measurable behaviors, not abstract numbers. The key is separating jitter (short-term phase variations) from wander (slow frequency/phase drift), then assigning ownership across the chain (recovery vs cleaning vs distribution). Pass criteria should be derived from network budgets and equipment class, and expressed as repeatable tests.

Risk mapping (what breaks when each is wrong)
Jitter (short-term)
  • Typical symptom: intermittent lock instability, timing-quality alarms, endpoint tolerance issues.
  • Common causes: bandwidth partitioning mistakes, power/EMI coupling into PLLs, distribution re-injection.
  • Most useful check: compare recovered clock vs cleaner output under the same stimulus to locate ownership.
Wander (long-term)
  • Typical symptom: slow degradation, holdover drift, quality downgrades after hours or temperature changes.
  • Common causes: oscillator thermal gradients, aging model mismatch, policy-driven long settling after switching.
  • Most useful check: long-soak drift logs correlated with temperature and source-selection events.
Minimum acceptance test set (write criteria as “measure point + stimulus + pass”)
1) Output jitter checks
  • Measure point: cleaner output (TP2) and one endpoint clock (TP3).
  • Stimulus: normal operation + induced supply/EMI stress representative of deployment.
  • Pass: jitter stays within the allocated budget for the equipment class and endpoint tolerance.
2) Wander / drift statistics
  • Measure point: local frequency output used as timing reference (TP2/TP3).
  • Stimulus: long soak + temperature sweep + reference-loss entry into holdover.
  • Pass: drift stays bounded by the service budget over the required holdover window.
3) Lock / settling
  • Measure point: lock-state timeline and performance snapshots at defined instants.
  • Stimulus: link up/down, source change, and controlled degradations.
  • Pass: predictable settling without alarm flapping, consistent across temperature and lots.
4) Switchover transient
  • Measure point: the same output clock before/during/after switching (TP2/TP3).
  • Stimulus: forced primary loss and controlled revertive transitions.
  • Pass: transient stays within the network’s permitted window and is repeatable.
How to write pass criteria without hardcoding numbers
  • Use placeholders tied to ownership: X = network jitter budget, Y = equipment class requirement, Z = endpoint tolerance.
  • Express criteria as behavior: “After switching, output frequency error settles within ±X in Y seconds, with no quality flapping.”
  • Always bind criteria to measure point (TP2/TP3) and stimulus (temperature, link loss, forced switch).
Diagram: Wander vs jitter zones and cascaded filtering (shape-only)
Jitter and wander zones with tracker and cleaner transfer shapes A log-frequency axis split into wander and jitter zones. Two shape curves indicate tracker transfer and cleaner attenuation with bandwidth markers. Wander zone Jitter zone Frequency offset (log) Transfer / attenuation (shape) Tracker BW Cleaner BW Tracker Cleaner Tracker owns tracking Cleaner owns cleaning

Clock recovery in Ethernet PHY: what can and cannot be controlled

In SyncE systems, the Ethernet PHY is the frequency recovery boundary: line timing enters on the left, a CDR/PLL recovers frequency, and the result is either consumed internally or exported as a recovered clock accompanied by status/alarms. Engineering success depends on separating knobs (what is configurable and observable) from limits (what is dictated by PHY architecture and board coupling).

The recovery chain (clock + truth)
  • Line timing in: frequency is embedded in data transitions; SyncE uses physical-layer timing, not packets.
  • CDR/PLL recovery: the PHY recovers frequency and maintains lock under defined jitter tolerance.
  • Recovered clock out: may be internal-only or exported (path and output standard matter).
  • Status/alarms: lock/quality indicators are mandatory for operations, switching, and postmortems.
Knobs: what can be configured and validated
Knob #1 — Recovered clock export path & output format
  • Decision: internal-only vs exported, and LVCMOS/LVDS/HCSL/LVPECL as applicable.
  • Why: mux and pin routes can inject noise; exported clocks may differ from internal clocks.
  • Verify: compare TAP (near PHY) vs system receive point under the same stimulus.
Knob #2 — PHY mode / performance grade
  • Decision: recovery behavior, lock criteria, and any jitter-optimized modes the PHY provides.
  • Why: the transfer behavior can change substantially between modes.
  • Verify: measure output jitter and lock robustness trends after mode changes.
Knob #3 — Reference input requirements and isolation
  • Decision: reference quality, routing, filtering, and power domain isolation for PHY timing.
  • Why: reference pollution can lift the recovered-clock noise floor directly.
  • Verify: isolate/swap reference and check whether recovered-clock metrics improve coherently.
Knob #4 — DPLL/control interface and observability
  • Decision: what can be read (lock/quality/freq offset) and what can be set (holdover entry/exit, priorities).
  • Why: without truth signals, field failures become blind tuning.
  • Verify: define minimum log schema: source ID, quality, lock state, switch reason, holdover state.
Limits: what cannot be tuned away
  • Architecture ceiling: recovered-clock jitter floor and jitter tolerance are PHY-dependent; configuration changes have bounded impact.
  • Board sensitivity: supply noise, ground bounce, and crosstalk can dominate high-frequency jitter even with a clean upstream source.
  • Export path penalties: muxing and pin routing may add noise; internal clocks can differ from exported clocks.
  • Black-box behavior: lock/quality alarm policies can be opaque; reliable diagnosis requires status plus controlled stimuli.
Practical takeaway: recovered clocks are often best treated as tracker inputs, while a downstream cleaner is used for final clock delivery.
Diagram: Simplified PHY recovery path (line → CDR/PLL → recovered clock + status)
Ethernet PHY clock recovery simplified path with measurement tap Block diagram: line input enters PHY, goes through CDR and PLL, produces recovered clock output and status/alarms to monitor. Includes an output mux and a measurement tap. Line in PHY Rx Ethernet PHY (recovery) CDR PLL Mode Ref in DPLL IF Outputs Output mux Recovered clk Status / alarms Lock · QL · LOS Monitor Logs / alarms TAP Measure Solid = clock path · Dashed = status/measurement

Jitter filtering strategy: tracker vs cleaner loop bandwidth partitioning

Two-loop SyncE designs work when each loop has a clear job: the tracker follows upstream frequency (wider bandwidth), while the cleaner suppresses random jitter and isolates noise (narrower bandwidth). The system must enforce ownership—who passes, who attenuates, and where noise is re-added— and must avoid loop contention that turns “locked” into “unstable in the field.”

Roles and bandwidth intent (no formulas, just outcomes)
Tracker loop (wider BW)
  • Purpose: follow upstream frequency and maintain SyncE continuity.
  • Expected behavior: pass slow variations while avoiding unnecessary high-frequency noise transfer.
  • Common misuse: using tracker output as final delivery clock without downstream cleaning.
Cleaner loop (narrower BW)
  • Purpose: reduce random jitter and decouple upstream noise from local distribution.
  • Expected behavior: avoid chasing slow disturbances that belong to tracking/selection policy.
  • Common misuse: setting cleaner BW too wide, turning it into a noisy follower.
Ownership model: Pass / Attenuate / Add
  • Pass: slow frequency variations that must be tracked to stay aligned to the timing chain.
  • Attenuate: random jitter that should not reach distribution and endpoints.
  • Add: board-level coupling and distribution artifacts that can re-inject noise even after cleaning.
Typical failures (symptom → fast localization → fix direction)
Failure #1 — bandwidth contention (“both loops chase the same band”)
  • Symptom: lock holds, but alarms flap; switching causes overshoot or ringing.
  • Fast check: tracker output and cleaner output move in the same direction under the same stimulus.
  • Fix direction: enforce bandwidth hierarchy and clear ownership boundaries.
Failure #2 — coupled oscillation (stable loops, unstable cascade)
  • Symptom: periodic modulation appears after changes or during holdover transitions.
  • Fast check: lock stays true while frequency error or phase trend oscillates.
  • Fix direction: reduce coupling paths (control, power, policy), and widen separation of dynamic responses.
Failure #3 — fast lock, bad jitter (cleaner too wide)
  • Symptom: lock time improves but output jitter exceeds budget or endpoints lose tolerance margin.
  • Fast check: cleaner output “follows” upstream disturbances too closely.
  • Fix direction: narrow cleaner bandwidth and keep it as a true jitter attenuator.
Failure #4 — clean output, cannot track (tracker too narrow or policy misfit)
  • Symptom: good jitter at steady state, but drops lock or downgrades quality under drift/switch events.
  • Fast check: tracker output cannot accommodate wander-like changes without entering alarms/holdover.
  • Fix direction: widen tracker tracking intent and align policy timers with loop dynamics.
Minimum verification loop (stimulus → tap trend → pass template)
  • Wander-like stimulus: tracker should reflect tracking; cleaner should not over-chase.
  • Jitter-like stimulus: cleaner should attenuate; endpoint should gain margin vs recovered clock.
  • Switching stimulus: transients remain within budget-derived placeholders and do not cause alarm flapping.
  • Pass template: under defined stimuli, tracker and cleaner outputs must show distinct ownership trends rather than co-moving.
Diagram: Cascaded PLL chain with noise ownership and bandwidth boundaries
Two-loop PLL cascade with Pass Attenuate Add ownership labels and bandwidth boundaries Upstream reference passes to tracker PLL then to cleaner PLL then to outputs. Labels indicate Pass, Attenuate, and Add, with bandwidth boundary markers for tracker and cleaner. Upstream ref recovered clk Tracker Follow (LF) Output to cleaner Cleaner Filter (HF) Deliver clocks Outputs PHY PASS ATTENUATE ATTENUATE Board coupling ADD Tracker BW Cleaner BW Ownership: Pass / Attenuate / Add

Quality Levels (QL), SSM/ESMC, and avoiding timing loops

In SyncE networks, frequency is selected and propagated through many nodes. Quality Levels (QL) carried by SSM/ESMC exist to make source selection deterministic: every node can tell which reference is more trustworthy, when to downgrade, and when to switch. Without consistent QL policy, timing loops can form—nodes indirectly reference each other—leading to drift-like behavior, alarm flapping, and unstable switching.

What QL/SSM/ESMC provides (engineering meaning)
  • QL: a network-visible trust label for frequency references, enabling consistent “best source” decisions.
  • SSM/ESMC: a transport mechanism for QL so downstream nodes see upgrades/downgrades in real time.
  • Operational value: prevents “link-up but wrong source” scenarios and makes failures isolatable by policy and logs.
Configuration panel (setting → risk → pass intent)
Input priority table
  • Setting: ordered list of eligible inputs.
  • Risk: priority-only selection can choose a high-priority but low-trust source.
  • Pass intent: selection must satisfy both priority and minimum QL gate.
QL gate (allow/deny)
  • Setting: acceptable QL set for “valid reference”.
  • Risk: loose gating spreads degraded sources downstream.
  • Pass intent: downgrades must propagate consistently; no local “false high quality”.
Revertive vs non-revertive
  • Setting: whether to return to primary after recovery.
  • Risk: immediate revertive behavior can cause repeated disturbances and flapping.
  • Pass intent: any return-to-primary must satisfy holdoff/WTR stability windows.
Holdoff timers & alarm gating
  • Setting: delay before switching, recovery wait, and alarm suppression window.
  • Risk: too short → flapping; too long → slow fault isolation and recovery.
  • Pass intent: switch triggers require persistent evidence, not transient events.
Timing loop formation (conditions → symptoms)
  • Conditions: mutual reference selection, inconsistent QL propagation, revertive + short timers, or wrong advertisement of derived sources.
  • Symptoms: drift-like oscillation, QL/source flapping, alarms synchronized across nodes, “stable link but unstable quality”.
Fast localization: start from selection events (source + QL + reason + timers), then reconstruct the chosen reference path to detect closed loops.
Diagram: Multi-source selector with QL priority and a timing-loop example
Multi-source input selector using priority and QL gate, plus timing-loop warning example Left: sources with QL chips feed an input selector showing priority and QL gate. Right: two nodes form a timing loop highlighted with a red warning outline. Correct: select by Priority + QL gate Source A QL: High Source B QL: Med Source C QL: Low Input selector Priority QL gate Holdoff WTR Local clocks Advertise QL Wrong: timing loop (warning) Node A Select in Node B Select in Timing loop risk Blue = clock path · Red = warning only

Protection switching & holdover: hitless goals and what to log

Protection switching in SyncE is an operational requirement: references change because links fail, QL degrades, or maintenance occurs. A “hitless” goal is achieved when transients are contained within system tolerance windows and do not trigger service-impacting behaviors. Holdover is the controlled bridge between references; it must remain bounded, observable, and recoverable through disciplined timers and logging.

Switching triggers (type → risk)
  • Link loss (LOS): largest transient risk; must rely on predefined priorities and timers.
  • QL degrade: link remains up but is untrusted; risk is overreaction and flapping without gating.
  • Manual switch: maintenance action; must obey the same holdoff/WTR rules to avoid unnecessary disturbances.
  • Internal fault: local PLL/thermal/supply alarm; must be logged with root reason and configuration context.
Hitless criteria as acceptance language (windows, not fixed numbers)
  • Transient window: switching disturbance stays within tolerance derived from network budget and equipment grade.
  • Alarm-gate window: short post-switch disturbance does not cause persistent alarms or repeated switching.
  • Settling window: reacquisition converges within a defined window and remains stable (no flapping).
Holdover (entry → bounded drift → recovery)
  • Entry: reference loss or QL below gate; policy must define when to enter and what to freeze.
  • Bounded drift: temperature and aging dominate; holdover duration is budgeted, not assumed.
  • Recovery: restored references require holdoff/WTR and a settling gate to prevent “recover-then-flap”.
What to log (minimum schema for accountability)
Event log (every switch)
  • timestamp (local time base)
  • event type (LOS / QL degrade / manual / internal fault)
  • from source → to source (source ID)
  • decision inputs: QL, priority rank, timer states (holdoff/WTR/gate)
  • reason code (primary + secondary)
State log (periodic samples)
  • state (LOCKED / HOLDOVER / REACQUIRE)
  • freq offset trend / drift indicator
  • temperature / supply health snapshots
  • PLL profile ID (configuration “mode”, not raw registers)
Pass template: each switch must be reconstructable from logs to show why it happened and whether the system converged without repeated oscillation.
Diagram: Hitless switching state machine and holdover timeline windows
SyncE switching state machine and holdover timeline with holdoff settling and alarm gate windows Top: LOCKED to HOLDOVER to REACQUIRE to LOCKED state machine with triggers. Bottom: timeline showing holdoff, settling, and alarm gate windows with event markers. State machine (top) + Timeline windows (bottom) LOCKED Ref valid HOLDOVER Bounded drift REACQUIRE Settle gate LOCKED Stable LOS / QL degrade Ref restored holdoff / settling / gate Timeline t0 t holdoff settling alarm gate Switch Reacquire Stable HOLDOVER Avoid flapping Windows derived from system budget

Hardware implementation: reference tree, power isolation, and distribution

A SyncE design succeeds or fails on the board. The practical goal is a clock tree that is hierarchical (clean near the root, distribute near the leaves), isolated (supply and return paths do not inject into PLL-sensitive nodes), and measurable (test points exist to separate “root quality” from “distribution damage”).

Recommended hierarchy (root → leaves)
  • Tree: Ref in → (selector/protection) → tracker/cleaner → fanout/buffers → endpoints (PHY / FPGA / SoC).
  • Root rule: place “cleaning” early so downstream branches do not multiply noise paths.
  • Leaf rule: fanout stages should focus on drive, standards, and load isolation—avoid asking them to “fix” jitter.
Power strategy (islands + injection control)
PLL / cleaner analog island
  • Dedicated low-noise rail (LDO or filtered branch) placed close to the load.
  • Short, tight decoupling loops to avoid turning filters into resonators.
  • Return path kept continuous; avoid forcing sensitive currents to detour.
Digital control & fanout rails
  • Separate control rail prevents I²C/SPI activity from modulating the analog island.
  • Fanout drivers can warrant their own branch to avoid switching noise back-injection.
  • If rails must be shared, enforce isolation with filters and a controlled return.
Distribution choices (ZDB vs fanout, differential routing, backplane)
  • ZDB: best for delay alignment across domains when feedback alignment is part of the requirement.
  • Fanout buffers: best for multi-output low-jitter distribution and level-standard matching.
  • Across connectors/backplanes: treat clock as a controlled-impedance channel with continuous reference planes and explicit return paths.
Measurement intent: TP1 separates “root clock quality” from downstream damage; TP2 validates the far-end endpoint sensitivity and channel integrity.
Diagram: Board-level clock tree (hierarchy, branches, and test points)
Board-level clock tree for SyncE: reference selection, cleaner, fanout, endpoints, and test points Ref A and Ref B feed a glitch-free selector. The selected reference drives a tracker/cleaner block with an analog power island. A fanout block distributes to multiple endpoints (PHY, FPGA, SoC). TP1 is placed after cleaner output; TP2 is placed at the far-end endpoint input. Ref → Selector → Cleaner → Fanout → Endpoints (TP1/TP2) Ref A Input Ref B Input Selector glitch-free A/B policy Tracker / Cleaner Loop profile Holdover Alarms Analog LDO Filter Digital LDO CTRL TP1 Fanout LVDS HCSL LVPECL Endpoints PHY FPGA SoC TP2 Keep cleaning near root; distribute near leaves

PCB layout & real-world pitfalls (EMI coupling, terminations, SSC policy)

Most field failures are not caused by “wrong frequency,” but by injection paths that turn a clean clock into an unstable reference: supply ripple modulates PLLs, return paths create common-mode injection, and poor terminations amplify reflections. SSC can help EMI, but it must be treated as a policy knob—enabled only where compatibility and timing margins are proven.

Terminations & returns (mistake → symptom → quick check)
HCSL
  • Mistake: termination topology/placement inconsistent with the receiver channel.
  • Symptom: reflections and common-mode stress → jitter rises or lock alarms become sensitive.
  • Quick check: observe far-end waveform and common-mode stability around switching events.
LVDS
  • Mistake: impedance discontinuities or routing across split reference planes.
  • Symptom: stable on bench, fragile in chassis/backplane conditions.
  • Quick check: compare near-end vs far-end; if far-end degrades sharply, the channel is the culprit.
LVPECL
  • Mistake: incorrect bias/termination network leading to shifted common-mode point.
  • Symptom: reduced receiver margin and higher sensitivity to ground bounce.
  • Quick check: verify common-mode stability during activity bursts and thermal changes.
Three dominant coupling paths (evidence-driven isolation)
  • Ripple: PSU noise → PLL modulation. Evidence comes from a nearby analog-rail probe.
  • Crosstalk: fanout switching or parallel routing → differential-pair disturbance. Evidence comes from near-end vs far-end comparison.
  • Return: connector/backplane + discontinuous reference planes → ground bounce/common-mode injection. Evidence comes from local ground-sense observations.
SSC policy (when to enable vs when to forbid)
Enable SSC only if
  • downstream endpoints explicitly tolerate SSC and the timing margin is proven.
  • SSC does not sit inside the most sensitive synchronization branch (or can be isolated to non-critical outputs).
  • logs correlate SSC state with any switching/alarms for fast rollback decisions.
Forbid SSC if
  • the branch is timing-critical and jitter/alarms are already near the edge.
  • compatibility is uncertain and symptoms appear as “lock but flaps” or “training fails”.
  • switching behavior becomes unstable even when the link remains up.
Fast remediation order (highest yield first)
  1. stabilize the analog island rail and its return loop (prove with power probe evidence).
  2. repair return-path discontinuities and ground bounce injection paths.
  3. fix termination topology and common-mode stability for the chosen signaling standard.
  4. reduce long parallel coupling (routing spacing, stitching, layer planning) and connector return weakness.
Diagram: Layout pitfalls map (Ripple / Crosstalk / Return)
Layout pitfalls map for SyncE: ripple injection, crosstalk into differential pairs, and return-path ground bounce A central PLL/clock cleaner block receives three coupling arrows: PSU ripple to PLL, fanout crosstalk into differential pairs, and connector/backplane return-path ground bounce. Test points are shown for power, near-end, far-end, and ground sense. Coupling paths that break SyncE stability Clock cleaner PLL core Sensitive PSU / rails Ripple Fanout Connector Ground bounce Diff pairs Crosstalk Return path Return Ripple Crosstalk Return Test points: TP_PWR_A · TP_GND_SENSE · TP1 (near) · TP2 (far) TP1 TP2

Engineering checklist (bring-up → verification → deployment)

This checklist is designed to deliver a repeatable bring-up: establish one stable timing baseline first, then introduce switching and alarms only after measurement taps prove the clock tree is clean at the root and intact at the far end. Each step below includes Action, Probe, and Pass criteria.

Bring-up
baseline-first
1) Freeze one reference path (no auto switching)
Action: force a single upstream reference (A or B), disable SSC, disable revertive policies, keep alarms gated.
Probe: reference selector status + lock indicators (LOS/LOL) + event counter baseline.
Pass: stable selection state (no flapping) within a defined observation window.
2) Lock the link and confirm recovered clock behavior
Action: bring the Ethernet link up; confirm the PHY recovered clock is present/valid (if exposed).
Probe: TP2 (far-end endpoint input) + PHY status (link/lock).
Pass: TP2 frequency stable and PHY lock indicators remain steady during normal traffic.
3) Enable cleaner profile (conservative first)
Action: enable jitter cleaning using a conservative loop profile; keep multi-input switching disabled.
Probe: TP1 (cleaner output near-end) + TP_PWR_A (analog rail ripple near the cleaner).
Pass: TP1 improves versus TP2 (root is cleaner than the far end) and no lock alarms appear under controlled load steps.
4) Open fanout in groups (avoid multi-variable changes)
Action: enable fanout outputs in small groups; keep routing/connector worst paths for last.
Probe: one near-end output + one far-end output per group (compare degradation).
Pass: enabling additional branches does not cause a measurable regression beyond the allocated distribution margin.
5) Only then enable QL policy, auto switching, and alarm gating
Action: validate manual switching first; then enable holdoff timers, revertive/non-revertive policy, and alarm gates.
Probe: state timeline (LOCKED → HOLDOVER → REACQUIRE) + switching reason codes.
Pass: switching is explainable (why/when/which) and does not trigger flapping or alarm storms.
Verification (prove margins under realistic stress)
Jitter
Stimulus: traffic load + control bus activity + power/load steps.
Probe: TP1 vs TP2 + analog rail ripple correlation.
Pass: RMS jitter meets the allocated budget for the equipment class; no timing alarms triggered by normal operations.
Wander / frequency stability
Stimulus: slow reference variations and holdover entry/exit.
Probe: frequency offset statistics across short/long windows.
Pass: drift remains within the network budget and does not cause QL oscillation or repeated switching.
Switching transient
Stimulus: manual A/B switch; simulate QL downgrade; simulate link loss.
Probe: state timeline + TP1/TP2 transient capture.
Pass: transient stays inside the system tolerance window and settles without flapping.
Thermal & long-run
Stimulus: cold start → thermal steady-state; airflow changes; multi-hour stability run.
Probe: drift slope + alarm rate + switch count.
Pass: predictable behavior across temperature and stable alarm rate under sustained operation.
Deployment (policy, logging, rollback)
QL policy knobs
  • priority order + revertive / non-revertive selection
  • holdoff timers to prevent fast oscillation
  • clear loop-avoidance policy (single upstream ownership)
Alarm gating & thresholds
  • debounce gates for LOS/LOL and QL drop
  • event-rate limits to prevent alarm storms
  • SSC is a policy switch: must be reversible and logged
Minimal log schema (must be replayable)
Fields: timestamp, active reference, QL in/out, lock states, holdover enter/exit reason, switch reason code, switch count, alarm counters, optional TP summary (near/far).
Rollback buttons (field-safe)
  • force single reference + disable auto switching
  • disable SSC + fall back to conservative loop profile
  • isolate non-critical branches (fanout group disable)
Diagram: Checklist flow (Bring-up → Verify → Deploy)
SyncE engineering checklist flow: bring-up, verification, deployment Three-lane flow with action boxes and measurement taps: TP1, TP2, TP_PWR_A, TP_GND_SENSE. Bring-up Verify Deploy Freeze ref (A/B) Lock link + TP2 Enable cleaner + TP1 Open fanout (groups) Manual switch test Jitter (TP1 vs TP2) Wander / drift stats Switch transient Thermal + long-run QL policy + holdoff Alarm gating Log fields Rollback plan taps: TP1 · TP2 · TP_PWR_A · TP_GND

Applications & IC selection notes (carrier / 5G backhaul patterns)

This section ties SyncE to real carrier topologies and turns “timing requirements” into a block-level bill of materials. The goal is not a brand catalog, but a repeatable selection flow that maps each deployment pattern to the minimum set of blocks (PHY + cleaner + fanout + switching + monitoring) required to meet the local jitter/wander budget.

A) Applications (deployment patterns)
1) Cell-site backhaul (DU/RU chains)
Objective: stable frequency distribution under outdoor temperature and power noise.
Pressure points: thermal drift, rail ripple, connector return-path noise.
Blocks: SyncE-capable PHY + root cleaner + fanout (grouped) + minimal monitoring.
Log: lock/holdover transitions, drift slope, alarm rate, switch counter (if dual ref).
2) Aggregation node (multi-upstream / multi-downstream)
Objective: clear “who is trusted” behavior with traceable switching.
Pressure points: QL oscillation, timing loops, policy flapping.
Blocks: multi-input DPLL/cleaner with input selection + holdoff timers + fanout + robust logs.
Log: active input, QL in/out history, switch reason codes, holdoff state, event counters.
3) Ring networks (redundancy + loop risk)
Objective: redundancy without timing loops or alarm storms.
Pressure points: loop formation, unstable priority when references are close.
Blocks: explicit QL policy + conservative revertive behavior (if needed) + stable alarm gating.
Log: QL changes and loop-related alarms correlated to traffic/routing events.
4) Dual-homing (hitless goals + rollback)
Objective: clean main/backup behavior with bounded switching transients.
Pressure points: frequent switching when inputs are similar; transient spikes.
Blocks: glitch-free selection + holdoff + holdover plan + detailed event logs.
Rollback: force one input + disable auto switching + disable SSC.
B) IC selection notes (decision flow + example material numbers)
Decision flow (block ownership)
  1. SyncE-capable endpoints? If recovered clock is usable and observable, keep architecture simple; otherwise add an external cleaner stage.
  2. Tight jitter budget at the far end? Put a dedicated jitter attenuator/cleaner at the root and prove TP1 vs TP2 margin.
  3. Need hitless / controlled switching? Require glitch-free selection, holdoff timers, and replayable logs (reason codes + counters).
  4. Need holdover? Select oscillator class and DPLL features based on temperature profile and expected outage duration.
  5. Need monitoring? Add clock monitors / frequency or phase measurement hooks to enable fast field isolation.
Example material numbers (starting points only)

These part numbers are provided to accelerate datasheet lookup and lab prototyping. Always verify suffix/package, supported input/output standards, telecom timing features, and availability for the target BOM.

Jitter attenuators / DPLL timing ICs
Si5341 Si5344 Si5345 ZL30792 ZL30772 82P33931 AD9545 AD9548
Check: multi-input behavior, hitless switching support (if required), loop profiles, holdover capability, and output jitter performance.
Clock generators / distribution (low additive jitter)
AD9528 HMC7044 LMK04828 LMK05318
Check: output standards (LVDS/HCSL/LVPECL/LVCMOS), channel count, skew control, and additive jitter.
Fanout buffers (LVDS/HCSL/LVPECL)
LMK00334 LMK00804 ADCLK948 ADCLK905
Check: additive jitter, output swing/CM, termination requirements, and sensitivity to supply/ground bounce.
Programmable oscillators (platform flexibility / SKU management)
Si570 Si571 Si549
Check: phase noise/jitter suitability for the branch (root vs non-critical), control interface, and warm-up/aging behavior.
Monitoring & measurement hooks (frequency/phase visibility)
ZL30364 Check: alarm outputs, measurement resolution, and how the data maps into replayable logs.
Integration note: TP1 / TP2 taps should be designed in from day one. Selecting strong ICs cannot compensate for missing measurement hooks and uncontrolled return paths.
Diagram: Pattern map + selection flow
SyncE deployment patterns and a block selection flow Left: point-to-point, ring, and dual-homing patterns. Right: decision flow that selects blocks: cleaner, mux, fanout, monitoring, and holdover support. Topology patterns Selection flow Point-to-point Upstream Downstream Ring Node A Node B loop risk Dual-homing Upstream A Upstream B MUX Tight jitter budget? Add cleaner at root Need hitless switching? Glitch-free MUX + holdoff Need holdover / monitoring? Oscillator class + monitors Block set: PHY + Cleaner + (MUX) + Fanout + Monitoring (+ Holdover)

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (SyncE troubleshooting, 4-line answers)

Each FAQ is intentionally constrained to the SyncE boundary and uses the same 4-line, testable format: Likely cause / Quick check / Fix / Pass criteria.

ALARMS Link is stable, but SyncE quality alarms still happen—first “wander vs jitter” check?
Likely cause: A slow frequency/wander component is violating the timing mask even though short-term jitter looks acceptable.
Quick check: Compare TP1 vs TP2 using both a short-window RMS jitter view and a long-window wander view (MTIE/TDEV or frequency-offset statistics) and correlate with QL/ESMC events.
Fix: Re-partition tracker/cleaner loop profiles and tighten holdover/selection policy using a timing DPLL/cleaner (e.g., Si5344/Si5345, ZL30792, AD9545) at the root.
Pass criteria: Quality alarms remain cleared across the configured observation window and the measured wander/jitter metrics meet the allocated network budget for the equipment class.
LOCK Cleaner output jitter is great, yet PHY occasionally drops lock—probe what first?
Likely cause: The PHY CDR/lock is being disturbed by local coupling (supply/return-path/common-mode) or by an input level/termination mismatch rather than by root jitter.
Quick check: Time-correlate PHY LOS/LOL counters with TP at the PHY clock pin plus TP_PWR_A ripple and check whether failures only occur during traffic bursts or cable/connector disturbance.
Fix: Improve local power/return integrity and re-drive the PHY clock with a low-additive-jitter buffer/fanout (e.g., LMK00334/LMK00804, ADCLK948) while keeping the cleaner (e.g., Si5344/AD9545) at the tree root.
Pass criteria: PHY lock indicators stay stable and LOS/LOL event rate drops below the site alarm gate threshold under the same stress conditions.
BW Widening loop bandwidth reduced lock time but worsened jitter—how to confirm BW is the culprit?
Likely cause: A wider tracking bandwidth is passing more upstream reference noise into the output and shrinking the cleaner’s effective attenuation region.
Quick check: A/B test two loop profiles and compare TP1 jitter (and wander metrics if available) while keeping the same upstream reference and load, then confirm the degradation tracks the profile setting rather than traffic or temperature.
Fix: Restore a narrower tracking loop and let a dedicated cleaner handle random jitter (e.g., AD9545/AD9548 or Si5344/Si5345 with a “tracker + cleaner” profile split).
Pass criteria: Lock time remains within the deployment requirement while TP1/TP2 jitter metrics return to the allocated budget with no new alarm triggers.
TEMP Lock passes at room temp but fails across temperature—what to log and compare?
Likely cause: Temperature-driven drift or supply/threshold shift is pushing the DPLL/PHY into a marginal region that only appears during cold-start or thermal gradients.
Quick check: Log temperature, frequency offset, active input, QL in/out, LOS/LOL counters, and holdover enter/exit reason codes, then compare “fail” vs “pass” runs at the same topology and traffic profile.
Fix: Reduce thermal gradients and use a timing DPLL with well-defined holdover behavior (e.g., Si5344/Si5345 or AD9545) paired with a higher-stability timebase option (e.g., TCXO family such as SiT5356) where holdover dominates.
Pass criteria: The same temperature sweep no longer triggers LOS/LOL or QL instability and the drift trend stays within the allocated holdover budget for the required duration.
METRICS Why does recovered clock look fine on scope but MTIE/TDEV fails in report?
Likely cause: The scope view is emphasizing short-term jitter while the report captures longer-term wander/phase variations (or uses a different mask/filter configuration).
Quick check: Verify the test method (mask/class, observation interval, filtering) and re-run TP1/TP2 measurements using a time-interval/phase tool that outputs MTIE/TDEV rather than only time-domain jitter.
Fix: Adjust loop partitioning to reduce low-frequency phase wander and ensure the timing IC is configured for the required telecom mask (e.g., Si5344/Si5345 or ZL30792 profile selection).
Pass criteria: MTIE/TDEV passes the required mask/class for the full report interval and the result is reproducible across repeated runs.
SSC Why does enabling SSC reduce EMI but break SyncE stability in some nodes?
Likely cause: SSC introduces intentional frequency modulation that can exceed the timing chain’s wander tolerance or interact with tracking loops and selection thresholds.
Quick check: Toggle SSC on/off while holding all other variables constant and compare quality alarms plus frequency-offset statistics and MTIE/TDEV for the same observation interval.
Fix: Disable SSC on the SyncE-critical chain and confine SSC to non-timing branches using separate clock domains (e.g., feed SyncE via Si5344/ZL30792 while SSC is generated elsewhere).
Pass criteria: SyncE alarms and wander metrics remain within budget with SSC disabled on the timing chain while EMI objectives are met via non-critical clock domains.
LOOP Why does the node form a timing loop after adding a second upstream link?
Likely cause: Reference ownership is ambiguous and QL/SSM policy allows bidirectional selection that creates a closed timing dependency.
Quick check: Build an “active reference graph” from logs (active input + QL in/out + ESMC events) and look for simultaneous “follow” behavior between adjacent nodes.
Fix: Enforce a single-direction timing plan by locking priorities, filtering unacceptable QL, and adding holdoff/hysteresis in the selector/DPLL (e.g., Si5345/ZL30792 input-selection policy).
Pass criteria: No repeated source toggles occur across the configured holdoff window and loop-related alarms cease under steady-state and failover tests.
SWITCH Hitless switchover still causes packet errors—what transient metric is usually missing?
Likely cause: A short phase step or clock interruption during switching is not visible in steady-state jitter numbers but is large enough to upset the packet/PHY timing window.
Quick check: Capture a switch event timeline with synchronized packet-error counters and a phase/time-interval trace at TP2 around the switching instant, not only before/after snapshots.
Fix: Use a device and configuration that guarantees glitch-free/hitless switching plus settle gating (e.g., Si5344/Si5345, ZL30792, AD9545) and apply holdoff/settling timers before releasing alarms.
Pass criteria: Packet-error counters do not spike during a controlled switch and the phase transient stays within the system tolerance window for the defined switch test profile.
QL Why does QL selection “flap” between two sources—what holdoff/revertive setting to check?
Likely cause: The selection logic is too sensitive (short holdoff, aggressive revertive mode, insufficient hysteresis) when two sources are close in quality.
Quick check: Plot active-input toggles against QL in/out and holdoff state to confirm switches occur at the boundary rather than being triggered by true LOS/LOL faults.
Fix: Increase holdoff/min-dwell time and prefer non-revertive behavior where appropriate, then enforce hysteresis using the timing device policy engine (e.g., Si5345 or ZL30792 selection settings).
Pass criteria: Source selection remains stable and the switch-count rate stays below the operational threshold under steady-state and controlled degradation tests.
HOLDOVER Holdover looks good short-term but drifts after hours—what’s the first aging/thermal suspect?
Likely cause: A slow thermal gradient (board airflow, enclosure heating) or oscillator aging slope is dominating beyond the short-term holdover window.
Quick check: Trend frequency offset during holdover against local temperature sensors and DPLL holdover state variables (slope/trim) to see whether drift is temperature-correlated or time-correlated.
Fix: Improve thermal stability and select a holdover-capable timing DPLL (e.g., AD9545 or Si5344) with a higher-stability timebase option (e.g., TCXO family such as SiT5356, or an external OCXO where required).
Pass criteria: Holdover drift remains within the allocated budget for the required outage duration and the drift curve is consistent across repeated runs.
BOARD Why does long cable / noisy PSU cause intermittent SyncE lock loss even with good ref?
Likely cause: Common-mode injection and return-path discontinuities convert cable/PSU noise into clock/PHY threshold disturbance, causing sporadic LOS/LOL.
Quick check: Correlate lock-loss events with PSU ripple at TP_PWR_A and with cable movement while probing common-mode behavior at the connector and at the PHY clock input.
Fix: Restore controlled impedance/termination and a continuous return path, then harden the local clock power using low-noise regulators (e.g., LT3045 or TPS7A94) and a robust fanout/buffer stage (e.g., ADCLK948 or LMK00334).
Pass criteria: Lock-loss events disappear under the same cable/PSU stress test and the alarm counters remain flat across the defined observation window.
TRIAGE How to separate PHY recovery limitation vs board coupling vs cleaner configuration quickly?
Likely cause: The failure is owned by one block (PHY recovery, board coupling, or loop profile) but symptoms look similar because all three surface as lock alarms and quality degradation.
Quick check: Run a three-way A/B/C experiment: (A) bypass cleaner, (B) inject a known-clean source at TP1, (C) keep cleaner but reduce tracking BW, and compare TP1→TP2 degradation plus PHY LOS/LOL counts.
Fix: Assign ownership and act accordingly—upgrade cleaner/profile (e.g., Si5344/ZL30792/AD9545), harden distribution (e.g., LMK00804/ADCLK948), or accept PHY recovered-clock limits and re-architect the clock handoff.
Pass criteria: The identified owner change produces a consistent improvement (lower alarm rate and better TP1→TP2 margin) and the effect reproduces across repeated trials.