Industrial Time & TSN: 1588 HW Timestamps and Jitter-Cleaned Clocks
← Back to: Industrial Sensing & Process Control
TSN switches/PHYs with IEEE 1588 hardware timestamps, jitter-cleaning PLL clock trees, and redundant time bases for deterministic scheduling—even under fault conditions.
Scope: Focus on timestamp placement, clock stability (jitter/wander), and redundancy behavior that preserves determinism. Protocol names appear only as needed to explain where errors enter and how to verify them.
H2-1. Center Idea — Deterministic Time Is a Hardware Problem
What “deterministic time” means in industrial TSN
In an industrial TSN network, determinism is not proven by claiming protocol support. It is proven when time error stays inside a declared budget while traffic is scheduled and faults occur (reference loss, failover, congestion). The deciding factors are physical: where timestamps are taken, how clean and stable the clock tree is, and how redundancy behaves during switchover.
The three engineering axes (each must be verifiable)
- Hardware timestamp location: The closer the timestamp is to the wire boundary (ingress/egress at MAC/PHY), the less “uncontrolled delay” leaks into the time model.
- Jitter-cleaned clock tree: A disciplined clock tree converts a noisy reference into a usable timebase; both short-term jitter and long-term wander must stay within budget.
- Redundant time base under failure: Redundancy is meaningful only if switchover is bounded (phase step, lock time) so schedules remain valid and events remain ordered.
Evidence fields to look for (no guessing)
“Deep” TSN timing validation starts by confirming that the implementation exposes measurable evidence: timestamping at defined points (ingress/egress), clock-quality indicators (PLL lock status, oscillator class), and failover observables (phase transient/time error step). These become the reference checks used throughout the page.
H2-2. System Architecture Overview — Where Time Is Born and Controlled
Time is a closed-loop system, not a feature checkbox
Industrial timing behaves like a feedback control system: timestamps measure error, the servo corrects the local timebase, and TSN scheduling consumes that corrected time. Determinism is achieved only when the timestamp path avoids uncontrolled delay and the clock tree remains stable during reference loss and recovery.
Clock-and-time blocks (what each block is responsible for)
- Grandmaster / External Reference: PTP GM, GNSS-disciplined source, or SyncE-derived reference that provides the “target time.”
- PLL / Jitter Cleaner: Converts reference quality into a usable clock (attenuates reference noise; enforces stability constraints).
- Oscillator (XO / TCXO / OCXO): Local timebase; dominates holdover drift when reference disappears.
- Servo Loop: Adjusts frequency/phase based on timestamp-derived error (keeps local time aligned to reference).
- TSN Switch Core + TSU: Switching pipeline with a Timestamp Unit that captures ingress/egress events and applies residence/correction modeling.
- Egress Queues (scheduling boundary): The place where uncontrolled queuing can destroy determinism if timestamps are too far away.
- TSN PHY (timestamp + calibrated latency): Physical-layer boundary where deterministic timestamping and latency compensation are enforced.
Where errors enter (architecture-level “error injection points”)
Even with correct standards support, time error commonly enters through a few physical mechanisms: queueing variability, PHY latency/asymmetry drift, PLL bandwidth trade-offs, and oscillator temperature drift during holdover. The architecture diagram below marks these points so verification stays grounded in evidence.
Verify in 5 minutes (first checks before deeper debugging)
- Timestamp point: Confirm ingress/egress hardware timestamp placement and whether egress timestamping excludes queue delay.
- Clock quality: Confirm PLL lock/health indication and oscillator class used for holdover expectations.
- Holdover path: Confirm reference-loss behavior (which timebase takes over) and whether phase/frequency transient is bounded.
H2-3. IEEE 1588 & 802.1AS Hardware Timestamp Path
Focus: where time error is injected (not protocol narration)
The timing outcome is dominated by latency uncertainty. The protocol fields (t1/t2/t3/t4, correctionField, residence time) only stay meaningful when timestamps are captured at deterministic points and when fixed latencies (PHY/MAC/pipeline) are compensated consistently.
Timestamp points: ingress vs egress (why placement matters)
- Ingress timestamp location: captures when a Sync message enters the device time domain. If taken too far from the wire boundary, unmodeled PHY/MAC latency becomes invisible.
- Egress timestamp location: captures when a Delay_Resp (or Sync) leaves. If taken before the scheduling/queue boundary, variable queue delay leaks into the timing model.
Latency components that must be modeled or excluded
- Store-and-forward / pipeline latency: depends on frame length and internal arbitration. It can be compensated only if the device exposes a stable residence-time model.
- PHY latency compensation: RX/TX fixed delays and temperature drift. Without calibrated PHY latency, the same protocol fields can produce different time error in the field.
- Asymmetry error: the forward and reverse path delays are not equal. Unmodeled asymmetry shows up as a stable offset or a drifting bias depending on environment.
Key fields (what each field proves)
- t1/t2/t3/t4: the raw time capture points used by the offset/delay estimator; accuracy is bounded by timestamp placement and clock noise.
- correctionField: carries modeled transit delay; if this varies with load, the timestamp point is too early or queue/pipeline is leaking into the model.
- residence time: the device-local forwarding time; it must be stable and consistently included, otherwise “deterministic” scheduling cannot be proven.
Why software timestamps fail for <100 ns targets
Software timestamps inherit non-deterministic delay from interrupt latency, OS scheduling, cache effects, and bus transactions. Even if the mean error looks acceptable, the tail latency breaks determinism. Sub-100 ns goals require a hardware timestamp unit (TSU) that captures events before software layers can inject variability.
H2-4. TSN Scheduling Engine — Deterministic Traffic Shaping
Scheduling consumes corrected time
TSN scheduling turns a stable timebase into deterministic traffic behavior. The schedule is only as deterministic as the clock that drives it: gate misalignment produces deterministic jitter, and clock drift accumulates as a phase shift across cycles.
Core building blocks (time-aware shaping, preemption, policing)
- 802.1Qbv (Time-Aware Shaper): gates open/close per time; deterministic windows are created by a Gate Control List (GCL).
- Gate Control List (GCL): a cycle schedule with offsets and window lengths; guard bands prevent overlap under bounded time error.
- 802.1Qbu (Frame Preemption): reduces blocking by allowing express traffic to interrupt preemptable frames; fragment timing must remain bounded.
- 802.1Qci (Policing): prevents bursty or misbehaving flows from violating the schedule and collapsing deterministic guarantees.
How gate misalignment creates deterministic jitter
If the local timebase drifts or experiences a servo transient, the gate timeline shifts relative to intended cycle boundaries. This does not look like random noise: it appears as repeatable, cycle-correlated timing error. A schedule can remain “correct” on paper while observed latency becomes periodic and deterministic due to misalignment.
Cycle time vs oscillator stability (why long cycles amplify error)
Cycle-based scheduling assumes bounded phase error. With a drifting oscillator, the phase offset grows with cycle time. Guard bands must cover the combined timing error budget (clock drift + servo transient + timestamp/PHY residuals) or the latency bound collapses.
Worst-case latency bound: what must be included
- Gate wait time: time until the next window opens (dominant under misalignment).
- Preemption overhead: express interruption and fragment completion bounds.
- Forwarding + PHY latency: pipeline residence plus calibrated RX/TX delays.
H2-5. Clock Tree Design — From Oscillator to PHY
Clock quality sets the timing ceiling before any protocol runs
Deterministic timing requires a timebase with bounded short-term jitter and bounded long-term wander. A clock tree is a chain of noise contributors: the oscillator defines baseline stability, the jitter-cleaning PLL reshapes phase noise, fanout/dividers add skew and additive jitter, and the PHY reference clock integrity determines the final timestamp floor at the wire boundary.
Oscillator selection: XO vs TCXO vs OCXO (what changes in the time error model)
- XO: simple and low cost; drift can accumulate into cycle misalignment and faster holdover error growth.
- TCXO: temperature-compensated; reduces drift and improves holdover stability under ambient variation.
- OCXO: oven-controlled; best long-term stability and holdover performance when a strict phase budget must be maintained.
Jitter-cleaning PLL: noise reshaping, not “magic cleaning”
- Reference noise vs PLL noise: the PLL attenuates part of reference phase noise but contributes its own noise floor.
- Bandwidth trade-off: narrow bandwidth improves jitter suppression but slows tracking; wide bandwidth tracks faster but can pass more reference jitter.
- Peaking risk: poor loop design can amplify noise in a band, appearing as periodic phase disturbance in measurements.
Phase noise contribution and RMS jitter budget (engineering view)
Phase noise is a spectrum; the practical requirement is an integrated RMS jitter budget across the clock tree. The budget should include: oscillator noise, PLL residual noise, fanout/divider additive jitter, and supply-induced jitter. The usable timestamp accuracy cannot exceed the clock budget delivered to the switch core and PHY reference pins.
Clock distribution: fanout topology and PHY reference integrity
- Fanout topology: additive jitter and branch-to-branch skew can create stable offsets that mimic asymmetry.
- Divider strategy: divider noise and phase alignment influence multi-port consistency.
- PHY ref clock integrity: trace quality, power noise coupling, and clock tolerance affect PHY timestamp stability at the wire boundary.
H2-6. Servo Loop & Holdover Behavior
Servo loop: timestamps measure error; the loop corrects the timebase
A PTP servo is a feedback controller. Hardware timestamps provide phase/time error, a loop filter (often PI behavior) shapes the response, and the controlled oscillator/clock adjust mechanism (DCO/PLL control) applies frequency and phase correction. Proper tuning keeps the loop stable while tracking reference changes without injecting oscillation into the schedule timeline.
PI loop behavior (engineering interpretation)
- P-term: fast reaction to phase error; too aggressive tuning can cause oscillation or overshoot.
- I-term: removes steady-state bias; too large an integrator can wind up and create slow recovery transients.
- Stability symptom: loop instability often appears as periodic phase modulation or cycle-correlated jitter in scheduled traffic.
Holdover drift (what happens when the Grandmaster is lost)
When the reference source disappears, the servo stops receiving trustworthy error updates and the system enters holdover. The local oscillator becomes the time authority and drift accumulates over time. Holdover performance is therefore dominated by oscillator class (XO/TCXO/OCXO), temperature variation, and the last known frequency estimate at the moment of loss.
How long can holdover maintain ±100 ns?
The holdover window is determined by the drift rate and the permitted time error budget. A practical engineering method is: confirm oscillator stability class, estimate drift per second under worst ambient conditions, and compute the time until the accumulated error reaches the ±100 ns threshold. This bound must be compared to application requirements (fault duration, re-lock time, and scheduling guard bands).
MTIE / TDEV (why both are useful)
- MTIE: worst-case time error over observation intervals (useful for bounding the maximum deviation).
- TDEV: statistical stability across time scales (useful for understanding noise vs drift regimes).
H2-7. Redundant Time Base & Failover
Failover is a time-transient problem, not only a selection problem
Redundancy is only successful if the timebase remains usable during a switchover. The engineering question is not “which GM wins,” but whether the switchover produces a phase step or a smooth slew, and whether the transient stays inside the application’s phase/jitter/latency budget.
BMCA (Best Master Clock Algorithm): what it changes in a real system
- Selection inputs: GM priority/quality and reachability decide which clock becomes authoritative.
- Operational consequence: a GM change triggers a servo transient that can shift the schedule phase if not bounded.
- Evidence to capture: GM change events, offset step magnitude, and re-lock time under load.
Dual GM domain: isolation vs switching behavior
- Single-domain redundancy: simpler transition path, but requires strong protection against a “bad GM” polluting the domain.
- Dual-domain strategy: improved isolation, but switching can behave more like re-synchronization if domain alignment is not controlled.
SyncE + PTP hybrid: why it improves hitless behavior
SyncE stabilizes frequency while PTP corrects phase/time. With a stable frequency anchor, a GM change is less likely to create a large phase jump, improving the chance of a hitless switchover.
What happens during switchover: phase step or smooth?
- Phase step: an instantaneous time jump; high risk for Qbv window boundary violations and event ordering.
- Smooth slew: gradual correction; requires enough guard band and bounded slew rate to preserve worst-case latency.
- Engineering criteria: step bound, re-lock time bound, and schedule continuity under stress traffic.
H2-8. PHY-Level Determinism
PHY determines the wire-edge timing floor
Sub-100 ns alignment depends on where the timestamp is captured and how fixed delays are compensated. Capturing timestamps closer to the wire boundary reduces exposure to variable pipeline and queue effects, but it also increases dependence on PHY calibration quality and long-term drift behavior.
PHY timestamp insertion and latency calibration
- Timestamp insertion point: capturing at the PHY boundary aligns time with SFD/wire events more directly.
- RX/TX fixed delay calibration: compensates internal receive and transmit path delays that otherwise appear as bias.
- Residual vs variable: calibration targets fixed components; variable queue delay must be excluded by correct timestamp placement.
Cable delay compensation and temperature drift
Cable changes, connectors, and media type shifts introduce delay changes that can masquerade as servo instability. Temperature variation can shift PHY internal delay and reference clock behavior. A stable system defines when to re-calibrate, and how to verify that offsets remain valid across the expected operating range.
Why PHY asymmetry matters (bias, not noise)
Asymmetry means forward and reverse delays are different. If the estimator assumes symmetry, the result is a persistent offset bias. Bias breaks absolute alignment even when short-term jitter looks small, so asymmetry must be explicitly modeled or compensated in the calibration process.
How calibration offsets are stored and made auditable
- Runtime registers: offsets loaded into PHY/switch compensation registers at boot.
- NVM / EEPROM: persistent storage for calibrated offsets and revision metadata.
- Traceability: store offset value + version + timestamp + temperature point to support maintenance and audits.
H2-9. Jitter Cleaning & Phase Noise Engineering
PLL “cleaning” is phase-noise shaping
Jitter cleaning is not a single number—it is the output phase-noise spectrum after loop shaping. The loop bandwidth determines how much reference noise is passed, how much VCO/PLL-internal noise dominates, and whether the system remains stable without peaking. A practical design starts from a timestamp/PHY sensitivity band and allocates an integrated RMS jitter budget.
Loop bandwidth vs jitter attenuation (engineering interpretation)
- Bandwidth too wide: more reference phase noise leaks to the output; jitter can rise unexpectedly.
- Bandwidth too narrow: internal VCO noise dominates and the loop tracks slowly (worse wander response).
- Targeted bandwidth: minimize output jitter over the application-relevant band while maintaining stable transient behavior.
Integrated phase noise → RMS jitter (concept only, no full derivation)
Phase noise is specified as a spectrum. RMS jitter is derived by integrating phase noise over a defined offset-frequency range (the choice of integration limits directly changes the reported jitter). To compare designs or vendors, the same integration band must be used. In a TSN system, the clock domain feeding TSU/PHY should meet the RMS jitter budget required to keep hardware timestamps stable.
PLL peaking: when “cleaning” becomes amplification
- Root cause: insufficient damping or poor loop filter tuning creates a gain peak near the loop natural frequency.
- Symptom: a “hump” in phase noise and periodic time-error oscillation that can appear as cycle-correlated jitter.
- Mitigation: adjust bandwidth/damping, validate spectrum shape (not only a single jitter number), and re-check under load/temperature.
Power supply noise coupling: the field failure multiplier
Supply ripple and ground impedance can modulate the VCO control node, fanout buffers, and PHY reference clock path. This creates phase modulation that appears as elevated jitter even if the loop tuning looks correct. A robust design measures supply noise at the PLL/VCXO rails and checks correlation between supply ripple and time error signatures.
H2-10. Measurement & Validation Methodology
Validation turns timing theory into bounded evidence
Deterministic time must be validated with repeatable evidence that separates bias (asymmetry/calibration error), noise (jitter/peaking), and system transients (failover/servo behavior). A minimal methodology uses time-error plots, distributions, packet-delay variation, and register evidence to map symptoms to root causes.
Time error histogram (distribution view)
- Narrow single peak: low random jitter, stable operation.
- Wide single peak: elevated jitter or noise coupling.
- Double peak: switchover steps or mixed time domains.
- Long tail: occasional blocking or transient behavior.
TE vs time plot (dynamic view)
- Slope: frequency error / wander (holdover drift signature).
- Step: GM switchover phase step, calibration change, or domain transition.
- Periodic oscillation: servo PI instability or PLL peaking band coupling.
PDV (packet delay variation): network determinism evidence
PDV quantifies scheduling, contention, and queue effects. A stable timebase with high PDV points to shaping/queueing issues; increasing TE jitter together with high PDV suggests compounded timing noise plus network-level delay variation under load.
Oscilloscope phase measurement (hardware confirmation)
- Measure phase/jitter at PLL output and PHY ref clock nodes.
- Check correlation between supply ripple and TE signatures to confirm power noise coupling.
Timestamp register readout (TSU/PHY evidence)
Register evidence is used to prove where error enters. Read timestamp/compensation fields and align them to TE/PDV observations. Without register evidence, asymmetry, incorrect compensation, and failover transients are often indistinguishable.
Minimum evidence set: two waveforms + one log field
- Waveform #1: PLL output clock or PHY reference clock (phase/jitter source).
- Waveform #2: schedule boundary marker (gate activity or a critical TSN stream arrival-time jitter).
- Log field: TE/offset at events + servo state (locked / holdover / locking) to classify step vs slew.
H2-11. Design Tradeoffs & Failure Modes
Determinism under fault is the qualification metric
A TSN system is not qualified by “good average timing” in a lab. Qualification means the timebase remains bounded during faults and transitions: no uncontrolled phase steps, bounded re-lock time, and schedule continuity under load, temperature, and redundancy events.
Failure Mode: Timestamp Jump
Symptom: TE vs time shows a step; histogram becomes double-peak; critical flows may cross Qbv boundaries.
Primary causes: GM switchover with phase step; servo forced re-lock; PHY delay compensation reloaded; PLL unlock/relock transient.
Minimum evidence set
- Waveform #1: PLL OUT or PHY REF clock phase/jitter
- Waveform #2: Schedule boundary marker (gate activity) or critical TSN stream arrival jitter
- Log field: offset + servo state + GM change event
First fix: enforce slew-only policy (bound step), enable holdover bridging, and freeze/sequence calibration reload to avoid “hot” offset changes.
- TSN switch (PTP/TSU capable): Microchip LAN9662; NXP SJA1105
- 1588-capable Ethernet PHY (HW timestamp support depends on config): TI DP83869; Microchip KSZ9131
- Jitter cleaner / clock generator: Silicon Labs Si5341; Renesas (IDT) 8A34001
Failure Mode: GCL Drift (Schedule Phase Drift)
Symptom: boundaries slowly shift over time; cycle alignment degrades; TE shows slope or low-frequency wander signature.
Primary causes: timebase wander; PLL bandwidth too narrow/wide for the target band; clock-domain mismatch between timebase and gate engine.
Minimum evidence set
- Waveform #1: gate boundary marker (per-cycle phase drift)
- Waveform #2: reference/PLL output stability (phase/frequency)
- Log field: offset trend + servo state
First fix: verify the scheduling engine is driven by the disciplined time domain; adjust loop bandwidth/damping and add guard band consistent with oscillator stability.
- Disciplined clock / time sync IC: Analog Devices AD9545; Renesas (IDT) 8A34001
- OCXO for longer holdover: Abracon AOCJY (OCXO family example)
- TCXO for cost/space: Epson TG-3530SA (TCXO family example)
Failure Mode: PLL Unlock / Relock Transient
Symptom: burst jitter appears; TE widens abruptly; scope shows unstable clock edges or phase excursions.
Primary causes: supply ripple coupling into VCO/control node; marginal damping/peaking; reference input quality below threshold; thermal corner cases.
Minimum evidence set
- Waveform #1: PLL rail ripple (at the IC pins)
- Waveform #2: PLL OUT / PHY REF clock phase behavior
- Log field: lock status / alarm bits
First fix: harden the PLL power path (filtering/LDO/return), reduce peaking risk (damping), then re-verify spectrum shape over temperature.
- Jitter cleaner / clock generator: Silicon Labs Si5341; Si5345
- Clock buffer/fanout (example): Analog Devices LTC6952 (buffer family example)
- Low-noise LDO (example): Analog Devices LT3042 (low-noise LDO family example)
Failure Mode: Servo Oscillation
Symptom: TE vs time shows periodic oscillation; histogram widens with a “structured” pattern; schedule jitter becomes periodic.
Primary causes: PI parameters too aggressive; integrator wind-up; timestamp noise increased; PLL peaking excites the loop.
Minimum evidence set
- Waveform #1: TE vs time (oscillation frequency)
- Waveform #2: PLL OUT / disciplined clock phase behavior
- Log field: servo state + loop status (locking/locked/holdover)
First fix: reduce P-gain / limit integrator, cap maximum slew per interval, and confirm the jitter cleaner has no peaking hump in the sensitive band.
- Time sync / disciplined clock IC: Analog Devices AD9545; AD9548
- Jitter cleaner: Silicon Labs Si5341; Renesas (IDT) 8A34002
Failure Mode: Thermal Drift
Symptom: TE slowly drifts with temperature; holdover window shrinks; PHY delay calibration bias changes across temperature.
Primary causes: XO/TCXO/OCXO temperature characteristics; PHY internal delay drift; supply/ground behavior changes with thermal load.
Minimum evidence set
- Waveform #1: temperature vs TE (correlation)
- Waveform #2: clock stability (phase/frequency) or calibrated offset readout over temperature
- Log field: calibration value + version + temperature point
First fix: implement temperature-point calibration strategy and store offsets with traceability; upgrade oscillator class if the holdover requirement demands it.
- OCXO (family example): Abracon AOCJY
- TCXO (family example): Epson TG-3530SA
- Temperature sensor (example): TI TMP117 (for correlation logging)
Failure Mode: Sync Storm (Over-Resynchronization)
Symptom: frequent GM changes; PDV spikes; TE shows repeated steps or burst jitter; servo state toggles repeatedly.
Primary causes: unstable GM quality triggers BMCA flapping; misconfigured domains/priority; missing hysteresis/holdover policy; noisy links causing repeated transitions.
Minimum evidence set
- Waveform #1: PDV under stress traffic
- Waveform #2: TE vs time (step density + burst behavior)
- Log field: GM-change counter + servo-state transition counter
First fix: add BMCA hysteresis and guard timers; prefer holdover bridging; reduce transition frequency before tuning deeper parameters.
- TSN switch: Microchip LAN9662; NXP SJA1105
- Jitter cleaner / timing hub: Renesas (IDT) 8A34001; Silicon Labs Si5341
- 1588 PHY: TI DP83869; Microchip KSZ9131
Three tradeoffs that decide fault determinism
- Fast re-lock vs schedule continuity: aggressive correction can create steps; bounded slew protects Qbv boundaries but converges slower.
- Loop bandwidth vs tracking agility: narrow bandwidth reduces certain jitter bands but may worsen low-frequency wander response; wide bandwidth can leak reference noise.
- Aggressive redundancy vs stability: rapid failover reduces outage time but increases transient density and “sync storm” risk without hysteresis/holdover.
Fault Qualification Map: symptoms → evidence → first fix
Note: MPNs listed are practical examples for reference and procurement shortlisting; final selection depends on required TSN features, timestamp integration points, temperature range, and clock/jitter budget.
H2-12. FAQs (10 Questions, Evidence-Based)
Each answer is evidence-driven: 1 sentence conclusion, 2 evidence points, and 1 first fix, with an explicit mapping back to the relevant H2 sections. This keeps troubleshooting deterministic and prevents scope creep.
Q1SOE jumps 2ms — GM switch or servo runaway?
Evidence #1: TE vs time shows a clear step (often with double-peak histogram).
Evidence #2: Logs show GM-change events and servo state toggling (locked → locking/holdover).
First fix: enforce a bounded slew-only correction during switchover and bridge with holdover to avoid hard steps.
Maps to: H2-6 / H2-7 / H2-10
- Timing/discipline: AD9545
- Jitter cleaner: Si5341
- TSN switch: LAN9662
Q2Timestamp stable but latency drifts — PHY asymmetry?
Evidence #1: PDV stays moderate, but the measured one-way delay slowly biases over time/temperature.
Evidence #2: PHY RX/TX delay calibration values (or stored offsets) differ from the expected profile or change after resets.
First fix: lock calibration versioning and apply temperature-aware PHY delay compensation with traceable offsets.
Maps to: H2-8 / H2-10 / H2-11
- 1588 PHY: DP83869
- Switch/TSU: SJA1105
- Temp sensor (logging): TMP117
Q3Holdover works in lab but not field — oscillator grade?
Evidence #1: TE vs time during GM loss shows a slope (frequency error) much larger than the lab baseline.
Evidence #2: Temperature correlation shows drift accelerates with thermal gradients; servo enters holdover but offset diverges quickly.
First fix: upgrade XO→TCXO/OCXO as required and re-tune holdover/servo policy with temperature-point characterization.
Maps to: H2-5 / H2-6 / H2-10
- OCXO family: Abracon AOCJY
- TCXO family: Epson TG-3530SA
- Jitter cleaner: 8A34001
Q4TSN traffic still collides — GCL misalignment?
Evidence #1: Boundary marker shows window edges drifting relative to the critical stream arrival times.
Evidence #2: PDV spikes cluster around gate transitions, not uniformly across time.
First fix: re-validate GCL with measured residence time and add guard band sized to oscillator stability and worst-case latency bounds.
Maps to: H2-4 / H2-6 / H2-10
- TSN switch: LAN9662
- TSN switch: SJA1105
- Timing hub: AD9545
Q5PLL reduces jitter but increases wander — bandwidth wrong?
Evidence #1: Integrated jitter over the target band drops, but TE vs time shows increased low-frequency slope/slow drift during disturbances.
Evidence #2: Phase noise shows peaking or insufficient suppression near the band that drives scheduling or servo behavior.
First fix: re-select bandwidth and damping to avoid peaking and meet both jitter and wander budgets over the defined integration bands.
Maps to: H2-9 / H2-6 / H2-10
- Jitter cleaner: Si5341
- Jitter cleaner: 8A34002
- Low-noise LDO: LT3042
Q6Sync storm after failover — BMCA loop?
Evidence #1: Logs show frequent GM-change events and repeated servo state transitions; TE shows dense steps or burst jitter.
Evidence #2: PDV under load spikes during the transition window, indicating contention and unstable synchronization activity.
First fix: add BMCA hysteresis/holdover bridging and enforce minimum dwell time before re-electing a master.
Maps to: H2-7 / H2-10 / H2-11
- Timing hub: 8A34001
- TSN switch: LAN9662
- Jitter cleaner: Si5341
Q7Sub-100ns target not achieved — wrong timestamp location?
Evidence #1: t1–t4 variance is larger than expected and correlates with MAC/queue activity, suggesting non-wire-adjacent stamping.
Evidence #2: Correction/residence time fields do not match measured store-and-forward behavior, or PHY delay compensation is not applied consistently.
First fix: move stamping to HW TSU/PHY egress/ingress points and verify calibration offsets are persisted and versioned.
Maps to: H2-3 / H2-8 / H2-10
- TSN switch/TSU: SJA1105
- 1588 PHY: KSZ9131
- 1588 PHY: DP83869
Q8PDV spikes during heavy traffic — queue starvation?
Evidence #1: PDV increases only when stress streams run; TE may remain stable, indicating a network scheduling issue, not a clock issue.
Evidence #2: Spikes align with GCL transitions or policing drops, visible as clustered delay outliers and missed windows.
First fix: validate Qbv window sizing against worst-case latency, enable/verify preemption policy, and isolate critical flows in scheduling.
Maps to: H2-4 / H2-10 / H2-11
- TSN switch: LAN9662
- TSN switch: SJA1105
- Timing hub: AD9545
Q9Temperature causes phase step — XO drift?
Evidence #1: TE changes correlate with temperature inflection points rather than traffic load; histogram may show two clusters around the step event.
Evidence #2: Lock/alarm bits or calibration version changes appear near the event; PHY delay offsets may shift with temperature.
First fix: enforce temperature-point calibration with traceable offset versions and increase oscillator/clock-tree thermal margin.
Maps to: H2-5 / H2-8 / H2-11
- TCXO family: Epson TG-3530SA
- OCXO family: Abracon AOCJY
- Temp sensor: TMP117
Q10Dual GM not seamless — phase alignment missing?
Evidence #1: TE vs time shows a step at the exact GM-change timestamp; step size repeats across tests, indicating policy/config rather than random noise.
Evidence #2: Logs show domain/priority changes, GM change count, and servo state transitions; PDV may spike during transition.
First fix: implement hitless switchover strategy (slew-only) with explicit domain alignment and minimum dwell timers to prevent rapid re-election.
Maps to: H2-7 / H2-6 / H2-10
- Timing hub: 8A34001
- Jitter cleaner: Si5341
- TSN switch: LAN9662