Timing & Sync for CCTV (PTP/IEEE-1588, PLL, RTC Holdover)
← Back to: Security & Surveillance
Core idea: CCTV-grade sync is not “PTP enabled”—it is a measurable budget of tails (95th/max offset), drift, and holdover/restore behavior proven end-to-end. Build it by enforcing true hardware timestamps, controlling network PDV with the right switch participation, and using a jitter-cleaned timebase that stays monotonic across load and power events.
Scope, Use-Cases, and “What Good Sync Looks Like”
This page is strictly about time infrastructure for CCTV: IEEE-1588/PTP with hardware timestamps, jitter-cleaning PLL/DPLL for clean clocks, and RTC hold-up for time continuity during outages. It does not cover PoE power allocation, surge devices, camera imaging/ISP, or NVR/VMS platform design.
“Good sync” in CCTV is not only about being close to an absolute time source; it is about being consistent across devices and auditable under real network load, temperature drift, and power events. The engineering target must be expressed in measurable metrics that can be verified in production tests and field diagnostics.
Three measurable success metrics
| Metric | What it captures | How it is proven (evidence) | Typical failure signature |
|---|---|---|---|
| Offset (time error now) | Instantaneous difference between device time and reference time. |
Report offset distribution over a defined window:
mean, RMS, 95th, max.
Use PHC readouts + PTP stats.
|
Looks “OK” on average but has a long tail (rare large errors) under load. |
| Drift (stability over time) | How fast offset changes when conditions vary (temperature, load, aging). |
Track offset(t) slope over minutes/hours; compare across temperature bands.
Store slope estimates in logs.
|
Offset slowly diverges even when PTP is “locked” (often oscillator grade/thermal). |
| Holdover error (when sync is lost) | Maximum time error during reference loss (link down / GM gone / reboot). |
Force loss-of-sync and record error vs holdover time (e.g., 1/10/30 min).
Require a holdover flag + recovery markers.
|
Time jumps backward/forward after power events; or error grows too fast in holdover. |
Timing Metrics & Error Budget for Video/Events
CCTV timing design fails most often because requirements are expressed as “PTP on/off” rather than a measurable time error budget. A budget converts the system goal (alignment, correlation, forensics) into four practical contributors that can be engineered and verified: Network PDV, Timestamp path, PLL/DPLL residual jitter, and Oscillator/temperature/holdover drift.
Metrics that matter (and how they show up in the field)
| Term | Engineering meaning | Best evidence to capture | Most sensitive CCTV use-case |
|---|---|---|---|
| Offset | Instantaneous time error between endpoint and reference (GM). | Offset distribution: mean, RMS, 95th, max over window. |
Multi-camera alignment & cross-device event ordering. |
| Jitter | Short-window variation of time error (high-frequency instability). | Short-window variance / peak-to-peak; compare idle vs loaded network. | Frame-level alignment, trigger consistency, “shimmering” event offsets. |
| Drift | Long-window trend of time error (slope), often temperature/oscillator driven. | Offset vs time slope under temperature bands; log slope estimates. | Long-running systems: slowly diverging timelines. |
| Skew | Relative error between two endpoints (A vs B), independent of absolute time. | Pairwise difference: offset_A - offset_B distribution. |
Cross-camera correlation (what matters most in practice). |
| PDV (Packet Delay Variation) | Network-induced delay randomness that feeds directly into PTP servo noise. | Offset tail growth vs network load; queueing correlation; compare BC/TC vs unaware switch. | Large deployments, shared networks, congested uplinks. |
“Most damaging” depends on which symptom must be prevented: jitter/PDV dominate under congestion, drift dominates over hours with temperature gradients, and holdover error dominates whenever the reference disappears (link/GM/power events).
Error budget: split the total time error into controllable contributors
An actionable budget turns “sync quality” into a set of engineering levers: reduce PDV exposure, move timestamps into hardware, attenuate jitter with DPLL, and choose an oscillator/RTC holdover strategy that matches outage and temperature realities.
- Network PDV contribution: queueing and path variation; mitigated by PTP-aware switching (BC/TC) and traffic isolation (principle only).
- Timestamp path contribution: if timestamps occur above the MAC/PHY boundary, software scheduling noise creates long-tail errors.
- PLL/DPLL residual contribution: loop bandwidth and lock strategy determine how much wander is passed vs filtered.
- Oscillator + temperature + holdover contribution: TCXO/OCXO grade and thermal environment set the stability ceiling during outages.
Acceptance statement templates (copy into test plans)
-
Steady-state:
PHC offset to GM is ≤
X(95th percentile) and ≤Y(max) over aZ-minute window under normal load. -
Stress / congestion:
Under defined traffic stress, 95th percentile offset remains ≤
X, and the maximum does not exceedY, with no backward time jumps. -
Holdover:
After loss of reference, time error stays within
Xfor10 minholdover and withinYfor30 min; device logsenter_holdover/exit_holdovermarkers with timestamps.
PTP/IEEE-1588 in CCTV: Minimal Protocol View (Only What You Need)
This chapter intentionally avoids “PTP textbook depth”. It focuses on the smallest protocol view that directly supports CCTV engineering: which messages move which error terms, and what to verify in packet captures to prevent “PTP enabled but still unstable” deployments.
What Sync/Follow_Up changes (and what it cannot fix)
- Sync / Follow_Up are the “time sampling” path: they deliver timing information used to estimate current offset. Under congestion, variable queuing delay (PDV) can inflate the long-tail of offset even when the average looks fine.
- If cameras show occasional misalignment spikes, suspect the offset distribution tail, not the mean. A stable CCTV timeline must be written in percentiles (e.g., 95th) and worst-case bounds, not averages.
What Delay mechanism changes (and why PDV matters)
- Delay_Req / Delay_Resp (or peer-delay in some designs) estimates path delay. Any error or variation in this estimate feeds directly into offset and increases sensitivity to network load.
- PDV is the practical enemy: as network load rises, queueing variability can distort both the Sync sampling and Delay estimation, creating the exact symptom CCTV integrators hate: “mostly aligned, but occasionally wrong.”
Domain / priority / GM selection: practical meaning in CCTV
- Domain answers “which time universe is this device following?” A domain mismatch can produce large pairwise skew across cameras even if each camera claims it is “locked”.
- Priority & GM selection define time authority and failover behavior. Frequent GM changes can look like “random jitter” in offset; the cure is policy + logging: record GM changes and correlate them with offset spikes.
- For CCTV, the most important property is often consistency across endpoints and auditability, not raw absolute accuracy.
Packet capture checklist (minimum evidence)
| Field / item | Why it matters | Typical failure signature | What it affects (H2-2) |
|---|---|---|---|
| domainNumber | Confirms all endpoints follow the same time domain. | Some devices locked in a different domain → large camera-to-camera skew. | Skew / alignment |
| sequenceId | Pairs Sync ↔ Follow_Up and Delay_Req ↔ Delay_Resp for sanity checks. | Gaps/irregularities correlate with tail spikes (drops / reordering under stress). | Offset tails (95th/max) |
| correctionField | Shows switch timing corrections (TC/BC behavior) along the path. | Stagnant/absent correction suggests non-PTP-aware switching or wrong mode. | PDV sensitivity / tails |
| Message mix (Sync, Follow_Up, Delay_*) | Confirms the minimal timing loop is complete. | Missing Follow_Up or unstable Delay response → servo instability symptoms. | Offset / jitter |
Hardware Timestamping: Where the Timestamp Must Live
In CCTV deployments, PTP failures are often misdiagnosed as “protocol issues”. A more common root cause is simpler: timestamps are generated too high in the software stack, so OS scheduling, interrupts, and queueing create long-tail time errors. The result is the classic symptom: “PTP looks locked, but multi-camera alignment still slips.”
Where “HW timestamp” can live (and how to judge it)
- PHY-based: timestamp unit close to the physical interface; strongest isolation from software jitter.
- MAC/NIC-based: timestamp at transmit/receive boundaries (still wire-adjacent) and exposed via PHC.
- SoC-integrated Ethernet: can be valid HW TS if timestamping occurs at the Ethernet controller boundary, not inside an OS timestamp API above the driver.
Why “looks synchronized” but cameras still misalign
- Tail-driven slips: mean offset is small, but rare spikes break alignment. This is typical when timestamps are taken in software after queueing/scheduling delays.
- Load sensitivity: alignment degrades only when many streams run or when CPU load is high. That pattern is a strong indicator of software-path timestamp noise.
- Hidden time domains: camera-to-camera skew can remain large if devices follow different domains or unstable GM (validated in H2-3 via domainNumber and GM behavior).
HW TS vs SW TS: the A/B proof every project should run
| Test | How to run it | What to compare | Conclusion |
|---|---|---|---|
| Normal load | Record PTP stats and PHC offset over a fixed window. | mean, 95th, max offset; camera-to-camera skew. |
Baseline time quality. |
| CPU + traffic stress | Increase video streams / CPU workload while keeping topology constant. | Tail growth: does 95th/max inflate dramatically? |
Large tail growth implies software-path contamination. |
| HW TS vs SW TS | Compare true HW timestamp path against a degraded SW path (if available). | Tail ratio: HW should keep tails bounded; SW often shows spikes. | Confirms timestamp path as the limiting factor. |
| Pairwise skew | Measure offset_A - offset_B distribution across endpoints. |
Skew percentiles under normal + stress. | Skew is often the real CCTV metric. |
Switch Participation: Boundary Clock vs Transparent Clock (Practical Implications)
In CCTV deployments, the switch layer often determines the synchronization ceiling. Protocol packets can look correct while alignment still fails because network PDV (packet delay variation) inflates the long tail of offset and skew. Switch participation (TC or BC) is the practical mechanism that prevents “multi-hop + mixed traffic” from turning into random timeline slips.
Transparent Clock (TC): what it fixes in practice
- A TC does not become a new clock authority. It forwards timing messages while writing path corrections (e.g., residence/forwarding time) into the packet’s correctionField.
- Practical impact: TC reduces how much multi-hop forwarding variability expands the offset tail. This is especially important when a CCTV network has multiple aggregation hops.
- What TC does not solve: a severely congested uplink can still produce PDV large enough to break tails. TC helps, but it cannot create bandwidth or eliminate queueing at saturation.
Boundary Clock (BC): when it becomes necessary
- A BC terminates the timing loop at the switch and runs a local servo, then re-originates timing downstream. This effectively segments the network and prevents upstream PDV from freely propagating through all hops.
- Practical impact: BC improves scalability and can make tail behavior predictable on large CCTV deployments. It also provides clearer “lock state” boundaries for auditing and debugging.
If switches are not PTP-aware: how to prove PDV is the bottleneck
The goal is to distinguish “network-limited” vs “endpoint-limited” synchronization using evidence that is repeatable and audit-friendly.
| Evidence step | How to run it | What to compare (must use tails) | Interpretation |
|---|---|---|---|
| A/B traffic load | Measure timing under idle network, then under representative CCTV traffic stress. | offset 95th/max, skew 95th/max between endpoints. |
Tail inflation strongly correlated with load points to PDV dominance. |
| Path hop sensitivity | Compare a short path vs a longer/more aggregated path (same endpoints if possible). | Tail growth vs hop count. | Growth with hops indicates per-hop variability accumulation. |
| correction behavior | Inspect correctionField behavior in captures on TC/BC vs unaware switches. |
Presence/stability of corrections vs tail reduction. | Absence of correction + tail spikes is a classic “switch ceiling” signature. |
| Endpoint sanity check | Keep endpoints constant; change only load/topology. | Tails move with network conditions, not with endpoint CPU load. | Separates network PDV from endpoint timestamp/clock-tree issues. |
Clock Tree Inside Devices: PHC, Local Oscillator, and Distribution
PTP delivers timing information over the network, but an endpoint only becomes “sync-capable” when it can turn that information into a usable, low-noise clock and timebase inside the device. The internal clock tree is the bridge between PHC (hardware time), the local oscillator, and the clocks consumed by media pipelines and event logic.
PHC vs system clock vs media clock: separation goals
- PHC (PTP Hardware Clock): the time authority used by the PTP servo and used for time stamping and audit logs. It should stay stable and measurable (offset/skew percentiles).
- System/CPU clock domain: a noisy domain affected by DVFS, bus contention, interrupts, and scheduling. It should not be allowed to inject noise into timing-critical domains.
- Media/trigger clock domain: time-critical clocking for frame timing, encoder pacing, or event I/O alignment. This domain benefits from jitter-cleaned clocks and controlled distribution.
Clean clock vs recovered clock: which modules need which
| Clock consumer | Preferred input | Reason (engineering) | Failure signature if wrong |
|---|---|---|---|
| Timestamp/PHC boundary logic | Clean / disciplined | Directly impacts time error tails and stability. | Rare spikes; skew tails inflate under load. |
| Media pipeline / encoder timebase | Clean clock | Jitter/wander translates into frame pacing inconsistency. | Alignment slips even when PTP is “locked”. |
| Trigger / event I/O | Clean clock | Edge timing needs determinism for correlation. | Event stamps mismatch across endpoints. |
| General MCU control / low-rate telemetry | Recovered / tolerant | Less sensitive to short-term jitter. | Usually not the limiting factor. |
Evidence pattern that points to internal clock-tree coupling
- Offset mean remains stable, but 95th/max grows when CPU/video load increases.
- Pairwise skew grows under device load even when network conditions are unchanged.
- PTP capture fields look normal (domain/sequence/correction), yet alignment still slips: this shifts suspicion to internal distribution and clock cleanliness.
Jitter-Cleaning PLL / DPLL: How to Choose and Tune
In CCTV timing systems, “locking” is not the finish line. The real requirement is staying stable when network load, hop count, and endpoint activity change. A jitter-cleaning PLL/DPLL sits between a noisy time reference and the clocks consumed by media and event correlation. The objective is to reject short-term noise while keeping long-term error bounded and recovery behavior predictable.
Loop bandwidth: what it controls (and why “smaller” is not always better)
Loop bandwidth is the main knob that decides how much the output follows the input reference versus the local oscillator. It directly trades off noise rejection and correction speed.
| Bandwidth choice | What improves | What gets worse | Typical CCTV symptom |
|---|---|---|---|
| BW higher | Faster correction of frequency/phase error; faster “return to spec” after disturbances. | More input noise passes through (PDV/short-term jitter); tails can inflate. | Looks “responsive” but offset 95th/max grows when traffic is busy. |
| BW lower | Stronger filtering of short-term reference noise; cleaner output jitter. | Slower recovery; more dependence on oscillator drift/temperature; long settling tail. | Idle looks great, but after GM switch / reference drop, alignment takes too long to recover. |
Holdover: who dominates the error when the reference is gone
- While disciplined: output stability is dominated by reference noise (PDV, switch behavior) and loop filtering strategy. The loop decides how much to “trust” the network reference.
- In holdover: error is dominated by the local oscillator (temperature drift, aging) and its environment. At that point, “good filtering” cannot compensate for an unstable timebase.
- At reference return: the servo strategy must prevent abrupt time steps and provide predictable convergence. Stability is measured by both transient shape and tail behavior after re-lock.
Acceptance measurements: choose one primary set and make it repeatable
For CCTV deployments, a practical acceptance method is to use offset distribution tails and a recovery curve under representative traffic. This directly maps to multi-camera alignment and auditability.
| Test condition | What to record | Pass/fail language (template) | What it proves |
|---|---|---|---|
| Steady under load | offset 95th, offset max (and/or skew tails) |
“Under traffic profile X, offset 95th ≤ A and max ≤ B.” | Loop filters reference noise while keeping tails bounded. |
| Transient event | Offset vs time after disturbance (restore curve) | “After event Y, returns to spec within T seconds with no abnormal overshoot.” | Loop stability and predictable convergence. |
| Reference loss | Holdover drift vs time (or tail growth) | “In holdover for N minutes, error stays within D.” | Oscillator and holdover behavior meet the evidence budget. |
Holdover Timebase: RTC + Hold-Up Power + Temperature Reality
For CCTV evidence, the most damaging failure is not small drift — it is time discontinuity (time jumps, time going backward, or undefined gaps across power loss and reboot). Holdover design defines how a device preserves time continuity when power or reference timing is disturbed, and how it restores timing without breaking auditability.
RTC vs monotonic time: roles and boundaries
- RTC (wall time): preserves an approximate real-world time reference across power loss, enabling “close to correct” restoration. RTC accuracy is often temperature-dependent.
- Monotonic time: guarantees that local time used for ordering events never goes backward. Even if wall time is imperfect, monotonic continuity protects audit trails and sequencing.
- PTP/PHC re-discipline: aligns wall time back to the network reference after recovery while avoiding abrupt steps that break correlation.
Hold-up power goals: define duration and drift budget (not just hardware)
Hold-up power (supercap/battery) keeps the timing domain alive so the system can preserve continuity. But the error budget during holdover is dominated by RTC/oscillator drift and temperature reality, not by “capacity” alone.
| Goal dimension | What to define | How to express it (acceptance language) | Dominant contributor |
|---|---|---|---|
| Duration | How long the timing domain stays valid across outage. | “Holdover lasts N minutes without losing monotonic continuity.” | Hold-up rail + load of RTC/timing core. |
| Drift budget | Allowed error growth during holdover. | “During N minutes holdover, time error stays within D.” | RTC/oscillator temp drift + aging. |
| Restore behavior | How the device returns to the offset budget after power/reference recovery. | “Returns to spec within T seconds with no backward step.” | Servo strategy + DPLL settings. |
Evidence test: power loss → power restore (the restore curve)
Holdover acceptance should be demonstrated with a repeatable “loss-and-restore” test: run disciplined, force a controlled outage (power or reference), wait for a defined duration, then restore and record offset vs time until it returns to steady-state tails.
Reference Options: GNSS, SyncE, and “When PTP Alone Isn’t Enough”
This section is a decision boundary (not a tutorial): choose the minimum reference stack that meets the long-term drift and audit requirements for CCTV timing. PTP can align endpoints very well inside a controlled network, but some deployments require a stronger long-term anchor or frequency discipline to maintain time continuity during long outages and harsh environments.
What each reference option primarily solves
| Option | Primary strength | Best for | Typical limitation (scope-safe) |
|---|---|---|---|
| PTP (IEEE-1588) | Distributes time and enables multi-endpoint alignment (when HW timestamp + controlled switching exist). | Multi-camera correlation inside a managed CCTV network. | Long-term stability depends on reference continuity and network behavior. |
| SyncE | Distributes a stable frequency reference (frequency discipline). | Reducing long-term frequency drift and improving holdover stability. | Does not directly provide absolute time-of-day by itself. |
| GNSS | Provides a long-term absolute time anchor (site-level truth source). | Cross-site audit, long outages, isolated sites, long holdover requirements. | Field conditions can affect availability; still requires a clean local timebase strategy. |
When CCTV deployments must introduce external reference
- Long holdover targets: if the site must maintain bounded time error for long periods without network reference, external anchoring or stronger timebase becomes necessary (GNSS and/or better oscillator class, and/or frequency discipline).
- Cross-site evidence correlation: multi-location incident reconstruction and audit trails often require a common long-term anchor.
- Uncontrolled switching / unstable PDV: external reference improves long-term behavior, but does not “fix” queueing variability; network ceilings still require evidence-based diagnosis (tails vs load/hops).
Validation & Compliance Tests: How to Prove Sync Works
Synchronization must be proven with repeatable evidence, not assumptions. This section provides a minimum viable test SOP that works for both factory acceptance and field re-testing: capture protocol evidence, read endpoint timing state, and report results in tail statistics and restore curves.
Minimal toolset (enough to close the evidence loop)
- Packet capture: confirm timing domains and message behavior (domain/sequence/corrections) at the network edge.
- PHC / endpoint timing status: confirm hardware timestamp path and servo state (locked/holdover/restore).
- Logs + statistics: compute offset/skew distributions and store restore curves for audit and regression.
A single table that works for factory acceptance and field re-test
Each test below includes: condition → measurement → pass/fail language → what it proves. The same table can be used in production as a baseline and in the field for re-validation after network changes.
| # | Test | Condition | What to measure | Pass/fail language (template) | What it proves |
|---|---|---|---|---|---|
| 1 | Steady (idle) | Low traffic, stable topology | offset 95th/max (and skew tails if needed) |
“Idle: offset 95th ≤ A and max ≤ B.” | Baseline stability and measurement pipeline sanity. |
| 2 | Steady (load) | Representative CCTV streams + normal operations | offset 95th/max under load |
“Load: offset 95th ≤ A’ and max ≤ B’.” | Tail robustness with real traffic and endpoint load. |
| 3 | Network congestion stress | Controlled congestion / PDV injection | Tail growth vs congestion level | “Under stress: tails remain within budget (or degrade predictably).” | Whether the ceiling is network PDV vs endpoint clock-tree. |
| 4 | Temperature drift | Temperature sweep (defined range/ramp) | Drift trend and tail stability across temperature | “Across temperature: drift ≤ D and tails stay bounded.” | Oscillator/RTC temperature reality vs requirement. |
| 5 | Power-loss holdover | Controlled outage for N minutes | Holdover drift + monotonic continuity | “Holdover N min: error ≤ D, no backward step.” | Continuity and drift budget across outages. |
| 6 | Reboot / restore | Reboot or reference return event | Restore curve (offset vs time), convergence time | “Returns to spec within T seconds, no abnormal step.” | Predictable recovery behavior and audit readiness. |
Field Debug Playbook: Symptom → Evidence → Isolate → Fix
This playbook is designed for field use: each symptom is closed with a minimal evidence loop. Every path starts with two measurements (fast, repeatable), then uses a discriminator to isolate the dominant contributor, and ends with a first fix (minimal change first). MPNs below are examples to guide BOM selection and replacement strategy.
Symptom A — Multi-camera videos drift further apart over time
Alignment looks acceptable at the beginning, but recorded timelines separate gradually (minutes to hours). This usually indicates a frequency/holdover dominance (local timebase and clock-tree reality) rather than pure network PDV.
- Offset trend: log offset (or skew) vs time for 10–30 minutes and estimate drift slope (not only snapshots).
- Endpoint timing state: record PHC/servo status (locked / holdover / restoring) and any re-lock events.
- Monotonic slope (steady drift direction) with stable jitter → local oscillator / temperature drift is dominant.
- No clear slope but heavy tails that correlate with traffic → network PDV is dominant (route to Symptom B).
- Frequent lock transitions → reference instability or servo configuration mismatch (route to Symptom C + restore behavior).
- Protect the clock-tree boundary: ensure media clock consumers use the cleaned clock, not a noisy CPU-derived clock.
- Upgrade timebase quality for drift budget (MPN examples):
- Temperature-compensated oscillator (TCXO): SiTime SIT5356 (TCXO family, typical 25/26 MHz options).
- MEMS oscillator (programmable): SiTime SiT8008 (common programmable oscillator family).
- Precision oscillator families: NDK NZ2520SDA (industry oscillator family; select frequency/ppm per need).
- If long holdover is required: add a disciplined reference path (see Symptom C) or move to GNSS/SyncE strategy (MPNs below).
Symptom B — Sync fails only during high traffic / congestion
The system is stable at idle, but tail events (large offset/skew) appear when video load increases or the network becomes busy. This pattern points to PDV (packet delay variation), switching participation limits, or an endpoint timestamp path that is not truly hardware-bounded.
- Tails vs load: compare
offset 95thandoffset maxbetween idle and representative load. - Capture evidence: inspect capture fields for timing behavior consistency (domain/sequence continuity; correction behavior if present).
- Idle good, load tails explode → PDV ceiling or queuing/jitter coupling into the servo loop.
- Tails correlate with hop count → switching participation (TC/BC capability) is likely the limiter.
- Tails correlate with endpoint CPU load → timestamp path may be contaminated by scheduling/interrupt latency.
- Verify endpoint timestamp boundary: ensure the NIC/PHY/MAC supports hardware timestamping at the correct boundary.
- Use a PTP-capable switch strategy (MPN examples, common in CCTV switching):
- Industrial Ethernet switch SoC families: Microchip LAN9668 (PTP-capable switch family; choose port count variant).
- PTP-capable industrial switch families: NXP Layerscape (LS10xx / LS20xx) (used in managed networking designs).
- If PDV dominates and cannot be reduced: tighten acceptance around tails and recovery; avoid masking network issues by over-filtering.
- Hardening on the edge: add a jitter-cleaning DPLL in the endpoint timing path (MPN examples):
- DPLL / jitter cleaner: Renesas (IDT) 8A34001 (SyncE/PTP timing family; pick variant per outputs).
- DPLL / jitter cleaner: Silicon Labs Si5341 (jitter-attenuating clock family; choose configuration as needed).
Symptom C — After power loss / reboot, time jumps or goes backward
This is a trust and compliance failure. Even small backward steps break event ordering and auditability. The root cause is usually holdover / RTC / restore policy, not “network accuracy.”
- Restore curve: record offset vs time from power restore until steady-state is reached.
- Monotonic violation evidence: check for backward-step flags/counters in logs (or detect discontinuity in timestamp sequence).
- Backward time step exists → restore policy is unsafe; must enforce monotonic continuity.
- No backward step but large initial error → holdover drift budget is not met (RTC/timebase dominance).
- Frequent restore oscillation → servo/loop configuration mismatch; limit correction rate and overshoot.
- Enforce monotonic restore: never step backward; converge with bounded slew rate to avoid audit breaks.
- Strengthen holdover timebase (MPN examples):
- RTC with good stability options: Microchip MCP7940N (I²C RTC family; choose accuracy grade as needed).
- High-accuracy RTC families: ABLIC S-35390A (RTC family used in low-power designs).
- Add hold-up power for the timing domain so RTC/timing core survives short outages (MPN examples):
- Supercap charger / backup controller: Analog Devices LTC3225 (supercap charge/balance family).
- Power-path / ideal-diode controller for backup OR-ing: Analog Devices LTC4412 (ideal diode controller family).
- If long outages must preserve absolute time: anchor with GNSS and discipline the local timing (MPN examples):
- GNSS receiver module: u-blox NEO-M9N (multi-GNSS module family).
- GNSS receiver module: Quectel L76K (GNSS module family).
- Harden the restoration clock path with a jitter cleaner / DPLL (examples: Si5341, 8A34001) to ensure predictable convergence.
FAQs (PTP Timing & Sync for CCTV)
These FAQs target long-tail field questions without scope creep. Each answer is evidence-based: one result metric (tails / drift / restore curve) + one locating evidence (capture fields / PHC servo state), followed by a minimal first fix and a jump back to the relevant chapter(s).
95th/max) and recovery curves, not averages.
PTP is enabled but cameras still drift apart over time—timestamping issue or oscillator drift?
Check whether this is a drift problem or a tail problem. First, log offset vs time for 10–30 minutes and estimate drift slope. Second, record PHC/servo state (locked/holdover/restore). A steady slope points to oscillator/clock-tree dominance; traffic-correlated bursts suggest a timestamp boundary or hidden software path. Fix by enforcing true HW timestamping and a clean clock tree.
Offset looks fine on average, but video alignment is still unstable—should I care about PDV?
Yes—video alignment is usually driven by tails, not the mean. Measure offset 95th and offset max
at idle and under representative load, then correlate tail growth with congestion and hop count. If tails inflate with traffic,
the network PDV ceiling is the limiter even when the average looks “good.” First fix is to validate BC/TC participation and write acceptance around tails.
When do I need a Boundary Clock switch instead of Transparent Clock?
Use a Boundary Clock when the network needs regeneration and isolation, not just correction. If tails grow quickly with hop count, mixed traffic, or multiple segments, BC terminates the timing session and re-times downstream, limiting how PDV accumulates. A Transparent Clock helps by accounting for residence time, but it does not re-create a clean local timing domain. Confirm the need by comparing tails across hop counts.
Hardware timestamp supported, yet jitter is high—where’s the hidden software path?
“Supported” does not guarantee the servo is using HW timestamps. First, compare timing noise at idle vs high endpoint CPU load: a CPU-correlated jitter increase often indicates a software-contaminated path (interrupt latency, scheduling, userspace stamping). Second, confirm the PHC is the active time source and not an OS clock proxy. First fix: force HW timestamp boundary usage, reduce timestamp handling in userspace, and validate with the H2-10 SOP.
How do I pick PLL loop bandwidth for “fast convergence” vs “clean output”?
Loop bandwidth trades recovery speed against noise pass-through. A wider loop converges faster after disturbances but tracks more PDV/wander, which can inflate alignment tails. A narrower loop cleans output but can converge slowly and may worsen long holdover error if misconfigured. Choose bandwidth from the error budget: specify tail limits and allowed restore time, then validate with step disturbances and congestion tests rather than intuition.
Holdover passes 5 minutes but fails 30 minutes—RTC issue or oscillator grade?
Use the holdover restore curve to separate drift mechanisms. If error grows roughly linearly with outage duration, oscillator stability/temperature is dominant. If error shows steps or discontinuities at power transitions, the RTC/backup domain and restore policy are suspect. First measure drift rate per minute from 5 vs 30 minutes; then confirm whether the timing domain stayed powered. Fix by upgrading timebase quality or adding discipline (GNSS/SyncE) when the holdover budget is beyond the local timebase.
After power loss, time jumps backward—what should be monotonic and what should be wall-clock?
Monotonic time must never go backward; wall-clock time may be corrected, but only with a bounded forward slew to preserve event ordering. First, detect and log any backward-step violations during restore. Second, capture the restore curve (offset vs time) until steady-state. If backward steps exist, fix the restore policy immediately: clamp corrections, preserve monotonic counters, and ensure the timing/RTC domain survives short outages with hold-up power.
Only one site has bad sync—network congestion or switch feature mismatch?
Run the same SOP in both sites and compare evidence. First, measure tails (95th/max) at idle and load; if only the bad site inflates under load, congestion/PDV is likely. Second, capture key timing behavior and note hop count and switch capabilities (BC/TC participation). If idle is already poor, suspect endpoint timestamp boundary or clock-tree integrity. Field proof should bundle topology snapshot, tails, capture summary, and PHC/servo state.
Do I need SyncE if I already have PTP?
PTP can be sufficient for alignment inside a controlled network with short outages, but SyncE becomes valuable when frequency stability and long holdover drift dominate the requirement. SyncE disciplines the rate (frequency), while PTP carries time-of-day alignment. Decide using your holdover budget: if required drift per outage duration is tighter than the local timebase can sustain across temperature, add frequency discipline (SyncE) and/or a long-term anchor.
How can I validate sync without expensive lab gear?
Yes—use a minimal evidence toolkit: packet capture, PHC/servo status, and logging/statistics. Execute a repeatable six-test set:
idle tails, load tails, congestion stress, temperature drift, power-loss holdover, and reboot/restore. Report 95th/max tails and the restore curve
with explicit pass/fail language so results are comparable across factory and field. This is sufficient to prove the ceiling and route root cause without specialized lab instruments.
Cameras sync fine individually, but fail when many streams run—why load affects timing?
Load increases timing stress in two places: network queues (PDV grows) and endpoints (interrupt/scheduling pressure rises). The mean offset can remain “fine” while tails explode, which breaks frame alignment. First compare tails vs stream count and hop count; then check whether jitter correlates with endpoint CPU load. Fix by addressing PDV ceilings (switch participation/QoS) or hardening the timestamp boundary and clock distribution so media timing stays clean under load.
What logs/counters are most useful to prove the root cause in the field?
Prioritize evidence that closes the isolate loop: (1) tails and trends—offset 95th/max, drift slope, and restore curve; (2) servo state transitions—
locked/holdover/restoring; (3) PHC health counters and timestamp error indicators; (4) capture metadata—domain and sequence continuity plus correction behavior;
(5) event markers for power loss/reboot and network load. Bundle these with a topology snapshot so the same facts support “network PDV vs endpoint timestamp vs holdover policy” conclusions.