TCXO-Based RTC for Low-Drift I2C Timekeeping

Q: Why does drift look worse after enabling discipline?

Likely cause: The loop is chasing measurement noise (short window/torn reads) or using a temperature proxy that does not track the oscillator neighborhood (gradient/lag).\nQuick check: Plot trim_code and time_error vs time; if trim toggles frequently while time_error variance rises, noise dominates. Also log i2c_retry_count and torn_read_count.\nFix: Increase averaging window, add deadband/hysteresis, slow the update cadence, and enforce trim-first / slew-only during steady state.\nPass criteria: trim_step_rate < K/hour and 24h_worst_case_time_error < X seconds under profile Y (budget-defined), with lower variance than discipline off.

Q: Why does time jump backward occasionally even if RTC is running?

Likely cause: A step correction is applied without a monotonic guard, or restore logic loads a stale saved timestamp/table and overwrites a newer time.\nQuick check: Log monotonic_violation_count, corr_mode (step/slew/trim), and restore_phase; correlate backward jumps with correction events or restore entry.\nFix: Enforce monotonic time (clamp or slew-only during runtime); allow step only at boot with explicit time-was-untrusted flag; validate saved-table version/CRC before applying.\nPass criteria: monotonic_violation_count = 0 over N days of stress runs (including power cycles), and no negative time deltas in logs.

Q: My I²C reads sometimes show “mixed” seconds/minutes—what’s the first fix?

Likely cause: Torn reads across a rollover boundary, or missing the device’s snapshot/latch mechanism.\nQuick check: Read time twice back-to-back; if values differ in an impossible way, torn reads are present. Track torn_read_count and burst_read_used.\nFix: Use snapshot/latch if available; otherwise use burst read + double-read-verify and avoid reads near rollover windows.\nPass criteria: torn_read_count = 0 over N consecutive reads, with i2c_retry_rate bounded.

Q: Why does calibration work at room temp but fail after a thermal cycle?

Likely cause: Calibration is overfit to one condition, or thermal gradients/sensor lag shift the effective oscillator temperature after cycling.\nQuick check: Compare time_error vs temp_proxy correlation before/after cycle; log temp_proxy_offset and lag_tau; check cal_version and CRC on restore.\nFix: Use multi-point calibration with soak time, constrain the model, and validate across at least two independent thermal runs.\nPass criteria: After cycle, 24h_worst_case_time_error < X seconds under profile Y, and two repeated runs agree within margin M.

Q: How do I choose the update interval (too fast vs too slow)?

Likely cause: Too fast causes correction chatter from noise; too slow cannot track temperature drift/gradients.\nQuick check: Sweep update_period and compare time_error variance, trim_step_rate, and correction_event_rate_per_day under the same thermal profile.\nFix: Tie update cadence to thermal time constant, low-pass the error estimate, and add a deadband around zero error.\nPass criteria: trim_step_rate bounded, correction_event_rate_per_day < K/day, and 24h_worst_case_time_error meets X under profile Y.

Q: Why does trim saturate at min/max and never recovers?

Likely cause: Wrong sign/units in correction, insufficient trim range, or persistent temperature proxy mismatch biases the estimator.\nQuick check: Log trim_code, time_error slope, and the sign of error→trim mapping; verify whether error improves when trim moves away from saturation.\nFix: Re-zero controller offset, clamp integral growth, add deadband, and validate direction with two-point perturbation; adjust architecture/policy if range is insufficient.\nPass criteria: trim_saturation_pct < P% over 24h and time_error slope converges inside the budget.

Q: How can I tell thermal gradient vs true oscillator drift?

Likely cause: Gradients change oscillator neighborhood temperature without changing the measured proxy; true drift stays consistent for the same neighborhood temperature.\nQuick check: Hold chamber setpoint and toggle airflow/load; if time_error shifts while temp_proxy stays similar, gradients dominate. Track Δtime_error(airflow).\nFix: Improve sensor placement and thermal path management; slow corrections to match measured thermal lag.\nPass criteria: Under defined airflow/load toggles, Δtime_error < X seconds/day-equivalent and residual error correlates with the chosen proxy within tolerance.

Q: What’s the minimum data I should log to debug field drift?

Likely cause: Field drift becomes un-debuggable when correction context is missing.\nQuick check: Ensure each correction event emits a compact record with consistent timestamps.\nFix: Log time_error, freq_error_est, temp_proxy, trim_code, update_period, corr_mode, i2c_retry_count, torn_read_count, power_state, restore_phase, cal_version, CRC, and a reason_code.\nPass criteria: For any anomaly window, logs reconstruct correction actions, bus health, temperature context, and calibration versioning with no gaps > T minutes.

Q: Why does backup mode lose the “disciplined” benefit after long off time?

Likely cause: Calibration/context is not retained (or fails CRC), backup-domain conditions differ, and restore does not re-enter discipline in a controlled sequence.\nQuick check: Verify cal_version/CRC after restore; compare early time_error slope to baseline; check whether stale trim is applied.\nFix: Rate-limit writes, persist only stable estimates, and on restore validate table → coarse align (if allowed) → fine trim/slew with bounded gains.\nPass criteria: After off-time T, recovery_time < R minutes to return inside budget and CRC_fail_count = 0.

Q: Is it better to correct frequency (trim) or correct time (slew/step)?

Likely cause: Trim targets rate error; slew/step targets phase error; the wrong choice breaks monotonic logs or creates discontinuities.\nQuick check: Identify allowed_step, max_slew_rate, and whether errors are rate-dominant vs phase-dominant; log corr_mode decisions.\nFix: Use trim-first for steady-state drift; use slew to remove residual phase error; reserve step for controlled boot-time alignment with explicit flags.\nPass criteria: monotonic_violation_count = 0, slew_rate ≤ S, and 24h_worst_case_time_error meets X under profile Y.

← Back to:Reference Oscillators & Timing

TCXO-Based RTC improves real-world timekeeping by disciplining a 32 kHz RTC with a stable TCXO reference and a measurable correction loop. It turns drift control into an engineering workflow—error budget, thermal strategy, firmware guards, validation gates—so accuracy is provable and maintainable in production.

What is a TCXO-Based RTC (and when you need it)

A TCXO-Based RTC keeps time in the usual low-power RTC domain (often a 32 kHz timebase), but uses a temperature-stable reference (TCXO, or TCXO-referenced measurements) to discipline the RTC: measure drift, apply trim or corrections, and keep long-term error close to the TCXO-grade behavior.

Definition in one line

RTC timekeeping remains in the always-on clock domain, while accuracy is enforced by TCXO-grade stability via periodic discipline (frequency trim and/or time correction).

What improves (quantified, in practical units)

Drift is best reasoned in seconds per day, not just ppm. (Rule of thumb: 1 ppm ≈ 0.0864 s/day.)
Typical goal: move from tens of ppm down to a few ppm or better, with the exact target set by the system budget (temperature profile, holdover duration, logging requirements).
The gain comes from observability + correction: drift is measured against a stable reference and corrected before it accumulates.

When it is worth it (trigger checklist)

No GNSS / weak signal / offline device

Time-stamps must remain trustworthy over long deployments.

Long holdover / sleep intervals

Error must stay bounded across hours/days without an external time source.

Production consistency requirements

Tighter distribution across units after temperature cycling and aging.

Harsh temperature dynamics

Outdoor/industrial/vehicle thermal gradients dominate real-world drift.

Out of scope (to avoid topic overlap)

TCXO internal physics and phase-noise deep dives → see TCXO.
Generic RTC fundamentals (calendar/alarms/basic 32 kHz crystal design) → see RTC.
Backup switchover circuitry, supercap/coin-cell leakage troubleshooting → see RTC Backup & Switchover.
Network time distribution/discipline (PTP/SyncE/GPSDO) → see Timing & Synchronization.

System architectures: 3 ways to discipline an RTC with TCXO

TCXO-based timekeeping can be built in three practical ways. The best choice depends on power budget, required observability, temperature dynamics, and production calibration constraints. The goal of this section is to map architectures clearly; loop theory and deep PLL/cleaner behavior are intentionally excluded.

Key decision dimensions (fast scan)

Power budget

A: Medium–High B: Low–Medium C: Low–Medium

Hardware complexity

A: Medium B: Low C: Low

Observability (can drift be measured?)

A: Strong B: Strong C: Depends

Production calibration friendliness

A: Medium B: Strong C: Strong

Labels A/B/C correspond to the architecture cards below. Values are directional; final choice must be validated against the drift budget and temperature profile.

A) Ref-in: TCXO provides a stable reference

What it is

A TCXO-grade clock is available in the system and acts as the stable reference for timebase generation or calibration, while the RTC interface remains I²C-based.

Minimum requirements

A stable TCXO output available to the timekeeping logic (directly or via measurement).
Defined behavior for startup/warm-up and reference validity detection.

Failure modes to watch

Calibration performed before the TCXO output is thermally settled (initial minutes dominate error).
Reference path contamination (power/ground noise) misleads the discipline loop.

B) Measure-and-trim: measure RTC drift, write trim

What it is

The RTC’s effective 32 kHz rate is measured against a TCXO-grade reference over a defined window, then corrected using the RTC’s digital calibration/trim mechanism.

Minimum requirements

A way to measure frequency or time error over a stable interval (counter/timer).
RTC supports trim granularity that can actually move the drift distribution.

Failure modes to watch

Too-short measurement window (noise dominates) or too-coarse trim step (no convergence).
Trim write timing near rollovers causes occasional apparent “time jumps”.

C) Soft discipline loop: firmware corrects time error (slew/step)

What it is

Firmware periodically estimates time error using a TCXO-grade reference, then corrects time in controlled steps (step) or gradually (slew) while enforcing monotonic behavior for logs.

Minimum requirements

Reliable time-error estimator (reference + stable logging cadence).
A correction policy that keeps application time monotonic when required.

Failure modes to watch

Over-correction creates visible time wobble (frequent steps) instead of reducing long-term drift.
State loss across power events breaks convergence until re-initialized properly.

Out of scope: PLL/clock-cleaner loop theory and deep jitter transfer. For those topics, use PLL and Jitter Attenuators / Clock Cleaners.

Error budget: from ppm to seconds/day (what dominates in reality)

Drift discussions become actionable only after choosing one consistent unit and an error budget structure. This section defines the conversion from ppm to seconds/day, then breaks total error into components that can be measured, controlled, and validated.

Conversion (fixed units for specs and acceptance)

ppm → seconds/day

seconds/day = ppm × 86400 / 1,000,000
Rule of thumb: 1 ppm ≈ 0.0864 s/day

ppm → seconds over holdover

seconds = ppm × holdover_seconds / 1,000,000
Example placeholder: X ppm × Y hours → Z seconds

Acceptance criteria should always specify the temperature profile and time window; ppm numbers without context are rarely predictive in real systems.

Error terms (structured for engineering control)

Reference-limited

TCXO stability (temperature behavior and aging).
Reference validity during startup and warm-up.

Timebase & quantization-limited

RTC divider/counting granularity and trim step size.
Calibration quantization (cannot correct smaller than one step).

System-limited (often dominant in reality)

Thermal gradients (sensor location ≠ TCXO/RTC thermal domain).
Measurement window too short (noise dominates drift estimate).
Write/read timing (rollover, shadow registers, I²C latency).
Update strategy (step/slew policy causes apparent “drift”).

Out of scope: deep TCXO datasheet interpretation (phase-noise, internal compensation modeling). Use TCXO for device-level details.

“Expected root cause” vs “dominant real-world cause”

Upgrading TCXO does not improve drift

Likely dominant: board thermal gradient or wrong temperature proxy.
Quick check: log time_error and temperature during a chamber sweep and evaluate correlation and lag.

Short test looks good, long test is poor

Likely dominant: window too short or update too aggressive.
Quick check: increase measurement window by N× and verify whether the estimate converges or still wanders.

Occasional jump / mixed fields when reading time

Likely dominant: I²C atomicity (read across a rollover) or incorrect write timing.
Quick check: use a burst/latch read mode or double-read verify around second boundaries.

Acceptance template (reusable pass criteria)

Temperature profile: T(t) explicitly defined (step, ramp, cycle; include dwell time).
Primary metric: max |time_error| over the window (e.g., 24 h) < X seconds (system budget).
Secondary metric: error rate (seconds/day) bounded under the same profile.
Behavioral constraint: no prohibited time back-steps for logging/audit use-cases.
Health constraint: trim/correction does not remain saturated (otherwise the model/thermal proxy is wrong).

Units: seconds/day Must specify: T(t) + window Log: time_error + temp + trim

Disciplining algorithms: open-loop trim vs closed-loop control

Discipline is an engineering policy: correct frequency (trim) or correct time (step/slew), while controlling noise sensitivity, thermal tracking, and time behavior constraints (especially monotonic logs). This section maps practical options without diving into network time algorithms.

Open-loop: temperature/LUT → trim

Input

Temperature proxy and calibration table (single-point / multi-point / chamber-derived LUT).

Output

RTC trim value (frequency correction). Behavior is stable and predictable when the thermal proxy matches the oscillator’s thermal domain.

Common failure

LUT becomes wrong under real airflow/heat sources due to thermal gradients and sensor mismatch.

Frequency-first Low complexity

Closed-loop: error estimate → filtered correction

Error input

Time error or frequency error computed against a stable reference over a defined window.

Key knobs (engineering meaning)

Window: longer = less noise, slower reaction.
Update interval: too fast = chases noise, too slow = misses thermal drift.
Limits: clamp correction rate and handle saturation to preserve stability.

Stability signal

Convergence shows as a shrinking time_error envelope without frequent visible correction events.

Noise vs tracking trade-off Requires observability

Out of scope: PTP/SyncE academic algorithms and link delay correction. Use Timing & Synchronization.

Step vs Slew (time behavior policy)

Step

Best for initial alignment after boot or when the error is too large and the application permits a visible correction.

Slew

Best when logs require smooth behavior. Correction is spread over time to avoid sudden jumps.

Monotonic constraint

For audit/event ordering, enforce no back-steps. If negative correction is needed, apply it as a bounded slew or via an offset layer rather than rewriting time backward.

Policy: allowed jump? Policy: max slew rate

Update interval selection (fast vs slow costs)

Choose interval based on thermal time constant: fast thermal dynamics require faster tracking, but only if measurement noise is controlled.
Use a window long enough to average measurement noise; shrinking the window should not change the estimated drift direction dramatically.
Enforce rate limits: correction per update should be bounded to avoid visible time wobble.
Log these fields together to debug real behavior: time_error, temperature, trim/correction, interval.

Too fast → chases noise Too slow → misses drift Always clamp & monitor saturation

Temperature strategy: TCXO compensation, sensing, and thermal gradients

In TCXO-based timekeeping, temperature is not a single number. It is a field shaped by heat sources, copper planes, airflow, and thermal time constants. If the temperature proxy is wrong, even a strong ppm-grade TCXO can still produce visible drift at the system level.

TCXO internal compensation ≠ board temperature

What must match

The system temperature proxy must represent the same thermal domain as the TCXO/RTC. Physical distance alone is not sufficient; thermal coupling paths dominate.

Common mismatches

Sensor sits in a different airflow or copper plane than the TCXO.
Local self-heating changes package temperature without changing “board temp”.
Heat sources (SoC/PMIC) shift gradients during workload transitions.

Key risk: wrong proxy Symptom: “ppm looks good” but drifts

Sensor placement & thermal path management (actionable)

Placement rules

Place the sensor in the TCXO/RTC thermal domain.
Avoid direct airflow and avoid high-gradient heat plumes.
Prefer stable coupling: predictable lag beats random airflow-driven noise.

Thermal path controls

Use keepouts / isolation slots to limit heat spreading from SoC/PMIC.
Avoid large copper planes that “drag” heat into the TCXO area.
Keep TCXO/RTC away from inductors and hot power stages.

Out of scope: oven-based stabilization details. For oven control strategies, use OCXO.

Thermal time constants: fast vs slow changes (control impact)

Fast changes (airflow, door open)

Risk: the proxy temperature moves faster than the oscillator domain. Updating too often can chase noise, creating visible wobble. Use longer windows, slower update, and correction clamps.

Slow changes (enclosure warm-up)

Risk: time error accumulates if the update interval is too slow. Tracking requires an interval that follows the thermal drift, while using windows long enough to keep estimates stable.

Log together: time_error + temp + trim Knobs: window + interval + clamp

Typical root causes when drift persists

Thermal gradient dominates

Symptom: chamber looks stable; field drifts under airflow/workload changes.
First check: correlate time_error with temperature proxy and measure lag.

Package self-heating

Symptom: drift depends on output enable/load state.
First check: compare drift with identical temperature profile but different operating states.

Airflow-driven proxy noise

Symptom: correction becomes frequent without real improvement.
First check: increase window and reduce update rate; verify whether correction events reduce.

Hardware design: reference routing, power, and I²C integrity

Hardware details often become “time drift” symptoms. Reference routing, low-noise power domains, and I²C integrity determine whether frequency correction is stable and whether time reads/writes remain atomic and reliable.

Reference path: routing & returns (key points only)

Keep the reference path short and keep a clean return nearby.
Avoid crossing split planes or noisy return loops that modulate threshold crossings.
Maintain separation from fast I/O and switching nodes to reduce injected jitter into measurement.
If level/interface options exist (CMOS/LVDS, etc.), treat them as a system choice; this section focuses on layout risk control.

Short + controlled return Avoid split-plane crossings

Power domains: main / backup / always-on (why it matters)

Low-noise supply

TCXO/RTC should sit on a quiet LDO with clear domain boundaries to prevent digital load steps from modulating timekeeping.

Power sequencing

Avoid “half-powered” states that can corrupt RTC configuration or cause intermittent resets. Ensure backup switchover does not create ambiguous register behavior.

Domain boundaries Quiet LDO for TCXO/RTC

I²C integrity: pullups, capacitance, and EMI (engineering keys)

Pullups set a trade-off: stronger pullups improve edge rate but increase backup-domain power.
Bus capacitance (trace length + devices) slows edges and reduces noise margin; intermittent retries can look like time “jumps”.
EMI coupling can flip bits or force retries. Use routing separation, clean returns, and robust read/write retry policies.
Clock stretching compatibility should be confirmed; mismatches can create silent timeouts and partial reads.

Backup power vs edges EMI can masquerade as drift

Out of scope: protocol teaching (addressing, arbitration, timing spec walkthroughs). This section focuses on failure modes that produce incorrect time reads/writes.

Shadow registers & atomic read (hardware + system cooperation)

Preferred behavior

Use a latch/snapshot mechanism if supported so that all time fields are read from a consistent captured state.

Fallback behavior

If snapshot is not available, use double-read consistency checks and avoid second-boundary reads; reliable edges and EMI control keep this effective.

Prevent mixed-field reads I²C reliability required

Firmware timekeeping: read/write pitfalls, logging, and monotonicity

Many “RTC drift” issues are firmware artifacts: non-atomic reads, rollover-edge writes, inconsistent correction policy, or missing traceability. A disciplined RTC stack must make reads consistent, writes safe, and time behavior monotonic where required.

Read time correctly (avoid mixed-field reads)

Preferred

Use latch/snapshot read mode if supported.
Use burst read for full time fields in one transaction.

Fallback

Double-read-verify: read, then re-read and confirm stability.
Avoid sampling at second boundaries when snapshot is unavailable.

Goal: consistent fields Risk: rollover mix

Write/correct safely (avoid rollover-edge writes)

Use a defined write window away from rollover edges (seconds/minutes boundary).
If the device supports shadow registers, use the recommended “write + commit” flow to avoid partial field updates.
Treat I²C errors as first-class events: retries must be bounded; repeated failures should trigger a fault state rather than silent drift.
Apply correction in the smallest domain possible (trim first; step only when policy allows).

Safe windowing Atomic commit

Monotonicity: prevent time back-steps for logs

Policy

If the product requires ordered events, enforce no back-steps. When negative correction is needed, use bounded slew or an offset layer instead of rewriting time backward.

Step vs slew

Step is reserved for explicit re-sync conditions (boot, large error, non-audit mode). Slew is preferred when logs must remain smooth.

Constraint: no back-steps Slew for audit logs

Correction logging (required for field diagnosis)

Every correction should be traceable. Without a minimal record, thermal-proxy mismatch and I²C faults become indistinguishable from oscillator drift.

Log fields (minimum)

timestamp, rtc_time, system_time (or reference time)
time_error, estimated drift, trim/correction applied
temperature proxy, update interval, window length
reason code (boot, periodic, fault-recovery, manual)

Red flags

correction saturates for long periods (model/proxy mismatch)
frequent corrections without shrinking error envelope (noise chasing)
I²C retries spike near load transitions (EMI / return path)

Out of scope: operating system time subsystems. This section covers only RTC discipline behavior and error-proof read/write strategy.

Holdover and backup: what happens when main power disappears

Backup mode must preserve continuity and preserve the “discipline state” so that restoration does not introduce a large discontinuity. The design must balance retention, write cadence, and leakage while keeping a clear re-discipline sequence after power returns.

Backup-mode goals (what must survive)

Continuous timekeeping (no resets, no ambiguous time state).
Retention of calibration parameters (trim/LUT/controller state if required).
Predictable restoration: no large time jump, no back-step for monotonic logs.
Low leakage discipline: backup budget is dominated by always-on load and bus pullups.

Continuity Retention No jump

Parameter storage: RTC RAM vs EEPROM vs FRAM (engineering trade-offs)

RTC RAM

Lowest write cost and fastest access, but retention depends on backup power continuity and device behavior.

EEPROM

Strong retention but has wear and higher write energy. Requires rate limiting and careful commit timing.

FRAM

Low write energy and high endurance, often preferred when frequent parameter updates are required in the field.

Key knob: write cadence Key risk: wear / leakage

Write cadence (avoid draining backup or wearing storage)

Write only on meaningful state changes (model update, temperature regime change, fault recovery).
Rate limit periodic commits; prefer aggregated commits with a checksum and a version tag.
Do not write at the power-fail edge unless a safe “last-gasp” mechanism is guaranteed.
Log commit attempts and failures; a silent failure behaves like unexplained drift on restoration.

Commit with checksum Avoid edge writes

Restoration flow (coarse first, fine later)

Phase 1: coarse alignment

After power returns, re-establish a trusted baseline quickly (validate RTC, confirm monotonic policy, load retained parameters).

Phase 2: fine discipline

Resume periodic discipline with conservative window/interval until temperature and load settle, then return to normal tracking.

Out of scope: supercap/coin-cell switchover circuits and leakage troubleshooting. Use RTC Backup & Switchover.

Validation & measurement: how to prove drift improvement (bench + chamber)

This section turns “drift improved” into a measurable acceptance claim. The proof requires a trusted reference, consistent sampling, and traceability of the control output (trim/correction). The goal is time-drift validation, not RF purity or phase-noise deep evaluation.

A) What to measure (minimum set)

Measure both the outcome (time error) and the cause (trim/correction trajectory). Without control-output traceability, measurement artifacts can be misread as oscillator drift.

Primary signals

time_error: RTC vs reference (s/ms)
freq_error: counter or slope estimate
temp_proxy: TCXO/board sensor readout
trim/correction: code + mode (trim/slew/step)

Metadata (for debugging)

sampling interval Δt (actual, not assumed)
read method (snapshot/burst/double-verify)
I²C retries/timeouts (counts)
reason code (boot/periodic/fault-recovery)

Must log trim Must log Δt

B) Bench methods (choose one primary + one verifier)

1) Counter / frequency gate

Best for freq_error. Use sufficiently long gate time to avoid mistaking short-term noise for drift. Correlate against temp_proxy.

2) I²C trace / logic analysis

Best for proving read/write correctness: snapshot/burst usage, retry storms, and rollover-edge behavior.

3) Long-run logging

Best for acceptance: time_error envelope over 24–72 hours with consistent sampling and full correction traceability.

Practical rule: trust improves when one method produces the primary metric and another independently verifies read/write integrity.

C) Chamber sweep (make temperature drift reproducible)

Define a temperature profile: ramp rate, dwell/soak time, and number of dwell points.
Record both chamber setpoint and temp_proxy to expose thermal lag and gradients.
Evaluate drift during transients (ramps) as well as at dwell points; ramps often dominate real systems.
Keep sampling interval stable; log correction trajectory to separate control behavior from the thermal plant.

Profile + soak Log lag/gradient

D) Typical pitfalls (false drift)

Too-short window

Short gate or short log span makes noise look like drift. Use longer windows and report confidence bands.

Inconsistent sampling

Assuming a fixed Δt when the system jitters in scheduling distorts slope-based frequency estimates. Always log actual Δt.

Untrusted host time

PC clock corrections can dominate the measured error. Use a trusted reference source or explicitly log host sync events.

I²C delays as “drift”

Variable bus delay/retry changes apparent timestamps. Record retries/timeouts and avoid rollover-edge reads/writes.

E) Pass criteria template (acceptance-ready)

Profile: temperature profile Y (ramp + soak defined)
Duration: continuous run T (e.g., 24–72 h)
Metric: worst-case |time_error|, plus drift rate (optional)
Criteria: worst-case error < X seconds, no persistent trim saturation, and no recurring I²C fault storms

Thresholds (X, Y, T) must come from system budget and workload profile. This page provides structure, not fixed numbers.

Production calibration & field maintenance: what to store, what to alarm

A TCXO-disciplined RTC must be manufacturable and maintainable. This section defines calibration strategies, storage structure, alarm triggers, and safe fallback behavior. Security/audit time signing is out of scope here and should be handled in a dedicated secure-time design.

A) Production calibration strategy (1-point / 2-point / multi-point)

1-point

Lowest cost. Suitable when thermal gradients are controlled and the error model is stable. Verification must still use a defined profile.

2-point

Balanced approach. Captures a dominant slope term without exploding complexity. Often the best default for production.

Multi-point (temperature points)

Used when temperature behavior must be explicitly modeled. Requires guardrails against overfitting and a stronger verification gate.

Overfitting guardrails

Use a verification profile different from the calibration points (do not “train on the test”).
Reject solutions that rely on persistent trim saturation to meet short tests.
Treat sensor/TCXO thermal mismatch as a model risk; more points can amplify the mismatch.

B) What to store (versioned + integrity-checked)

Storage must survive resets and field updates. Use explicit versioning and CRC so corrupted or incompatible data triggers a predictable fallback rather than silent drift.

Calibration record template

cal_version + model id
CRC (full record integrity)
points[]: temperature point(s) + coefficients/trim base
valid_range: temperature/profile label
write_count + commit sequence number

Versioned CRC-protected Predictable fallback

C) Field alarms (detect “discipline is lying”)

1) time_error rate

Trigger when error growth rate exceeds a budgeted threshold. First action: check temperature correlation and correction cadence.

2) trim saturation

Trigger when trim remains near limits for long periods. This often indicates model mismatch or a dominant unmodeled thermal gradient.

3) temperature mismatch

Trigger when temperature changes no longer correlate with expected error response (lag/gradient changed by airflow or load).

Alarm thresholds must be derived from system budget. This page defines signals and failure modes, not fixed numbers.

D) Maintenance actions + safe rollback

Re-calibrate when alarms indicate persistent mismatch and the environment profile has changed.
Re-verify after firmware updates that change sampling cadence, correction policy, or temperature source.
Fallback if discipline becomes unstable: degrade to “plain RTC behavior” while preserving logging and alarms.
RMA decision when CRC fails repeatedly, trim saturates across conditions, or I²C reliability collapses.

Out of scope

Secure audit / signed time should be handled in the Secure RTC page (avoid mixing production drift maintenance with security architecture).

Engineering checklist (bring-up → validation → shipment)

This checklist converts “TCXO-based RTC discipline” into verifiable actions. Each item defines what to check, what to log, and what counts as pass, so results remain reproducible across bench, chamber, and production.

A) Bring-up — make the time path trustworthy first

1) I²C read consistency (no torn reads)

Check: reading time/date never returns mixed fields across a rollover boundary.
How: snapshot/latch mode, burst read, or double-read-verify (two identical reads required).
Log: read method, retry count, torn-read counter, bus error codes.
Pass: 0 torn reads in long-run sampling; retry rate under a defined threshold (project-specific).

2) Calibration register sanity (read/write/retain)

Check: trim/offset registers are writable and read back identically.
How: write → readback → soft reset → verify expected persistence behavior.
Log: written code, readback code, reset context, CRC/metadata if stored externally.
Pass: readback matches; post-reset behavior matches datasheet expectations.

3) Trim effectiveness (direction + stability)

Check: changing trim produces a consistent change in measured frequency/time error.
How: apply two trim codes; measure error sign and response repeatability.
Log: trim code vs time error trend; “trim activity” (how often changes occur).
Pass: monotonic direction; no high-frequency toggling that indicates chasing noise.

B) Thermal — verify the temperature proxy and gradients

1) Chamber profile readiness

Check: ramp/soak steps are defined and repeatable.
Log: profile ID, setpoint, measured proxy temperature, time stamps.
Pass: repeated runs stay within the same thermal envelope (project-specific tolerance).

2) Sensor coherence (offset + lag)

Check: sensor reading represents the TCXO/RTC neighborhood (not a distant heat source).
Log: sensor vs board reference temperature; estimated time constant (lag).
Pass: offset/lag is stable enough to be compensated or bounded in the error budget.

3) Gradient sensitivity (airflow/load)

Check: time error does not jump when airflow/load conditions change.
Log: labels for airflow/load modes + time error trajectory.
Pass: delta stays within budget; otherwise enforce a more conservative update cadence.

C) Algorithm — cadence, step/slew policy, monotonicity

1) Update cadence (avoid chasing noise)

Check: corrections are not toggling rapidly (a sign of measurement noise domination).
Log: update period, time error variance, trim activity, outlier counters.
Pass: time error converges while trim activity remains bounded.

2) Step vs Slew (log integrity)

Check: policy matches the system’s audit/logging requirement.
Log: correction mode (step/slew/trim), correction magnitude, reason codes.
Pass: no forbidden step events; slews stay within allowable slope limits (project-specific).

3) Monotonicity guard (never go backwards)

Check: system time never decreases, even during correction and restore.
Log: monotonic violation counter, last-good timestamp, mitigation action.
Pass: 0 violations; or violations are explicitly flagged and quarantined by design.

D) Backup/Holdover — persistence, restore, and re-discipline

1) Power-fail entry and stability

Check: single clean transition into backup mode (no chatter).
Log: power-fail timestamp, entry flags, backup current snapshot (if measurable).
Pass: no repeated entry/exit; critical parameters remain intact.

2) Parameter persistence cadence

Check: storing calibration/metadata does not exceed energy or endurance budget.
Log: write count, write reason code, CRC/version, last-write age.
Pass: write schedule obeys policy; CRC always validates after restore.

3) Restore and re-discipline (no unexplained jumps)

Check: restore uses “coarse then fine” and respects monotonicity.
Log: restore phases + time error/trim trajectory, fallback triggers.
Pass: time error returns inside the budget within a defined recovery window.

Shipment gate — one-page acceptance template

X) Worst-case time error: 24 h worst-case error < X s under temperature profile Y (project-defined).
Y) Control health: trim does not saturate long-term; correction activity stays bounded.
Z) Robustness: I²C error storms and monotonicity violations are 0 (or within a defined spec).

Fail mapping: X → error budget/thermal/measurement; Y → control strategy/thermal; Z → I²C integrity/firmware handling/factory-field policy.

Diagram: Checklist gates (Functional → Thermal → Holdover → Field robustness)

Gate-based checklist keeps drift claims measurable and reproducible across bench, chamber, and production.

Applications & IC selection notes (TCXO-Based RTC)

This section turns “TCXO-based RTC” into a selection flow: choose the architecture (A/B/C), then select TCXO/RTC/NVM features that match the drift budget, holdover profile, and logging integrity needs. Part numbers below are starting points for datasheet lookup—verify suffix, package, temperature grade, and availability.

A) Application slices (within this page boundary)

Industrial data logging (unattended)

Need: controlled drift over long runtimes.
Constraint: weak/absent external time sources.
Accept: 24 h worst-case error < X s under profile Y (project-defined).

Billing / event logs (consistency first)

Need: predictable drift + monotonic time policy.
Constraint: corrections must not break audit trails.
Accept: no forbidden steps; all corrections are logged with reason codes.

Off-grid devices (no gateway time)

Need: stable time base without relying on periodic sync.
Constraint: power budget, backup domain behavior.
Accept: holdover drift stays inside the budget; restore is “coarse then fine”.

Weak GNSS environments (holdover quality)

Need: acceptable drift when sync is intermittent.
Constraint: unpredictable reacquisition times.
Accept: drift bounded across temperature and airflow perturbations.

B) Selection fields (TCXO / RTC / NVM) + concrete part numbers

Option 1 — “All-in-one” temperature-compensated RTCs (fastest BOM)

Use when minimal part count and predictable drift matter more than custom discipline loops.

DS3231SN# — I²C RTC with integrated TCXO + crystal.
DS3232S# — I²C RTC with integrated TCXO + crystal, plus battery-backed SRAM.
DS3234S# — SPI RTC with integrated TCXO + crystal, plus SRAM (SPI systems).
DS3231M — I²C temperature-compensated RTC with internal MEMS resonator (no external crystal).

Option 2 — External TCXO as the stable reference (architecture A/C)

Use when the system already has a stable MHz reference (MCU/SoC timer) or needs tighter stability than a plain 32 kHz crystal domain.

SiT5356AI-FQ-33E0-10.000000X — 10 MHz Super-TCXO (LVCMOS).
ASTX-H11-25.000MHZ-T — 25 MHz TCXO (HCMOS).
ASTX-H11-27.000MHZ-T — 27 MHz TCXO (HCMOS; common for platform clocks).

Note: external TCXO frequency can be divided to seconds via a timer/counter path; the RTC can be corrected (slew/step/trim) through firmware policy.

Option 3 — Nonvolatile storage for calibration tables / metadata (CRC + version)

Use FRAM when frequent updates or “always-on logs” would wear out EEPROM; use EEPROM when writes are infrequent and power is tight.

MB85RC256V — 256 Kbit I²C FRAM (Fujitsu).
FM24CL64B-GTR — 64 Kbit I²C FRAM (Infineon).
24LC256 — 256 Kbit I²C EEPROM (Microchip; family includes 24AA256 / 24FC256 variants).

Support silicon (power) — concrete LDO examples

TPS7A0220PDQNR — nanopower-IQ LDO (TI).
MCP1700T-3302E/TT — low-IQ LDO 3.3 V option (Microchip).
ADP150AUJZ-3.3-R7 — ultra-low-noise LDO 3.3 V option (Analog Devices).

Selection rule: backup-domain current dominates holdover; main-domain noise can dominate short-term correction stability. Use two rails/domains when needed.

C) Architecture mapping (A/B/C) — the minimum decision set

Need the stable reference directly for a timer/counter? → prefer A) Ref-in.
RTC exposes fine trim/offset controls? → prefer B) Measure-and-trim.
MCU can stay active and enforce policy? → prefer C) Soft discipline loop.
Monotonic event logs required? → constrain to slew/trim-first behavior; avoid uncontrolled step events.

D) Risk notes (what breaks drift claims in real deployments)

Thermal gradients & airflow changes

A “ppm-grade” device can still drift if the temperature proxy does not track the oscillator neighborhood. Validate with airflow/load perturbations, not only steady soaks.

Measurement artifacts (I²C delay ≠ drift)

Short observation windows, irregular sampling, or torn reads can look like frequency error. Always log bus retries and read method alongside time error.

Field drift anomalies must be diagnosable

Store calibration version + CRC and correction history (offset/temp/trim/reason). When discipline fails, fall back to a safe RTC mode rather than producing silent time regressions.

Diagram: Selection flow (need → constraints → architecture → parts)

Keep the decision tree small: requirements → architecture A/B/C → feature picks → part numbers → measurable acceptance.

Reference examples (part numbers only; verify suffix/package/grade)

Compensated RTC: DS3231SN#, DS3232S#, DS3234S#, DS3231M
TCXO reference: SiT5356AI-FQ-33E0-10.000000X, ASTX-H11-25.000MHZ-T, ASTX-H11-27.000MHZ-T
FRAM: MB85RC256V, FM24CL64B-GTR
EEPROM: 24LC256 (family: 24AA256 / 24FC256)
LDO examples: TPS7A0220PDQNR, MCP1700T-3302E/TT, ADP150AUJZ-3.3-R7

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (TCXO-Based RTC)

Short, actionable troubleshooting. Each answer follows a fixed 4-line structure: Likely cause / Quick check / Fix / Pass criteria.

Why does drift look worse after enabling discipline?

Likely cause:

The loop is chasing measurement noise (short window/torn reads) or using a temperature proxy that does not track the oscillator neighborhood (gradient/lag).

Quick check:

Plot trim_code and time_error vs time; if trim toggles frequently while time_error variance rises, noise dominates. Also log i2c_retry_count and torn_read_count.

Fix:

Increase averaging window, add deadband/hysteresis, slow the update cadence, and enforce “trim-first / slew-only” during steady state.

Pass criteria:

trim_step_rate < K/hour and 24h_worst_case_time_error < X seconds under profile Y (budget-defined), with lower variance than “discipline off”.

Why does time jump backward occasionally even if RTC is running?

Likely cause:

A step correction is applied without a monotonic guard, or restore logic loads a stale saved timestamp/table and overwrites a newer time.

Quick check:

Log monotonic_violation_count, corr_mode (step/slew/trim), and restore_phase. Correlate backward jumps with correction events or restore entry.

Fix:

Enforce monotonic time: clamp or slew-only during runtime; allow step only at boot with explicit “time-was-untrusted” flag. Validate saved-table version/CRC before applying.

Pass criteria:

monotonic_violation_count = 0 over N days of stress runs (including power cycles), and no negative time deltas in logs.

My I²C reads sometimes show “mixed” seconds/minutes—what’s the first fix?

Likely cause:

Torn reads across a rollover boundary (seconds increments between reading different registers), or missing the device’s snapshot/latch mechanism.

Quick check:

Read time twice back-to-back; if values differ in an impossible way, torn reads are present. Track torn_read_count and burst_read_used.

Fix:

Use device snapshot/latch if available; otherwise use burst read + double-read-verify (accept only identical reads), and avoid reads near known rollover windows.

Pass criteria:

torn_read_count = 0 over N consecutive reads (N sized for the product’s lifetime logging cadence), with i2c_retry_rate bounded.

Why does calibration work at room temp but fail after a thermal cycle?

Likely cause:

The calibration table is overfit to one condition, or thermal gradients/sensor lag shift the effective temperature seen by the oscillator after cycling.

Quick check:

Compare time_error vs temp_proxy correlation before/after cycle; log temp_proxy_offset and lag_tau (estimated). Check cal_version and CRC on restore.

Fix:

Use multi-point calibration across representative temperature points (with soak time), constrain the model (no over-parameterization), and validate across at least two independent thermal runs.

Pass criteria:

After cycle, 24h_worst_case_time_error remains < X seconds under profile Y, and two repeated runs agree within margin M (repeatability gate).

How do I choose the update interval (too fast vs too slow)?

Likely cause:

Too fast: noise and bus jitter dominate, creating correction chatter. Too slow: the loop cannot track temperature drift and gradients.

Quick check:

Sweep update_period and compare time_error variance, trim_step_rate, and correction_event_rate_per_day under the same thermal profile.

Fix:

Choose an interval tied to the thermal time constant (avoid high-frequency chasing), add EWMA/low-pass on error estimates, and introduce a deadband around zero error.

Pass criteria:

trim_step_rate bounded, correction_event_rate_per_day < K/day, and 24h_worst_case_time_error meets the drift budget X under profile Y.

Why does trim saturate at min/max and never recovers?

Likely cause:

Wrong sign/units in correction, insufficient trim range for the true error, or a persistent temperature proxy mismatch (gradient/lag) biases the estimator.

Quick check:

Log trim_code, time_error slope (seconds/day), and the sign of “error→trim” mapping. Verify whether error improves when trim moves away from saturation.

Fix:

Re-zero the controller (offset), clamp integral growth, add deadband, and validate trim direction with two-point perturbation. If range is insufficient, switch to a different architecture (trim + slew policy).

Pass criteria:

trim_saturation_pct < P% over 24h and time_error slope converges inside the budget (X seconds/day equivalent).

How can I tell thermal gradient vs true oscillator drift?

Likely cause:

Gradients change the oscillator neighborhood temperature without changing the measured proxy; “true drift” stays consistent for the same neighborhood temperature.

Quick check:

Hold the same chamber setpoint and toggle airflow/load; if time_error shifts while temp_proxy stays similar, gradients dominate. Track Δtime_error(airflow).

Fix:

Improve sensor placement near the oscillator region, manage thermal paths (isolation/heat spreading), and slow corrections to match the measured thermal lag.

Pass criteria:

Under defined airflow/load toggles, Δtime_error < X seconds/day-equivalent, and the residual error correlates with the chosen temperature proxy within tolerance.

What’s the minimum data I should log to debug field drift?

Likely cause:

Field drift is often “un-debuggable” because the correction context (temperature, trim, bus health, restore path) is missing.

Quick check:

Ensure each correction event logs a compact record with consistent timestamps (monotonic + wall time if available).

Fix:

Log at minimum: time_error, freq_error_est, temp_proxy, trim_code, update_period, corr_mode, i2c_retry_count, torn_read_count, power_state, restore_phase, cal_version, CRC, plus a reason_code.

Pass criteria:

For any anomaly window, logs reconstruct (1) correction actions, (2) bus health, (3) temperature context, and (4) calibration versioning with no gaps larger than T minutes.

Why does backup mode lose the “disciplined” benefit after long off time?

Likely cause:

Calibration/context is not retained (or fails CRC), backup-domain conditions differ (temperature/voltage), and restore does not re-enter discipline in a controlled “coarse then fine” sequence.

Quick check:

After restore, verify cal_version/CRC, compare early time_error slope to pre-off baseline, and check whether the loop starts with stale trim.

Fix:

Rate-limit writes, persist only stable estimates, and on restore run a deterministic sequence: validate table → apply coarse alignment (if allowed) → enable fine trim/slew with bounded gains.

Pass criteria:

After an off-time of T hours, recovery_time < R minutes to return inside the drift budget, and CRC_fail_count = 0.

Is it better to correct frequency (trim) or correct time (slew/step)?

Likely cause:

Trim targets long-term rate error; slew/step targets phase error. The wrong choice breaks monotonic logs or creates visible discontinuities.

Quick check:

Identify constraints: allowed_step (yes/no), max_slew_rate (ppm or ms/s), and how errors appear (rate vs phase). Log corr_mode decisions.

Fix:

Use trim-first for steady-state drift; use slew to remove residual phase error without time going backward; reserve step for controlled boot-time alignment with explicit flags.

Pass criteria:

monotonic_violation_count = 0, slew_rate ≤ S (policy-defined), and 24h_worst_case_time_error meets X under profile Y.

Why does storing calibration more often reduce accuracy?

Likely cause:

The system commits noisy estimates (short windows), writes during unstable thermal periods, or creates write-correlated artifacts (power/bus contention), degrading repeatability.

Quick check:

Correlate time_error variance with write_event timestamps; compare “store often” vs “store on stability” runs under the same profile.

Fix:

Store only when stable (soak + low variance), decimate updates (min interval), use median/trimmed-mean estimates, and add version/CRC so invalid tables never apply.

Pass criteria:

write_event_rate ≤ W/day, post_store_error_delta does not increase beyond margin M, and CRC_fail_count = 0.

How do I set pass criteria without a perfect external time reference?

Likely cause:

Absolute time is hard to prove without a trusted reference; however, drift improvement can be verified using repeatability, bounded-error budgets, and independent cross-checks.

Quick check:

Define X/Y/T (error bound / thermal profile / duration), then run at least two repeat tests and compare 24h_worst_case_time_error and repeatability_delta.

Fix:

Use a lab counter/known-good timebase when available; otherwise use “two independent references” (e.g., a stable counter source + repeated profile) and require consistency across runs rather than single-shot truth.

Pass criteria:

24h_worst_case_time_error < X seconds under profile Y, and repeatability_delta < M across ≥2 runs; measurement method and uncertainty are logged.

Data-structured answer format (fixed)

Likely cause → Quick check (signals/log fields) → Fix (1–2 actions) → Pass criteria (measurable thresholds: X/Y/K/M/N/T/P/W placeholders).

TCXO-Based RTC for Low-Drift I2C Timekeeping

TCXO-Based RTC for Low-Drift I2C Timekeeping

What is a TCXO-Based RTC (and when you need it)

Definition in one line

What improves (quantified, in practical units)

When it is worth it (trigger checklist)

Out of scope (to avoid topic overlap)

System architectures: 3 ways to discipline an RTC with TCXO

Key decision dimensions (fast scan)

A) Ref-in: TCXO provides a stable reference

B) Measure-and-trim: measure RTC drift, write trim

C) Soft discipline loop: firmware corrects time error (slew/step)

Error budget: from ppm to seconds/day (what dominates in reality)

Conversion (fixed units for specs and acceptance)

Error terms (structured for engineering control)

“Expected root cause” vs “dominant real-world cause”

Acceptance template (reusable pass criteria)

Disciplining algorithms: open-loop trim vs closed-loop control

Open-loop: temperature/LUT → trim

Closed-loop: error estimate → filtered correction

Step vs Slew (time behavior policy)

Update interval selection (fast vs slow costs)

Temperature strategy: TCXO compensation, sensing, and thermal gradients

TCXO internal compensation ≠ board temperature

Sensor placement & thermal path management (actionable)

Thermal time constants: fast vs slow changes (control impact)

Typical root causes when drift persists

Hardware design: reference routing, power, and I²C integrity

Reference path: routing & returns (key points only)

Power domains: main / backup / always-on (why it matters)

I²C integrity: pullups, capacitance, and EMI (engineering keys)

Shadow registers & atomic read (hardware + system cooperation)

Firmware timekeeping: read/write pitfalls, logging, and monotonicity

Read time correctly (avoid mixed-field reads)

Write/correct safely (avoid rollover-edge writes)

Monotonicity: prevent time back-steps for logs

Correction logging (required for field diagnosis)

Holdover and backup: what happens when main power disappears

Backup-mode goals (what must survive)

Parameter storage: RTC RAM vs EEPROM vs FRAM (engineering trade-offs)

Write cadence (avoid draining backup or wearing storage)

Restoration flow (coarse first, fine later)

Validation & measurement: how to prove drift improvement (bench + chamber)

A) What to measure (minimum set)

B) Bench methods (choose one primary + one verifier)

C) Chamber sweep (make temperature drift reproducible)

D) Typical pitfalls (false drift)

E) Pass criteria template (acceptance-ready)

Production calibration & field maintenance: what to store, what to alarm

A) Production calibration strategy (1-point / 2-point / multi-point)

B) What to store (versioned + integrity-checked)

C) Field alarms (detect “discipline is lying”)

D) Maintenance actions + safe rollback

Engineering checklist (bring-up → validation → shipment)

A) Bring-up — make the time path trustworthy first

B) Thermal — verify the temperature proxy and gradients

C) Algorithm — cadence, step/slew policy, monotonicity

D) Backup/Holdover — persistence, restore, and re-discipline

Shipment gate — one-page acceptance template

Applications & IC selection notes (TCXO-Based RTC)

A) Application slices (within this page boundary)

B) Selection fields (TCXO / RTC / NVM) + concrete part numbers

C) Architecture mapping (A/B/C) — the minimum decision set

D) Risk notes (what breaks drift claims in real deployments)

Reference examples (part numbers only; verify suffix/package/grade)

Recommended topics you might also need

Request a Quote

Accepted Formats

Attachment

FAQs (TCXO-Based RTC)

Explore

Categories

Get in Touch