123 Main Street, New York, NY 10001

TCXO-Based RTC for Low-Drift I2C Timekeeping

← Back to:Reference Oscillators & Timing

TCXO-Based RTC improves real-world timekeeping by disciplining a 32 kHz RTC with a stable TCXO reference and a measurable correction loop. It turns drift control into an engineering workflow—error budget, thermal strategy, firmware guards, validation gates—so accuracy is provable and maintainable in production.

What is a TCXO-Based RTC (and when you need it)

A TCXO-Based RTC keeps time in the usual low-power RTC domain (often a 32 kHz timebase), but uses a temperature-stable reference (TCXO, or TCXO-referenced measurements) to discipline the RTC: measure drift, apply trim or corrections, and keep long-term error close to the TCXO-grade behavior.

Definition in one line

RTC timekeeping remains in the always-on clock domain, while accuracy is enforced by TCXO-grade stability via periodic discipline (frequency trim and/or time correction).

What improves (quantified, in practical units)

  • Drift is best reasoned in seconds per day, not just ppm. (Rule of thumb: 1 ppm ≈ 0.0864 s/day.)
  • Typical goal: move from tens of ppm down to a few ppm or better, with the exact target set by the system budget (temperature profile, holdover duration, logging requirements).
  • The gain comes from observability + correction: drift is measured against a stable reference and corrected before it accumulates.

When it is worth it (trigger checklist)

No GNSS / weak signal / offline device
Time-stamps must remain trustworthy over long deployments.
Long holdover / sleep intervals
Error must stay bounded across hours/days without an external time source.
Production consistency requirements
Tighter distribution across units after temperature cycling and aging.
Harsh temperature dynamics
Outdoor/industrial/vehicle thermal gradients dominate real-world drift.

Out of scope (to avoid topic overlap)

  • TCXO internal physics and phase-noise deep dives → see TCXO.
  • Generic RTC fundamentals (calendar/alarms/basic 32 kHz crystal design) → see RTC.
  • Backup switchover circuitry, supercap/coin-cell leakage troubleshooting → see RTC Backup & Switchover.
  • Network time distribution/discipline (PTP/SyncE/GPSDO) → see Timing & Synchronization.
Problem versus fix: Crystal RTC drift vs TCXO-disciplined RTC Two-column block diagram showing crystal-based RTC drift from temperature and aging, and a TCXO-based discipline loop over I2C producing tighter drift. Problem Fix Crystal RTC (32k) Temp Aging Drift: seconds/day TCXO RTC Discipline I²C Tighter seconds/day Same I²C timekeeping, improved long-term accuracy via discipline

System architectures: 3 ways to discipline an RTC with TCXO

TCXO-based timekeeping can be built in three practical ways. The best choice depends on power budget, required observability, temperature dynamics, and production calibration constraints. The goal of this section is to map architectures clearly; loop theory and deep PLL/cleaner behavior are intentionally excluded.

Key decision dimensions (fast scan)

Power budget
A: Medium–High B: Low–Medium C: Low–Medium
Hardware complexity
A: Medium B: Low C: Low
Observability (can drift be measured?)
A: Strong B: Strong C: Depends
Production calibration friendliness
A: Medium B: Strong C: Strong

Labels A/B/C correspond to the architecture cards below. Values are directional; final choice must be validated against the drift budget and temperature profile.

A) Ref-in: TCXO provides a stable reference

What it is
A TCXO-grade clock is available in the system and acts as the stable reference for timebase generation or calibration, while the RTC interface remains I²C-based.
Minimum requirements
  • A stable TCXO output available to the timekeeping logic (directly or via measurement).
  • Defined behavior for startup/warm-up and reference validity detection.
Failure modes to watch
  • Calibration performed before the TCXO output is thermally settled (initial minutes dominate error).
  • Reference path contamination (power/ground noise) misleads the discipline loop.

B) Measure-and-trim: measure RTC drift, write trim

What it is
The RTC’s effective 32 kHz rate is measured against a TCXO-grade reference over a defined window, then corrected using the RTC’s digital calibration/trim mechanism.
Minimum requirements
  • A way to measure frequency or time error over a stable interval (counter/timer).
  • RTC supports trim granularity that can actually move the drift distribution.
Failure modes to watch
  • Too-short measurement window (noise dominates) or too-coarse trim step (no convergence).
  • Trim write timing near rollovers causes occasional apparent “time jumps”.

C) Soft discipline loop: firmware corrects time error (slew/step)

What it is
Firmware periodically estimates time error using a TCXO-grade reference, then corrects time in controlled steps (step) or gradually (slew) while enforcing monotonic behavior for logs.
Minimum requirements
  • Reliable time-error estimator (reference + stable logging cadence).
  • A correction policy that keeps application time monotonic when required.
Failure modes to watch
  • Over-correction creates visible time wobble (frequent steps) instead of reducing long-term drift.
  • State loss across power events breaks convergence until re-initialized properly.

Out of scope: PLL/clock-cleaner loop theory and deep jitter transfer. For those topics, use PLL and Jitter Attenuators / Clock Cleaners.

Three architectures for TCXO-based RTC discipline Three horizontal lanes showing ref-in, measure-and-trim, and software discipline loop, with always-on domain boundary and I2C timekeeping. 3-lane Architecture Map (A/B/C) Always-on domain I²C timekeeping A B C TCXO MCU / Timer RTC Ref-in TCXO Counter RTC Trim reg Trim TCXO Firmware RTC Slew / Step Loop Same I²C timekeeping, different discipline mechanisms (A/B/C)

Error budget: from ppm to seconds/day (what dominates in reality)

Drift discussions become actionable only after choosing one consistent unit and an error budget structure. This section defines the conversion from ppm to seconds/day, then breaks total error into components that can be measured, controlled, and validated.

Conversion (fixed units for specs and acceptance)

ppm → seconds/day

seconds/day = ppm × 86400 / 1,000,000
Rule of thumb: 1 ppm ≈ 0.0864 s/day

ppm → seconds over holdover

seconds = ppm × holdover_seconds / 1,000,000
Example placeholder: X ppm × Y hours → Z seconds

Acceptance criteria should always specify the temperature profile and time window; ppm numbers without context are rarely predictive in real systems.

Error terms (structured for engineering control)

Reference-limited
  • TCXO stability (temperature behavior and aging).
  • Reference validity during startup and warm-up.
Timebase & quantization-limited
  • RTC divider/counting granularity and trim step size.
  • Calibration quantization (cannot correct smaller than one step).
System-limited (often dominant in reality)
  • Thermal gradients (sensor location ≠ TCXO/RTC thermal domain).
  • Measurement window too short (noise dominates drift estimate).
  • Write/read timing (rollover, shadow registers, I²C latency).
  • Update strategy (step/slew policy causes apparent “drift”).

Out of scope: deep TCXO datasheet interpretation (phase-noise, internal compensation modeling). Use TCXO for device-level details.

“Expected root cause” vs “dominant real-world cause”

Upgrading TCXO does not improve drift
Likely dominant: board thermal gradient or wrong temperature proxy.
Quick check: log time_error and temperature during a chamber sweep and evaluate correlation and lag.
Short test looks good, long test is poor
Likely dominant: window too short or update too aggressive.
Quick check: increase measurement window by N× and verify whether the estimate converges or still wanders.
Occasional jump / mixed fields when reading time
Likely dominant: I²C atomicity (read across a rollover) or incorrect write timing.
Quick check: use a burst/latch read mode or double-read verify around second boundaries.

Acceptance template (reusable pass criteria)

  • Temperature profile: T(t) explicitly defined (step, ramp, cycle; include dwell time).
  • Primary metric: max |time_error| over the window (e.g., 24 h) < X seconds (system budget).
  • Secondary metric: error rate (seconds/day) bounded under the same profile.
  • Behavioral constraint: no prohibited time back-steps for logging/audit use-cases.
  • Health constraint: trim/correction does not remain saturated (otherwise the model/thermal proxy is wrong).
Units: seconds/day Must specify: T(t) + window Log: time_error + temp + trim
Error budget stack: ppm to seconds/day contributions Stacked block diagram showing contributors to time drift: TCXO, RTC divider, quantization, thermal gradient, I2C read/write, and update strategy, with a highlight on dominant real-world terms. Error Budget Stack (total seconds/day) Contributors TCXO RTC divider Quantization Thermal gradient I²C read/write Update strategy Dominant in reality (often) Thermal gradient I²C atomicity Update policy Use seconds/day for budgets; specify temperature profile and measurement window

Disciplining algorithms: open-loop trim vs closed-loop control

Discipline is an engineering policy: correct frequency (trim) or correct time (step/slew), while controlling noise sensitivity, thermal tracking, and time behavior constraints (especially monotonic logs). This section maps practical options without diving into network time algorithms.

Open-loop: temperature/LUT → trim

Input
Temperature proxy and calibration table (single-point / multi-point / chamber-derived LUT).
Output
RTC trim value (frequency correction). Behavior is stable and predictable when the thermal proxy matches the oscillator’s thermal domain.
Common failure
LUT becomes wrong under real airflow/heat sources due to thermal gradients and sensor mismatch.
Frequency-first Low complexity

Closed-loop: error estimate → filtered correction

Error input
Time error or frequency error computed against a stable reference over a defined window.
Key knobs (engineering meaning)
  • Window: longer = less noise, slower reaction.
  • Update interval: too fast = chases noise, too slow = misses thermal drift.
  • Limits: clamp correction rate and handle saturation to preserve stability.
Stability signal
Convergence shows as a shrinking time_error envelope without frequent visible correction events.
Noise vs tracking trade-off Requires observability

Out of scope: PTP/SyncE academic algorithms and link delay correction. Use Timing & Synchronization.

Step vs Slew (time behavior policy)

Step
Best for initial alignment after boot or when the error is too large and the application permits a visible correction.
Slew
Best when logs require smooth behavior. Correction is spread over time to avoid sudden jumps.
Monotonic constraint
For audit/event ordering, enforce no back-steps. If negative correction is needed, apply it as a bounded slew or via an offset layer rather than rewriting time backward.
Policy: allowed jump? Policy: max slew rate

Update interval selection (fast vs slow costs)

  • Choose interval based on thermal time constant: fast thermal dynamics require faster tracking, but only if measurement noise is controlled.
  • Use a window long enough to average measurement noise; shrinking the window should not change the estimated drift direction dramatically.
  • Enforce rate limits: correction per update should be bounded to avoid visible time wobble.
  • Log these fields together to debug real behavior: time_error, temperature, trim/correction, interval.
Too fast → chases noise Too slow → misses drift Always clamp & monitor saturation
Discipline loop: error estimator, filter/controller, and trim/correction Closed-loop block diagram showing RTC timebase feeding an error estimator, a filter/controller, a limiter, and trim or slew/step correction applied back to the RTC. Discipline Control Loop (device-internal) RTC timebase Time error estimator Filter / Controller Limiter / clamp Trim (frequency) Slew / Step (time) Monotonic Window Update interval Engineerable knobs: window, update interval, clamps, and monotonic policy

Temperature strategy: TCXO compensation, sensing, and thermal gradients

In TCXO-based timekeeping, temperature is not a single number. It is a field shaped by heat sources, copper planes, airflow, and thermal time constants. If the temperature proxy is wrong, even a strong ppm-grade TCXO can still produce visible drift at the system level.

TCXO internal compensation ≠ board temperature

What must match

The system temperature proxy must represent the same thermal domain as the TCXO/RTC. Physical distance alone is not sufficient; thermal coupling paths dominate.

Common mismatches
  • Sensor sits in a different airflow or copper plane than the TCXO.
  • Local self-heating changes package temperature without changing “board temp”.
  • Heat sources (SoC/PMIC) shift gradients during workload transitions.
Key risk: wrong proxy Symptom: “ppm looks good” but drifts

Sensor placement & thermal path management (actionable)

Placement rules
  • Place the sensor in the TCXO/RTC thermal domain.
  • Avoid direct airflow and avoid high-gradient heat plumes.
  • Prefer stable coupling: predictable lag beats random airflow-driven noise.
Thermal path controls
  • Use keepouts / isolation slots to limit heat spreading from SoC/PMIC.
  • Avoid large copper planes that “drag” heat into the TCXO area.
  • Keep TCXO/RTC away from inductors and hot power stages.

Out of scope: oven-based stabilization details. For oven control strategies, use OCXO.

Thermal time constants: fast vs slow changes (control impact)

Fast changes (airflow, door open)

Risk: the proxy temperature moves faster than the oscillator domain. Updating too often can chase noise, creating visible wobble. Use longer windows, slower update, and correction clamps.

Slow changes (enclosure warm-up)

Risk: time error accumulates if the update interval is too slow. Tracking requires an interval that follows the thermal drift, while using windows long enough to keep estimates stable.

Log together: time_error + temp + trim Knobs: window + interval + clamp

Typical root causes when drift persists

Thermal gradient dominates
Symptom: chamber looks stable; field drifts under airflow/workload changes.
First check: correlate time_error with temperature proxy and measure lag.
Package self-heating
Symptom: drift depends on output enable/load state.
First check: compare drift with identical temperature profile but different operating states.
Airflow-driven proxy noise
Symptom: correction becomes frequent without real improvement.
First check: increase window and reduce update rate; verify whether correction events reduce.
Thermal map and sensor placement: temperature is a field Board-level thermal map diagram with SoC and PMIC heat sources, TCXO and RTC area, temperature sensor placements, airflow arrows, and isolation slot to manage thermal gradients. Thermal Map & Sensor Placement (temperature is a field) SoC PMIC TCXO RTC slot airflow Sensor Sensor (bad) Key effects Gradient Lag Airflow Proxy must match TCXO thermal domain

Hardware design: reference routing, power, and I²C integrity

Hardware details often become “time drift” symptoms. Reference routing, low-noise power domains, and I²C integrity determine whether frequency correction is stable and whether time reads/writes remain atomic and reliable.

Reference path: routing & returns (key points only)

  • Keep the reference path short and keep a clean return nearby.
  • Avoid crossing split planes or noisy return loops that modulate threshold crossings.
  • Maintain separation from fast I/O and switching nodes to reduce injected jitter into measurement.
  • If level/interface options exist (CMOS/LVDS, etc.), treat them as a system choice; this section focuses on layout risk control.
Short + controlled return Avoid split-plane crossings

Power domains: main / backup / always-on (why it matters)

Low-noise supply

TCXO/RTC should sit on a quiet LDO with clear domain boundaries to prevent digital load steps from modulating timekeeping.

Power sequencing

Avoid “half-powered” states that can corrupt RTC configuration or cause intermittent resets. Ensure backup switchover does not create ambiguous register behavior.

Domain boundaries Quiet LDO for TCXO/RTC

I²C integrity: pullups, capacitance, and EMI (engineering keys)

  • Pullups set a trade-off: stronger pullups improve edge rate but increase backup-domain power.
  • Bus capacitance (trace length + devices) slows edges and reduces noise margin; intermittent retries can look like time “jumps”.
  • EMI coupling can flip bits or force retries. Use routing separation, clean returns, and robust read/write retry policies.
  • Clock stretching compatibility should be confirmed; mismatches can create silent timeouts and partial reads.
Backup power vs edges EMI can masquerade as drift

Out of scope: protocol teaching (addressing, arbitration, timing spec walkthroughs). This section focuses on failure modes that produce incorrect time reads/writes.

Shadow registers & atomic read (hardware + system cooperation)

Preferred behavior

Use a latch/snapshot mechanism if supported so that all time fields are read from a consistent captured state.

Fallback behavior

If snapshot is not available, use double-read consistency checks and avoid second-boundary reads; reliable edges and EMI control keep this effective.

Prevent mixed-field reads I²C reliability required
Power domains and buses: main rail, backup rail, always-on, and I²C integrity Block diagram showing power rails and domains with RTC and TCXO in always-on domain, SoC/PMIC in main rail, backup source feeding backup rail, and I2C lines with pullups tied to a selected rail, plus EMI risk marker. Power Domains + Buses (reliability & low leakage) Main rail SoC PMIC High activity / load steps Always-on domain Quiet LDO TCXO RTC Temp sensor Backup rail coin/supercap SCL SDA Pullups Vpullup EMI → retries / errors Atomic read / snapshot

Firmware timekeeping: read/write pitfalls, logging, and monotonicity

Many “RTC drift” issues are firmware artifacts: non-atomic reads, rollover-edge writes, inconsistent correction policy, or missing traceability. A disciplined RTC stack must make reads consistent, writes safe, and time behavior monotonic where required.

Read time correctly (avoid mixed-field reads)

Preferred
  • Use latch/snapshot read mode if supported.
  • Use burst read for full time fields in one transaction.
Fallback
  • Double-read-verify: read, then re-read and confirm stability.
  • Avoid sampling at second boundaries when snapshot is unavailable.
Goal: consistent fields Risk: rollover mix

Write/correct safely (avoid rollover-edge writes)

  • Use a defined write window away from rollover edges (seconds/minutes boundary).
  • If the device supports shadow registers, use the recommended “write + commit” flow to avoid partial field updates.
  • Treat I²C errors as first-class events: retries must be bounded; repeated failures should trigger a fault state rather than silent drift.
  • Apply correction in the smallest domain possible (trim first; step only when policy allows).
Safe windowing Atomic commit

Monotonicity: prevent time back-steps for logs

Policy

If the product requires ordered events, enforce no back-steps. When negative correction is needed, use bounded slew or an offset layer instead of rewriting time backward.

Step vs slew

Step is reserved for explicit re-sync conditions (boot, large error, non-audit mode). Slew is preferred when logs must remain smooth.

Constraint: no back-steps Slew for audit logs

Correction logging (required for field diagnosis)

Every correction should be traceable. Without a minimal record, thermal-proxy mismatch and I²C faults become indistinguishable from oscillator drift.

Log fields (minimum)
  • timestamp, rtc_time, system_time (or reference time)
  • time_error, estimated drift, trim/correction applied
  • temperature proxy, update interval, window length
  • reason code (boot, periodic, fault-recovery, manual)
Red flags
  • correction saturates for long periods (model/proxy mismatch)
  • frequent corrections without shrinking error envelope (noise chasing)
  • I²C retries spike near load transitions (EMI / return path)

Out of scope: operating system time subsystems. This section covers only RTC discipline behavior and error-proof read/write strategy.

Firmware state machine: read, validate, discipline, and fault handling State machine diagram showing boot, RTC read, validation, discipline application, periodic updates, and alarm or fault handling with chips for snapshot read, double-verify, monotonic policy, and logging. Firmware Timekeeping State Machine Boot Read RTC snapshot/burst Validate double-verify Log state Apply discipline trim / slew / step (policy) Periodic update window + interval Alarm / fault handling I²C errors / saturation Snapshot read Double verify Monotonic Log every correction

Holdover and backup: what happens when main power disappears

Backup mode must preserve continuity and preserve the “discipline state” so that restoration does not introduce a large discontinuity. The design must balance retention, write cadence, and leakage while keeping a clear re-discipline sequence after power returns.

Backup-mode goals (what must survive)

  • Continuous timekeeping (no resets, no ambiguous time state).
  • Retention of calibration parameters (trim/LUT/controller state if required).
  • Predictable restoration: no large time jump, no back-step for monotonic logs.
  • Low leakage discipline: backup budget is dominated by always-on load and bus pullups.
Continuity Retention No jump

Parameter storage: RTC RAM vs EEPROM vs FRAM (engineering trade-offs)

RTC RAM

Lowest write cost and fastest access, but retention depends on backup power continuity and device behavior.

EEPROM

Strong retention but has wear and higher write energy. Requires rate limiting and careful commit timing.

FRAM

Low write energy and high endurance, often preferred when frequent parameter updates are required in the field.

Key knob: write cadence Key risk: wear / leakage

Write cadence (avoid draining backup or wearing storage)

  • Write only on meaningful state changes (model update, temperature regime change, fault recovery).
  • Rate limit periodic commits; prefer aggregated commits with a checksum and a version tag.
  • Do not write at the power-fail edge unless a safe “last-gasp” mechanism is guaranteed.
  • Log commit attempts and failures; a silent failure behaves like unexplained drift on restoration.
Commit with checksum Avoid edge writes

Restoration flow (coarse first, fine later)

Phase 1: coarse alignment

After power returns, re-establish a trusted baseline quickly (validate RTC, confirm monotonic policy, load retained parameters).

Phase 2: fine discipline

Resume periodic discipline with conservative window/interval until temperature and load settle, then return to normal tracking.

Out of scope: supercap/coin-cell switchover circuits and leakage troubleshooting. Use RTC Backup & Switchover.

Power-fail timeline: main power drop, backup holdover, restore, re-discipline Timeline diagram showing normal operation, main power drop, backup mode timekeeping, restoration flow, and re-discipline, with rails and an error envelope concept. Power-Fail Timeline (holdover & restoration) Normal disciplined Power drop event Backup mode holdover Restore boot Re- discipline Signals (concept) VMAIN drop VBACKUP time_error disciplined stability holdover drift restore + fine Backup focus: keep time continuous, keep parameters, restore with coarse-then-fine discipline

Validation & measurement: how to prove drift improvement (bench + chamber)

This section turns “drift improved” into a measurable acceptance claim. The proof requires a trusted reference, consistent sampling, and traceability of the control output (trim/correction). The goal is time-drift validation, not RF purity or phase-noise deep evaluation.

A) What to measure (minimum set)

Measure both the outcome (time error) and the cause (trim/correction trajectory). Without control-output traceability, measurement artifacts can be misread as oscillator drift.

Primary signals
  • time_error: RTC vs reference (s/ms)
  • freq_error: counter or slope estimate
  • temp_proxy: TCXO/board sensor readout
  • trim/correction: code + mode (trim/slew/step)
Metadata (for debugging)
  • sampling interval Δt (actual, not assumed)
  • read method (snapshot/burst/double-verify)
  • I²C retries/timeouts (counts)
  • reason code (boot/periodic/fault-recovery)
Must log trim Must log Δt

B) Bench methods (choose one primary + one verifier)

1) Counter / frequency gate

Best for freq_error. Use sufficiently long gate time to avoid mistaking short-term noise for drift. Correlate against temp_proxy.

2) I²C trace / logic analysis

Best for proving read/write correctness: snapshot/burst usage, retry storms, and rollover-edge behavior.

3) Long-run logging

Best for acceptance: time_error envelope over 24–72 hours with consistent sampling and full correction traceability.

Practical rule: trust improves when one method produces the primary metric and another independently verifies read/write integrity.

C) Chamber sweep (make temperature drift reproducible)

  • Define a temperature profile: ramp rate, dwell/soak time, and number of dwell points.
  • Record both chamber setpoint and temp_proxy to expose thermal lag and gradients.
  • Evaluate drift during transients (ramps) as well as at dwell points; ramps often dominate real systems.
  • Keep sampling interval stable; log correction trajectory to separate control behavior from the thermal plant.
Profile + soak Log lag/gradient

D) Typical pitfalls (false drift)

Too-short window

Short gate or short log span makes noise look like drift. Use longer windows and report confidence bands.

Inconsistent sampling

Assuming a fixed Δt when the system jitters in scheduling distorts slope-based frequency estimates. Always log actual Δt.

Untrusted host time

PC clock corrections can dominate the measured error. Use a trusted reference source or explicitly log host sync events.

I²C delays as “drift”

Variable bus delay/retry changes apparent timestamps. Record retries/timeouts and avoid rollover-edge reads/writes.

E) Pass criteria template (acceptance-ready)

  • Profile: temperature profile Y (ramp + soak defined)
  • Duration: continuous run T (e.g., 24–72 h)
  • Metric: worst-case |time_error|, plus drift rate (optional)
  • Criteria: worst-case error < X seconds, no persistent trim saturation, and no recurring I²C fault storms

Thresholds (X, Y, T) must come from system budget and workload profile. This page provides structure, not fixed numbers.

Measurement setup: DUT, chamber, reference time source, logger, counter, and I2C sniffer Block diagram showing a DUT inside an optional temperature chamber, with connections to reference time source, logger, frequency counter, and an I2C sniffer. Tags indicate time_error, freq_error, temp, and trim. Measurement Setup (bench + chamber) Chamber (optional) DUT (board) TCXO RTC MCU / Firmware logs + correction Reference time source trusted baseline Logger / PC CSV / DB time_error + temp + trim Counter freq_error I²C sniffer time_error freq_error temp trim

Production calibration & field maintenance: what to store, what to alarm

A TCXO-disciplined RTC must be manufacturable and maintainable. This section defines calibration strategies, storage structure, alarm triggers, and safe fallback behavior. Security/audit time signing is out of scope here and should be handled in a dedicated secure-time design.

A) Production calibration strategy (1-point / 2-point / multi-point)

1-point

Lowest cost. Suitable when thermal gradients are controlled and the error model is stable. Verification must still use a defined profile.

2-point

Balanced approach. Captures a dominant slope term without exploding complexity. Often the best default for production.

Multi-point (temperature points)

Used when temperature behavior must be explicitly modeled. Requires guardrails against overfitting and a stronger verification gate.

Overfitting guardrails
  • Use a verification profile different from the calibration points (do not “train on the test”).
  • Reject solutions that rely on persistent trim saturation to meet short tests.
  • Treat sensor/TCXO thermal mismatch as a model risk; more points can amplify the mismatch.

B) What to store (versioned + integrity-checked)

Storage must survive resets and field updates. Use explicit versioning and CRC so corrupted or incompatible data triggers a predictable fallback rather than silent drift.

Calibration record template
  • cal_version + model id
  • CRC (full record integrity)
  • points[]: temperature point(s) + coefficients/trim base
  • valid_range: temperature/profile label
  • write_count + commit sequence number
Versioned CRC-protected Predictable fallback

C) Field alarms (detect “discipline is lying”)

1) time_error rate

Trigger when error growth rate exceeds a budgeted threshold. First action: check temperature correlation and correction cadence.

2) trim saturation

Trigger when trim remains near limits for long periods. This often indicates model mismatch or a dominant unmodeled thermal gradient.

3) temperature mismatch

Trigger when temperature changes no longer correlate with expected error response (lag/gradient changed by airflow or load).

Alarm thresholds must be derived from system budget. This page defines signals and failure modes, not fixed numbers.

D) Maintenance actions + safe rollback

  • Re-calibrate when alarms indicate persistent mismatch and the environment profile has changed.
  • Re-verify after firmware updates that change sampling cadence, correction policy, or temperature source.
  • Fallback if discipline becomes unstable: degrade to “plain RTC behavior” while preserving logging and alarms.
  • RMA decision when CRC fails repeatedly, trim saturates across conditions, or I²C reliability collapses.
Out of scope

Secure audit / signed time should be handled in the Secure RTC page (avoid mixing production drift maintenance with security architecture).

Lifecycle flow: calibrate, verify, store, deploy, monitor, re-calibrate or RMA with fallback Flow diagram showing factory calibration and verification, storing versioned CRC data, deployment, monitoring alarms, re-calibration decisions, RMA decisions, and a fallback path to plain RTC operation. Factory → Field Lifecycle (calibration + maintenance) Calibrate factory Verify profile gate Store ver + CRC Deploy field Monitor error_rate / saturation / mismatch Re-calibrate + re-verify RMA decision Fallback plain RTC mode Store: version + CRC + wear policy Alarms drive actions

Engineering checklist (bring-up → validation → shipment)

This checklist converts “TCXO-based RTC discipline” into verifiable actions. Each item defines what to check, what to log, and what counts as pass, so results remain reproducible across bench, chamber, and production.

A) Bring-up — make the time path trustworthy first

1) I²C read consistency (no torn reads)
  • Check: reading time/date never returns mixed fields across a rollover boundary.
  • How: snapshot/latch mode, burst read, or double-read-verify (two identical reads required).
  • Log: read method, retry count, torn-read counter, bus error codes.
  • Pass: 0 torn reads in long-run sampling; retry rate under a defined threshold (project-specific).
2) Calibration register sanity (read/write/retain)
  • Check: trim/offset registers are writable and read back identically.
  • How: write → readback → soft reset → verify expected persistence behavior.
  • Log: written code, readback code, reset context, CRC/metadata if stored externally.
  • Pass: readback matches; post-reset behavior matches datasheet expectations.
3) Trim effectiveness (direction + stability)
  • Check: changing trim produces a consistent change in measured frequency/time error.
  • How: apply two trim codes; measure error sign and response repeatability.
  • Log: trim code vs time error trend; “trim activity” (how often changes occur).
  • Pass: monotonic direction; no high-frequency toggling that indicates chasing noise.

B) Thermal — verify the temperature proxy and gradients

1) Chamber profile readiness
  • Check: ramp/soak steps are defined and repeatable.
  • Log: profile ID, setpoint, measured proxy temperature, time stamps.
  • Pass: repeated runs stay within the same thermal envelope (project-specific tolerance).
2) Sensor coherence (offset + lag)
  • Check: sensor reading represents the TCXO/RTC neighborhood (not a distant heat source).
  • Log: sensor vs board reference temperature; estimated time constant (lag).
  • Pass: offset/lag is stable enough to be compensated or bounded in the error budget.
3) Gradient sensitivity (airflow/load)
  • Check: time error does not jump when airflow/load conditions change.
  • Log: labels for airflow/load modes + time error trajectory.
  • Pass: delta stays within budget; otherwise enforce a more conservative update cadence.

C) Algorithm — cadence, step/slew policy, monotonicity

1) Update cadence (avoid chasing noise)
  • Check: corrections are not toggling rapidly (a sign of measurement noise domination).
  • Log: update period, time error variance, trim activity, outlier counters.
  • Pass: time error converges while trim activity remains bounded.
2) Step vs Slew (log integrity)
  • Check: policy matches the system’s audit/logging requirement.
  • Log: correction mode (step/slew/trim), correction magnitude, reason codes.
  • Pass: no forbidden step events; slews stay within allowable slope limits (project-specific).
3) Monotonicity guard (never go backwards)
  • Check: system time never decreases, even during correction and restore.
  • Log: monotonic violation counter, last-good timestamp, mitigation action.
  • Pass: 0 violations; or violations are explicitly flagged and quarantined by design.

D) Backup/Holdover — persistence, restore, and re-discipline

1) Power-fail entry and stability
  • Check: single clean transition into backup mode (no chatter).
  • Log: power-fail timestamp, entry flags, backup current snapshot (if measurable).
  • Pass: no repeated entry/exit; critical parameters remain intact.
2) Parameter persistence cadence
  • Check: storing calibration/metadata does not exceed energy or endurance budget.
  • Log: write count, write reason code, CRC/version, last-write age.
  • Pass: write schedule obeys policy; CRC always validates after restore.
3) Restore and re-discipline (no unexplained jumps)
  • Check: restore uses “coarse then fine” and respects monotonicity.
  • Log: restore phases + time error/trim trajectory, fallback triggers.
  • Pass: time error returns inside the budget within a defined recovery window.

Shipment gate — one-page acceptance template

  • X) Worst-case time error: 24 h worst-case error < X s under temperature profile Y (project-defined).
  • Y) Control health: trim does not saturate long-term; correction activity stays bounded.
  • Z) Robustness: I²C error storms and monotonicity violations are 0 (or within a defined spec).

Fail mapping: X → error budget/thermal/measurement; Y → control strategy/thermal; Z → I²C integrity/firmware handling/factory-field policy.

Diagram: Checklist gates (Functional → Thermal → Holdover → Field robustness)
Checklist gates for TCXO-based RTC Four gated blocks with chips and evidence outputs for release readiness. Release gates for TCXO-based RTC discipline Evidence-driven: logs • profile IDs • CRC/version • alarms/fallback Gate A Functional I²C clean Reg check Trim works No torn read Gate B Thermal Profile Sensor lag Gradient Bounded drift Gate C Holdover Persist Restore No jump Re-discipline Gate D Field Alarms Fallback CRC/version RMA rule Shipment X/Y/Z acceptance met logs + timestamps CRC + versioning
Gate-based checklist keeps drift claims measurable and reproducible across bench, chamber, and production.

Applications & IC selection notes (TCXO-Based RTC)

This section turns “TCXO-based RTC” into a selection flow: choose the architecture (A/B/C), then select TCXO/RTC/NVM features that match the drift budget, holdover profile, and logging integrity needs. Part numbers below are starting points for datasheet lookup—verify suffix, package, temperature grade, and availability.

A) Application slices (within this page boundary)

Industrial data logging (unattended)
  • Need: controlled drift over long runtimes.
  • Constraint: weak/absent external time sources.
  • Accept: 24 h worst-case error < X s under profile Y (project-defined).
Billing / event logs (consistency first)
  • Need: predictable drift + monotonic time policy.
  • Constraint: corrections must not break audit trails.
  • Accept: no forbidden steps; all corrections are logged with reason codes.
Off-grid devices (no gateway time)
  • Need: stable time base without relying on periodic sync.
  • Constraint: power budget, backup domain behavior.
  • Accept: holdover drift stays inside the budget; restore is “coarse then fine”.
Weak GNSS environments (holdover quality)
  • Need: acceptable drift when sync is intermittent.
  • Constraint: unpredictable reacquisition times.
  • Accept: drift bounded across temperature and airflow perturbations.

B) Selection fields (TCXO / RTC / NVM) + concrete part numbers

Option 1 — “All-in-one” temperature-compensated RTCs (fastest BOM)

Use when minimal part count and predictable drift matter more than custom discipline loops.

  • DS3231SN# — I²C RTC with integrated TCXO + crystal.
  • DS3232S# — I²C RTC with integrated TCXO + crystal, plus battery-backed SRAM.
  • DS3234S# — SPI RTC with integrated TCXO + crystal, plus SRAM (SPI systems).
  • DS3231M — I²C temperature-compensated RTC with internal MEMS resonator (no external crystal).
Option 2 — External TCXO as the stable reference (architecture A/C)

Use when the system already has a stable MHz reference (MCU/SoC timer) or needs tighter stability than a plain 32 kHz crystal domain.

  • SiT5356AI-FQ-33E0-10.000000X — 10 MHz Super-TCXO (LVCMOS).
  • ASTX-H11-25.000MHZ-T — 25 MHz TCXO (HCMOS).
  • ASTX-H11-27.000MHZ-T — 27 MHz TCXO (HCMOS; common for platform clocks).

Note: external TCXO frequency can be divided to seconds via a timer/counter path; the RTC can be corrected (slew/step/trim) through firmware policy.

Option 3 — Nonvolatile storage for calibration tables / metadata (CRC + version)

Use FRAM when frequent updates or “always-on logs” would wear out EEPROM; use EEPROM when writes are infrequent and power is tight.

  • MB85RC256V — 256 Kbit I²C FRAM (Fujitsu).
  • FM24CL64B-GTR — 64 Kbit I²C FRAM (Infineon).
  • 24LC256 — 256 Kbit I²C EEPROM (Microchip; family includes 24AA256 / 24FC256 variants).
Support silicon (power) — concrete LDO examples
  • TPS7A0220PDQNR — nanopower-IQ LDO (TI).
  • MCP1700T-3302E/TT — low-IQ LDO 3.3 V option (Microchip).
  • ADP150AUJZ-3.3-R7 — ultra-low-noise LDO 3.3 V option (Analog Devices).

Selection rule: backup-domain current dominates holdover; main-domain noise can dominate short-term correction stability. Use two rails/domains when needed.

C) Architecture mapping (A/B/C) — the minimum decision set

  • Need the stable reference directly for a timer/counter? → prefer A) Ref-in.
  • RTC exposes fine trim/offset controls? → prefer B) Measure-and-trim.
  • MCU can stay active and enforce policy? → prefer C) Soft discipline loop.
  • Monotonic event logs required? → constrain to slew/trim-first behavior; avoid uncontrolled step events.

D) Risk notes (what breaks drift claims in real deployments)

Thermal gradients & airflow changes

A “ppm-grade” device can still drift if the temperature proxy does not track the oscillator neighborhood. Validate with airflow/load perturbations, not only steady soaks.

Measurement artifacts (I²C delay ≠ drift)

Short observation windows, irregular sampling, or torn reads can look like frequency error. Always log bus retries and read method alongside time error.

Field drift anomalies must be diagnosable

Store calibration version + CRC and correction history (offset/temp/trim/reason). When discipline fails, fall back to a safe RTC mode rather than producing silent time regressions.

Diagram: Selection flow (need → constraints → architecture → parts)
Selection flowchart for TCXO-based RTC Flowchart from drift need to holdover and monotonic requirements, choosing A/B/C architecture and selecting TCXO, RTC, and NVM features. TCXO-based RTC selection flow Need drift improvement? Long holdover / backup? Monotonic logs required? MCU can enforce policy? RTC exposes trim/offset? Choose architecture A) Ref-in stable MHz timer path B) Trim measure write trim C) Soft slew/step policy Pick features (then parts) TCXO stability power startup RTC trim step backup I²C robust NVM CRC endurance energy Validate → Monitor → Maintain (bench + chamber + field logs)
Keep the decision tree small: requirements → architecture A/B/C → feature picks → part numbers → measurable acceptance.

Reference examples (part numbers only; verify suffix/package/grade)

  • Compensated RTC: DS3231SN#, DS3232S#, DS3234S#, DS3231M
  • TCXO reference: SiT5356AI-FQ-33E0-10.000000X, ASTX-H11-25.000MHZ-T, ASTX-H11-27.000MHZ-T
  • FRAM: MB85RC256V, FM24CL64B-GTR
  • EEPROM: 24LC256 (family: 24AA256 / 24FC256)
  • LDO examples: TPS7A0220PDQNR, MCP1700T-3302E/TT, ADP150AUJZ-3.3-R7

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (TCXO-Based RTC)

Short, actionable troubleshooting. Each answer follows a fixed 4-line structure: Likely cause / Quick check / Fix / Pass criteria.

Why does drift look worse after enabling discipline?
Likely cause:
The loop is chasing measurement noise (short window/torn reads) or using a temperature proxy that does not track the oscillator neighborhood (gradient/lag).
Quick check:
Plot trim_code and time_error vs time; if trim toggles frequently while time_error variance rises, noise dominates. Also log i2c_retry_count and torn_read_count.
Fix:
Increase averaging window, add deadband/hysteresis, slow the update cadence, and enforce “trim-first / slew-only” during steady state.
Pass criteria:
trim_step_rate < K/hour and 24h_worst_case_time_error < X seconds under profile Y (budget-defined), with lower variance than “discipline off”.
Why does time jump backward occasionally even if RTC is running?
Likely cause:
A step correction is applied without a monotonic guard, or restore logic loads a stale saved timestamp/table and overwrites a newer time.
Quick check:
Log monotonic_violation_count, corr_mode (step/slew/trim), and restore_phase. Correlate backward jumps with correction events or restore entry.
Fix:
Enforce monotonic time: clamp or slew-only during runtime; allow step only at boot with explicit “time-was-untrusted” flag. Validate saved-table version/CRC before applying.
Pass criteria:
monotonic_violation_count = 0 over N days of stress runs (including power cycles), and no negative time deltas in logs.
My I²C reads sometimes show “mixed” seconds/minutes—what’s the first fix?
Likely cause:
Torn reads across a rollover boundary (seconds increments between reading different registers), or missing the device’s snapshot/latch mechanism.
Quick check:
Read time twice back-to-back; if values differ in an impossible way, torn reads are present. Track torn_read_count and burst_read_used.
Fix:
Use device snapshot/latch if available; otherwise use burst read + double-read-verify (accept only identical reads), and avoid reads near known rollover windows.
Pass criteria:
torn_read_count = 0 over N consecutive reads (N sized for the product’s lifetime logging cadence), with i2c_retry_rate bounded.
Why does calibration work at room temp but fail after a thermal cycle?
Likely cause:
The calibration table is overfit to one condition, or thermal gradients/sensor lag shift the effective temperature seen by the oscillator after cycling.
Quick check:
Compare time_error vs temp_proxy correlation before/after cycle; log temp_proxy_offset and lag_tau (estimated). Check cal_version and CRC on restore.
Fix:
Use multi-point calibration across representative temperature points (with soak time), constrain the model (no over-parameterization), and validate across at least two independent thermal runs.
Pass criteria:
After cycle, 24h_worst_case_time_error remains < X seconds under profile Y, and two repeated runs agree within margin M (repeatability gate).
How do I choose the update interval (too fast vs too slow)?
Likely cause:
Too fast: noise and bus jitter dominate, creating correction chatter. Too slow: the loop cannot track temperature drift and gradients.
Quick check:
Sweep update_period and compare time_error variance, trim_step_rate, and correction_event_rate_per_day under the same thermal profile.
Fix:
Choose an interval tied to the thermal time constant (avoid high-frequency chasing), add EWMA/low-pass on error estimates, and introduce a deadband around zero error.
Pass criteria:
trim_step_rate bounded, correction_event_rate_per_day < K/day, and 24h_worst_case_time_error meets the drift budget X under profile Y.
Why does trim saturate at min/max and never recovers?
Likely cause:
Wrong sign/units in correction, insufficient trim range for the true error, or a persistent temperature proxy mismatch (gradient/lag) biases the estimator.
Quick check:
Log trim_code, time_error slope (seconds/day), and the sign of “error→trim” mapping. Verify whether error improves when trim moves away from saturation.
Fix:
Re-zero the controller (offset), clamp integral growth, add deadband, and validate trim direction with two-point perturbation. If range is insufficient, switch to a different architecture (trim + slew policy).
Pass criteria:
trim_saturation_pct < P% over 24h and time_error slope converges inside the budget (X seconds/day equivalent).
How can I tell thermal gradient vs true oscillator drift?
Likely cause:
Gradients change the oscillator neighborhood temperature without changing the measured proxy; “true drift” stays consistent for the same neighborhood temperature.
Quick check:
Hold the same chamber setpoint and toggle airflow/load; if time_error shifts while temp_proxy stays similar, gradients dominate. Track Δtime_error(airflow).
Fix:
Improve sensor placement near the oscillator region, manage thermal paths (isolation/heat spreading), and slow corrections to match the measured thermal lag.
Pass criteria:
Under defined airflow/load toggles, Δtime_error < X seconds/day-equivalent, and the residual error correlates with the chosen temperature proxy within tolerance.
What’s the minimum data I should log to debug field drift?
Likely cause:
Field drift is often “un-debuggable” because the correction context (temperature, trim, bus health, restore path) is missing.
Quick check:
Ensure each correction event logs a compact record with consistent timestamps (monotonic + wall time if available).
Fix:
Log at minimum: time_error, freq_error_est, temp_proxy, trim_code, update_period, corr_mode, i2c_retry_count, torn_read_count, power_state, restore_phase, cal_version, CRC, plus a reason_code.
Pass criteria:
For any anomaly window, logs reconstruct (1) correction actions, (2) bus health, (3) temperature context, and (4) calibration versioning with no gaps larger than T minutes.
Why does backup mode lose the “disciplined” benefit after long off time?
Likely cause:
Calibration/context is not retained (or fails CRC), backup-domain conditions differ (temperature/voltage), and restore does not re-enter discipline in a controlled “coarse then fine” sequence.
Quick check:
After restore, verify cal_version/CRC, compare early time_error slope to pre-off baseline, and check whether the loop starts with stale trim.
Fix:
Rate-limit writes, persist only stable estimates, and on restore run a deterministic sequence: validate table → apply coarse alignment (if allowed) → enable fine trim/slew with bounded gains.
Pass criteria:
After an off-time of T hours, recovery_time < R minutes to return inside the drift budget, and CRC_fail_count = 0.
Is it better to correct frequency (trim) or correct time (slew/step)?
Likely cause:
Trim targets long-term rate error; slew/step targets phase error. The wrong choice breaks monotonic logs or creates visible discontinuities.
Quick check:
Identify constraints: allowed_step (yes/no), max_slew_rate (ppm or ms/s), and how errors appear (rate vs phase). Log corr_mode decisions.
Fix:
Use trim-first for steady-state drift; use slew to remove residual phase error without time going backward; reserve step for controlled boot-time alignment with explicit flags.
Pass criteria:
monotonic_violation_count = 0, slew_rate ≤ S (policy-defined), and 24h_worst_case_time_error meets X under profile Y.
Why does storing calibration more often reduce accuracy?
Likely cause:
The system commits noisy estimates (short windows), writes during unstable thermal periods, or creates write-correlated artifacts (power/bus contention), degrading repeatability.
Quick check:
Correlate time_error variance with write_event timestamps; compare “store often” vs “store on stability” runs under the same profile.
Fix:
Store only when stable (soak + low variance), decimate updates (min interval), use median/trimmed-mean estimates, and add version/CRC so invalid tables never apply.
Pass criteria:
write_event_rate ≤ W/day, post_store_error_delta does not increase beyond margin M, and CRC_fail_count = 0.
How do I set pass criteria without a perfect external time reference?
Likely cause:
Absolute time is hard to prove without a trusted reference; however, drift improvement can be verified using repeatability, bounded-error budgets, and independent cross-checks.
Quick check:
Define X/Y/T (error bound / thermal profile / duration), then run at least two repeat tests and compare 24h_worst_case_time_error and repeatability_delta.
Fix:
Use a lab counter/known-good timebase when available; otherwise use “two independent references” (e.g., a stable counter source + repeated profile) and require consistency across runs rather than single-shot truth.
Pass criteria:
24h_worst_case_time_error < X seconds under profile Y, and repeatability_delta < M across ≥2 runs; measurement method and uncertainty are logged.
Data-structured answer format (fixed)
Likely cause → Quick check (signals/log fields) → Fix (1–2 actions) → Pass criteria (measurable thresholds: X/Y/K/M/N/T/P/W placeholders).