Process Data Logger / Gateway (Integrity-Grade Logging)

Q: Audit asks for proof of non-tampering — what fields matter?

Non-tampering proof requires verifiable chain continuity plus anti-rollback evidence; playback alone is not a proof artifact. Evidence check 1: provide hash_head/prev_hash continuity and sig_verify results with failure reasons. Evidence check 2: demonstrate freshness via rollback_ctr monotonicity and firmware lineage via fw_ver_hash binding. First fix: emit an auditable segment manifest (ID, mono range, hash head, signature, verify status, counters) for independent review.

← Back to: Industrial Sensing & Process Control

A process data logger is not just a recorder—it is an evidence system that keeps records in order, keeps time consistent, survives power loss without silent corruption, and makes tampering detectable with verifiable signatures.

A “trustworthy” logger is judged by its proof: monotonic timelines, safe commit markers, recoverable storage, and audit-ready integrity fields—not by how many protocols it can read.

H2-1. Center Idea — What Makes a Logger “Trustworthy”

This chapter does not describe features. It sets the acceptance bar for what an industrial logger must prove—under audit, incident review, and worst-case power events.

A process data logger is not a recorder. It is an evidence system: it must preserve integrity, survive power loss without structural damage, keep time consistent, and make tampering detectable. Multi-protocol support is engineering workload; the real challenge is whether the recorded data remains defensible when something goes wrong.

No-loss (Completeness): data gaps must be prevented where possible, and when unavoidable they must be measured and reported (drop counters, gap markers).
In-order (Causality): the system must prove “what happened first” using sequence numbers and commit boundaries, not just arrival timing.
Monotonic time (No backward jumps): wall-clock can step during sync, but logs must remain monotonic using a stable counter and recorded clock-step events.
Power-fail safe (Structural self-consistency): brownouts must not leave half-written metadata or corrupted indices; recovery must be deterministic.
Tamper-evident (Verifiable integrity): edits, deletions, or rollback must be detectable via hash chaining/signatures and key epoch rules.

These criteria define “trust” as a testable contract. The rest of the page maps each contract item to architecture choices and evidence fields that can be validated with power-yank tests, time-step tests, and integrity checks.

Figure 1 — A trustworthy logger is defined by five verifiable criteria: completeness, ordering, monotonic time, power-fail safety, and tamper evidence.

Cite this figure

H2-2. System Architecture Overview

This architecture is best understood as four interlocking paths—data, power, time, and trust. Field failures become debuggable when each symptom is mapped to one (or a coupling) of these paths.

Data path: ingress → aggregation → buffer → storage commit. Goal: preserve completeness and ordering under rate mismatch and bursts.
Power path: brownout detect → freeze ingress → safe commit window → mark clean. Goal: prevent half-written structures and enable deterministic recovery.
Time path: RTC/time sync → timestamp discipline → monotonic counter + clock-step events. Goal: logs remain monotonic even if wall-clock steps.
Trust path: hash/sign → key epoch rules → verification status. Goal: tampering and rollback become detectable, not debatable.

A “module list” only becomes an evidence system when each module produces a small set of audit fields that can be checked after the fact: source identity, ingress time, sequence counters, commit markers, clean-shutdown flags, clock-step events, and signature verification state. Those fields are the bridge between engineering design and defensible records.

Figure 2 — The same modules become an evidence system only when four paths are explicit: data, power, time, and trust.

Cite this figure

H2-3. Multi-Protocol Aggregation Engine

This chapter avoids protocol details and focuses on aggregation principles: converting heterogeneous inputs into a single evidence event stream with identity, time meaning, order proof, and observable loss.

ingress_ts (ingress timestamp) source_id (source identity) seq_in (per-source sequence) drop_cnt (dropped packet counter) q_flags (quality flags) bp_state (backpressure state)

A logger becomes defensible only when the aggregation layer emits records that can answer audit questions: what arrived, from where, in what order, with what time meaning, and what was lost or degraded. The core deliverable is not “more protocols”—it is a normalized event model.

Collection mode sets the meaning of time

Polling collection provides a scheduled sampling instant; jitter is dominated by task timing. The event time represents “observed at poll.”
Interrupt/event collection captures edges and bursts; latency is dominated by ISR/queueing. The event time represents “entered evidence chain after interrupt.”

Because the physical meaning differs, the system must treat ingress timestamp as the canonical “evidence entry time” and keep any additional “source time” strictly optional and explicitly labeled in later time-integrity chapters.

Normalization: the minimum audit record

Timestamp at ingress: the earliest consistent point where all sources can be measured on one clock reference.
Per-source sequence: enables gap/duplicate/reorder detection without guessing based on arrival timing.
Quality & loss observability: drops and degradations must be measurable (drop counters, reason codes, overrun flags).

Rate mismatch handling: choose a controlled degradation

Backpressure: slow sources or pause ingestion when queue depth exceeds thresholds, with explicit bp_state.
Decimation: down-sample with declared factors and windows, never silently.
Drop policy: if dropping is required, record drop_cnt and reason_code so gaps are explainable.

Any degradation that does not leave an evidence trace becomes indistinguishable from tampering or malfunction.

Figure 3 — Aggregation is an evidence pipeline: it stamps ingress time, enforces per-source ordering, and makes loss/degradation observable.

Cite this figure

H2-4. Buffering & Ordering Strategy

This chapter is the engineering pivot: buffering is not a throughput trick. It defines commit semantics—what is considered “persisted,” what may be lost, and how order remains provable.

write_ptr (write pointer) commit_ptr (commit pointer) wrap_cnt (wrap count) reorder_det (reorder detection) commit_id (commit boundary id)

The primary KPI is ordering integrity, not peak throughput. Throughput can be degraded with explicit traces (drops/decimation/backpressure), but if ordering becomes unprovable, records lose evidentiary value.

Ring buffer vs linear write: choose by recoverability

Ring buffer isolates burst intake and makes “last consistent point” discoverable using pointers and wrap counts.
Linear write can be simpler but is more vulnerable to partial metadata damage unless commit boundaries are explicit.

Double buffering enables atomic commit boundaries

Buffer A receives ingress events while Buffer B is committed to storage.
Switching buffers defines a commit boundary: a unit that must be locatable, verifiable, and recoverable.

Commit boundaries: locatable, verifiable, recoverable

Locatable: commit_ptr and commit_id identify the last completed boundary.
Verifiable: each boundary carries CRC/hash so partial writes are detectable.
Recoverable: on reboot, replay rules rebuild state up to the last verified boundary.

Atomic records: detect “half records” without guessing

Records must be self-delimiting (length + end marker) so the system can distinguish complete vs partial entries.
Partial records are quarantined to the uncommitted region and never contaminate the committed history.

Figure 4 — Commit boundaries define what is durable and what may be lost. Pointers and wrap counts make the “last consistent point” provable.

Cite this figure

H2-5. NAND / SD Storage Management

Storage is the most common root cause of “mysterious corruption.” The goal is not just to write data, but to keep a recoverable and verifiable history across aging, noise, and unexpected power loss.

Core principle: FAT is not an integrity system. Journaling + hash chain is.

ecc_err_cnt (ECC error counter) bad_block_tbl (bad block table) journal_replay (journal replay log) hash_head (hash chain pointer) verify_stat (segment verify status)

Flash reality: “written” is not the same as “recoverable”

NAND and SD-backed media rely on ECC, mapping, and retries. As devices age or operate at high temperature, raw bit errors increase and recovery work grows. When power fails during internal updates, metadata can become inconsistent even when some files still appear readable. Integrity-grade logging therefore requires health counters and deterministic recovery rules, not just a mountable filesystem.

Wear leveling: physical location is not evidence

Wear leveling relocates data across physical blocks to extend life, so physical addresses cannot be treated as stable proof.
The evidence boundary must be the commit structure: segment headers, commit markers, and verifiable chain pointers.
Rising ecc_err_cnt is an early signal of approaching end-of-life or thermal stress.

Bad block management: detect and contain degradation

Factory bad blocks exist from day one; grown bad blocks appear over time.
The logger should maintain a bad_block_tbl with growth trend and timestamps to make failures explainable.
Weak blocks often show up first as increasing ecc_err_cnt and retry activity before hard failure.

Journaling filesystem vs raw partition: deterministic recovery wins

Non-transactional metadata updates can leave directory structures and allocation maps half-updated after power loss. The result may be “mountable but wrong,” which is fatal for evidence systems. Journaling introduces a replay rule: after restart, the system replays (or rolls back) to the last consistent metadata state, and records the outcome in journal_replay. Raw partitions can also be safe, but only when they implement an explicit journal-like commit protocol at the application layer.

Metadata redundancy: protect the recovery entry point

Redundant metadata copies (A/B or N-of-M) protect pointers, indices, and segment headers from single-point corruption.
Updates must be write-new-then-switch: write a new metadata copy, verify it, then atomically advance the active pointer.
Verification status should be summarized in verify_stat so post-mortem analysis does not require guesswork.

CRC vs cryptographic hash: different threats, different guarantees

CRC is effective against random corruption (noise, bit flips, media errors) and is computationally cheap.
Cryptographic hash detects structured changes and supports tamper evidence when combined into a chain.
A hash chain makes deletions/insertions detectable by linking each segment to the previous one; hash_head identifies the current chain head.

Figure 5 — Integrity-grade storage relies on health evidence (ECC/bad blocks), deterministic recovery (journal replay), and verifiable history (hash chain head).

Cite this figure

H2-6. Power-Loss Hold-Up & Safe Commit

Hold-up does not prevent a power loss. It guarantees a deterministic safe-commit window: enough time to freeze ingestion, flush critical state, advance commit metadata, and mark a clean boundary.

clean_flag (last clean shutdown flag) unexp_rst (unexpected reset counter) incomplete_mk (incomplete record marker) bo_event (brownout event log)

Hold-up sizing is a timing budget problem

The sizing question is not “how many farads,” but “how many milliseconds of guaranteed work.” The budget must cover detection latency, ingress freeze, storage flush behavior, metadata commit, and verification margin. If the window is not guaranteed at end-of-life temperature and aging, the design is not deterministic.

Brownout detection threshold: early enough, not noisy

Too late: voltage collapses before metadata commit finishes, creating partial states.
Too sensitive: line ripple triggers frequent commit cycles, reducing performance and increasing wear.
Count and timestamp events in bo_event, and track unexp_rst to prove stability over time.

Pre-commit window: freeze → commit → mark clean

Freeze ingress: stop accepting new events at a defined boundary and preserve ordering.
Commit metadata: advance journal entries, commit pointers, and hash head as the durable boundary.
Mark clean: set clean_flag only after the commit is verifiably complete.

If power collapses mid-window, the restart must replay to the last verified boundary and record an incomplete_mk so the loss is explainable rather than ambiguous.

Supercap vs bulk capacitor: choose predictability under aging

Supercap: longer windows, but requires charging control and health monitoring (leakage/ESR aging).
Bulk capacitor: simpler, but shorter and less predictable under temperature and lifetime degradation.
The best choice is the one that preserves the timing budget at end-of-life, not the one with the largest nominal energy.

Flush timing: a request is not proof

SD/eMMC devices may absorb writes internally and complete them later. A flush call is therefore a request, not a guarantee. The only safe proof is a committed boundary marker that can be verified after reboot: commit pointer advanced, journal entry consistent, and hash head updated.

Figure 6 — A deterministic power-loss response is a time budget: detect brownout, freeze ingress, commit metadata, then mark clean—before collapse.

Cite this figure

H2-7. Time Integrity & Synchronization

Time is the core of evidence: it defines order, causality, and replay. The system must treat wall-clock time as a convenience and monotonic time as the ordering authority.

Principle: Logs must be monotonic even if wall clock steps.

rtc_offset (RTC offset) last_sync_ts (last sync timestamp) mono_ctr (monotonic counter) clock_step_log (clock step event log) sync_state (sync quality state)

RTC drift: bounded, observable, and never assumed

RTC drift is inevitable under temperature variation and aging. Evidence systems therefore treat RTC as a local reference with bounded error that must be measured and recorded. Storing rtc_offset makes time interpretation auditable: it explains why timestamps diverge after long offline periods and how much correction was applied.

Discipline strategy: slew when possible, step only with a trace

Slew gradually adjusts the wall clock to avoid discontinuities in human-readable time.
Step may be required for large errors or leap-second style events, but must generate a clock_step_log entry.
Quality gating prevents bad time sources from poisoning evidence; expose the result via sync_state.

Monotonic counter: the ordering key that never goes backward

Every record should carry a strictly increasing mono_ctr used for sorting, windowing, and latency metrics. Wall clock time remains a secondary field for cross-system alignment and human reading. If wall time repeats or moves backward, monotonic ordering still guarantees a consistent event sequence.

Wall clock vs monotonic: dual-time model per record

monotonic_time (via mono_ctr): ordering and causality.
wall_time: alignment and reporting; may step.
sync markers: last_sync_ts and sync_state explain validity at capture time.

Leap seconds and clock steps: allow wall time to bend, never the log

Leap seconds, operator time changes, or upstream corrections can force wall time discontinuities. The logger must accept that wall time can repeat or jump, but it must never break monotonic ordering. Each step event must be recorded with direction and magnitude in clock_step_log so any downstream analysis can re-map wall time with full context.

Figure 7 — Use a dual-time model: monotonic counters guarantee ordering; wall-clock time supports alignment and can step with explicit step logs.

Cite this figure

H2-8. Signatures & Tamper Resistance

Evidence requires more than “checksums.” The system must detect modifications, prevent silent deletion/insertion, and resist rollback to old but valid histories.

Principle: Integrity without anti-rollback is incomplete.

sig_verify (signature verification flag) rollback_ctr (anti-rollback counter) fw_ver_hash (firmware version hash) hash_head (hash chain head pointer) key_src (root key source)

Hash chain per block: detect deletions and insertions

A single hash can prove that one block was not altered, but it cannot prove that the history is complete. A hash chain links each committed segment to the previous one (via prev_hash), making missing or inserted segments detectable. The current chain head (hash_head) is part of the evidence state and must be advanced only at verified commit boundaries.

Segment-level signing: bind evidence to a device identity

Each commit segment produces a segment hash; the logger signs it to create a tamper-evident record.
Verification must produce an explicit sig_verify result (pass/fail + reason), not a silent best-effort.
Signing granularity should align with commit boundaries to keep recovery and verification consistent.

Root key storage: signatures only matter if keys are non-extractable

Software-stored keys are copyable and undermine evidentiary value.
MCU secure storage raises the bar but still depends on platform hardening.
Secure elements keep root keys non-extractable and perform signing internally; record the choice as key_src.

Anti-rollback counter: prevent “valid but old” histories

Without anti-rollback, an attacker can replay an older signed log that still verifies. A monotonic rollback_ctr, stored in a domain that cannot be decremented, prevents reverting to prior histories or firmware states. Binding evidence to the running software environment via fw_ver_hash closes the loop: the log can prove not only that it was unmodified, but also that it was produced under the expected firmware lineage.

Figure 8 — Hash chains detect missing/inserted history; signatures bind evidence to device identity; anti-rollback prevents valid-but-old replays.

Cite this figure

H2-9. Gateway Uplink Strategy (Optional)

Uplink is a copy path, not the source of truth. The local evidence store remains authoritative: only verifiable, committed segments should be transported and accepted.

Principle: Uplink must never redefine truth; it only transports verifiable segments.

store-and-forward offline-first retry backoff ID-based dedup TLS ≠ evidence

Store-and-forward: upload only committed segments

Transport unit is a committed segment (bounded by commit pointers and journal rules), not raw streaming bytes.
Each uploaded segment should carry segment identity: segment ID, monotonic range, hash head, and signature.
Uploading “in-progress” data breaks recovery semantics and makes gaps impossible to explain.

Offline mode: treat disconnection as normal

Logging must continue locally even when uplink is down; backlog growth must not corrupt commit boundaries.
Backpressure should apply to uplink throughput, not to evidence preservation.
Queue depth and last-ack markers (if implemented) should be logged as operational evidence.

Retry backoff: stability and device health

Aggressive retries amplify congestion and increase energy draw. Use exponential backoff with jitter, and record failure classes (timeout, handshake, unreachable) so behavior remains auditable over time.

Duplicate suppression: dedup by identity, never by wall time

Duplicate delivery is expected under retries; reception must be idempotent.
Dedup keys should be derived from segment/record identity (e.g., segment ID + hash), not wall-clock timestamps that may step.
Monotonic counters are the ordering authority; wall time is not safe for identity decisions.

TLS vs local storage trust: transport security is not evidence integrity

TLS protects data in transit, but it does not prove that records were not modified at rest or replayed from an older valid history. Evidence integrity still requires local commit boundaries, hash chains, signatures, and anti-rollback policies. A receiver should verify chain and signatures before accepting a segment as evidence.

Upload boundary

Only committed segments; never partial windows. Align with commit pointers and journal rules.

Acceptance rule

Verify signature + chain continuity before acknowledging. Reject unverifiable segments.

Backoff rule

Exponential backoff + jitter; classify failures to keep behavior explainable.

Dedup rule

Dedup by stable ID/hash; never by wall time. Reception should be idempotent.

Figure 9 — Store-and-forward preserves evidence semantics: upload committed segments, tolerate offline backlog, retry with backoff, and dedup by stable IDs.

Cite this figure

H2-10. Failure Modes & Forensics

Forensics converts symptoms into evidence-driven conclusions. Each failure mode below maps to specific evidence fields and the chapters that define recovery, ordering, and integrity rules.

Principle: Symptoms are not causes; evidence fields decide.

Corrupted SD card (→ H2-5 / H2-6)

Looks like: mount failures, unreadable segments, “random” missing data.
Check: rising ecc_err_cnt, growth in bad_block_tbl, frequent journal_replay, presence of incomplete_mk.
Likely cause: media degradation plus incomplete safe-commit windows.
First fix: strengthen journaling/metadata redundancy and re-budget hold-up for verified commit.

Time reset to epoch (→ H2-7 / H2-6)

Looks like: timestamps jump to 1970/2000; ordering becomes confusing across reboots.
Check: empty or stale last_sync_ts, abnormal rtc_offset, large deltas in clock_step_log, correlated brownout/reset events.
Likely cause: RTC power domain loss or discipline policy accepting poor time sources.
First fix: mark wall time invalid while preserving mono_ctr ordering; tighten sync quality gating.

Record gap (→ H2-4 / H2-5 / H2-6)

Looks like: missing span in monotonic ranges; segment continuity breaks.
Check: commit pointer jumps, incomplete_mk markers, chain discontinuity (hash head / prev hash), journal replay rollbacks.
Likely cause: interrupted commit boundary during power loss or buffer wrap without explicit boundary markers.
First fix: enforce atomic commit boundaries and write explicit gap markers to keep evidence explainable.

Duplicate entries (→ H2-3 / H2-7 / H2-9)

Looks like: identical records repeated, often around reconnect or retry events.
Check: duplicate source+sequence, repeated mono_ctr ranges, uplink retry bursts; confirm dedup keys are not wall-time based.
Likely cause: retries without idempotent acceptance; dedup driven by wall clock that stepped.
First fix: dedup by stable record/segment ID and require idempotent receiver behavior.

Brownout loop (→ H2-6 / H2-5)

Looks like: repeated resets; system never reaches a clean commit state.
Check: rapidly increasing unexp_rst, clean_flag rarely true, frequent bo_event, repeated journal_replay.
Likely cause: brownout threshold too aggressive or hold-up budget insufficient under load/inrush.
First fix: re-budget the safe-commit window, adjust thresholds/debounce, and minimize work required inside the window.

Figure 10 — A forensics map: start from the symptom, validate evidence fields, then follow the mapped chapters to the correct recovery and integrity rules.

Cite this figure

H2-11. Validation & Test Strategy

Validation is only meaningful when each test produces a consistent evidence story: power events, time events, storage recovery, and tamper checks must be traceable in fields and logs.

Principle: A test passes only when evidence fields match the expected story.

last_clean_shutdown_flag unexpected_reset_counter incomplete_record_marker journal_replay_log commit_pointer hash_head sig_verify mono_ctr clock_step_log rtc_offset last_sync_ts ecc_err_cnt bad_block_tbl rollback_ctr fw_ver_hash

Power yank test (→ H2-6 / H2-5 / H2-4)

Test actions

1) Run steady ingress + commit workload.
2) Yank power at three phases: (A) pre-commit, (B) inside commit window, (C) metadata flush boundary.
3) Reboot and execute recovery scan / journal replay.
4) Verify last committed segments offline (chain + signature).

Expected evidence fields

last_clean_shutdown_flag=false (for yank cases)
unexpected_reset_counter++
incomplete_record_marker present (B/C)
journal_replay_log indicates replay/repair (if enabled)
commit_pointer / hash_head consistent after recovery
sig_verify=PASS for last committed segment

MPN examples (validation fixtures / power path): TPS2663 (eFuse / hot-swap), TPS25982 (eFuse), TPS3808 (supervisor), LTC4365 (surge stopper), LTC3350 (supercap backup controller), LTC4040 (backup power manager).

Time rollback test (→ H2-7 / H2-6)

Test actions

1) Capture stable logs under a disciplined time source.
2) Force wall-clock steps (backward and forward).
3) Continue logging across multiple commit segments.
4) Compare ordering by mono_ctr vs wall time; validate step traceability.

Expected evidence fields

mono_ctr strictly increasing (no repeats/backward)
clock_step_log contains direction + delta + reason/source
last_sync_ts updates across discipline events
rtc_offset changes remain explainable / bounded

MPN examples (RTC / time base): DS3231M (temperature-compensated RTC), RV-3032-C7 (ultra-low-power RTC), Abracon ABS07 / Epson FC-135 (typical 32.768 kHz crystal families for RTC domains).

Corrupted block injection (→ H2-5 / H2-8)

Test actions

1) Select one committed segment and its metadata region.
2) Inject corruption (bit flip, unreadable block, index damage).
3) Reboot and force recovery path (scan + journal replay).
4) Run offline verification (hash chain + signature).

Expected evidence fields

ecc_err_cnt increases (or read-fail counters trigger)
bad_block_tbl updated (if remap occurs)
journal_replay_log records repair/rollback steps
sig_verify=FAIL (tamper/corrupt) for impacted segment, with reason
chain discontinuity detectable via hash_head/prev_hash

MPN examples (flash targets for injection): W25N01GW (SPI NAND family), MT29F / MT29F1G (raw NAND families), industrial microSD examples often used in logging validation: Swissbit S-45u / S-55u series (family naming), Kingston Industrial microSD families.

Wear endurance test (→ H2-5 / H2-6)

Test actions

1) Run two write profiles: (A) small-record high-frequency, (B) large-segment low-frequency.
2) Execute accelerated write cycles to a defined total written budget.
3) Periodically sample error counters and remap tables.
4) Randomly verify segments (signature + chain) throughout the run.

Expected evidence fields

ecc_err_cnt trend observable but bounded
bad_block_tbl growth rate explainable
journal_replay_log appears only on defined fault triggers
sig_verify remains PASS for all committed segments (no silent degradation)

MPN examples (endurance-oriented storage options): eMMC families such as Micron eMMC (industrial grades) or Kioxia eMMC (industrial grades), SPI-NAND options like W25N series; endurance testing should use the exact SD/eMMC/NAND candidates planned for deployment.

Tamper verification test (→ H2-8 / H2-7 / H2-5)

Test actions

1) Modify payload inside a committed segment (content tamper).
2) Delete one middle segment (history gap).
3) Insert a fabricated segment (history splice).
4) Replay an older valid log set (rollback / replay).
5) Run verifier: chain check + signature check + anti-rollback gate.

Expected evidence fields

sig_verify=FAIL for modified segments (reason recorded)
hash_head/prev_hash mismatch pinpoints deletion/insertion
rollback_ctr rejects older histories (monotonic requirement)
fw_ver_hash mismatch rejects wrong firmware lineage

MPN examples (root-of-trust / key protection): ATECC608B (secure element), NXP SE050 (secure element family), Infineon OPTIGA™ Trust M (secure element family). These enable non-extractable keys and on-chip signing for verifiable evidence.

Figure 11 — Validation Matrix: each test must produce the expected evidence fields and an auditable pass story (power/time/storage/tamper).

Cite this figure

MPN note: The listed part numbers are concrete examples commonly used in validation fixtures (power protection, RTC, secure element, storage targets). Final selection should match the deployment voltage/current, temperature range, and storage endurance requirements.

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12. FAQs (Evidence-Driven)

Each answer follows a strict evidence pattern: 1 conclusion, 2 evidence checks, and 1 first fix—then maps back to the chapters that define the rules.

Rule: Every FAQ must collapse to evidence fields + a low-risk first fix.

Records out of order — timestamp or buffer commit issue?

→ H2-4 → H2-7

Conclusion: Out-of-order records are usually commit-boundary ordering mistakes, not “bad timestamps,” and should be diagnosed with monotonic ordering first.

Evidence check 1: Verify mono_ctr is strictly increasing per source/stream; any repeat/backward step indicates ordering is not anchored to monotonic time.

Evidence check 2: Compare write_pointer vs commit_pointer and any reorder flags; boundary crossings should not reorder previously committed records.

First fix: Enforce “sort by mono_ctr” and write an immutable per-segment index at commit time to lock ordering.

SD card corrupt after outage — missing hold-up or no journaling?

→ H2-5 → H2-6

Conclusion: Post-outage corruption is most often an unsafe metadata window, and journaling without a verified hold-up budget still fails under real yank conditions.

Evidence check 1: Correlate unexpected_reset_counter spikes with last_clean_shutdown_flag=false and repeated journal_replay_log events.

Evidence check 2: Look for incomplete_record_marker near the outage time; absence often indicates silent partial writes rather than controlled recovery.

First fix: Stop ingress on brownout detect, then commit metadata first, and require journaling to produce an explicit “incomplete” marker when the window is missed.

Logs look intact but fail audit — missing signatures?

→ H2-8

Conclusion: Audit failures usually mean the evidence is not independently verifiable, even if it “plays back,” because signature coverage or anti-rollback proof is incomplete.

Evidence check 1: Confirm every committed segment reports sig_verify=PASS (or equivalent) and that signature IDs/keys are recorded with the segment.

Evidence check 2: Validate rollback_ctr monotonicity and fw_ver_hash lineage; integrity without anti-rollback is usually rejected.

First fix: Sign per-segment (not just files) and emit a verifiable audit summary: segment ID, mono range, hash head, signature, and verify status.

Time jumped backward — NTP step or RTC reset?

→ H2-7

Conclusion: A backward jump must be treated as a wall-clock event; monotonic ordering must remain valid even when synchronization steps time.

Evidence check 1: Inspect clock_step_log for step direction, delta, and source; a step with valid sync metadata points to NTP/PTP discipline.

Evidence check 2: Check rtc_offset and last_sync_ts; RTC reset often pairs with missing sync updates or sudden offset discontinuity.

First fix: Mark wall time “degraded” after a step and keep ordering anchored to mono_ctr; tighten discipline policy to prevent large backward steps.

After firmware update logs unverifiable — key rotation issue?

→ H2-8

Conclusion: Unverifiable logs after an update typically indicate a signature lineage break (key ID/firmware lineage not preserved), not a storage corruption problem.

Evidence check 1: Compare fw_ver_hash before/after the update and verify that segments include the signing key identifier used at creation.

Evidence check 2: Review sig_verify failure reasons; “unknown key / chain mismatch” points to rotation without verifier continuity.

First fix: Add key-version + key-id to segment headers and keep old verification chains read-only so historical segments remain verifiable.

Random gaps under high load — ingress overrun?

→ H2-3

Conclusion: High-load gaps are usually controlled drops that were not made explicit, caused by ingress overruns or backpressure applied too late.

Evidence check 1: Check dropped_packet_counter (or equivalent) rising with load; silent gaps without counters indicate missing instrumentation.

Evidence check 2: Validate the relationship between ingestion rate and commit capacity using buffer metrics (write_pointer advance vs commit progress).

First fix: Implement explicit gap markers when shedding load and apply backpressure/decimation at ingress before commit starvation occurs.

Duplicate entries after reconnect — retry logic missing dedupe?

→ H2-9

Conclusion: Duplicates after reconnect almost always come from retry/resend without idempotent acceptance, not from “double logging” at the source.

Evidence check 1: Confirm duplicates share the same identity (segment ID / hash_head / mono range); if yes, uplink replay is the mechanism.

Evidence check 2: Inspect ACK progress (e.g., last-ack segment marker if present) and retry bursts; missing ACK gating produces repeated sends.

First fix: Dedup by stable ID+hash and require the receiver to be idempotent; ACK only after signature/chain verification passes.

Frequent unexpected reset counter increment — brownout threshold wrong?

→ H2-6

Conclusion: Frequent unexpected resets typically indicate an unstable brownout policy or insufficient hold-up margin during commit phases.

Evidence check 1: Correlate unexpected_reset_counter with brownout events (bo_event or equivalent) and the time of high write activity.

Evidence check 2: Look for clustered incomplete_record_marker occurrences near resets, indicating commits are being interrupted mid-window.

First fix: Detect brownout earlier, stop ingress immediately, and reduce work inside the safe window (commit minimal metadata, defer non-critical writes).

Audit asks for “proof of non-tampering” — what fields matter?

→ H2-8

Conclusion: “Non-tampering” proof requires verifiable chain continuity plus anti-rollback evidence; playback alone is not a proof artifact.

Evidence check 1: Provide per-segment chain fields (hash_head / prev_hash) and the signature verification result (sig_verify with reason on failure).

Evidence check 2: Demonstrate freshness via rollback_ctr monotonicity and firmware lineage via fw_ver_hash binding.

First fix: Emit an auditable “segment manifest” (ID, mono range, hash head, signature, verify status, counters) for independent review.

Storage wears out early — write amplification?

→ H2-5

Conclusion: Early wear-out usually indicates write amplification from small random writes and metadata churn, not “bad cards,” especially under journaling stress.

Evidence check 1: Track ecc_err_cnt slope and bad_block_tbl growth; an accelerating trend under stable workload signals endurance issues.

Evidence check 2: Quantify metadata frequency via journal/commit counters (or replay frequency); high metadata rate relative to payload indicates amplification.

First fix: Batch records into larger segments, reduce metadata updates per unit time, and keep hot data in ring buffers before boundary commits.

RTC drifts too much — no discipline strategy?

→ H2-7

Conclusion: Large drift is expected without discipline; reliable evidence needs a drift model plus periodic quality-gated synchronization, not raw RTC time alone.

Evidence check 1: Evaluate rtc_offset over temperature/time to determine if drift is linear, step-like, or reset-driven.

Evidence check 2: Inspect last_sync_ts spacing and any sync quality state; long gaps or low-quality sources amplify wall-clock error.

First fix: Add a discipline policy (interval + quality threshold) and record every step/slew event so wall time remains explainable.

CRC passes but hash fails — partial rewrite?

→ H2-5 → H2-8

Conclusion: “CRC pass but hash fail” usually means CRC covered only a subset (payload) while the cryptographic hash covered headers/metadata that were partially rewritten.

Evidence check 1: Compare CRC coverage scope with the signed/hash scope; mismatched scopes create false confidence when metadata changes.

Evidence check 2: Look for journal_replay_log and incomplete_record_marker around the failing segment, which often indicates interrupted boundary sealing.

First fix: Extend hash/signature to cover header + payload + critical metadata, then seal the segment in a strict order: immutable header → payload → final signature.

Figure 12 — FAQ-to-evidence map: start from the symptom, validate fields, then follow the mapped chapters to the correct rules.

Cite this figure

Process Data Logger / Gateway (Integrity-Grade Logging)

Process Data Logger / Gateway (Integrity-Grade Logging)

H2-1. Center Idea — What Makes a Logger “Trustworthy”

H2-2. System Architecture Overview

H2-3. Multi-Protocol Aggregation Engine

Collection mode sets the meaning of time

Normalization: the minimum audit record

Rate mismatch handling: choose a controlled degradation

H2-4. Buffering & Ordering Strategy

Ring buffer vs linear write: choose by recoverability

Double buffering enables atomic commit boundaries

Commit boundaries: locatable, verifiable, recoverable

Atomic records: detect “half records” without guessing

H2-5. NAND / SD Storage Management

Flash reality: “written” is not the same as “recoverable”

Wear leveling: physical location is not evidence

Bad block management: detect and contain degradation

Journaling filesystem vs raw partition: deterministic recovery wins

Metadata redundancy: protect the recovery entry point

CRC vs cryptographic hash: different threats, different guarantees

H2-6. Power-Loss Hold-Up & Safe Commit

Hold-up sizing is a timing budget problem

Brownout detection threshold: early enough, not noisy

Pre-commit window: freeze → commit → mark clean

Supercap vs bulk capacitor: choose predictability under aging

Flush timing: a request is not proof

H2-7. Time Integrity & Synchronization

RTC drift: bounded, observable, and never assumed

Discipline strategy: slew when possible, step only with a trace

Monotonic counter: the ordering key that never goes backward

Wall clock vs monotonic: dual-time model per record

Leap seconds and clock steps: allow wall time to bend, never the log

H2-8. Signatures & Tamper Resistance

Hash chain per block: detect deletions and insertions

Segment-level signing: bind evidence to a device identity

Root key storage: signatures only matter if keys are non-extractable

Anti-rollback counter: prevent “valid but old” histories

H2-9. Gateway Uplink Strategy (Optional)

Store-and-forward: upload only committed segments

Offline mode: treat disconnection as normal

Retry backoff: stability and device health

Duplicate suppression: dedup by identity, never by wall time

TLS vs local storage trust: transport security is not evidence integrity

Upload boundary

Acceptance rule

Backoff rule

Dedup rule

H2-10. Failure Modes & Forensics

Corrupted SD card (→ H2-5 / H2-6)

Time reset to epoch (→ H2-7 / H2-6)

Record gap (→ H2-4 / H2-5 / H2-6)

Duplicate entries (→ H2-3 / H2-7 / H2-9)

Brownout loop (→ H2-6 / H2-5)

H2-11. Validation & Test Strategy

Power yank test (→ H2-6 / H2-5 / H2-4)

Test actions

Expected evidence fields

Time rollback test (→ H2-7 / H2-6)

Test actions

Expected evidence fields

Corrupted block injection (→ H2-5 / H2-8)

Test actions

Expected evidence fields

Wear endurance test (→ H2-5 / H2-6)

Test actions

Expected evidence fields

Tamper verification test (→ H2-8 / H2-7 / H2-5)

Test actions

Expected evidence fields

Request a Quote

Accepted Formats

Attachment

H2-12. FAQs (Evidence-Driven)

Explore

Categories

Get in Touch