Flight Data/Voice Recorder (FDR/CVR) Design Guide
← Back to: Avionics & Mission Systems
Flight Data/Voice Recorders are designed to keep evidence trustworthy when power is interrupted: they freeze inputs, converge to a last-consistent-point, and verify integrity so exported data matches what was recorded. The practical goal is not “fast storage,” but provable pre/post event capture with measurable health trends and repeatable validation of power-fail, trigger logic, and readback integrity.
H2-1 · What FDR/CVR is — scope, boundaries, and “what must never be lost”
A recorder becomes valuable only when its outputs remain trustworthy under the exact conditions that break ordinary logging: brownouts, abrupt power removal, internal resets, or media wear. This page focuses on the recorder’s internal chain (buffer → storage controller → NVMe/UFS → NAND → integrity → power-fail commit), and avoids deep dives into aircraft bus protocols, aircraft-wide power compliance, or full anti-tamper architectures.
| Aspect | FDR (Flight Data Recorder) | CVR (Cockpit Voice Recorder) |
|---|---|---|
| Data shape | Multi-channel parameter/event streams; bursts during abnormal events. | Continuous audio stream; continuity and gap detection are critical. |
| Write pattern | Segmented records + indexes; event windows must align to time. | Steady, always-on writes; short gaps are highly visible and unacceptable. |
| Typical failure symptom | Missing segments, broken time correlation, or “has data but cannot reconstruct timeline.” | Dropouts/short voids, partial overwrite, or audio present but index/manifest inconsistent. |
What must never be lost (engineering definition)
- Continuity: no silent gaps; segment order and sample counts stay consistent across resets/power events.
- Time correlation: records can be reassembled into a monotonic timeline; event windows map to the correct segments.
- No silent corruption: damaged content is detectable (CRC/ECC/hash layers), not “quietly wrong.”
- Recoverable readout: crash readout/offload produces verifiable output (manifest + checks) even after abrupt shutdown.
Scope boundary: focus on recorder-internal reliability (write path, power-fail closure, integrity checks, readout proof). Do not expand into avionics network protocols, aircraft 28V front-end compliance, or full crypto/anti-tamper system design.
H2-2 · System architecture — from acquisition to crash-survivable memory
At a system level, the recorder sits between acquisition sources (flight-data aggregation and cockpit audio acquisition) and two consumers: maintenance offload and crash survivable readout. The design goal is not “maximum throughput,” but predictable persistence: the ability to state, test, and prove where “data is safely on media” under worst-case bursts and abrupt shutdown.
End-to-end path (protocol-agnostic)
Acquisition (flight parameters / audio) → Recorder ingress → Input buffer → Segment builder & metadata → NVMe/UFS storage stack → NAND/CSMU → Readout (maintenance offload or crash readout).
Each internal module should be described by its responsibility and its failure signature, so diagnosis stays inside the recorder boundary:
- Ingress: frames and paces sources; uncontrolled pacing creates burst-driven buffer overrun or timing skew.
- Input buffer: absorbs bursts; weak buffering shows up as missing segments, dropouts, or window discontinuities.
- Segment builder & metadata: converts streams into segments + index/manifest; a fragile index can make valid media content unreconstructable.
- Storage stack (NVMe/UFS): defines what “write complete” means; confusion here causes “acknowledged but not durable” records.
- NAND/FTL: wear, bad blocks, and write amplification; symptoms appear as rising ECC corrections, retries, or variable latency.
- Integrity checks: detects corruption at multiple layers; missing layers enable silent wrong data.
- Power-fail closure: freezes ingress, drains critical queues, and commits a last-consistent point; without this, recovery becomes guesswork.
The three choke points (what they decide)
- Choke #1 — Input buffer: determines whether bursts turn into gaps (FDR segments) or audible dropouts (CVR).
- Choke #2 — FTL/commit point: determines whether acknowledgements map to a durable, reconstructable state on NAND.
- Choke #3 — Power-fail closure: determines whether crash recovery can rebuild a monotonic timeline and verify integrity.
This architecture view sets up later deep dives: NVMe vs UFS responsibility split (commit semantics), power-fail state machine (freeze/drain/commit), and integrity layering (CRC/ECC/hash + manifest).
H2-3 · Recording requirements that drive the design — bandwidth, retention, and worst-case bursts
A recorder rarely fails because its interface is “too slow” on average. It fails when short abnormal windows create bursts that overflow buffers, delay commits, or collide with flash management behavior. For CVR, continuous audio writes amplify this risk because flash housekeeping can create latency spikes and increase write amplification (WA). The goal is to turn “bandwidth and capacity” into a verifiable write budget that holds under stress.
Three numbers that must be defined (and later proven)
| Budget | Definition (what it means) | How to estimate | How to prove |
|---|---|---|---|
| Average write | Long-term sustained write rate across normal operation (steady-state logging). | Measure per stream and sum; include segment/manifest overhead. | Soak test at nominal loads; verify zero gaps and stable latency distribution. |
| Worst-case burst | Maximum data generated in a defined window during abnormal events (e.g., N seconds around triggers). | Use event scenarios to define bytes_per_window; include audio + parameter spikes. |
Event replay test: burst windows repeated; verify no buffer overrun and commits meet deadlines. |
| Lifetime write budget | Total bytes written to flash over service life, including WA, retries, bad-block growth, and metadata journals. | Convert to “equivalent TBW” using WA factor and expected duty cycle. | Accelerated wear + periodic readback; track ECC corrections, retries, and bad blocks vs thresholds. |
Two practical consequences follow from these budgets:
- FDR burst protection: define the event window first (time-bounded burst), then size buffers and commit time so the window is never fragmented.
- CVR continuity protection: treat latency spikes as a first-class requirement (not a corner case); continuous writes + flash GC can cause dropouts unless buffered and committed deterministically.
Common requirement mistakes that later become “missing data”
- Using average bitrate as a sizing target: event windows, not averages, determine continuity under stress.
- Equating interface throughput with durable logging: “write complete” is not the same as “durably committed and reconstructable.”
- Ignoring WA and flash housekeeping: WA multiplies the true NAND write volume and changes lifetime and latency behavior.
H2-4 · NVMe vs UFS for recorders — what matters in power-fail and integrity
Both NVMe and UFS can support high write rates, but recorders care about commit semantics and recovery determinism. The most important questions are: (1) can the recorder force the storage stack to converge to a last-consistent point during a power-fail sequence, and (2) can metadata protection be tested and proven so corruption is detected rather than silent.
| Recorder concern | NVMe-style stack (host-led closure) | UFS-style stack (device-managed closure) |
|---|---|---|
| Power-fail behavior | Durability depends on host policy: freeze new writes, drain critical queues, and explicitly close segments/manifests at a defined commit point. | Durability depends on device caching and internal scheduling; host still needs segment closure but may rely more on device-managed persistence behavior. |
| Recovery determinism | Strong when the host defines and tests a “last-consistent point” contract (commit marker + verified manifest). | Strong when device recovery behavior is stable and testable across power-cut patterns; requires validation of rebuild outcomes. |
| Metadata protection | Host typically owns segment/manifest integrity and journaling strategy; easier to reason about if implemented explicitly. | Device may provide more built-in management; host still must ensure recorder-level manifests remain reconstructable and verifiable. |
| Implementation complexity | Higher host responsibility for closure timing and “acknowledged vs durable” mapping. | Potentially simpler host closure flow, but requires careful characterization of device caching/recovery behavior. |
| Testability | Excellent when commit semantics and closure steps are instrumented (events + counters + post-cut verification). | Excellent when power-cut matrices reproduce the same rebuild result and integrity proofs across units and temperatures. |
Two acceptance questions (make them test gates)
- Can power-fail force convergence? A randomized power-cut matrix should always recover a monotonic timeline and pass manifest verification.
- Is metadata protection provable? Segment/index/manifest corruption must be detectable (fail-closed), not silently misinterpreted as valid.
Later chapters should convert these gates into a concrete procedure: freeze ingress → drain queues → commit marker → verify manifest on next boot, then sample readback to confirm integrity across the reconstructed timeline.
H2-5 · Power-fail write — detection, hold-up, and “last-consistent-point” design
A power-fail event should not be treated as “try to write as much as possible.” The correct objective is convergence: freeze ingress, drain what can be made durable, commit a final marker that proves the LCP, and then power down. A recorder that cannot prove its LCP risks silent gaps, broken segment order, or a manifest that points to data that was never fully committed.
Power-fail 5-step state machine (acceptance-friendly)
| Step | Action | What it protects | Failure signature if missing |
|---|---|---|---|
| 1) Detect | Early warning from V-rail drop, PG change, or UV interrupt. | Creates a bounded time window for closure. | Commit begins too late; recovery becomes non-deterministic. |
| 2) Freeze | Stop new ingestion or switch to read-only buffer mode. | Caps queue growth; stabilizes what must be closed. | Queues keep growing; drain cannot catch up. |
| 3) Drain | Drain write queues; prioritize segment tails and metadata. | Moves data to a reconstructable boundary. | Data exists but timeline/segments become unrebuildable. |
| 4) Commit | Write journal/commit marker (epoch) that defines the LCP. | Proves the last consistent state on media. | Half-updated manifest; “acknowledged but not durable.” |
| 5) Power-off | Shut down after the marker is durable (verified). | Ensures deterministic rebuild on next boot. | Random partial writes; inconsistent metadata versions. |
The “last-consistent point” is best implemented as a small, verifiable artifact on media: a commit marker (often tied to an epoch or monotonically increasing sequence) that is written only after the recorder has closed segment boundaries and updated the manifest/journal. If the marker is missing or invalid after restart, the recorder must fail closed (reject) and fall back to the previous valid epoch rather than guessing.
Design criteria (what must be budgeted and later proven)
- Early-warning margin: time from detection to power collapse must exceed worst-case closure time under load.
- Freeze latency bound: ingress must be frozen within a fixed upper limit after early warning.
- Closure time bound: freeze → drain → commit must complete before hold-up expires.
- Worst-case WA impact: closure must succeed even when flash WA/GC increases effective write volume and latency.
- Hold-up scoped to closure: hold-up energy targets “commit completion,” not extended recording duration.
Verification hook (for later validation chapters): run a randomized power-cut matrix across normal and burst workloads and confirm every reboot can rebuild a monotonic timeline up to the latest valid commit marker.
H2-6 · Data integrity pipeline — CRC, ECC, journaling, and readback proof
A recorder’s integrity pipeline should be layered so each mechanism covers a different failure mode. Packet-level checks catch corruption introduced in transport or buffering, storage-level ECC handles media bit errors, and segment-level hashes protect reconstruction correctness across segments and manifests. Metadata consistency is maintained with journaling or double-write strategies so “data is valid but directory is broken” (or the reverse) cannot occur silently.
Integrity layers (what each layer proves)
| Layer | Protects | Detects / corrects | What to log |
|---|---|---|---|
| Transport CRC | Packets/frames in ingress, buffering, and offload path. | Corruption introduced before storage (DMA/buffers/link). | CRC fail count, source channel, timestamp window. |
| Storage ECC | Flash pages/blocks inside NAND + FTL mapping. | Bit errors; correctable vs uncorrectable events. | ECC corrected bits, UBER events, retry counts, bad blocks. |
| Segment hash | Reconstructed segments and their ordering. | Wrong segment content, wrong assembly, stale pointers. | Hash mismatch rate, segment IDs affected, epoch ID. |
| Journal / double-write | Manifest/index updates and epoch/commit markers. | Half updates; directory/data mismatch after power cuts. | Journal replay count, last valid epoch, rollback events. |
Metadata should be treated as safety-critical because it defines reconstruction. Journaling (or a two-copy scheme with versioning) should ensure that after any reset or power cut, the recorder selects the latest valid metadata set using a simple rule: choose the newest version that passes integrity checks. If no valid set exists, the system must fail closed and report a fault rather than producing plausible but incorrect readout.
Readback proof (maintenance-side verification steps)
- 1) Verify reconstruction basis: load manifest/index for the latest valid epoch; confirm monotonic segment order.
- 2) Sample readback: read selected windows (recent + historical) and verify segment hashes against the manifest.
- 3) Check layer counters: review CRC failures, ECC corrections, retries, and bad-block trends for degradation signals.
- 4) Produce a health verdict: pass/fail plus trend flags (rising ECC, increasing retries, frequent journal replays).
H2-7 · Event triggers — acceleration triggers, continuous ring buffer, and pre/post windows
A recorder does not capture meaningful pre-crash context simply by “having an accelerometer trigger.” The practical guarantee comes from how the ring buffer is segmented, how often segments are committed to a consistent point, and how the trigger locks a window without fragmenting it. The pre-window must be more than “still in RAM” — it must remain reconstructable after power loss, with a manifest that can prove window completeness.
Trigger criteria checklist (concept-level but testable)
| Criterion | What it means | Why it matters |
|---|---|---|
| Threshold | Acceleration magnitude exceeds a configured level. | Defines sensitivity; too low increases false triggers. |
| Duration | Time-over-threshold must persist for a minimum window. | Rejects short spikes and vibration bursts. |
| Multi-axis / composite | Combine axes or use a composite rule for impact patterns. | Improves robustness across orientation and mounting. |
| Voting | Two-of-N conditions must agree before triggering. | Balances false-trigger reduction vs missed triggers. |
| Debounce / re-arm | Trigger is latched and re-armed only after cooldown. | Prevents event “chatter” and window fragmentation. |
The ring buffer should be treated as a continuous, segmented timeline. Segmentation creates fast boundaries for freezing and committing: pre-window data is guaranteed only if it resides in segments that already belong to a known, valid commit epoch (LCP). When the trigger fires, the recorder latches the event and freezes a combined pre/post window, then commits the window’s manifest so readout can prove that the window is complete and in-order.
False trigger vs missed trigger (typical symptoms)
| False trigger (too sensitive) | Missed trigger (too strict / too late) |
|---|---|
|
Frequent event windows during non-accident vibration. Window content looks normal; event density is abnormally high. |
Accident occurs but no event marker is present. Post-window is incomplete because freeze/commit happens too late. |
|
Trigger count rises with certain operational phases. Duration/voting rarely filters events. |
Trigger counters show “near hits” (threshold met) but duration/voting not satisfied. Freeze latency exceeds the usable closure margin. |
Practical linkage to power-fail closure: pre-window guarantees depend on commit cadence and segment boundaries, so the trigger pipeline must align with the LCP design.
H2-8 · Crash survivable memory unit — packaging, thermal, shock/vibration, connectors
The crash survivable memory unit (CSMU) concentrates the recorder’s most valuable asset: the final, reconstructable storage timeline. Survivability depends on structural layering and on the weakest interfaces — especially connectors and solder joints — as well as on thermal behavior under sustained write workloads. A robust design treats mechanical and thermal risks as integrity risks because degradation ultimately appears as ECC trend changes, read retries, and bad-block growth.
Risk list and countermeasures (CSMU focus)
| Risk focus | Typical symptom | Mitigation (concept-level) | Proof hook |
|---|---|---|---|
| Connector / interface | Intermittent contact, transient read errors, partial window gaps. | Locking, strain relief, reduced fretting, stable contact design. | Shock/vibration runs + readback verification pass rate. |
| Solder joints / PCB | Errors rise after thermal cycling; sporadic uncorrectables. | Mechanical reinforcement, controlled stress paths, protective coating. | Thermal cycling + sustained-write + ECC trend comparison. |
| Thermal hot spots | Frequent throttling, rising ECC corrections, higher retry counts. | Thermal path design, power limiting, “closure-first” throttling policy. | Steady-state temperature vs error counters and throughput. |
Packaging should be explained as a layered system: an outer enclosure and damping/insulation protect against impact energy, while internal stiffening supports the PCB and reduces local strain. Inside the thermal domain, the recorder should prioritize safe closure behaviors (commit markers and manifests) over raw throughput when approaching temperature limits, because the primary goal remains “readable and provable” storage after an event.
H2-9 · Health monitoring & built-in test — proving the recorder is still trustworthy
Maintenance decisions should not rely on a single “pass/fail” light. The objective is to combine built-in tests (BIT/BIST) with media-health telemetry so the recorder can answer three practical questions: Can it record now? Is risk rising? What action is required? A good health design is measurable: every critical signal is readable, loggable, and eligible for thresholds or trends.
BIT/BIST coverage (recorder-side)
| Test type | When it runs | What it proves | Failure handling |
|---|---|---|---|
| Power-on BIT | At boot before entering normal record mode. | Core subsystems are reachable; last shutdown can be reconstructed; metadata area is readable. | Fail-closed or restricted mode. |
| Periodic BIT | On a controlled schedule during operation. | Ongoing consistency checks and lightweight readback sampling without disrupting recording. | Raise monitoring level; escalate if trending. |
| Write-path BIST | On-demand or scheduled low-impact window. | End-to-end loop: buffer → write → media → readback verification (hash/CRC check). | Freeze/export if proof fails. |
Media-health indicators should be interpreted as a combination of irreversible degradation signals and trend-based early warnings. For example, a growing bad-block count is fundamentally different from an increasing ECC correction rate: one implies structural wear, while the other may indicate the recorder is “working harder” to maintain correctness. Thermal exposure history provides context: sustained high-temperature write workloads can accelerate error growth and increase retries.
Health metrics table (readable · loggable · alarmable)
| Metric | How to interpret | Log field | Alarm rule → action |
|---|---|---|---|
| Bad block growth | Irreversible media wear indicator; growth rate matters. | BadBlocksTotal, BadBlocksDelta | Rapid growth → Degrade / Replace planning. |
| ECC corrected bits trend | Early warning; rising trend implies shrinking margin. | EccCorrectedBits, EccTrendSlope | Trend up → raise sampling + schedule service. |
| Uncorrectable events | Hard fault signal; cannot be “averaged out.” | EccUncorrectableCount, AffectedSegmentIDs | Any event → Replace / export evidence fail-closed. |
| Read retry rate | Operational stress; rising retries reduce timing margin. | ReadRetries, RetryRate | Rising → Degrade mode / increase verification. |
| Thermal exposure history | Explains acceleration; used for derating policy. | TimeAboveLimit, PeakTemp, ThermalCycles | Excess exposure → throttle policy + service window. |
| Journal replay / recovery counts | Frequent recovery indicates repeated abnormal closures. | ReplayCount, LastValidEpoch | High frequency → investigate power-fail closure margin. |
Maintenance actions should be explicit and conservative. A recorder that cannot prove its write-path integrity or shows uncorrectable events should not be kept in service as “probably okay.” The safest policy is fail-closed: freeze, export what is provably valid, and replace the storage module when evidence indicates crossing a risk threshold.
Maintenance actions (decision-friendly)
| Action level | Typical entry conditions | Recorder-side actions |
|---|---|---|
| Continue | Stable trends; no uncorrectables; bad-block growth flat; retries normal. | Normal verification cadence; log counters for trend tracking. |
| Degrade / Plan service | ECC corrections rising; retry rate increasing; thermal exposure elevated. | Increase readback sampling; apply write-pressure limits; schedule maintenance. |
| Replace / Remove | Any uncorrectable event; BIT fails; repeated recovery anomalies; rapid bad-block growth. | Freeze or read-only; export provable evidence; replace CSMU/media. |
H2-10 · Data offload & chain of custody — export, verify, and keep evidence consistent
Offload should be treated as a controlled recorder-side procedure, not an ad-hoc copy. The goal is to export a package that is complete, in-order, and provably consistent with the recorder’s commit epoch (LCP). This is achieved by entering a read-only export mode, building a manifest that enumerates segments, generating a verification chain, and recording an offload log that aligns with the exported segment IDs and epoch marker.
Offload procedure (6 steps, recorder-side)
| Step | Recorder action | Artifact produced | Verification gate |
|---|---|---|---|
| 1 | Enter export mode (freeze / read-only). | Session ID + current epoch | No further writes allowed. |
| 2 | Select target window (event or time range). | Segment list + window bounds | Bounds land on committed epoch. |
| 3 | Build export package from segments. | Package + manifest | Manifest self-consistent (count/order/size). |
| 4 | Compute verification chain. | Hashes / summaries | Sample readback matches manifest hashes. |
| 5 | Transfer the package. | Transfer log (chunks/retries) | Completion mark + summary match. |
| 6 | Finalize and log export outcome. | Offload log + final status | Offload log aligns with segment IDs and epoch. |
“Chain of custody” at recorder level is primarily about consistency: the export package, the manifest, and the offload log must describe the same segment set and the same commit epoch. If any gate fails — wrong boundaries, missing chunks, hash mismatch, or a log that does not align — the export should be treated as invalid and re-attempted from a known-good epoch.
Common failures and fail-closed handling
| Failure | Typical symptom | Recorder-side handling |
|---|---|---|
| Interrupted transfer | Missing chunks; no completion marker; count mismatch. | Resume or re-export; accept only when summary and counts match. |
| Verification mismatch | Hash mismatch; manifest inconsistency; readback proof fails. | Fail-closed: roll back to last valid epoch; rebuild package; log fault. |
| Wrong window boundaries | Pre/post not complete; window crosses uncommitted segments. | Force selection onto committed epoch boundaries; reject “unprovable” windows. |
H2-11 · Validation & production checklist — how to prove power-fail, integrity, and trigger logic
This checklist is written to be auditable. Each item includes a test condition, an observable artifact (log/counter/report), and a pass/fail rule. The structure is layered so engineering, production, and maintenance can each run a bounded set of tests without redefining correctness.
Definition of Done (acceptance rules)
- LCP closure: every forced power cut ends at a committed epoch (commit marker present) and recovery never produces a “quiet” mismatch.
- Integrity proof: post-recovery auto-check passes (manifest + segment hashes/CRCs), and readback sampling shows stable ECC/retry trends.
- Trigger completeness: for each trigger class, pre/post windows are complete and aligned to committed boundaries (no partial/unprovable segments).
- Fail-closed handling: any uncorrectable event or verification mismatch forces export-only / service action, not continued recording.
1) Engineering qualification (R&D validation)
R&D validation must cover worst cases, not averages. The matrix below focuses on the recorder’s internal choke points: input buffering, FTL commit, and the power-fail closure window. The objective is to show timing margin (early warning + hold-up) remains sufficient under the highest write pressure and the fastest rail collapse.
Power-fail test matrix (must be enumerated)
| Dimension | Levels to cover | Evidence + pass criteria |
|---|---|---|
| Write load | Low / Mid / High sustained + Burst-event profile. | Log LCP epoch, closure time, replay count; PASS if post-recovery verification is clean. |
| Ramp slope | Slow / Medium / Fast / Very fast rail collapse (project-defined). | Measure early-warning lead time vs closure duration; PASS if margin > 0 with worst WA. |
| Temperature points | Cold / Ambient / Hot (recorder operating limits). | Compare ECC corrections and retries vs baseline; PASS if trend remains within limits. |
| Cut-point zone | Buffer stage / Pre-commit / During commit / Post-commit (randomized). | PASS if recovery lands on a committed boundary and segment manifest stays consistent. |
Suggested minimal recorder-side log keys for audit:
EarlyWarn_us, Freeze_ts, Drain_us, Commit_us, LastValidEpoch,
ReplayCount, VerifyStatus, EccCorrectedBits, EccUncorrectableCount, ReadRetries.
Integrity proof (post-recovery loop)
- Randomized cut: run many power cuts with randomized timing relative to commit boundaries (cover all zones).
- Auto-check on boot: rebuild or replay journal metadata, then verify manifest counts/order and segment hashes/CRCs.
- Readback sampling: verify a defined fraction of recent segments and record ECC/retry counters as a time series.
- Pass rule: 0 uncorrectable events; 0 hash/manifest mismatches; trends do not accelerate unexpectedly after stress.
Trigger logic validation (coverage + statistics)
| What to cover | How to measure | Pass rule |
|---|---|---|
| Threshold / duration | Exercise low/medium/high thresholds and short/medium/long durations under controlled inputs. | Trigger fires only in intended region; debounce behaves predictably. |
| Multi-axis voting | 1-axis vs 2-of-3 vs 3-axis combinations; verify gating and vote outcomes. | Vote logic matches spec; no inconsistent state transitions. |
| False triggers | Run background vibration/noise profiles and count triggers per time window. | False-trigger rate within limit; mitigation (debounce/vote) reduces it measurably. |
| Window completeness | Confirm pre/post segments are present and on committed epochs. | No partial/unprovable windows; exported event package matches manifest. |
2) Production / EOL screening (fast, deterministic)
Production tests should be short and strict. Instead of running the full matrix, use a focused subset that is most likely to expose marginal hold-up timing, integrity mismatch, or a broken trigger chain. Production output must include a per-unit report snapshot.
Production checklist (minimum set)
- Power-fail subset: two write loads (mid + high) × two ramp slopes (medium + fast) × one temperature point (ambient; add hot if available).
- Integrity gate: write test payload → force cut → recover → auto-check must PASS; record key counters in the EOL report.
- Trigger quick-check: trigger chain self-test or simulated injection; confirm window boundaries align to committed epochs.
- EOL artifact: serial number, firmware ID, media batch ID, and a counter snapshot (ECC/retries/bad blocks/replay count).
3) Maintenance verification (periodic proof of trust)
Maintenance is about trend and proof, not exhaustive testing. The recorder should provide a lightweight write-path proof, plus a health snapshot that clearly maps to “Continue / Degrade / Replace.” If any verification gate fails, export should be treated as invalid until re-run from a known-good epoch.
Maintenance checklist (service-friendly)
- Read health snapshot: bad blocks, ECC corrected bits trend, retries, thermal exposure history.
- Run small write-path proof: write small segment → commit → readback verify (hash/CRC).
- Decision mapping: stable trends → Continue; rising trends → Degrade/plan service; any uncorrectable or mismatch → Replace/remove.
Example validation BOM (specific part numbers)
The list below is a validation reference (examples) to anchor measurements and acceptance criteria. Final selection must match project temperature range, certification needs, and supply constraints.
| Role in validation | Example part number | Why it matters for H2-11 tests |
|---|---|---|
| Early-warning / reset supervisor | TI TPS3890 | Generates deterministic early warning and reset behavior; used to measure lead time vs closure duration. |
| Hot-swap / eFuse protection | TI TPS25982 | Enables repeatable current limiting and fault handling under high write load; helps verify protection does not corrupt closure timing. |
| Power MUX / source switchover | TI TPS2121 | Supports controlled switchover behavior; used when validating hold-up switching and minimizing rail disturbances during closure. |
| Supercap monitor/manager | TI BQ33100 | Anchors hold-up budgeting with monitored stack health (capacitance/ESR); used to prove hold-up serves closure only. |
| Ideal diode controller | ADI LTC4359 | Reduces reverse current transients during source loss; helps keep closure behavior stable across repeated cut tests. |
| Low-drift trigger accelerometer | ADI ADXL357 | Supports stable threshold/duration tests and false-trigger statistics with low drift across temperature. |
| High-g impact trigger accelerometer | ADI ADXL372 | Targets impact-like trigger profiles; used to validate high-g event capture and window completeness logic. |
| Industrial NVMe with PLP option | Swissbit N3602 (powersafe) | Provides a realistic storage target for randomized cut and recovery verification; supports end-to-end data protection features. |
H2-12 · FAQs (FDR/CVR recorder: power-fail, integrity, triggers, and evidence)
These FAQs focus on what makes flight data/voice recording provable after power loss: last-consistent-point (LCP) closure, integrity layers, trigger windows, health trends, and verifiable offload packages.