Smart Glasses Cam-Audio Module: Sensor, ISP, Codec & Storage
Core idea: A Smart Glasses Cam-Audio Module is only as reliable as its shared resources—USB 5V power, clocks, memory bandwidth, and storage write determinism. Turn complaints (frame drops, pops, drift, reboots, corrupt files) into measurable evidence (two-point 5V, CSI/PLL/storage counters), isolate one variable at a time, then apply the first hardware fix (inrush/rail partition, return path/ESD, clock/codec rail cleanliness, deterministic storage).
H2-1. Definition & Boundary: what this module covers
This page treats the Cam-Audio Module as the capture core of smart glasses: it connects image sensor I/F + ISP/SoC + storage with mic-array + audio codec, and keeps timestamps and capture stability intact under USB 5V power disturbances.
Boundary rule (scope lock)
Only hardware evidence that changes capture fidelity is covered: link margin, buffering/latency, clock/timestamp behavior, power integrity, and storage integrity. Any topic outside the module boundary is intentionally excluded to avoid scope creep.
The five problems this page must solve (engineer-facing)
- Frame drops / stutter: burst drops, periodic drops, temperature-correlated drops, or drops triggered by hot-plug / touch.
- Pops / clicks / noise jumps: start/stop artifacts, pops during storage write, noise floor rising under load.
- A/V drift: audio gradually leads/lags video; drift changes after PLL relock, thermal rise, or resync events.
- Random reboot / reset: brownout, UVLO, or host current limit events that reset ISP/SoC or storage.
- Corrupt / truncated files: missing tail segments, CRC/ECC errors, filesystem/journal faults, retry storms.
Included vs Excluded (fast checklist)
- Included: Sensor I/F margin (CSI errors), ISP/encode buffering, storage worst-case latency, codec clocking & pop evidence, USB 5V droop/inrush, reset/UVLO counters.
- Excluded: Any end-to-end AR user experience, display optics tuning, wireless audio ecosystem design, cloud delivery, or haptics driver circuits.
H2-2. Pass/Fail metrics: turn complaints into measurable checks
Capture problems become solvable only after translating them into measurable metrics and linking each metric to a minimal evidence point (scope, counter, or log). This chapter defines the “scoreboard” used by every later section.
How this chapter is used (repeatable workflow)
- Step 1 — Name the symptom: frame drops, pops, drift, reboot, corrupt file.
- Step 2 — Pick the metric: choose a single metric that rises/falls with the symptom.
- Step 3 — Record the evidence: scope at the right point + the right counter/log in the same time window.
- Step 4 — Correlate: align timestamps across logs and waveforms (the root cause is usually the correlated one).
- Step 5 — Decide directionally: worst-case spikes matter more than averages (latency spikes, droop spikes, error bursts).
Video metrics (frame stability and link health)
-
Dropped frames (burst count)
Symptom: visible stutter or missing frames, often clustered in bursts.
Metric: frame counter gaps per minute; burst length distribution (not just total).
Where to measure: encoder output timestamps, frame sequence numbers, capture pipeline counters. -
CSI error count
Symptom: intermittent video glitches, “works at 30 fps but fails at 60 fps”.
Metric: CRC/ECC/deskew errors per time window; error bursts vs temperature/hot-plug.
Where to measure: CSI/PHY error counters; log of link state transitions. -
ISP overflow / backpressure
Symptom: stable link but frames still drop when load rises (HDR/NR/encode).
Metric: FIFO/line-buffer overflow events; backpressure rate.
Where to measure: ISP status registers; overflow interrupt counters. -
Encode underrun
Symptom: video drops aligned with heavy CPU/storage activity.
Metric: encode buffer underrun / emergency rate-control events.
Where to measure: encoder logs and timestamps aligned to storage write bursts.
Audio metrics (noise, stability, and clock events)
-
Noise floor shift
Symptom: background hiss increases under certain load/conditions.
Metric: RMS of a “silent” segment; noise-floor delta between idle and load.
Where to measure: codec digital samples; correlate with rail noise and storage activity. -
Clip headroom / clipping flags
Symptom: harsh distortion on loud sound, especially after gain changes.
Metric: peak-to-RMS, clipping counters, limiter engagement (if available).
Where to measure: codec status flags + captured waveform peaks. -
PLL unlock / relock events
Symptom: pops/clicks or short gaps, often around hot-plug or droop.
Metric: PLL lock-loss count; relock duration distribution.
Where to measure: codec PLL lock status logs aligned to USB 5V droop. -
Buffer underrun (audio xrun)
Symptom: periodic ticks or gaps during heavy pipeline activity.
Metric: DMA ring underrun count; time between underruns.
Where to measure: audio DMA counters + CPU/ISR load markers (only as timestamps, no OS deep dive).
Sync metrics (drift and timestamp trust)
-
Drift ppm (audio vs video)
Symptom: lip-sync slowly walks off over minutes.
Metric: delta between (audio sample count / fs) and (video frame count / fps) over long windows.
Where to measure: mux timestamps; periodic checkpoints (e.g., every N seconds). -
Timestamp monotonicity
Symptom: A/V suddenly jumps or repeats for a short segment.
Metric: non-monotonic timestamp events; duplicate timestamps count.
Where to measure: mux/file timestamp stream; resync markers. -
Resync events
Symptom: drift “fixes itself” but creates a visible audio or video discontinuity.
Metric: resync count + resync magnitude; correlation with PLL relock or brownout.
Where to measure: pipeline logs aligned to power events.
Power metrics (USB 5V reality check)
-
USB 5V droop at two points
Symptom: errors/pops/reboots that coincide with plug-in or load bursts.
Metric: minimum voltage and droop duration at (A) connector and (B) PMIC input.
Where to measure: scope probes at both points in the same event window. -
Inrush peak
Symptom: device resets on hot-plug; host limits current; unstable startup.
Metric: peak current/droop during the first milliseconds; repeatability across cables/hosts.
Where to measure: hot-plug waveform capture + event counter. -
Reset/UVLO counters
Symptom: “random” reboot that is actually deterministic under droop.
Metric: reset reason histogram; UVLO trip count; brownout flags.
Where to measure: reset reason register + persistent log counter.
Storage metrics (determinism and integrity)
-
Worst-case write latency
Symptom: frame drops or audio ticks during writes even if “average speed” is fine.
Metric: latency histogram; 99.9th percentile; stall duration distribution.
Where to measure: write timestamp logs around capture bursts. -
ECC/CRC/fs errors
Symptom: corrupted or truncated files; missing tail; retry storms in field conditions.
Metric: ECC correction count, CRC errors, fs/journal errors, retries per minute.
Where to measure: storage driver logs + file integrity checks aligned to power dips/ESD events.
H2-3. System architecture: two pipelines sharing resources
The cam-audio module behaves like two real-time pipelines (video + audio) that must survive the same shared resources: clocks, power rails, memory bandwidth, and thermal headroom. Many “single-domain” symptoms are actually cross-domain coupling.
Pipeline view (what must be kept deterministic)
- Video pipeline: image sensor produces a fixed-rate frame stream; CSI-2 link integrity and ISP/encoder buffering decide whether frames are delivered or dropped.
- Audio pipeline: mic-array and codec produce a continuous sample stream; codec clock events (PLL unlock/relock) and DMA ring health decide whether audio stays gap-free and pop-free.
- Mux/file boundary: both streams are written into a file/container; any timestamp discontinuity or burst stall can appear as drift, jumps, or missing segments.
Shared resources (the common root-cause pool)
- Clocks: reference stability and PLL lock behavior; relock events can cause pops, short discontinuities, or resync markers.
- Power rails: USB 5V droop and rail partitioning; one droop can simultaneously trigger CSI errors, codec clock events, and storage retries.
- Memory bandwidth: ISP/encode + audio DMA + storage writes compete; worst-case latency spikes matter more than average throughput.
- Thermal headroom: temperature can reduce link margin, increase leakage, and force throttling; thermal-driven transitions often look “random” without correlation.
Top coupling paths (symptom → evidence → likely domain)
H2-4. Image sensor interface (CSI-2) margin: what breaks first and why
CSI-2 failures in smart glasses are rarely “random”. They usually follow a small set of margin killers that become visible as error bursts, temperature correlation, or touch/hot-plug triggers. This section focuses on module-level evidence, not generic protocol theory.
Module constraints that reduce CSI margin
- Flex + connector realities: bend radius, connector transitions, and ground discontinuities often dominate over ideal routing.
- Lane rate pressure: moving from 30 fps to 60 fps raises lane rate and shrinks eye margin; “works at low fps” is a key signature.
- Return-path continuity: breaks in reference ground or shield gaps inject common-mode noise and amplify sensitivity to disturbances.
Failure signatures (what is observable)
- CRC/ECC bursts aligned to frame drops or brief link retraining.
- Deskew fail / retrain loops when lane rate is high or when conditions change (touch/plug/temperature).
- Temperature-correlated errors: error rate increases after warm-up; “cold boot OK, hot fails”.
- Touch-triggered errors: touching frame/hinge/USB cable triggers errors within a repeatable window.
- Hot-plug-triggered errors: plugging USB or starting a heavy write burst increases CSI errors.
Likely root causes (mapped to engineering levers)
- Return-path discontinuity: shield/ground breaks, connector ground bounce, or flex crossing splits.
- Impedance discontinuity: connector/transition steps, sharp bends, via stubs (module-level transitions matter most).
- Rail noise / common-mode injection: switching noise or USB transient lifts common-mode, reducing receiver margin.
- ESD loading or leakage: TVS capacitance or leakage shifts with temperature; can create “warm-up failures”.
- Reference clock jitter: ref clock or PLL supply noise turns into timing uncertainty under load.
Fast validation (minimal actions → discriminator)
- Reduce fps/resolution: if errors collapse, suspect pure margin (impedance/jitter/return path) rather than ISP load.
- Stabilize USB 5V (short cable / stronger source): if errors collapse, suspect rail noise or droop-driven common-mode injection.
- Disable or throttle storage writes: if errors collapse, suspect bandwidth/ground noise coupling during write bursts.
- Temperature step test: if errors track temperature strongly, suspect ESD leakage, mechanical stress, or thermal drift of margin.
- Touch/hot-plug repeatability: if highly repeatable, prioritize return-path/ESD path investigation.
H2-5. ISP/SoC load + memory bandwidth + latency spikes (why stutter happens)
Stutter, pops, drift, and “random” resets often share a single mechanism: latency spikes that propagate through buffers while video + audio + storage compete for the same memory bandwidth. Average throughput can look fine while worst-case stalls destroy real-time capture.
Where buffers exist (and what fails first)
- Sensor / CSI receive FIFO: sensitive to short disturbances; error bursts and brief retraining show up as drops.
- ISP line/tile buffers: sensitive to compute load; overflow/backpressure grows when processing cannot keep pace.
- Encoder buffer (VBV / output queue): sensitive to downstream stalls; underrun indicates backpressure from storage/bus contention.
- Audio DMA ring buffer: sensitive to bus contention and service latency; underrun (xrun) is often the earliest warning sign.
- Mux / file write queue: sensitive to storage determinism; long-tail stalls propagate upstream as drops/pops.
Latency spike sources (the usual suspects)
- Storage GC / wear leveling: long-tail write stalls (rare but large) that align with drop bursts.
- fsync / journal commits: periodic stalls that look “every N seconds” under certain write patterns.
- Thermal throttling: sustained slowdown after warm-up; latency distribution shifts upward, not just a single spike.
- Power droop → retry/reset: droop causes retries, ECC bursts, or brief resets that create compound stalls.
- Bus contention: encode + audio DMA + storage bursts collide, creating frequent medium spikes that starve the audio ring.
Fixed SOP: Symptom → shared resource → discriminator evidence → first mitigation
H2-6. Audio codec + mic-array AFE: noise & pops as hardware evidence
In compact wearables, “noise floor rising” and “start/stop pops” are often the fastest hardware evidence for bias routing, ground reference, codec rail cleanliness, and PLL lock stability. This section focuses on measurable hardware triggers rather than DSP algorithm details.
Mic array physical realities (what changes the analog input)
- Mic port / duct / membrane: mechanical changes alter sensitivity and channel matching; can look like “one mic is bad”.
- Bias routing: mic-bias impedance and decoupling shape how rail noise becomes audible noise.
- Shield + ground reference: small ground shifts become common-mode injection in tight layouts and flex transitions.
Codec clock domain (hardware-trigger paths only)
- MCLK/BCLK/LRCLK stability: any lock-loss or relock can create discontinuities that appear as pops/clicks.
- PLL sensitivity to rail noise: rail ripple and droop can trigger unlock/relock events.
- Start/stop pop mechanisms: bias settling, mute/unmute edges, and rail ramp behavior can inject an impulse if timing is wrong.
Pop/Noise triage SOP (symptom → first 2 measurements → discriminator → first fix)
H2-7. A/V sync & timestamps: drift budget, resync evidence
“Audio/video out of sync after a few minutes” is solvable only when sync is treated as a measurable quantity: timebase + timestamp meaning + drift shape (continuous drift vs discrete jumps). This chapter builds a minimal log set and a fast computation method to classify root causes.
Timestamp origins (what each timestamp actually represents)
- Video timestamps: may represent frame start (sensor/ISP side) or encode output (SoC side). Buffering can shift “when it was captured” vs “when it was written”.
- Audio timestamps: may represent sample count (codec clock domain) or a DMA/system timebase. Service latency and underruns can break continuity.
- Mux/file boundary: the container is only as correct as its inputs; discontinuity or resync markers must be logged, not guessed.
Where drift comes from (two shapes, different diagnostics)
- Continuous drift: ppm mismatch between clocks, temperature-driven frequency drift, slow load/thermal changes.
- Discrete jumps: PLL unlock/relock, resync events, brownout/reset time jumps, discontinuity insertion after xrun.
What to log (minimal, high-signal checklist)
- Choose two stable windows (e.g., 30–60 s and 330–360 s) away from start/stop transitions.
- Record frame count at each window boundary and obtain the corresponding video timestamps.
- Record cumulative audio sample count at the same boundaries (or audio timestamps tied to sample count).
- Compute durations: video duration from timestamps (preferred) or frames/fps; audio duration from samples/Fs.
- Compute drift: ΔT = Taudio − Tvideo; express as ms per minute and as ppm (ppm ≈ ΔT / T × 10⁶).
- Check drift shape: smooth growth → ppm/thermal drift; step changes → PLL relock/resync/brownout time jump.
H2-8. Storage choices & integrity: deterministic write matters more than peak speed
Storage selection for continuous A/V capture is not about peak MB/s. The deciding factors are worst-case write latency, error-retry cost, power-event behavior, and temperature stability. Most field failures are predictable when logs focus on long-tail latency and integrity signatures.
What to compare (capture determinism dimensions)
- Worst-case write latency: long-tail stalls cause upstream buffer starvation (drops/pops) even when average speed is high.
- Power-event consistency: behavior under droop/reset determines file tail loss vs clean recovery.
- Error retry / ECC cost: retries and correction bursts translate into stalls and timing discontinuities.
- Temperature stability: write behavior can shift with heat/cold; margins shrink and long-tail expands.
- File-integrity signature: truncate/corrupt/stutter/fail-with-space are the meaningful outcomes, not datasheet numbers.
Common field signatures (and what they usually imply)
- Missing file tail (truncate): reset/brownout or unflushed tail; confirm with reset reason and last flush timing.
- Corrupted file / decode errors: ECC/CRC bursts or fs errors; confirm with retry/ECC counters and fs error logs.
- Stutter tied to writes: long-tail latency stall; confirm with write latency histogram aligned to frame drops/pops.
- Write fails despite free space: allocation/timeout behavior under temperature or power instability; confirm with error codes + droop/temp alignment.
Three-bucket selection summary (engineering-first)
H2-9. USB 5V power entry: inrush, droop, and cross-domain failures
USB 5V is the most common root of “cross-domain” failures: a single droop event can trigger MIPI errors, codec PLL unlock pops, and storage stalls or resets. The only reliable approach is to treat power as evidence: two-point probing, event counters, and timestamp alignment to symptoms.
USB 5V in the real world (why “5V” rarely behaves like 5V)
- Cable and contact loss: long/thin cables and connector resistance create droop under bursts (encode + write + RF spikes).
- Host/hub current limit: foldback can produce repeated droop-recover cycles that look like random instability.
- Hot-plug transient: plug-in charging of bulk caps causes inrush and ground bounce, creating short but harmful disturbances.
- ESD / plug events: can cause soft faults (noise/error bursts) before any hard failure appears.
Power tree view (where a 5V event turns into multiple symptoms)
- Entry: USB-C/connector → protection (TVS / fuse / eFuse / load switch) → PMIC input.
- Conversion: 5V → buck(s) → LDOs to separate domains (sensor, CSI/IO, codec analog, storage, SoC I/O).
- Dependencies: POR/UVLO thresholds and rail sequencing decide whether the failure becomes a hard reset or a soft corruption.
Inrush: why “bigger cap” can be worse at the USB entry
- What creates inrush: large entry bulk caps + simultaneous rail bring-up + hot-plug charging current.
- Common outcome: host limit/foldback → voltage collapses → repeated retries → “works on some ports/cables only”.
- Engineering levers: soft-start and current limiting (load switch/eFuse/PMIC), staged enable of heavy loads, and distributed capacitance.
Cross-domain coupling paths (droop → symptom chains)
- Droop → CSI margin shrink → MIPI CRC/deskew failures → frame drops (often induced by touch/plug events or heavy load edges).
- Droop → codec rail/clock perturbation → PLL unlock/relock → pops/clicks and elevated noise during capture transitions.
- Droop → storage rail reset / retry burst → write stall / long-tail latency → stutter + file tail loss under sustained record.
H2-10. EMI/ESD + thermal in glasses form factor: practical hooks (not a compliance tutorial)
In a glasses form factor, short flex runs, dense returns, and close switching sources make EMI/ESD/thermal issues appear as “random errors” unless the diagnosis is anchored to vulnerable nodes and quick validation hooks. This chapter uses a strict engineering template: Threat → vulnerable node → symptom → quick validation.
Key coupling realities in a compact module
- Switching EMI is often the hidden driver of CSI error bursts and codec PLL instability during load steps.
- ESD targets cluster around external touch/openings: USB port, FFC/flex connectors, mic openings and shields.
- Thermal drift changes margins: CSI link gets weaker, clock mismatch increases, and throttling creates latency spikes.
Threat: Switching EMI (buck harmonics, return bounce)
- Vulnerable node: CSI return path / ref clock → Symptom: CRC bursts, deskew fail, frame drops → Quick validation: error counters spike on load edges or during plug/touch events.
- Vulnerable node: codec PLL / analog rail → Symptom: pops/clicks, noise floor lift → Quick validation: PLL lock events or rail ripple align with pops windows.
- Vulnerable node: mic bias routing (acts like an antenna) → Symptom: one-channel noise, touch-sensitive hiss → Quick validation: bias ripple rises when switching load increases.
Threat: ESD (plug/touch discharge)
- Vulnerable node: USB entry / shield / ground return → Symptom: soft faults (noise/error bursts) → Quick validation: repeatable with touch/plug handling; counters spike without permanent damage.
- Vulnerable node: FFC/flex connector pins → Symptom: intermittent link failures → Quick validation: symptom depends on flex pressure/position; error signature correlates with handling.
- Soft vs hard: soft faults fluctuate and are event-correlated; hard faults persist and show permanent rails or I/O failure behavior.
Threat: Thermal (margin shrink + drift + throttle)
- Vulnerable node: CSI margin → Symptom: errors increase with temperature → Quick validation: error rate is temperature-monotonic or shows a sharp knee near a thermal threshold.
- Vulnerable node: clocks/timebase mismatch → Symptom: A/V drift grows over minutes → Quick validation: drift ppm increases as temperature rises (log drift vs temp).
- Vulnerable node: throttled SoC/ISP/storage path → Symptom: stutter and write stalls → Quick validation: write latency long-tail expands at elevated temperature.
H2-11. Validation & field debug playbook (copyable SOP)
This SOP is designed for fast root-cause isolation with minimal tools. The method is always the same: lock the symptom window → collect two-point power evidence and one domain counter → apply a single isolation toggle → choose the first hardware fix based on discriminator evidence (power vs link vs clock vs storage).
- Step 1 — Freeze the window: reproduce the symptom and record a single time window (start/end) for alignment.
- Step 2 — Two-point power first: probe USB 5V at the connector and at the PMIC input; capture min + droop duration.
- Step 3 — One domain counter: pick the counter that matches the symptom (CSI errors / PLL lock events / write-latency tail / reset reason).
- Step 4 — One isolation toggle: disable storage writes, lower FPS/bitrate, force fixed clock, or swap cable/port (one change at a time).
- Step 5 — First fix: choose the smallest hardware change that removes the discriminator evidence (rail partition, return path, inrush control, ESD leakage, clock cleanup).
Decision-tree bullets (symptom → first evidence → discriminator → isolation → first fix)
Symptom: Touch-triggered errors/pops/resets (handling the frame/port triggers it)
Symptom: Reboot / reset during record or plug-in
Symptom: Frame-drop bursts (CSI errors or encoder underrun) under load
Symptom: Pops/clicks (start/stop, bitrate change, or load steps)
Symptom: A/V drift (audio and video lose sync over minutes)
Symptom: Corrupt file / tail missing (record stops cleanly but file integrity fails)
- USB power protection / inrush control: TI TPS25947, TI TPS2595, onsemi NCP380, TI TPS22965
- Reset / UVLO supervision: TI TPS3839, Analog Devices ADM809
- Low-noise rails (codec/clock): TI TPS7A02, TI TPS7A20, Analog Devices ADP150
- High-speed ESD protection (USB/FFC/MIPI): TI TPD4E02B04, TI TPD4E05U06, Semtech RClamp0524P, Nexperia PESD2V0X1BSF
- Audio codec examples: TI TLV320AIC3254, Cirrus Logic CS42L42
- Clock examples (XO families): SiTime SiT1602, SiTime SiT5356, Abracon ASFL1
H2-12. FAQs (12) — evidence-based, no scope creep
+ Frames drop only when writing to storage: latency spike or bandwidth contention?
First evidenceLog a write-latency histogram (P99/P999) and align it with frame-drop timestamps plus encoder underrun flags. DiscriminatorIf drops line up with rare long stalls, it is latency tail; if drops persist with smooth latency, it is bandwidth contention. IsolationRun RAM-only capture (no writes). First fixPrefer deterministic storage and protect the storage rail (eFuse + supervisor, e.g., TPS25947/TPS3839).
+ Reboots right after plugging USB: inrush or UVLO? Which two points first?
First evidenceScope USB 5V at the connector and at the PMIC input during hot-plug, then read reset-reason/UVLO counters. DiscriminatorIf 5V dips below UVLO or shows current-limit foldback, inrush/entry is dominant. IsolationTry a short thick cable, delay heavy loads (encoder/storage) after attach. First fixAdd soft-start/current limiting (TPS22965/TPS25947) and a reset supervisor (TPS3839).
+ Drops only when hot: CSI margin drift or thermal throttling? Which counters matter?
First evidenceTrack CSI error counters/deskew failures versus temperature and any throttle indicators (clock reduction, underrun without CSI errors). DiscriminatorIf CSI errors rise with temperature even at reduced load, link margin is shrinking; if CSI stays clean but underruns rise, throttling/bandwidth is the trigger. IsolationHold temperature constant and step the load (FPS/bitrate). First fixFor margin: return-path/ESD loading/rail noise; for throttle: thermal headroom and load partitioning.
+ Pops on record start/stop: mute timing or rail dip? How to prove it?
First evidenceCapture codec analog-rail ripple (or mic-bias ripple) and read PLL lock/unlock or xrun/underrun events at the exact pop time. DiscriminatorPops that coincide with rail ripple or PLL unlock point to power/clock; pops that coincide with buffer underrun point to scheduling headroom. IsolationFix sample-rate/clock (no relock) and disable storage writes. First fixClean codec rails (TPS7A02-class LDO) and tighten power sequencing; ensure transition buffering.
+ A/V desync after minutes: ppm mismatch or resync policy? How to compute drift fast?
First evidenceCompute drift from frame-count vs audio-sample-count over a fixed window, and log resync events or timestamp discontinuities. DiscriminatorSmooth drift that scales with temperature indicates ppm mismatch; step-like jumps indicate PLL relock or timebase jumps (often brownout related). IsolationForce fixed clocks (disable DFS/relock). First fixUse a stable oscillator (SiT1602/SiT5356-class) and reduce rail-noise-driven relock events with cleaner supplies.
+ One mic channel gets noisy: bias leakage or acoustic/structure? How to isolate?
First evidenceMeasure mic-bias ripple/offset for that channel and compare the noise floor against other channels at the same gain. DiscriminatorIf noise follows bias ripple or changes with USB/touch events, it is electrical (leakage/return/shield). If noise follows wind/pressure or a specific opening, it is structural/acoustic. IsolationSwap channel-to-physical-mic mapping or run single-channel mode. First fixRework bias routing/return and check ESD leakage near mic openings.
+ Touching the frame makes CSI errors explode: return-path issue or ESD leakage? Fast experiment?
First evidenceLog CSI error bursts with touch timestamps and scope 5V + an IO/CSI rail for short bounce. DiscriminatorInstant burst with bounce suggests return-path discontinuity; a persistent post-touch offset/leakage current suggests ESD-related leakage/loading. IsolationTouch different zones (USB/FFC/mic opening), add a temporary ground strap, and compare burst rates. First fixImprove return continuity and use low-cap ESD parts (e.g., PESD2V0X1BSF/TPD4E05U06-class) with verified loading.
+ Files corrupt after low-battery or unplug: storage consistency or filesystem journal? Which logs?
First evidenceAlign corruption events with power logs (5V droop/UVLO) and storage health logs (retry/ECC/CRC/fs errors). DiscriminatorIf corruption aligns with droop or storage reset, it is power-path/rail integrity; if corruption aligns with long write stalls without droop, it is deterministic-write/journal policy. IsolationControlled power-cut tests plus RAM-only capture. First fixStabilize the storage rail (supervisor + eFuse) and validate worst-case write latency; prefer deterministic storage options.
+ Indoor OK but outdoor fails: heat/EMI environment or power-source differences? What to record?
First evidenceRecord temperature, CSI/PLL/storage counters, and USB 5V droop signature for each power source (phone port, power bank, hub). DiscriminatorIf failures follow a specific source, entry droop/inrush is likely; if failures follow temperature or EMI exposure, susceptibility dominates. IsolationRepeat outdoors with a known-good lab supply and controlled temperature steps. First fixStrengthen input filtering/rail partitioning and reduce EMI/ESD coupling at vulnerable nodes.
+ Higher bitrate causes noise/pops: DMA underrun or PLL relock? Which evidence points?
First evidenceCheck audio xrun/underrun counters and PLL lock/unlock events, then align both with storage write-latency tail and 5V ripple. Discriminatorxruns without PLL events indicate buffer starvation/bandwidth contention; PLL unlock that aligns with rail ripple indicates clock-domain sensitivity to power noise. IsolationDisable storage writes or reduce video load (one toggle). First fixIncrease buffer headroom/scheduling determinism, and clean codec/clock rails (TPS7A02-class LDO + stable XO).
+ 30 fps stable but 60 fps unstable: CSI lane margin or ISP bandwidth? Which KPI first?
First evidenceLook at CSI error counters/deskew failures (lane margin) and ISP/encoder underrun or overflow indicators (bandwidth/compute). DiscriminatorIf CSI errors jump at 60 fps, link margin is the limiter; if CSI stays clean but underruns rise, memory/ISP bandwidth is the limiter. IsolationKeep 60 fps but reduce resolution/bitrate to cut bandwidth; compare. First fixFor margin: return-path + low-cap ESD loading; for bandwidth: buffer sizing and deterministic storage writes.
+ microSD fine in the lab but fails in the field: worst-case latency, contact, or ESD? Which validation catches it?
First evidenceCapture write-latency tail (P999) with retry/ECC/fs errors, and correlate failures with motion/touch/ESD events. DiscriminatorLong-tail stalls with clean contacts indicate latency; error bursts tied to motion indicate contact; bursts tied to touch/ESD indicate coupling/leakage. IsolationMechanically secure the card, repeat touch/ESD tests, and A/B swap to eMMC/UFS. First fixImprove retention/contact and ESD scheme, or move to deterministic storage if latency tail dominates.