123 Main Street, New York, NY 10001

Compression / Codec for Machine Vision (H.26x, JPEG-LS)

← Back to: Imaging / Camera / Machine Vision

Compression/codec in machine vision is an engineering control loop: it must deliver measurable fidelity (edges/defects/ROI), bounded latency, and debuggable determinism via GOP/RC/VBV/ROI/buffering telemetry. The practical goal is to spend bits where evidence matters while keeping recovery, traceability, and security hooks from breaking predictability.

H2-1. What This Page Owns: Compression in Vision Systems (Boundary + Use Cases)

Engineering definition (extractable)

Machine-vision compression is the engineering of bounded latency, measurable inspection fidelity, and traceable, secure streams using H.26x or JPEG-LS/lossless pipelines. The objective is not cinematic quality; it is predictable timing, analyzable edges/textures/defects, and verifiable metadata—while minimizing bandwidth and storage.

Goal 1: Lower bandwidth / storage Goal 2: Bound end-to-end latency Goal 3: Preserve analyzable fidelity (ROI)

What this page covers vs. what it only references

Owns (deep coverage)
  • Encoder pipeline knobs: rate control (QP/VBV), picture structure (GOP/references/slices), low-latency modes.
  • Internal buffering logic: line/frame/reference buffering as it affects latency bounds and burst behavior.
  • Inspection fidelity controls: lossless / near-lossless, ROI bit allocation, and how to validate “defect survives compression”.
  • Security hooks: encryption/signing/watermarking insertion points without breaking determinism.
  • Telemetry & validation: which counters prove “codec config” vs “content complexity”.
References (link-out only; not expanded here)
  • Machine-Vision Interfaces (PHY/SerDes/retimers, electrical trigger robustness).
  • Sync/Trigger & Timing Hub (PTP/1588 distribution, jitter-cleaning PLL network).
  • Local Buffering & Storage (SD/NAND/SSD controller, PLP hold-up, wear leveling/ECC internals).
  • Power/EMC/I/O (PoE PD, isolated rails, surge strategy, grounding implementation).

Rule: when those topics are required, reference them by link; do not duplicate their core content in this page.

The measurable triangle: Bitrate ↔ Latency ↔ Fidelity

Use measurable evidence for each corner; avoid “looks better” language.

Bitrate (not only average)
  • Average + peak bursts (bits/frame distribution; burst-to-average ratio).
  • Scene-change spikes and ROI-induced variability (needs counters, not guesswork).
Latency (bounded, not just low)
  • min/typ/max; the bound (max) is often the real requirement.
  • Primary latency multipliers: B-frame reordering, lookahead, VBV fullness, deep queues.
Fidelity (inspection-grade)
  • ROI metrics: edge/texture preservation, fine-defect contrast, OCR/detector accuracy stability.
  • Lossless/near-lossless criteria: “compression error must not exceed defect signal budget”.

Engineering goal: choose knobs that keep all three corners within spec, not merely maximizing compression ratio.

Scope Guard (mechanical check)

Ctrl+F keywords verify that the page stays inside the boundary.

Allowed (examples)

H.264/H.265/H.266, JPEG-LS, lossless, GOP, QP, VBV, low-latency, ROI, slices/tiles, metadata, encryption, watermarking

Banned (examples)

CoaXPress/10GigE PHY tuning, PTP hub architecture, PoE PD design, NAND/SD controller internals, PLP hold-up, cloud/VMS

Figure F1 — Vision encode chain boundary map

Vision encode chain boundary map Block diagram showing sensor/ISP placeholder feeding preprocess, encoder core, bitstream and metadata, packetization, security and watermarking, and output. The owned scope is highlighted; adjacent pages are shown as link-out boxes. Compression / Codec — Boundary Map Owned scope: preprocess → encoder → bitstream/metadata → security hooks Sensor / ISP (placeholder) Preprocess format / ROI map Encoder Core RC QP/VBV GOP I/P/B BUF line/frame Bitstream + Metadata/SEI Pack packets Security / Watermark Hooks Encrypt Sign WM Output stream Owned scope Interfaces PHY/SerDes (link out) Timing Hub PTP/1588 (link out) Storage NAND/SSD (link out) Power / EMC / I/O PoE/Surge/IO (link out) Evidence signals this page emphasizes QP distribution • VBV/queue depth • bits/frame burst • ROI accuracy delta • lossless error bound • recovery time after loss
F1 highlights the engineering boundary: this page goes deep on encoder knobs (RC/GOP/buffering), measurable latency bounds, inspection fidelity (ROI/lossless), and security insertion points. Interfaces/timing/storage/power are referenced by link only.
Cite this figure: Compression / Codec for Machine Vision — “Vision encode chain boundary map (F1)”, ICNavigator.

H2-2. Decision Tree: Choose Codec Family by “Task” Not Trend

Why task-first selection matters

Codec selection in machine vision should start from the inspection task, not from the newest standard. Different tasks impose different non-negotiables: some require lossless evidence, others require bounded latency, and some are primarily bandwidth-limited. Picking the codec family first often forces later “band-aids” (unstable rate control, unpredictable latency spikes, or ROI failures).

Inspection / Detection Measurement / Metrology Remote Monitoring Evidence / Compliance

Minimum spec set (what must be known before choosing)

This list prevents “codec debates” without measurable inputs.

Input characteristics
  • Resolution and frame rate (e.g., 1920×1080 @ 60 fps).
  • Bit depth: 8/10/12; and pixel format (grayscale, YUV 4:2:0 / 4:2:2 / 4:4:4).
  • ROI definition: ROI area ratio, static vs dynamic ROI, and “must-preserve” feature type (edge/texture/defect/OCR).
Hard constraints (non-negotiables)
  • Latency requirement: specify max bound in milliseconds, not only typical.
  • Loss tolerance: acceptable mosaic probability, recovery time after loss, and whether any frame loss is allowed.
  • Traceability: whether metadata must be signed / watermarked; and whether replay must be bit-exact.
Acceptance metrics (measurable)
  • Bitrate: average + burst (bits/frame distribution, burst-to-average ratio).
  • Fidelity: ROI edge/texture preservation and task accuracy delta (false/true detections, OCR confidence stability).
  • Latency: min/typ/max + jitter; plus “queue depth” stability under scene changes.

Task → constraints → recommended codec path (engineer-oriented)

Task type Hard constraints (what cannot break) Recommended path (codec family + structure + verify)
Measurement / Metrology Evidence must not be altered; repeatable replay; drift must be diagnosable. Latency can be moderate, but fidelity is absolute. JPEG-LS lossless or near-lossless with explicit error bound. Prefer intra-only structure.
Verify: difference map / error bound compliance + ROI measurement repeatability.
Inspection / Detection ROI edges/textures/defects must survive compression; bounded latency required for closed-loop lines. Burst behavior must be controlled. H.264/H.265 low-latency with ROI allocation (QP map / tiles/slices) or near-lossless ROI. Keep GOP short; limit references.
Verify: ROI accuracy delta + latency max + burst-to-average ratio.
Remote Monitoring Bandwidth limited; tolerates some quality loss. Latency target is typically “low enough”, but stability still matters. H.265 (or H.264 if silicon constraints) with controlled bitrate (VBV). Use moderate GOP; avoid deep lookahead if latency variance matters.
Verify: bitrate stability under scene changes + acceptable visual/ROI degradation.
Evidence / Compliance Non-repudiation; tamper resistance; consistent replay; fidelity requirements are driven by audit policy. Lossless or constrained-loss + signing/watermarking hooks placed to preserve determinism. Prefer frequent recovery points (IDR/CRA, limited propagation).
Verify: signature validity + replay consistency + recovery time after loss.

Fast “Do / Don’t” (prevents common mistakes)

  • Do specify a latency max bound; “low latency” without a bound is not testable.
  • Do measure burst (bits/frame), not only average bitrate; bursts drive queueing and missed deadlines.
  • Don’t enable B-frames/lookahead when determinism is required; it often creates unpredictable latency tails.
  • Don’t judge inspection quality by PSNR alone; include ROI task accuracy or edge/texture preservation checks.

Figure F2 — Codec choice decision tree (task-driven)

Codec choice decision tree A decision tree that starts from task type and checks latency bound, fidelity bound, and bandwidth bound to recommend JPEG-LS lossless, H.264 intra, H.265 low-delay, or H.266 optional, with verify metrics at each leaf. Codec Family Decision Tree (Task → Constraints → Verify) Use measurable constraints: latency max, ROI fidelity, burst behavior, recovery time Task inspection / measurement Latency bound? max must be guaranteed Fidelity bound? ROI must survive Bandwidth bound? network/storage limited YES NO JPEG-LS lossless / near-lossless H.265 low-delay + ROI YES NO H.265 efficiency VBV control H.264 intra-only (simple) YES NO H.266 (optional) when silicon supports Verify at the leaf (must be measurable) Latency max ROI accuracy Δ Burst ratio Recovery time Leaf recommendations are only “correct” when these counters meet the requirement.
F2 forces a task-first choice: define latency bound, ROI fidelity bound, and bandwidth pressure, then pick a codec family and structure. Each leaf includes the verification counters that prevent subjective debates.
Cite this figure: Compression / Codec for Machine Vision — “Codec choice decision tree (F2)”, ICNavigator.

H2-3. Latency Anatomy: Where the Milliseconds Hide

Core idea: bounded latency is an engineering budget, not a slogan

In machine vision, the requirement is usually not “low latency” but bounded latency (a max tail that stays within a control-loop deadline). The only way to guarantee a bound is to decompose latency into measurable segments, then remove or cap the mechanisms that amplify the tail: B-frame reordering, lookahead, VBV queueing, and deep buffer backlogs.

Bound = max matters Tail latency & jitter Measure by segment

Three latencies (define what is controllable)

L1: Capture → Encode-Out (codec-core controllable)
  • From frame arrival at the encode pipeline to bitstream-ready.
  • Dominant multipliers: reordering, lookahead, reference/DPB access, internal queue depth.
L2: Encode-Out → Egress (pack/security tail)
  • From bitstream-ready to “ready-to-send” after packetization and optional encryption/signing.
  • Dominant multipliers: pack queues, crypto throughput limits, fragment policy, backpressure.
L3: Egress → Decode/Display (mostly external)
  • Useful mainly for back-inference: if L1/L2 are stable but end-to-end drifts, the cause is likely outside codec.

This page focuses on L1/L2 mechanisms and counters. Interface timing distribution, network stack behavior, and receiver-side buffering belong to other pages (link-out only).

Latency budget table (controllable vs. not) + what to log

Treat each row as a testable checkpoint. If the max bound breaks, identify which row grows first.

Stage Control Tail amplifiers (most common) Evidence to capture (counters / timestamps)
Line/Frame ingress Partial Deep input queues; frame pacing mismatch; sudden scene complexity Queue depth; frame counter continuity; input arrival timestamps
ME/RDO Yes Lookahead; heavy motion search; content-dependent compute spikes Encode-start/encode-end timestamps; lookahead depth; per-frame compute time
Entropy Yes High-detail frames increase bits; backpressure from VBV/pack Bits/frame; entropy stage time; stall events
GOP reorder Yes B-frame reordering; DPB/reference depth; long GOP Frame-id vs output-order; reorder depth; reference count
VBV / rate buffer Yes VBV fullness sustained high; large VBV hides bursts as delay VBV fullness over time; burst-to-average ratio; skip/merge indicators
Pack / fragment Partial Packet queueing; fragmentation policy; downstream backpressure Pack queue depth; egress timestamp; bytes/packet distribution
Encrypt / sign Partial Crypto throughput below peak; key ops stalling pipeline Crypto stage time; crypto queue depth; drop/retry counts (if any)

Practical rule: any “latency improvement” claim is incomplete unless it includes max and shows which row’s evidence improved.

Low-latency mode checklist (make the bound defensible)

Remove tail multipliers
  • Disable B-frames (avoid reordering tail).
  • Disable or cap lookahead (avoid hidden frame queues).
  • Limit reference frames / DPB depth (reduce dependency backlog).
Cap buffering and burst translation
  • Use a smaller VBV to prevent long “delay storage”.
  • Cap internal queue depth; avoid deep FIFO “safety nets” that become latency tails.
  • Prefer slices/tiles to reduce waiting and limit error propagation radius.
Verify (must be measured)
  • Latency: report min/typ/max and jitter.
  • Burst: bits/frame distribution and burst-to-average ratio.
  • Recoverability: time-to-recover after a loss event (shorter tail = smaller blast radius).

Figure F3 — Latency budget pipeline (timeline + reduction knobs)

Latency budget pipeline Timeline diagram from frame arrival to egress showing preprocess, ME/RDO, entropy, pack, encrypt, and output. Each stage has knobs to reduce tail latency: disable lookahead, reduce VBV, and use slices/tiles. Latency Budget Pipeline (L1 + L2) Goal: shrink tail latency by removing reordering/lookahead and capping buffering Frame n arrival Preprocess format/ROI ME / RDO compute-heavy Entropy bits/frame Pack queue Encrypt optional Lookahead disable/cap VBV smaller fullness Queue cap depth Slices / Tiles reduce waiting Egress ready-to-send Evidence to log (so latency becomes explainable) Capture TS • Encode-start TS • Encode-out TS • VBV fullness • bits/frame burst • Pack queue depth • Crypto stage time • Egress TS Report: min/typ/max latency + jitter. Fix the row that grows the max tail first.
F3 shows where bounded latency is lost: lookahead/reordering create hidden frame queues; VBV and pack queues translate bursts into delay. The “evidence to log” set makes L1/L2 explainable and repeatable in validation.
Cite this figure: Compression / Codec for Machine Vision — “Latency budget pipeline (F3)”, ICNavigator.

H2-4. Rate Control That Engineers Can Debug (CBR/VBR/Low-Delay VBV)

Core idea: rate control is a closed loop with observable signals

Rate control (RC) is often treated as a black box, but in machine vision it must be debuggable. The job of RC is not only meeting a target bitrate; it is controlling burst behavior and protecting inspection-critical ROI while keeping latency bounds intact. If RC hides bursts by buffering (VBV), the system “pays” in tail latency and quality volatility.

QP distribution bits/frame burst VBV fullness skip/merge ratio

Minimum evidence set: 4 counters that separate “RC instability” from “content complexity”

1) QP distribution (global + ROI)
  • Watch the high-QP tail: ROI degradation often appears as a tail growth.
  • Compare ROI vs background QP (ROI should not be sacrificed when inspection is the priority).
2) bits/frame curve + burst ratio
  • Average bitrate can look fine while bursts cause queueing and deadline misses.
  • Log burst-to-average ratio and correlate spikes with scene changes and I-frame events.
3) VBV fullness (or equivalent RC buffer occupancy)
  • Sustained high fullness indicates RC is translating bursts into latency.
  • Low-delay modes require VBV to be capped and recover quickly.
4) skip/merge ratio (detail sacrifice indicator)
  • High skip/merge can explain “bitrate stable but details disappear”.
  • Combine with ROI QP to confirm whether inspection features are being removed.

Debug cards: Symptom → Evidence → First knobs (codec-side only)

Symptom A: bitrate looks stable but fine defects / edges disappear
  • Evidence: ROI QP tail grows; skip/merge increases; ROI accuracy/OCR confidence drops.
  • Discriminator: if bits/frame stays flat but ROI metrics degrade, it is allocation (RC/ROI), not bandwidth.
  • First knobs: set ROI QP map/priority, cap ROI QP max, tighten scene-change behavior, avoid “ROI drift” in ROI mapping.
Symptom B: burst spikes cause congestion or missed deadlines (intermittent stalls)
  • Evidence: bits/frame spikes align with scene changes or I-frame events; VBV fullness saturates; pack queue depth rises.
  • Discriminator: if VBV stays high after a spike, the system is paying in latency tail.
  • First knobs: reduce VBV, constrain I-frame size, adjust scene-change threshold, shorten GOP, consider slice/tile to reduce wait.
Symptom C: ROI gets “crushed” (ROI flicker or unstable sharpness)
  • Evidence: ROI QP oscillates frame-to-frame; ROI boundary changes rapidly; bitrate remains within target but ROI accuracy varies.
  • Discriminator: global PSNR can remain similar while ROI inspection fails—use ROI accuracy delta as the acceptance metric.
  • First knobs: stabilize ROI map generation, use multi-level ROI (not binary), bind ROI to tiles/slices, avoid over-aggressive QP swings.

The “first knobs” above intentionally stay within this page: QP/VBV/ROI/GOP/slice. Interface/network/power/storage causes are handled by link-out pages, not duplicated here.

Knob → trade-off mapping (so tuning stays defensible)

QP cap / ROI weighting
  • Benefit: protects ROI detail and task accuracy.
  • Cost: bitrate may rise; background quality may drop if bitrate is constrained.
Smaller VBV (low-delay)
  • Benefit: reduces tail latency “stored in buffers”.
  • Cost: less ability to smooth bursts; may require stricter GOP/ROI strategy.
Shorter GOP / constrained I-frame
  • Benefit: reduces burst severity and improves recoverability.
  • Cost: compression efficiency decreases; average bitrate may increase.

Figure F4 — Rate control loop (observable signals + injection points)

Rate control loop Control-loop diagram showing content complexity driving RC model, producing QP decisions, generating bits, and feeding VBV fullness back. ROI QP map and GOP policy are shown as injection inputs. Rate Control Loop (Debuggable, Evidence-Based) Measure QP, bits/frame, VBV fullness, skip/merge to separate RC behavior from content complexity Content complexity RC model target + constraints QP decision per frame/ROI Bits out bits/frame VBV buffer fullness feedback Fullness feedback to RC model ROI QP map injection GOP policy IDR / refresh Observable signals (must be logged) QP distribution bits/frame VBV fullness skip/merge If bursts are “smoothed” only by VBV, the cost appears as tail latency and ROI volatility.
F4 makes RC measurable: QP and bits/frame show allocation decisions; VBV fullness reveals whether bursts are being translated into delay. ROI QP maps and GOP policy are explicit injection points that should be validated with ROI accuracy deltas.
Cite this figure: Compression / Codec for Machine Vision — “Rate control loop (F4)”, ICNavigator.

H2-5. Picture Structure: GOP, References, Slices/Tiles (Determinism vs Efficiency)

Core idea: picture structure controls latency bound and error radius

In machine vision, picture structure is not a “codec preference”; it is a contract for bounded latency, error propagation radius, recovery time, and efficiency. Long dependency chains (deep references, long GOPs) can improve compression, but they also increase the blast radius when a unit is lost or corrupted.

Bounded latency Error radius Recovery time Efficiency trade-offs

4 knobs × 4 outcomes (use as a design/validation checklist)

The table below is intended to be testable: each knob must map to measurable outcomes and evidence logs. (Mobile: swipe horizontally.)

Knob Latency bound Error radius Recovery time Efficiency Evidence to log
GOP length
short vs long
Short GOP reduces worst-case dependency tail; long GOP can increase tail under stress Shorter chain limits how far errors persist; long GOP extends visible corruption Short GOP recovers sooner (more frequent refresh opportunities) Long GOP usually improves compression efficiency GOP length, frame type pattern, recovery frames-to-clear
IDR / CRA frequency
more vs fewer
More frequent refresh points tighten bounds after disturbances Refresh points cap propagation across time Higher frequency improves “time-to-usable” after loss events More refresh increases bitrate overhead Refresh markers, bits/frame spikes at refresh, time-to-clear artifacts
Reference depth
limited vs deep
Deeper references can increase queueing and tail sensitivity Deeper dependencies increase how many frames become affected by a single break Limited references shorten the “prediction memory” and speed recovery Deep references can improve efficiency in repetitive scenes Ref count, DPB depth, reorder depth (if present), artifact persistence
Slices / Tiles
coarse vs fine
Finer partitioning can reduce waiting and improve pipeline behavior Partitions localize damage: a bad unit affects a region, not the whole frame Localized damage shortens “effective recovery” for ROI Partitioning can cost bits (overhead) and reduce efficiency Partition map, region-level artifact logs, ROI recovery time

Validation hint: always report max (not only average) for latency and recovery. “Tail is the spec.”

Failure mode: one loss breaks the prediction chain (what to change first)

Symptom: a single loss/corruption causes long-lasting mosaics/artifacts
Evidence: artifacts persist across many subsequent frames; recovery is slow
Likely structure cause: long GOP + deep references + insufficient refresh points
First structural fix: shorten GOP, increase IDR/CRA cadence, limit reference depth
Symptom: ROI becomes unusable even when corruption started elsewhere
Evidence: corruption spreads spatially into ROI; ROI metrics collapse
Likely structure cause: no/insufficient slice/tile partitioning (damage is not localized)
First structural fix: introduce slices/tiles to localize damage; bind ROI to partitions where possible
Symptom: bounded latency target fails under stress (deadline misses)
Evidence: tail latency grows; output timing becomes less predictable
Likely structure cause: deeper dependency/queue sensitivity amplifies tail behavior
First structural fix: simplify structure (shorter GOP, limited refs, partitioning) before chasing minor RC tweaks

Figure F5 — Error propagation vs GOP (time + region comparison)

Error propagation vs GOP Two scenarios compare long GOP with deep references and no slicing versus short GOP with limited references and multiple slices/tiles. Shows how error spreads over many frames and across the whole frame versus localized and quickly recovered. Error Propagation vs GOP / Partitioning Goal: shorten error radius (time) and localize damage (space) A) Long GOP • Deep refs • No slices/tiles Single break → long time-to-recover and full-frame artifacts I P B I P B P B P B P B P B P B P ! loss/corrupt error persists across many frames Full-frame artifacts Recovery: slow B) Short GOP • Limited refs • Multi slices/tiles Single break → localized damage and fast recovery I P P P I P P P I P P P Slices/Tiles localize the blast radius few frames only Localized damage Recovery: fast
F5 contrasts time and space: long dependency chains can cause long-lasting full-frame artifacts, while frequent refresh points and partitions (slices/tiles) cap the time radius and localize the spatial blast radius.
Cite this figure: Compression / Codec for Machine Vision — “Error propagation vs GOP (F5)”, ICNavigator.

H2-6. Lossless & Near-Lossless for Vision (JPEG-LS and Practical Constraints)

Core idea: lossless preserves evidence when defects sit near the noise floor

Lossless (and bounded-error near-lossless) matters when inspection evidence is subtle: micro-defects, fine textures, and measurement edges can sit close to the noise floor. In that regime, lossy compression error becomes indistinguishable from noise and can erase evidence, increasing false-negative risk. The trade-off is practical: lossless pushes higher and more variable bits/frame, raising buffering and throughput pressure.

Evidence preservation Noise-floor defects Bounded error Burst pressure

Application-driven: what must remain faithful (and how to validate)

Metrology / measurement
  • Preserve: edges and geometry used for dimensional accuracy.
  • Breaks first (lossy): edge softening and local bias.
  • Validate: difference images around edges; measure delta in edge/position error.
Micro-defect inspection
  • Preserve: low-contrast defect signatures close to noise.
  • Breaks first (lossy): compression error masks defect amplitude.
  • Validate: defect miss rate delta on ROI; difference image shows defect disappearance.
Forensics / traceability
  • Preserve: auditable evidence with minimal transformation.
  • Breaks first (lossy): disputed ambiguity (is it artifact or real).
  • Validate: checksum/bit-exact reproducibility; controlled re-decode comparison.
Dataset capture (bias control)
  • Preserve: training/benchmark inputs without codec-imposed bias.
  • Breaks first (lossy): systematic texture loss and block artifacts.
  • Validate: compare feature/ROI metrics across versions; keep a lossless reference set.

Note: this section focuses on codec-side constraints and validation methods, not algorithm internals.

Near-lossless: bounded error budgeting + ROI-first protection

Bound the maximum error (engineer the guarantee)
  • Define an explicit error ceiling (bounded deviation), instead of subjective “slightly lossy”.
  • Use the bound as an acceptance criterion alongside task metrics (ROI miss-rate / edge error).
ROI-first allocation (spend bits where evidence lives)
  • Keep ROI on a tighter error bound; allow larger error in background.
  • Validate with ROI metrics, not only global PSNR-like scores.
Practical constraint: bursts and buffering pressure
  • Lossless often increases bits/frame variability → higher burst-to-average ratios.
  • Track bits/frame and buffer/queue occupancy to ensure latency bounds remain intact.

Figure F6 — Lossless vs lossy impact on defect evidence (signal + noise + compression error)

Lossless vs lossy defect evidence Two paths compare lossless/near-lossless and lossy compression. Shows signal plus noise input, compression error injection for lossy, and how a small defect peak can be preserved or erased. ROI is highlighted as protected region. Lossless vs Lossy: Defect Evidence Near Noise Floor When defect amplitude ≈ noise, lossy error can erase the evidence A) Lossless / Near-lossless (bounded error) Defect peak remains distinguishable in ROI Input: signal + noise defect Codec error ≈ 0 / bounded Output: evidence preserved defect ROI protected B) Lossy (unbounded / task-unsafe error) Compression error adds to noise and can erase the defect peak Input: signal + noise defect Codec compression error Error adds to noise Output: defect disappears false-negative risk ROI
F6 illustrates why lossless/near-lossless is task-driven: when defect amplitude approaches the noise floor, lossy error behaves like added noise and can erase evidence. Validation should rely on ROI deltas and difference images.
Cite this figure: Compression / Codec for Machine Vision — “Lossless vs lossy defect evidence (F6)”, ICNavigator.

H2-7. ROI Encoding for Inspection: Keep Edges, Spend Bits Where It Matters

Core idea: ROI encoding is evidence budgeting, not “pretty video”

ROI encoding allocates bits to preserve task evidence (edges, text strokes, micro-defects) under bandwidth and latency constraints. Success is measured by ROI stability (edge/contrast retention) and task metrics (OCR/detection accuracy) — not by subjective overall sharpness.

ROI evidence Edge preservation Metric-driven validation Traceability (ROI metadata)

ROI strategy “pick one of three” (and what it costs)

1) Static ROI fixed regions
  • Use when: weld points, fixed OCR zones, repeatable inspection geometry.
  • Strength: stable control input → easier rate-control stability.
  • Risk: misses targets when camera/fixture drifts; ROI no longer covers evidence.
2) Dynamic ROI per-frame boxes
  • Use when: moving parts, variable target position, tracking-driven inspection.
  • Strength: spends bits where targets move; saves background.
  • Risk: ROI jitter/drift drives bitrate jitter and boundary flicker.
3) Multi-level ROI A/B/C priorities
  • Use when: “core evidence” + “secondary context” + “background” exist simultaneously.
  • Strength: most realistic budgeting; avoids binary ROI artifacts.
  • Risk: frequent level switching causes bits/frame spikes unless rate-limited.

Engineering rule: ROI maps are time-varying control inputs. Unfiltered ROI updates can masquerade as “content complexity changes” and destabilize rate control.

Interfaces required (what signals must exist)

QP map / ROI priority map input
Per-block or per-CTU weight/offset used by the encoder/RC to bias bit allocation.
Tile / slice partition control (optional)
A structural path to localize ROI and reduce spatial blast radius (concept-level).
ROI metadata sideband (traceability)
Record ROI definition/source/version so inspection evidence remains auditable.

Failure modes → evidence → first fix

Symptom: ROI boundary flicker / shimmering
Evidence: ROI edge region shows frequent QP toggles; edge/contrast oscillates frame-to-frame
Likely cause: ROI mask updates too fast; boundary not aligned/smoothed
First fix: rate-limit ROI updates; apply hysteresis; align ROI to block grid; stabilize boundaries
Symptom: ROI drift → bitrate jitter (bits/frame spikes)
Evidence: bits/frame spikes correlate with ROI movement or ROI size changes
Likely cause: RC treats ROI changes as content complexity jumps
First fix: constrain ROI motion/area changes; cap ROI boost; reduce ROI update frequency
Symptom: ROI not traceable in replay/audit
Evidence: cannot reconstruct which ROI map/version was used for a given segment
Likely cause: missing ROI metadata sideband or inconsistent versioning
First fix: attach ROI definition hash/version + timestamps; keep ROI source consistent

Figure F7 — ROI QP map injection (control + traceability + test points)

ROI QP map injection Block diagram showing ROI generator feeding a QP/priority map into rate control and encoder, plus ROI metadata sideband for traceability and probes for ROI quality and control stability metrics. ROI QP Map Injection (Inspection) Spend bits on ROI evidence • keep control stable • record ROI for traceability Encode control path ROI Generator static / dynamic / multi-level ROI definition QP / Priority Map per-block weighting Rate Control QP decision + stability Encoder Core structure + transform + entropy ROI-aware allocation Bitstream Out Rate-limit ROI updates • hysteresis • align to block grid ROI metadata sideband (traceability) ROI hash / version • ROI source • timestamps P ROI quality probe edge / contrast • OCR / detection P Control stability probe bits/frame jitter • ROI QP delta
F7 shows three critical “hooks”: (1) a QP/priority map interface for ROI biasing, (2) rate-control stability protections (rate limiting / hysteresis), and (3) ROI metadata sideband for auditability. Probes highlight test points for ROI quality and control stability.
Cite this figure: Compression / Codec for Machine Vision — “ROI QP map injection (F7)”, ICNavigator.

H2-8. DDR Buffering Inside the Encode Pipeline (What to Buffer, What to Avoid)

Core idea: internal buffering sets tail latency (bounded delay is the spec)

DDR buffering inside the encode pipeline is valuable for throughput, but it is also the primary source of tail latency. The goal is not “more buffer”; the goal is to keep queue depth observable and bounded so latency remains explainable under scene complexity changes.

Line / frame / DPB buffers Queue depth Tail latency DDR read/write pressure

Three buffer classes (function → risk → evidence)

Mobile: swipe horizontally.

Buffer What it buffers Main risk Evidence to track (concept-level)
Line buffer
rows
Row/stripe data for block processing and pacing Stalls under bandwidth contention (usually localized) Row/stripe stall counters; input/output pacing mismatch indicators
Frame buffer
frames
Full frames for pre-processing and pipeline decoupling Queue depth grows → latency increases frame-by-frame Frame queue depth; enqueue/dequeue rates; max encode latency (tail)
Reference / DPB
refs
Reference frames used for prediction dependencies Deep refs increase DDR reads and sensitivity; longer dependency tail DPB occupancy (concept); ref read bandwidth pressure; artifact persistence under stress

Symptoms → the internal cause chain (stay inside the encoder boundary)

Symptom: latency grows with scene complexity
Internal chain: complexity ↑ → bits/frame ↑ / compute ↑ → queue depth ↑ → tail latency ↑
Symptom: occasional drops / deadline misses
Internal chain: bursts exceed throughput → buffers fill → output misses bound → frames dropped downstream
Symptom: bitrate spikes and jitter “feel random”
Internal chain: control inputs (structure/ROI/scene changes) create bursts → DDR pressure ↑ → queue depth oscillates

This section intentionally avoids storage controller and network protocol details; the focus is DDR buffering inside the encode pipeline.

Buffer map “Do / Don’t” checklist

Do
  • Cap queue depth to enforce bounded latency (validate with max latency, not average).
  • Reduce reference depth in low-latency modes to cut DPB DDR reads.
  • Expose occupancy/queue stats (even if concept-level) so latency is explainable.
Don’t
  • Overuse lookahead if bounded latency is required (hidden queueing grows tail).
  • Allow implicit reordering that adds unpredictable waiting (tail jitter appears).
  • “Fix with bigger buffers” without measuring queue depth (tail latency becomes unbounded).

Figure F8 — Encoder buffering map (DDR read/write pressure + queue depth probes)

Encoder buffering map Pipeline diagram includes input FIFO, preprocess, DDR frame store, DDR DPB reference store, encoder core, bitstream FIFO, output. Solid and dashed arrows indicate DDR writes and reads. Probes indicate queue depth and DDR pressure measurement points. Encoder Buffering Map (Inside the Pipeline) What to buffer • where DDR pressure occurs • where queue depth becomes tail latency Input FIFO Preprocess format / ROI hooks DDR Frame Store frame buffer queue depth DDR DPB (Refs) reference buffer ref occupancy Encoder Core predict / transform entropy Bitstream FIFO Output write read P Probe: queue depth bounded latency depends on it P Probe: DDR read/write pressure ref reads + frame reads + writes write read probe
F8 highlights where DDR buffering turns into tail latency: frame queue depth and DPB reference activity. Track queue depth and DDR pressure points to keep latency explainable and bounded.
Cite this figure: Compression / Codec for Machine Vision — “Encoder buffering map (F8)”, ICNavigator.

H2-9. Security & Watermarking: Protect Streams Without Breaking Determinism

Core idea: security must preserve bounded latency and recovery

Vision pipelines require explainable, bounded latency and predictable recovery. Security mechanisms must protect confidentiality and integrity while avoiding hidden buffering, long-tail latency, or degraded loss recovery.

Encryption (confidentiality) Signing (anti-tamper) Watermarking (traceability) Determinism first

The security triad (what it provides — and what it can break)

Encryption keep content secret
  • Goal: prevent leakage (unauthorized viewing / copying).
  • Risk: added processing delay, throughput overhead, harder partial recovery if applied too early.
Signing prove not modified
  • Goal: anti-tamper and non-repudiation (prove origin and integrity).
  • Risk: verification gates can introduce buffering or replay failure if metadata isn’t consistent.
Watermarking trace the source
  • Goal: traceability (device/batch/channel), resist casual redistribution.
  • Risk: if watermark changes visual complexity, rate-control and ROI evidence may be harmed.

Practical rule: the insertion point often matters more than the cryptography/watermark method.

Where to implement: bitstream vs packet vs metadata (impact surface differs)

Mobile: swipe horizontally.

Layer Typical insertion Most likely to break Best for
Bitstream
sensitive
Encrypt/sign the encoded stream structure (concept) Random access & local recovery; may complicate “partial decode” paths When strong end-to-end secrecy is required and recovery constraints are well defined
Packet
often safe
Encrypt + sign after packetization (concept) Throughput overhead; tail latency if buffering/retry mechanisms exist Determinism-first real-time pipelines (security after encoder decisions)
Metadata
low-disruption
Sign timestamps/config hashes/ROI versioning (concept) Traceability only if metadata is incomplete/inconsistent Auditability and anti-tamper evidence with minimal effect on RC/quality

What to measure (the “do not regress determinism” checklist)

ΔLatency and tail latency
Measure added delay and whether max/99th-percentile latency grows.
Overhead (throughput / bitrate tax)
Confirm security overhead does not push the pipeline into congestion regimes.
Loss resilience
Verify recovery time and error blast radius under packet loss do not worsen.
Replay & verification consistency
Ensure stable playback and deterministic verification results across runs.

Common failures → first fix (stay inside this page’s boundary)

Symptom: latency tail grows after enabling security
Evidence: max/99th latency rises while average stays similar
First fix: move insertion later (prefer packet/metadata layers); cap any security-stage queueing
Symptom: loss recovery gets worse (longer corruption window)
Evidence: “one loss → long visible/analytic impact” increases
First fix: avoid early bitstream rewrites; preserve recovery-friendly boundaries; validate with controlled loss tests
Symptom: watermark harms ROI evidence (edges/text degrade)
Evidence: ROI edge/contrast or OCR/detection metrics drift after watermark enabled
First fix: switch to metadata-based traceability or reduce watermark aggressiveness; re-validate task metrics

Figure F9 — Security insertion points (safe vs risky for determinism)

Security insertion points Diagram shows bitstream output flowing through packetization, encryption, signing, and transmission. Sidebands include watermark key and metadata signing for traceability. Safe insertion points that do not affect encoder decisions are distinguished from risky early insertion points. Security Insertion Points (Determinism-Aware) Protect streams while keeping bounded latency and predictable recovery SAFE (after encoder decisions) RISK (may break recovery / determinism) Probe (what to measure) Output security chain Bitstream Out from encoder Packetize framing / payload Encrypt confidentiality Sign anti-tamper Transmit RISK ZONE (early / intrusive) Bitstream rewrite / pixel-domain watermark may harm recovery, RC stability, ROI evidence Sidebands for traceability (low disruption) Watermark Key key store / version Metadata Signing timestamps • ROI version • config hash auditability without changing RC Probe: Δlatency / tail Probe: overhead / throughput Probe: loss recovery / replay
F9 emphasizes insertion timing. Packet- and metadata-layer protection typically preserves encoder determinism, while early intrusive changes can degrade recovery and ROI evidence. Always validate Δlatency (including tail), overhead, loss resilience, and replay consistency.
Cite this figure: Compression / Codec for Machine Vision — “Security insertion points (F9)”, ICNavigator.

H2-10. Validation & Field Debug Playbook: Symptom → Evidence → Isolate → Fix

Debug principle: make codec problems measurable and repeatable

This playbook converts “codec feels wrong” into a repeatable SOP. Each symptom is diagnosed using two first measurements, then isolated into one of a few root causes inside this page: rate control, GOP/refs structure, buffering/queue depth, ROI policy, or security insertion.

Symptom → Evidence Discriminator First fix knobs Bounded latency

Universal SOP template (use this for every incident)

Step 1 — Symptom
Write one sentence describing the failure mode (what regressed, under which trigger).
Step 2 — First 2 measurements
Always pick the smallest set with high discrimination power:
  • QP distribution + bits/frame curve
  • VBV fullness (concept) + queue depth / buffer occupancy
  • GOP / refs stats (IDR cadence, B/reorder, ref count)
  • Security overhead (Δlatency, overhead ratio) when security is enabled
Step 3 — Discriminator
Use the two measurements to split the cause into: RC / Buffering / Structure / ROI / Security insertion
Step 4 — First fix
Apply one primary knob, then re-measure the same two signals (do not “change five things”).

Top symptoms (each one stays inside this page’s boundary)

Symptom: Bitrate spikes / unstable output → congestion events
First 2 measurements: bits/frame + QP distribution
Discriminator:
  • If bits/frame spikes with stable QP → structure/scene triggers or ROI changes driving bursts
  • If QP swings hard → RC instability or ROI priority input jitter
First fix: cap burstiness (tighten VBV), rate-limit ROI updates, tune I/IDR cadence for stability
Symptom: Latency “sometimes high” (not a fixed delay)
First 2 measurements: queue depth + VBV fullness (concept)
Discriminator:
  • If queue depth grows with complexity → buffering tail latency problem
  • If VBV oscillates while queue depth is stable → RC behavior dominating
First fix: cap queue depth, reduce lookahead, reduce ref depth in low-latency modes
Symptom: Small defects disappear / ROI edges look “washed”
First 2 measurements: ROI edge/contrast metric + ROI vs background QP delta
Discriminator:
  • If ROI QP delta collapses under load → ROI budget not enforced
  • If ROI metric drops when security/watermark enabled → insertion is harming evidence
First fix: enable/strengthen ROI QP map, stabilize ROI, move traceability to metadata signing
Symptom: One loss event → long corruption / long recovery window
First 2 measurements: GOP/IDR cadence + ref count / reordering presence
Discriminator:
  • Long GOP + deep refs → error persists longer
  • Hidden reordering adds recovery delay and jitter
First fix: shorten GOP, increase IDR/CRA frequency, reduce refs, avoid reordering where determinism is required
Symptom: Security enabled → latency jumps / occasional replay or verify failures
First 2 measurements: Δlatency (incl. tail) + verification success rate
Discriminator:
  • If tail latency grows → security stage introduces buffering/queueing
  • If verification is inconsistent → metadata/versioning inconsistency
First fix: move protection to packet/metadata layers, simplify insertion path, sign stable metadata (ROI/config hashes)
Symptom: ROI region flickers / quality alternates frame-to-frame
First 2 measurements: ROI update rate + ROI QP delta variability
Discriminator:
  • High ROI update rate → boundary shimmer and control jitter
  • QP delta toggles → RC reacting to ROI input jitter
First fix: add hysteresis/rate limit to ROI updates; align ROI to block grid; cap ROI boost

Figure F10 — Codec debug decision tree (screenshot-ready SOP)

Codec debug decision tree Decision tree guides engineers from a symptom to two first measurements, then isolates root cause among rate control, buffering, GOP/refs structure, ROI stability, and security insertion overhead, ending in first-fix actions. Codec Debug Decision Tree Symptom → 2 measurements → discriminator → first fix (inside Compression / Codec scope) Start: Symptom what regressed, when Check QP + bits/frame RC vs burstiness signal If QP swings hard RC instability or ROI input jitter First fix tighten VBV • cap bursts stabilize ROI updates If bits/frame spikes bursts driven by structure / ROI First fix increase stability knobs IDR cadence • burst caps Check VBV + queue depth tail latency indicator If queue depth grows buffering dominates tail latency First fix cap queue • reduce lookahead Check GOP / refs recovery window control If long corruption window structure drives persistence First fix short GOP • more IDR reduce refs • avoid reorder Check ROI stability ROI QP delta / update rate If ROI flickers ROI input jitter dominates First fix rate-limit ROI • hysteresis Check security overhead Δlatency + verify success First fix: move insertion later prefer packet/metadata layers
F10 is designed for field use: start with QP/bits and VBV/queue depth, then isolate structure (GOP/refs), ROI stability, and security overhead. Each leaf ends with a single “first fix” knob to apply and re-measure.
Cite this figure: Compression / Codec for Machine Vision — “Codec debug decision tree (F10)”, ICNavigator.

H2-11. IC / Silicon Selection Pointers (Codec-Side, Evidence-First)

What this chapter owns (and what it does not)

This section is not a shopping list for PoE, interfaces, storage controllers, or cloud stacks. It focuses on codec-side silicon (SoC/ASIC/FPGA) selection: which encoder capabilities matter, how to size multi-stream throughput, and—most importantly—how to validate vendor claims using exportable telemetry (QP/RC/VBV/queue/errors).

Codec modes & profiles Multi-stream sizing Low-delay determinism Telemetry hooks Security/watermark hooks

Note on VVC (H.266): hardware encode availability is uneven across vendors and product generations. Treat “supports H.266” as a claim to verify via SDK capability queries and measurable throughput tests.

Five capability pillars to evaluate (codec-side only)

A) Codec set & constraints
Which codecs and modes are supported under real limits (bit depth, chroma, max res/FPS, Intra-only, lossless/near-lossless).
B) Concurrency (streams × resolution × fps)
Capacity must be stated per mode (low-delay, ROI, security enabled). Ask for a concurrency matrix, not a single headline number.
C) Low-latency determinism
No-B / no-reorder, slice/tile structure, intra refresh, bounded VBV, bounded queue depth—these define predictable latency and recovery windows.
D) Observability (telemetry hooks)
If QP/VBV/queue/errors cannot be exported, field debug becomes guesswork. Observability is a silicon selection criterion.
E) Security & traceability hooks (at codec boundary)
Prefer post-encoder insertion (packet/metadata layers) and stable metadata signing; avoid solutions that feed back into RC or break recovery.

Vendor checklist — 12 questions that force “verifiable answers”

No brand preference. Each question demands an artifact: a matrix, a log export, or a reproducible mode demonstration.

1) Codec set & mode constraints 3 questions
  • Exactly which encode modes are supported? (H.264/H.265/(H.266?), Intra-only, JPEG-LS/lossless/near-lossless). Provide SDK capability output, not marketing slides.
  • What are the operating limits per codec? (max res/FPS, bit depth, chroma format). Provide a limit table per profile.
  • Which features are mutually exclusive? (e.g., lossless + ROI + low-delay). Provide a feature compatibility matrix.
2) Concurrency sizing 3 questions
  • Provide a concurrency matrix: streams × resolution × fps per mode (low-delay on, ROI on, security on). No single-number answers.
  • What happens at saturation? (latency tail growth, frame drops, QP collapse). Provide telemetry logs showing the transition.
  • Is throughput stable across scene complexity? Provide bits/frame and QP distributions under easy vs hard content.
3) Determinism & recovery knobs 3 questions
  • Can the pipeline enforce no-B / no-reorder? Provide a mode demo with GOP stats export (IDR cadence, ref depth, reorder flag).
  • Can VBV and queue depth be bounded? Provide VBV fullness + queue occupancy export and the documented limits.
  • Can recovery windows be controlled? (IDR/CRA cadence, intra refresh, slice/tile). Provide a loss test report showing recovery time.
4) Observability & traceability 3 questions
  • Which telemetry signals are exportable? QP histogram, bits/frame, VBV fullness, queue depth, error/recovery counters, event logs (scene-change/IDR insert). Provide sample dumps.
  • Can telemetry be correlated with frames? (timestamps/sequence counters/metadata). Provide a correlation example.
  • Can stable metadata be signed for audit? (timestamp/config hash/ROI version). Provide a verification consistency demo (pass rate over repeated replays).

Example codec-focused silicon (MPNs) — use as reference options

The list below provides specific part identifiers commonly seen in vision cameras / edge boxes where hardware encode and telemetry/control matter. Always confirm the encode (not decode-only) profiles, exact limits, and licensing in the latest datasheet/SDK.

Category 1 — Vision camera SoC families (hardware encoder-centric)
  • Ambarella CV series: CV22 / CV25 / CV28 / CV2 / CV5 (vision SoCs with ISP + HW encode class capabilities)
  • Rockchip: RK3568 / RK3588 (embedded SoCs commonly used in edge boxes with HW video codecs)
  • NXP i.MX family: i.MX 8M Plus (IMX8MP) (embedded vision-oriented SoC class; validate encode modes in BSP)
  • MediaTek Genio: Genio 700 / Genio 1200 (embedded SoCs with multimedia blocks; validate deterministic low-delay controls)

Validation focus for this category: low-delay mode (no-B/no-reorder), bounded buffering knobs, ROI hooks, exportable QP/VBV/queue telemetry.

Category 2 — Edge compute modules (multi-stream encode engines)
  • NVIDIA Jetson modules: Jetson Orin Nano / Orin NX / AGX Orin (module families with dedicated encode engines; validate stream concurrency and telemetry access)
  • Intel embedded platforms: Atom x6000E series (platform family where media acceleration may exist; validate encode support and driver exposure)

Use when “many cameras → one box” is the topology. Selection hinges on concurrency matrix and whether per-stream stats are exportable.

Category 3 — FPGA platforms for deterministic custom pipelines (codec IP / structure control)
  • AMD/Xilinx Zynq UltraScale+ EV: XCZU7EV / XCZU9EV (device families often used for video pipelines + codec IP integration)
  • AMD/Xilinx Kintex UltraScale: XCKU040 / XCKU060 (codec/pipeline acceleration class devices)
  • Intel Agilex: AGF-series (high-throughput FPGA family; validate available codec IP and telemetry integration approach)

FPGA selection is justified when you must enforce hard determinism (structure, buffering caps) and expose deep observability signals.

Category 4 — Dedicated/accelerator-class encoder silicon (edge/server gateway use)
  • NETINT encoder ASIC class: T408 / Quadra family (hardware video encoding/transcoding class devices; validate low-delay and per-stream telemetry exports)
  • PCIe accelerator examples: AMD/Xilinx Alveo U30 (FPGA accelerator class; validate codec IP stack and observability access)

Use when the system requires high channel density and strict throughput isolation between streams.

How to validate any MPN quickly (minimum acceptance evidence)

Evidence A — Concurrency matrix
streams × resolution × fps, with flags: low-delay ROI security on
Evidence B — Telemetry export
QP histogram, bits/frame curve, VBV fullness (concept), queue depth, GOP/refs stats, error/recovery counters
Evidence C — Determinism proof
no-B/no-reorder mode demo + bounded queue depth behavior + recovery window under controlled loss
Evidence D — Security does not break determinism
Δlatency (incl. tail), overhead ratio, replay/verify pass rate, and unchanged encoder decision path

Figure F11 — Silicon capability checklist map (Requirements → Capabilities → Evidence)

Silicon capability checklist map Diagram maps system requirements to codec-side silicon capabilities (RC, structure, buffering, ROI, security hooks, telemetry hooks) and to evidence outputs (QP, bits/frame, VBV/queue, GOP stats, latency/overhead, replay/verify, error counters). Silicon Capability Checklist Map Requirements → Codec-side capabilities → Evidence exports (telemetry hooks are the maintainability core) Requirements Silicon Capabilities Evidence Outputs Bitrate Budget bandwidth / storage pressure Latency Bound bounded + explainable delay ROI Fidelity edges / text / defects stay Loss Recovery short error window Traceability audit / anti-tamper Stream Concurrency streams × res × fps Codec Set & Profiles H.26x / Intra / JPEG-LS / lossless Rate Control Knobs VBV • burst caps • QP limits Low-Delay Structure no-B • slices/tiles • intra refresh Buffer Model queue depth caps • DPB limits ROI Hooks QP map • ROI metadata Security Hooks post-encoder insertion • metadata sign Telemetry Hooks (Core) QP • bits/frame • VBV • queue • errors • events QP Distribution Export ROI vs background bits/frame Curve burstiness / stability VBV + Queue Occupancy tail latency proof GOP / Refs Statistics IDR cadence • ref depth ΔLatency + Overhead security impact Replay / Verify Pass Rate consistency check Error & Recovery Counters loss window evidence
Use this map in procurement reviews: each requirement must land on a codec-side capability block and produce an exportable evidence artifact. Telemetry hooks are highlighted because they determine whether field debug is engineering—or guessing.
Cite this figure: Compression / Codec for Machine Vision — “Silicon capability checklist map (F11)”, ICNavigator.

Request a Quote

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

H2-12. FAQs (Codec Engineering, Evidence-Based)

Each answer stays inside this page’s evidence chain: GOP/RC/QP/VBV/ROI/buffering/security insertion/validation/telemetry. Every FAQ includes: 2+ measurements + a discriminator + a first fix knob.

Bitrate is stable but tiny defects disappear — RC or GOP structure? → H2-4 / H2-6
If bitrate is steady yet micro-defects vanish, the issue is usually bit allocation or predictive smoothing, not “too few bits.” Check QP histogram (ROI vs background) and bits/frame vs scene complexity. If QP rises in ROI during fine textures, tighten QP max or inject an ROI QP map. If long prediction chains blur evidence, shorten GOP or use Intra-only / near-lossless bound.
Latency spikes only on complex scenes — lookahead, VBV, or queue depth? → H2-3 / H2-8
Complexity-triggered spikes usually come from hidden buffering: lookahead, VBV refill, or queue growth. Correlate bits/frame with queue depth / buffer occupancy and any VBV fullness export. If spikes track queue depth, cap queue and reduce reference frames. If spikes track VBV fullness, shrink VBV and cap burst. If lookahead is enabled, disable or reduce it for low-delay determinism.
One lost packet causes long mosaics — how to limit error propagation? → H2-5
Long mosaics mean the prediction chain cannot recover quickly. Inspect GOP length, IDR/CRA cadence, and reference depth (how many frames depend on a lost block). Use a controlled loss test and record the recovery window (frames to clean). First fixes: shorten GOP, increase IDR/CRA frequency, limit references, and split frames into more slices/tiles so errors stop spreading across the whole picture.
ROI looks sharp but flickers — QP map noise or ROI boundary instability? → H2-7 / H2-4
ROI flicker is typically temporal instability in the ROI mask or overly aggressive QP deltas at the boundary. Measure ROI mask stability over time (area/position changes) and compare ROI QP variance against bits/frame burstiness. If mask jitter drives QP changes, add ROI smoothing/hysteresis (hold ROI for N frames). If boundary shimmer persists, cap QP delta, use multi-level ROI, or align ROI to tiles/slices for stable partitioning.
CBR still bursts — what does VBV actually constrain? → H2-4 / H2-3
CBR targets an average rate; VBV constrains short-term burst and delay by forcing the encoder to respect a buffer model. Verify with short-window bits/frame and VBV fullness (or any “buffer level” export). If bursts remain, the VBV is too large or burst caps are loose. First fix: shrink VBV size, set a stricter max burst, and avoid structures that create large instantaneous I-frame spikes (e.g., overly long GOP without slices).
Intra-only mode — when is it worth the bitrate hit? → H2-2 / H2-5
Intra-only is worth it when determinism, fast recovery, and analyzable detail outweigh bandwidth cost (inspection evidence, forensics, measurement). Compare latency tail (p95/p99) and loss recovery window versus inter-coded GOP. If inter coding blurs edges or makes recovery too slow, switch to Intra-only or a compromise: short GOP with frequent IDR/CRA, limited references, and more slices. Validate with ROI accuracy / false-negative rate.
Near-lossless — how to choose error bound without breaking inspection? → H2-6
Choose the near-lossless bound from the smallest defect signal that must survive. Measure a difference image / residual distribution and track inspection accuracy (false negatives) as the bound increases. If fine defects disappear first, the bound is too loose or not ROI-aware. First fix: tighten the bound globally, or apply a stricter bound in ROI while allowing a looser bound in background. Re-validate with the same defect set and compare edge/texture metrics across runs.
Encryption increased delay — where should security be inserted? → H2-9 / H2-3
Delay increases when security is inserted in a way that adds buffering or blocks scheduling. Measure Δlatency (including tail) and throughput/overhead with security on/off, and confirm loss recovery window does not worsen. Prefer insertion that is post-encoder (packet layer) or metadata signing so it does not feed back into RC decisions. First fixes: move encryption/signing after bitstream generation, reduce added buffering, and keep watermark keys/metadata in a stable side channel.
Watermarking breaks compression efficiency — what’s the safer approach? → H2-9 / H2-4
Efficiency drops when watermarking increases spatial complexity or destabilizes RC. Compare bits/frame and QP shift with watermark on/off, and check whether ROI QP worsens. A safer approach is post-encode watermarking (bitstream/packet layer) or signed metadata for traceability, so image content is not perturbed. First fix: avoid watermarking inside ROI, keep insertion post-encoder, and verify overhead stays bounded without changing GOP/RC behavior.
Why does enabling B-frames make latency unpredictable? → H2-3 / H2-5
B-frames often introduce reordering and deeper reference chains, which inflate tail latency and complicate recovery. Validate by exporting reorder depth / DPB usage and measuring p95/p99 latency under stable input FPS. If latency distribution widens, switch to no-B (P-only low-delay), shorten GOP, and limit references. To keep robustness, increase slices and use more frequent IDR/CRA so loss recovery remains bounded.
How to prove it’s codec config, not network jitter? → H2-10
Separate encoder-side variability from transport variability using two timelines: encoder egress timestamps/sequence counters and network receive timestamps. If egress timing already jitters, the codec is the source; correlate with queue depth and VBV fullness. First fix is to lock deterministic codec settings: no-B, fixed GOP, bounded VBV, capped queue. Only after stable egress should network jitter be blamed; otherwise root cause remains ambiguous.
What telemetry counters are must-have from the encoder? → H2-11 / H2-10
Minimum must-haves: QP histogram (ROI vs background), bits/frame curve, VBV fullness or buffer level, queue depth, GOP/IDR cadence, reference depth, and error/recovery counters. Add event markers: scene-change, IDR insert, buffer overflow/underflow, security insertion enabled. Without these exports, field debug becomes guesswork and procurement risk rises. First fix is to require telemetry in the acceptance checklist before committing to silicon.

Figure F12 — FAQ map to chapters (index)

FAQ map to chapters A compact index map connecting FAQ items Q1–Q12 to core chapters H2-3 to H2-11 (Latency, RC, GOP/structure, Lossless, ROI, Buffering, Security, Validation, Silicon selection). FAQ Map to Chapters Each FAQ routes back to codec evidence: GOP • RC/QP/VBV • ROI • Buffering • Security insertion • Validation • Telemetry FAQs Q1 Defects disappear Q2 Latency spikes Q3 Long mosaics Q4 ROI flicker Q5 CBR bursts Q6 Intra-only Q7 Near-lossless bound Q8 Encryption delay Q9 Watermark overhead Q10 B-frame latency Q11 Codec vs network jitter Chapters (Evidence Anchors) H2-3 Latency Anatomy H2-4 Rate Control (QP / VBV) H2-5 GOP / Slices / Tiles H2-6 Lossless / Near-lossless H2-7 ROI Encoding H2-8 DDR Buffering Map H2-9 Security / Watermark Hooks H2-10 Debug SOP H2-11 Silicon / Telemetry Rule: each answer cites ≥2 evidence signals (QP / bits-frame / VBV / queue / GOP stats / recovery counters)
This index keeps FAQs from scope creep: each question routes to a codec evidence chapter (latency, RC, GOP, ROI, buffering, security, SOP, telemetry).
Cite this figure: Compression / Codec for Machine Vision — “FAQ map to chapters (F12)”, ICNavigator.