Compression / Codec for Machine Vision (H.26x, JPEG-LS)
← Back to: Imaging / Camera / Machine Vision
Compression/codec in machine vision is an engineering control loop: it must deliver measurable fidelity (edges/defects/ROI), bounded latency, and debuggable determinism via GOP/RC/VBV/ROI/buffering telemetry. The practical goal is to spend bits where evidence matters while keeping recovery, traceability, and security hooks from breaking predictability.
H2-1. What This Page Owns: Compression in Vision Systems (Boundary + Use Cases)
Engineering definition (extractable)
Machine-vision compression is the engineering of bounded latency, measurable inspection fidelity, and traceable, secure streams using H.26x or JPEG-LS/lossless pipelines. The objective is not cinematic quality; it is predictable timing, analyzable edges/textures/defects, and verifiable metadata—while minimizing bandwidth and storage.
What this page covers vs. what it only references
- Encoder pipeline knobs: rate control (QP/VBV), picture structure (GOP/references/slices), low-latency modes.
- Internal buffering logic: line/frame/reference buffering as it affects latency bounds and burst behavior.
- Inspection fidelity controls: lossless / near-lossless, ROI bit allocation, and how to validate “defect survives compression”.
- Security hooks: encryption/signing/watermarking insertion points without breaking determinism.
- Telemetry & validation: which counters prove “codec config” vs “content complexity”.
- Machine-Vision Interfaces (PHY/SerDes/retimers, electrical trigger robustness).
- Sync/Trigger & Timing Hub (PTP/1588 distribution, jitter-cleaning PLL network).
- Local Buffering & Storage (SD/NAND/SSD controller, PLP hold-up, wear leveling/ECC internals).
- Power/EMC/I/O (PoE PD, isolated rails, surge strategy, grounding implementation).
Rule: when those topics are required, reference them by link; do not duplicate their core content in this page.
The measurable triangle: Bitrate ↔ Latency ↔ Fidelity
Use measurable evidence for each corner; avoid “looks better” language.
- Average + peak bursts (bits/frame distribution; burst-to-average ratio).
- Scene-change spikes and ROI-induced variability (needs counters, not guesswork).
- min/typ/max; the bound (max) is often the real requirement.
- Primary latency multipliers: B-frame reordering, lookahead, VBV fullness, deep queues.
- ROI metrics: edge/texture preservation, fine-defect contrast, OCR/detector accuracy stability.
- Lossless/near-lossless criteria: “compression error must not exceed defect signal budget”.
Engineering goal: choose knobs that keep all three corners within spec, not merely maximizing compression ratio.
Scope Guard (mechanical check)
Ctrl+F keywords verify that the page stays inside the boundary.
H.264/H.265/H.266, JPEG-LS, lossless, GOP, QP, VBV, low-latency, ROI, slices/tiles, metadata, encryption, watermarking
CoaXPress/10GigE PHY tuning, PTP hub architecture, PoE PD design, NAND/SD controller internals, PLP hold-up, cloud/VMS
Figure F1 — Vision encode chain boundary map
H2-2. Decision Tree: Choose Codec Family by “Task” Not Trend
Why task-first selection matters
Codec selection in machine vision should start from the inspection task, not from the newest standard. Different tasks impose different non-negotiables: some require lossless evidence, others require bounded latency, and some are primarily bandwidth-limited. Picking the codec family first often forces later “band-aids” (unstable rate control, unpredictable latency spikes, or ROI failures).
Minimum spec set (what must be known before choosing)
This list prevents “codec debates” without measurable inputs.
- Resolution and frame rate (e.g.,
1920×1080 @ 60 fps). - Bit depth: 8/10/12; and pixel format (grayscale, YUV 4:2:0 / 4:2:2 / 4:4:4).
- ROI definition: ROI area ratio, static vs dynamic ROI, and “must-preserve” feature type (edge/texture/defect/OCR).
- Latency requirement: specify max bound in milliseconds, not only typical.
- Loss tolerance: acceptable mosaic probability, recovery time after loss, and whether any frame loss is allowed.
- Traceability: whether metadata must be signed / watermarked; and whether replay must be bit-exact.
- Bitrate: average + burst (bits/frame distribution, burst-to-average ratio).
- Fidelity: ROI edge/texture preservation and task accuracy delta (false/true detections, OCR confidence stability).
- Latency: min/typ/max + jitter; plus “queue depth” stability under scene changes.
Task → constraints → recommended codec path (engineer-oriented)
| Task type | Hard constraints (what cannot break) | Recommended path (codec family + structure + verify) |
|---|---|---|
| Measurement / Metrology | Evidence must not be altered; repeatable replay; drift must be diagnosable. Latency can be moderate, but fidelity is absolute. |
JPEG-LS lossless or near-lossless with explicit error bound.
Prefer intra-only structure.
Verify: difference map / error bound compliance + ROI measurement repeatability. |
| Inspection / Detection | ROI edges/textures/defects must survive compression; bounded latency required for closed-loop lines. Burst behavior must be controlled. |
H.264/H.265 low-latency with ROI allocation (QP map / tiles/slices) or near-lossless ROI.
Keep GOP short; limit references.
Verify: ROI accuracy delta + latency max + burst-to-average ratio. |
| Remote Monitoring | Bandwidth limited; tolerates some quality loss. Latency target is typically “low enough”, but stability still matters. |
H.265 (or H.264 if silicon constraints) with controlled bitrate (VBV).
Use moderate GOP; avoid deep lookahead if latency variance matters.
Verify: bitrate stability under scene changes + acceptable visual/ROI degradation. |
| Evidence / Compliance | Non-repudiation; tamper resistance; consistent replay; fidelity requirements are driven by audit policy. |
Lossless or constrained-loss + signing/watermarking hooks placed to preserve determinism.
Prefer frequent recovery points (IDR/CRA, limited propagation).
Verify: signature validity + replay consistency + recovery time after loss. |
Fast “Do / Don’t” (prevents common mistakes)
- Do specify a latency max bound; “low latency” without a bound is not testable.
- Do measure burst (bits/frame), not only average bitrate; bursts drive queueing and missed deadlines.
- Don’t enable B-frames/lookahead when determinism is required; it often creates unpredictable latency tails.
- Don’t judge inspection quality by PSNR alone; include ROI task accuracy or edge/texture preservation checks.
Figure F2 — Codec choice decision tree (task-driven)
H2-3. Latency Anatomy: Where the Milliseconds Hide
Core idea: bounded latency is an engineering budget, not a slogan
In machine vision, the requirement is usually not “low latency” but bounded latency (a max tail that stays within a control-loop deadline). The only way to guarantee a bound is to decompose latency into measurable segments, then remove or cap the mechanisms that amplify the tail: B-frame reordering, lookahead, VBV queueing, and deep buffer backlogs.
Three latencies (define what is controllable)
- From frame arrival at the encode pipeline to bitstream-ready.
- Dominant multipliers: reordering, lookahead, reference/DPB access, internal queue depth.
- From bitstream-ready to “ready-to-send” after packetization and optional encryption/signing.
- Dominant multipliers: pack queues, crypto throughput limits, fragment policy, backpressure.
- Useful mainly for back-inference: if L1/L2 are stable but end-to-end drifts, the cause is likely outside codec.
This page focuses on L1/L2 mechanisms and counters. Interface timing distribution, network stack behavior, and receiver-side buffering belong to other pages (link-out only).
Latency budget table (controllable vs. not) + what to log
Treat each row as a testable checkpoint. If the max bound breaks, identify which row grows first.
| Stage | Control | Tail amplifiers (most common) | Evidence to capture (counters / timestamps) |
|---|---|---|---|
| Line/Frame ingress | Partial | Deep input queues; frame pacing mismatch; sudden scene complexity | Queue depth; frame counter continuity; input arrival timestamps |
| ME/RDO | Yes | Lookahead; heavy motion search; content-dependent compute spikes | Encode-start/encode-end timestamps; lookahead depth; per-frame compute time |
| Entropy | Yes | High-detail frames increase bits; backpressure from VBV/pack | Bits/frame; entropy stage time; stall events |
| GOP reorder | Yes | B-frame reordering; DPB/reference depth; long GOP | Frame-id vs output-order; reorder depth; reference count |
| VBV / rate buffer | Yes | VBV fullness sustained high; large VBV hides bursts as delay | VBV fullness over time; burst-to-average ratio; skip/merge indicators |
| Pack / fragment | Partial | Packet queueing; fragmentation policy; downstream backpressure | Pack queue depth; egress timestamp; bytes/packet distribution |
| Encrypt / sign | Partial | Crypto throughput below peak; key ops stalling pipeline | Crypto stage time; crypto queue depth; drop/retry counts (if any) |
Practical rule: any “latency improvement” claim is incomplete unless it includes max and shows which row’s evidence improved.
Low-latency mode checklist (make the bound defensible)
- Disable B-frames (avoid reordering tail).
- Disable or cap lookahead (avoid hidden frame queues).
- Limit reference frames / DPB depth (reduce dependency backlog).
- Use a smaller VBV to prevent long “delay storage”.
- Cap internal queue depth; avoid deep FIFO “safety nets” that become latency tails.
- Prefer slices/tiles to reduce waiting and limit error propagation radius.
- Latency: report min/typ/max and jitter.
- Burst: bits/frame distribution and burst-to-average ratio.
- Recoverability: time-to-recover after a loss event (shorter tail = smaller blast radius).
Figure F3 — Latency budget pipeline (timeline + reduction knobs)
H2-4. Rate Control That Engineers Can Debug (CBR/VBR/Low-Delay VBV)
Core idea: rate control is a closed loop with observable signals
Rate control (RC) is often treated as a black box, but in machine vision it must be debuggable. The job of RC is not only meeting a target bitrate; it is controlling burst behavior and protecting inspection-critical ROI while keeping latency bounds intact. If RC hides bursts by buffering (VBV), the system “pays” in tail latency and quality volatility.
Minimum evidence set: 4 counters that separate “RC instability” from “content complexity”
- Watch the high-QP tail: ROI degradation often appears as a tail growth.
- Compare ROI vs background QP (ROI should not be sacrificed when inspection is the priority).
- Average bitrate can look fine while bursts cause queueing and deadline misses.
- Log burst-to-average ratio and correlate spikes with scene changes and I-frame events.
- Sustained high fullness indicates RC is translating bursts into latency.
- Low-delay modes require VBV to be capped and recover quickly.
- High skip/merge can explain “bitrate stable but details disappear”.
- Combine with ROI QP to confirm whether inspection features are being removed.
Debug cards: Symptom → Evidence → First knobs (codec-side only)
- Evidence: ROI QP tail grows; skip/merge increases; ROI accuracy/OCR confidence drops.
- Discriminator: if bits/frame stays flat but ROI metrics degrade, it is allocation (RC/ROI), not bandwidth.
- First knobs: set ROI QP map/priority, cap ROI QP max, tighten scene-change behavior, avoid “ROI drift” in ROI mapping.
- Evidence: bits/frame spikes align with scene changes or I-frame events; VBV fullness saturates; pack queue depth rises.
- Discriminator: if VBV stays high after a spike, the system is paying in latency tail.
- First knobs: reduce VBV, constrain I-frame size, adjust scene-change threshold, shorten GOP, consider slice/tile to reduce wait.
- Evidence: ROI QP oscillates frame-to-frame; ROI boundary changes rapidly; bitrate remains within target but ROI accuracy varies.
- Discriminator: global PSNR can remain similar while ROI inspection fails—use ROI accuracy delta as the acceptance metric.
- First knobs: stabilize ROI map generation, use multi-level ROI (not binary), bind ROI to tiles/slices, avoid over-aggressive QP swings.
The “first knobs” above intentionally stay within this page: QP/VBV/ROI/GOP/slice. Interface/network/power/storage causes are handled by link-out pages, not duplicated here.
Knob → trade-off mapping (so tuning stays defensible)
- Benefit: protects ROI detail and task accuracy.
- Cost: bitrate may rise; background quality may drop if bitrate is constrained.
- Benefit: reduces tail latency “stored in buffers”.
- Cost: less ability to smooth bursts; may require stricter GOP/ROI strategy.
- Benefit: reduces burst severity and improves recoverability.
- Cost: compression efficiency decreases; average bitrate may increase.
Figure F4 — Rate control loop (observable signals + injection points)
H2-5. Picture Structure: GOP, References, Slices/Tiles (Determinism vs Efficiency)
Core idea: picture structure controls latency bound and error radius
In machine vision, picture structure is not a “codec preference”; it is a contract for bounded latency, error propagation radius, recovery time, and efficiency. Long dependency chains (deep references, long GOPs) can improve compression, but they also increase the blast radius when a unit is lost or corrupted.
4 knobs × 4 outcomes (use as a design/validation checklist)
The table below is intended to be testable: each knob must map to measurable outcomes and evidence logs. (Mobile: swipe horizontally.)
| Knob | Latency bound | Error radius | Recovery time | Efficiency | Evidence to log |
|---|---|---|---|---|---|
| GOP length short vs long |
Short GOP reduces worst-case dependency tail; long GOP can increase tail under stress | Shorter chain limits how far errors persist; long GOP extends visible corruption | Short GOP recovers sooner (more frequent refresh opportunities) | Long GOP usually improves compression efficiency | GOP length, frame type pattern, recovery frames-to-clear |
| IDR / CRA frequency more vs fewer |
More frequent refresh points tighten bounds after disturbances | Refresh points cap propagation across time | Higher frequency improves “time-to-usable” after loss events | More refresh increases bitrate overhead | Refresh markers, bits/frame spikes at refresh, time-to-clear artifacts |
| Reference depth limited vs deep |
Deeper references can increase queueing and tail sensitivity | Deeper dependencies increase how many frames become affected by a single break | Limited references shorten the “prediction memory” and speed recovery | Deep references can improve efficiency in repetitive scenes | Ref count, DPB depth, reorder depth (if present), artifact persistence |
| Slices / Tiles coarse vs fine |
Finer partitioning can reduce waiting and improve pipeline behavior | Partitions localize damage: a bad unit affects a region, not the whole frame | Localized damage shortens “effective recovery” for ROI | Partitioning can cost bits (overhead) and reduce efficiency | Partition map, region-level artifact logs, ROI recovery time |
Validation hint: always report max (not only average) for latency and recovery. “Tail is the spec.”
Failure mode: one loss breaks the prediction chain (what to change first)
Evidence: artifacts persist across many subsequent frames; recovery is slow
Likely structure cause: long GOP + deep references + insufficient refresh points
First structural fix: shorten GOP, increase IDR/CRA cadence, limit reference depth
Evidence: corruption spreads spatially into ROI; ROI metrics collapse
Likely structure cause: no/insufficient slice/tile partitioning (damage is not localized)
First structural fix: introduce slices/tiles to localize damage; bind ROI to partitions where possible
Evidence: tail latency grows; output timing becomes less predictable
Likely structure cause: deeper dependency/queue sensitivity amplifies tail behavior
First structural fix: simplify structure (shorter GOP, limited refs, partitioning) before chasing minor RC tweaks
Figure F5 — Error propagation vs GOP (time + region comparison)
H2-6. Lossless & Near-Lossless for Vision (JPEG-LS and Practical Constraints)
Core idea: lossless preserves evidence when defects sit near the noise floor
Lossless (and bounded-error near-lossless) matters when inspection evidence is subtle: micro-defects, fine textures, and measurement edges can sit close to the noise floor. In that regime, lossy compression error becomes indistinguishable from noise and can erase evidence, increasing false-negative risk. The trade-off is practical: lossless pushes higher and more variable bits/frame, raising buffering and throughput pressure.
Application-driven: what must remain faithful (and how to validate)
- Preserve: edges and geometry used for dimensional accuracy.
- Breaks first (lossy): edge softening and local bias.
- Validate: difference images around edges; measure delta in edge/position error.
- Preserve: low-contrast defect signatures close to noise.
- Breaks first (lossy): compression error masks defect amplitude.
- Validate: defect miss rate delta on ROI; difference image shows defect disappearance.
- Preserve: auditable evidence with minimal transformation.
- Breaks first (lossy): disputed ambiguity (is it artifact or real).
- Validate: checksum/bit-exact reproducibility; controlled re-decode comparison.
- Preserve: training/benchmark inputs without codec-imposed bias.
- Breaks first (lossy): systematic texture loss and block artifacts.
- Validate: compare feature/ROI metrics across versions; keep a lossless reference set.
Note: this section focuses on codec-side constraints and validation methods, not algorithm internals.
Near-lossless: bounded error budgeting + ROI-first protection
- Define an explicit error ceiling (bounded deviation), instead of subjective “slightly lossy”.
- Use the bound as an acceptance criterion alongside task metrics (ROI miss-rate / edge error).
- Keep ROI on a tighter error bound; allow larger error in background.
- Validate with ROI metrics, not only global PSNR-like scores.
- Lossless often increases bits/frame variability → higher burst-to-average ratios.
- Track bits/frame and buffer/queue occupancy to ensure latency bounds remain intact.
Figure F6 — Lossless vs lossy impact on defect evidence (signal + noise + compression error)
H2-7. ROI Encoding for Inspection: Keep Edges, Spend Bits Where It Matters
Core idea: ROI encoding is evidence budgeting, not “pretty video”
ROI encoding allocates bits to preserve task evidence (edges, text strokes, micro-defects) under bandwidth and latency constraints. Success is measured by ROI stability (edge/contrast retention) and task metrics (OCR/detection accuracy) — not by subjective overall sharpness.
ROI strategy “pick one of three” (and what it costs)
- Use when: weld points, fixed OCR zones, repeatable inspection geometry.
- Strength: stable control input → easier rate-control stability.
- Risk: misses targets when camera/fixture drifts; ROI no longer covers evidence.
- Use when: moving parts, variable target position, tracking-driven inspection.
- Strength: spends bits where targets move; saves background.
- Risk: ROI jitter/drift drives bitrate jitter and boundary flicker.
- Use when: “core evidence” + “secondary context” + “background” exist simultaneously.
- Strength: most realistic budgeting; avoids binary ROI artifacts.
- Risk: frequent level switching causes bits/frame spikes unless rate-limited.
Engineering rule: ROI maps are time-varying control inputs. Unfiltered ROI updates can masquerade as “content complexity changes” and destabilize rate control.
Interfaces required (what signals must exist)
Failure modes → evidence → first fix
Evidence: ROI edge region shows frequent QP toggles; edge/contrast oscillates frame-to-frame
Likely cause: ROI mask updates too fast; boundary not aligned/smoothed
First fix: rate-limit ROI updates; apply hysteresis; align ROI to block grid; stabilize boundaries
Evidence: bits/frame spikes correlate with ROI movement or ROI size changes
Likely cause: RC treats ROI changes as content complexity jumps
First fix: constrain ROI motion/area changes; cap ROI boost; reduce ROI update frequency
Evidence: cannot reconstruct which ROI map/version was used for a given segment
Likely cause: missing ROI metadata sideband or inconsistent versioning
First fix: attach ROI definition hash/version + timestamps; keep ROI source consistent
Figure F7 — ROI QP map injection (control + traceability + test points)
H2-8. DDR Buffering Inside the Encode Pipeline (What to Buffer, What to Avoid)
Core idea: internal buffering sets tail latency (bounded delay is the spec)
DDR buffering inside the encode pipeline is valuable for throughput, but it is also the primary source of tail latency. The goal is not “more buffer”; the goal is to keep queue depth observable and bounded so latency remains explainable under scene complexity changes.
Three buffer classes (function → risk → evidence)
Mobile: swipe horizontally.
| Buffer | What it buffers | Main risk | Evidence to track (concept-level) |
|---|---|---|---|
| Line buffer rows |
Row/stripe data for block processing and pacing | Stalls under bandwidth contention (usually localized) | Row/stripe stall counters; input/output pacing mismatch indicators |
| Frame buffer frames |
Full frames for pre-processing and pipeline decoupling | Queue depth grows → latency increases frame-by-frame | Frame queue depth; enqueue/dequeue rates; max encode latency (tail) |
| Reference / DPB refs |
Reference frames used for prediction dependencies | Deep refs increase DDR reads and sensitivity; longer dependency tail | DPB occupancy (concept); ref read bandwidth pressure; artifact persistence under stress |
Symptoms → the internal cause chain (stay inside the encoder boundary)
Internal chain: complexity ↑ → bits/frame ↑ / compute ↑ → queue depth ↑ → tail latency ↑
Internal chain: bursts exceed throughput → buffers fill → output misses bound → frames dropped downstream
Internal chain: control inputs (structure/ROI/scene changes) create bursts → DDR pressure ↑ → queue depth oscillates
This section intentionally avoids storage controller and network protocol details; the focus is DDR buffering inside the encode pipeline.
Buffer map “Do / Don’t” checklist
- Cap queue depth to enforce bounded latency (validate with max latency, not average).
- Reduce reference depth in low-latency modes to cut DPB DDR reads.
- Expose occupancy/queue stats (even if concept-level) so latency is explainable.
- Overuse lookahead if bounded latency is required (hidden queueing grows tail).
- Allow implicit reordering that adds unpredictable waiting (tail jitter appears).
- “Fix with bigger buffers” without measuring queue depth (tail latency becomes unbounded).
Figure F8 — Encoder buffering map (DDR read/write pressure + queue depth probes)
H2-9. Security & Watermarking: Protect Streams Without Breaking Determinism
Core idea: security must preserve bounded latency and recovery
Vision pipelines require explainable, bounded latency and predictable recovery. Security mechanisms must protect confidentiality and integrity while avoiding hidden buffering, long-tail latency, or degraded loss recovery.
The security triad (what it provides — and what it can break)
- Goal: prevent leakage (unauthorized viewing / copying).
- Risk: added processing delay, throughput overhead, harder partial recovery if applied too early.
- Goal: anti-tamper and non-repudiation (prove origin and integrity).
- Risk: verification gates can introduce buffering or replay failure if metadata isn’t consistent.
- Goal: traceability (device/batch/channel), resist casual redistribution.
- Risk: if watermark changes visual complexity, rate-control and ROI evidence may be harmed.
Practical rule: the insertion point often matters more than the cryptography/watermark method.
Where to implement: bitstream vs packet vs metadata (impact surface differs)
Mobile: swipe horizontally.
| Layer | Typical insertion | Most likely to break | Best for |
|---|---|---|---|
| Bitstream sensitive |
Encrypt/sign the encoded stream structure (concept) | Random access & local recovery; may complicate “partial decode” paths | When strong end-to-end secrecy is required and recovery constraints are well defined |
| Packet often safe |
Encrypt + sign after packetization (concept) | Throughput overhead; tail latency if buffering/retry mechanisms exist | Determinism-first real-time pipelines (security after encoder decisions) |
| Metadata low-disruption |
Sign timestamps/config hashes/ROI versioning (concept) | Traceability only if metadata is incomplete/inconsistent | Auditability and anti-tamper evidence with minimal effect on RC/quality |
What to measure (the “do not regress determinism” checklist)
Common failures → first fix (stay inside this page’s boundary)
Evidence: max/99th latency rises while average stays similar
First fix: move insertion later (prefer packet/metadata layers); cap any security-stage queueing
Evidence: “one loss → long visible/analytic impact” increases
First fix: avoid early bitstream rewrites; preserve recovery-friendly boundaries; validate with controlled loss tests
Evidence: ROI edge/contrast or OCR/detection metrics drift after watermark enabled
First fix: switch to metadata-based traceability or reduce watermark aggressiveness; re-validate task metrics
Figure F9 — Security insertion points (safe vs risky for determinism)
H2-10. Validation & Field Debug Playbook: Symptom → Evidence → Isolate → Fix
Debug principle: make codec problems measurable and repeatable
This playbook converts “codec feels wrong” into a repeatable SOP. Each symptom is diagnosed using two first measurements, then isolated into one of a few root causes inside this page: rate control, GOP/refs structure, buffering/queue depth, ROI policy, or security insertion.
Universal SOP template (use this for every incident)
Write one sentence describing the failure mode (what regressed, under which trigger).
Always pick the smallest set with high discrimination power:
- QP distribution + bits/frame curve
- VBV fullness (concept) + queue depth / buffer occupancy
- GOP / refs stats (IDR cadence, B/reorder, ref count)
- Security overhead (Δlatency, overhead ratio) when security is enabled
Use the two measurements to split the cause into: RC / Buffering / Structure / ROI / Security insertion
Apply one primary knob, then re-measure the same two signals (do not “change five things”).
Top symptoms (each one stays inside this page’s boundary)
First 2 measurements: bits/frame + QP distribution
Discriminator:
- If bits/frame spikes with stable QP → structure/scene triggers or ROI changes driving bursts
- If QP swings hard → RC instability or ROI priority input jitter
First 2 measurements: queue depth + VBV fullness (concept)
Discriminator:
- If queue depth grows with complexity → buffering tail latency problem
- If VBV oscillates while queue depth is stable → RC behavior dominating
First 2 measurements: ROI edge/contrast metric + ROI vs background QP delta
Discriminator:
- If ROI QP delta collapses under load → ROI budget not enforced
- If ROI metric drops when security/watermark enabled → insertion is harming evidence
First 2 measurements: GOP/IDR cadence + ref count / reordering presence
Discriminator:
- Long GOP + deep refs → error persists longer
- Hidden reordering adds recovery delay and jitter
First 2 measurements: Δlatency (incl. tail) + verification success rate
Discriminator:
- If tail latency grows → security stage introduces buffering/queueing
- If verification is inconsistent → metadata/versioning inconsistency
First 2 measurements: ROI update rate + ROI QP delta variability
Discriminator:
- High ROI update rate → boundary shimmer and control jitter
- QP delta toggles → RC reacting to ROI input jitter
Figure F10 — Codec debug decision tree (screenshot-ready SOP)
H2-11. IC / Silicon Selection Pointers (Codec-Side, Evidence-First)
What this chapter owns (and what it does not)
This section is not a shopping list for PoE, interfaces, storage controllers, or cloud stacks. It focuses on codec-side silicon (SoC/ASIC/FPGA) selection: which encoder capabilities matter, how to size multi-stream throughput, and—most importantly—how to validate vendor claims using exportable telemetry (QP/RC/VBV/queue/errors).
Note on VVC (H.266): hardware encode availability is uneven across vendors and product generations. Treat “supports H.266” as a claim to verify via SDK capability queries and measurable throughput tests.
Five capability pillars to evaluate (codec-side only)
Vendor checklist — 12 questions that force “verifiable answers”
No brand preference. Each question demands an artifact: a matrix, a log export, or a reproducible mode demonstration.
1) Codec set & mode constraints 3 questions
- Exactly which encode modes are supported? (H.264/H.265/(H.266?), Intra-only, JPEG-LS/lossless/near-lossless). Provide SDK capability output, not marketing slides.
- What are the operating limits per codec? (max res/FPS, bit depth, chroma format). Provide a limit table per profile.
- Which features are mutually exclusive? (e.g., lossless + ROI + low-delay). Provide a feature compatibility matrix.
2) Concurrency sizing 3 questions
- Provide a concurrency matrix: streams × resolution × fps per mode (low-delay on, ROI on, security on). No single-number answers.
- What happens at saturation? (latency tail growth, frame drops, QP collapse). Provide telemetry logs showing the transition.
- Is throughput stable across scene complexity? Provide bits/frame and QP distributions under easy vs hard content.
3) Determinism & recovery knobs 3 questions
- Can the pipeline enforce no-B / no-reorder? Provide a mode demo with GOP stats export (IDR cadence, ref depth, reorder flag).
- Can VBV and queue depth be bounded? Provide VBV fullness + queue occupancy export and the documented limits.
- Can recovery windows be controlled? (IDR/CRA cadence, intra refresh, slice/tile). Provide a loss test report showing recovery time.
4) Observability & traceability 3 questions
- Which telemetry signals are exportable? QP histogram, bits/frame, VBV fullness, queue depth, error/recovery counters, event logs (scene-change/IDR insert). Provide sample dumps.
- Can telemetry be correlated with frames? (timestamps/sequence counters/metadata). Provide a correlation example.
- Can stable metadata be signed for audit? (timestamp/config hash/ROI version). Provide a verification consistency demo (pass rate over repeated replays).
Example codec-focused silicon (MPNs) — use as reference options
The list below provides specific part identifiers commonly seen in vision cameras / edge boxes where hardware encode and telemetry/control matter. Always confirm the encode (not decode-only) profiles, exact limits, and licensing in the latest datasheet/SDK.
- Ambarella CV series: CV22 / CV25 / CV28 / CV2 / CV5 (vision SoCs with ISP + HW encode class capabilities)
- Rockchip: RK3568 / RK3588 (embedded SoCs commonly used in edge boxes with HW video codecs)
- NXP i.MX family: i.MX 8M Plus (IMX8MP) (embedded vision-oriented SoC class; validate encode modes in BSP)
- MediaTek Genio: Genio 700 / Genio 1200 (embedded SoCs with multimedia blocks; validate deterministic low-delay controls)
Validation focus for this category: low-delay mode (no-B/no-reorder), bounded buffering knobs, ROI hooks, exportable QP/VBV/queue telemetry.
- NVIDIA Jetson modules: Jetson Orin Nano / Orin NX / AGX Orin (module families with dedicated encode engines; validate stream concurrency and telemetry access)
- Intel embedded platforms: Atom x6000E series (platform family where media acceleration may exist; validate encode support and driver exposure)
Use when “many cameras → one box” is the topology. Selection hinges on concurrency matrix and whether per-stream stats are exportable.
- AMD/Xilinx Zynq UltraScale+ EV: XCZU7EV / XCZU9EV (device families often used for video pipelines + codec IP integration)
- AMD/Xilinx Kintex UltraScale: XCKU040 / XCKU060 (codec/pipeline acceleration class devices)
- Intel Agilex: AGF-series (high-throughput FPGA family; validate available codec IP and telemetry integration approach)
FPGA selection is justified when you must enforce hard determinism (structure, buffering caps) and expose deep observability signals.
- NETINT encoder ASIC class: T408 / Quadra family (hardware video encoding/transcoding class devices; validate low-delay and per-stream telemetry exports)
- PCIe accelerator examples: AMD/Xilinx Alveo U30 (FPGA accelerator class; validate codec IP stack and observability access)
Use when the system requires high channel density and strict throughput isolation between streams.
How to validate any MPN quickly (minimum acceptance evidence)
streams × resolution × fps, with flags: low-delay ROI security on
QP histogram, bits/frame curve, VBV fullness (concept), queue depth, GOP/refs stats, error/recovery counters
no-B/no-reorder mode demo + bounded queue depth behavior + recovery window under controlled loss
Δlatency (incl. tail), overhead ratio, replay/verify pass rate, and unchanged encoder decision path
Figure F11 — Silicon capability checklist map (Requirements → Capabilities → Evidence)
H2-12. FAQs (Codec Engineering, Evidence-Based)
Each answer stays inside this page’s evidence chain: GOP/RC/QP/VBV/ROI/buffering/security insertion/validation/telemetry. Every FAQ includes: 2+ measurements + a discriminator + a first fix knob.