Edge Security / ZTNA Node Hardware Architecture

Q: Where is the practical boundary between a ZTNA node and a traditional firewall/UTM?

A ZTNA node acts as a site-local policy enforcement point (PEP) where identity/device posture becomes enforceable session rules on the dataplane. A firewall/UTM is typically IP/port-centric and boundary-oriented. Check where policy tokens become flow rules and how allow/deny/encrypt decisions are executed in the inline pipeline.

Q: In TLS, where do handshake bottlenecks usually live—and how to isolate them?

Handshake limits typically come from public-key operations (ECDHE/ECDSA/RSA), session/conntrack table churn, or interrupt/queue handling during bursts. Isolate by separating handshake/s from bulk throughput and correlating active sessions, queue depth, and latency spikes during rekey events. A clean profile reveals whether control work steals cycles from the dataplane.

Q: What are the real pitfalls of IPsec “full offload / partial offload / software fallback”?

Full offload can hide state limits until edge cases appear (SA scale, replay window, sequence handling). Partial offload often breaks at boundary sync between control updates and dataplane timing. Software fallback is a long-tail trap: exception flows pin CPU and inflate p99 latency. Validate with SA scale, replay stress, and rekey bursts under realistic traffic mixes.

Q: Why does enabling DPI/IPS often create latency jitter—and how to prove the root cause?

DPI/IPS adds per-packet work and changes queue dynamics: inspection depth, signature growth, and backpressure can amplify micro-bursts into tail-latency spikes. Prove via a feature matrix using identical traffic models, then compare p99 latency, queue high-water marks, and drops-by-reason. If jitter scales with signature load or packet mix, the feature tax is confirmed.

Q: How to choose fail-open vs fail-closed, and prove bypass does not add new risk?

Fail-closed protects policy integrity but can break site connectivity; fail-open preserves connectivity but must not become an invisible security hole. Prove safety by making bypass transitions auditable (reason, duration, recovery event) and drilling power-loss/crash scenarios. Hardware bypass must be tested for deterministic behavior and verified by logs, not assumptions.

Q: What are the most common real-world causes of attestation failure at the edge?

Attestation commonly fails due to measurement chain drift (unexpected firmware/config differences), certificate/time issues (clock skew breaking verification), or missing observability (no stable reason codes). A deployable node should export success rate, failure reasons, and which measured stage changed, so issues are fixable without disabling security controls.

← Back to: 5G Edge Telecom Infrastructure

An Edge Security / ZTNA Node is an on-site policy enforcement point that turns identity and posture signals into inline, auditable dataplane actions—policy decision, crypto termination, and inspection—without relying on a distant core. “Done” means it can prove trust (secure/measured boot + attestation), hold performance under real traffic (Mpps/p99/sessions), and fail predictably (bypass/HA/update/rollback) with evidence logs.

Edge Security / ZTNA Node

Chapter 1 — What this node is at the edge (scope & boundary)

What this node is at the edge

Role in one sentence (keep the boundary hard)

An Edge Security / ZTNA Node is an inline Policy Enforcement Point (PEP) deployed near MEC/edge workloads, turning identity and device posture into data-plane enforceable rules and session-bound access, while providing cryptographic termination and a verifiable device trust state (TPM/HSM-backed secure + measured boot).

Practical boundary: the node is judged by what it enforces on-wire (policy/session) and what it can prove (boot/attestation), not by cloud platform features.

Why the function must live at the edge

Latency + tail control: policy enforcement and crypto termination at the near end avoids round-trip dependency on distant control planes and reduces p99 spikes during access bursts.
Backhaul realism: when uplinks are constrained or expensive, local enforcement prevents “all traffic must hairpin to cloud” designs that waste bandwidth.
Evidence in the field: edge deployments often need measurable, auditable trust (measured boot + attestation) to satisfy operational risk and compliance requirements.

Typical physical form (what arrives on the bench)

1U / short-depth rugged appliance form factor; multi-port high-speed Ethernet (10/25/50/100G common).
Optional dual power inputs for site resilience; optional fail-open / fail-closed bypass depending on risk model.
Clear internal split: traffic dataplane (NPU/ASIC + flow/policy/crypto/inspection) vs trust & management (TPM/HSM + secure boot chain + attestation + isolated admin).

What it is (in-scope for this page)

Inline dataplane enforcement: parser → flow/session state → policy match → crypto offload → inspection → egress scheduling.
Trust anchor you can verify: secure boot + measured boot, TPM/HSM-backed key handling, remote attestation hooks.
Management isolation as an engineering requirement: least privilege, mTLS/RBAC, auditable changes and evidence paths.

What it is not (explicitly out-of-scope)

Not a cloud SASE control plane or a generic “zero trust philosophy” overview.
Not an operator 5GC/UPF deep dive (GTP-U, QoS flows, slicing internals).
Not a timing architecture page (PTP/SyncE) and not a TAP/probe capture system.
Not an edge rack/PDU/environment monitoring design guide.

Figure 1 — Scope boundary: what belongs to the ZTNA node vs surrounding systems

The page stays inside the ZTNA node boundary: inline dataplane enforcement, verifiable trust (TPM/HSM), and isolated management evidence.

Dataplane-first

Chapter 2 — Inline traffic path: where security functions sit

Inline traffic path: where security functions sit

Why this map matters

An edge ZTNA node is best understood as a pipeline. Every feature—policy, crypto, inspection—occupies a specific stage in the inline path and consumes a measurable budget (Mpps, p99 latency, session state). This chapter pins each function to its stage so later chapters can reference a single backbone instead of repeating theory.

Reading rule: for each stage below, track function, state, bottleneck, and evidence.

Step 1 Ports → PHY/MAC/PCS (link reality)

Function: frame ingress/egress, link negotiation, PCS/FEC behavior that shapes effective throughput and stability.

State: per-port counters (FEC corrections, symbol errors), link training outcomes, pause/PFC events.

Bottleneck patterns: link flaps, “works at 10G but not 25G”, FEC mode mismatch, congestion backpressure that inflates tail latency.

Evidence: port error counters, re-train counts, pause/PFC stats, loss/CRC distribution by port and time window.

Step 2 Parser / Classification (fast-path eligibility)

Function: L2–L4 header parsing and early classification to steer traffic into fast-path tables.

State: header-type hit rates, unknown/slow-path triggers, tunnel/extension header flags.

Bottleneck patterns: uncommon encapsulations forcing slow path, expensive parsing for variable headers, misclassification leading to policy misses.

Evidence: fast-path hit ratio, “unknown header” counters, slow-path CPU/NPU exception counts, per-class latency deltas.

Step 3 Flow / Conntrack (stateful core)

Function: build and maintain session/flow state (5-tuple + metadata) for stateful policy and secure access binding.

State: active sessions, new flows/s, eviction/aging events, table occupancy and collision pressure.

Bottleneck patterns: short-connection churn, flow table thrash, collisions/evictions causing drops or reclassification overhead.

Evidence: active/new/expired flow metrics, eviction reasons, table utilization heatmap, drop reason codes tied to conntrack.

Step 4 Policy enforcement (PEP execution point)

Function: apply allow/deny/limit decisions using compiled policy artifacts derived from identity and device posture.

State: rule sets in dataplane memory, rule hit counters, per-tenant/session labels, policy version stamps.

Bottleneck patterns: large rule scale, frequent updates causing micro-stalls, non-atomic swaps leading to inconsistent enforcement.

Evidence: rule hit counters, policy update latency and failure logs, versioned rollbacks, per-tenant enforcement audits.

Step 5 Crypto (enc/dec) + session keys (feature tax hotspot)

Function: bulk encryption/decryption and integrity checks; bind crypto state to sessions (keys, rekey timers, replay windows).

State: handshake rate, active crypto sessions, rekey schedules, error counters for MAC/auth failures.

Bottleneck patterns: handshake bursts saturating compute, rekey events causing p99 spikes, replay windows stressing memory/lookup.

Evidence: handshakes/s, active sessions, p99 latency during rekey, crypto error reasons, offload vs fallback ratios.

Step 6 DPI / IDS / IPS inspection (cost scales with depth)

Function: content inspection and signature/policy matches on decrypted or pass-through traffic; enforce actions (drop/alert/limit).

State: signature set version, match/hit counters, inspection depth configuration, exception lists.

Bottleneck patterns: deep inspection reducing Mpps, signature explosion, false positives causing avoidable drops and operational noise.

Evidence: throughput with inspection toggles, signature hit distributions, drop reasons, inspection queue occupancy and time-in-stage.

Step 7 Egress scheduling / queues (tail latency control)

Function: queueing, shaping, and scheduling that determines burst tolerance and tail behavior under mixed traffic classes.

State: queue occupancy, tail-drop events, shaping rates, per-class latency and drop counters.

Bottleneck patterns: bufferbloat, mis-sized queues, unfair scheduling during bursts, head-of-line effects when classes are mixed.

Evidence: queue depth traces, p99 latency vs load, drop counters by class, “burst drills” results with repeatable profiles.

Fail-open vs fail-closed (place it on the path, not as philosophy)

Physical bypass (near Step 1): relay/NIC bypass keeps traffic moving when the node is unavailable; enforcement coverage changes and must be logged as evidence.
Logical bypass (between Steps 4–6): selective feature bypass under fault (e.g., disable deep inspection) trades risk for availability; must be tied to explicit policy and audit trails.

Figure 2 — Inline datapath backbone (stages + evidence hooks)

The inline path shows where policy, crypto, and inspection sit—and which counters provide proof when throughput or tail latency collapses under real traffic mixes.

Silicon choices

Chapter 3 — Dataplane silicon choices: NPU vs ASIC vs FPGA vs DPU

Dataplane silicon choices: NPU vs ASIC vs FPGA vs DPU

The “spec sheet trap” behind small-packet collapse

A ZTNA node rarely fails because peak Gbps is too low. Real collapses happen when Mpps becomes the limiter: per-packet fixed work (parsing, table lookups, counters, crypto metadata, inspection bookkeeping) dominates at 64–256B packets. Once multiple features are enabled, the dataplane turns into a state + memory bandwidth problem, not a pure compute problem.

Gbps ≠ Mpps Feature tax Fast-path hit ratio Flow table pressure Memory / bus limits

Verification principle: request performance under a real packet size mix with policy + crypto + inspection enabled, and require counters that prove where cycles and drops occur.

How each silicon type “pays” for ZTNA features

NPU: flexible microcode for ACL/DPI/telemetry; performance depends on feature enablement and fast-path coverage.
ASIC: fixed pipeline optimized for deterministic throughput/power; best when features and rules are stable across large deployments.
FPGA: targeted acceleration (e.g., regex/DPI sub-blocks) or rapid iteration; cost is higher power, BOM, and verification complexity.
DPU/SmartNIC: strong in host-adjacent or virtualization contexts; watch PCIe + memory bandwidth and queueing effects on tail latency (node-internal view only).

Comparison matrix (engineering decision lens)

Tip: treat this as a “questions to ask + acceptance evidence” checklist, not a marketing table.

Dimension	NPU	ASIC	FPGA	DPU / SmartNIC
Peak throughput What marketing shows	High (feature-dependent)	Very high (deterministic)	Moderate–high (design-dependent)	Moderate–high (I/O and host integration-dependent)
Mpps at 64–256B What breaks first	Varies with microcode + fast-path hit	Strong, predictable if feature set fits	Strong for targeted blocks; system Mpps depends on glue logic	Often limited by DMA/queues and memory traffic
Latency / p99 determinism	Good if fast-path dominates	Best (pipeline stability)	Good for fixed designs; can degrade with complex fabric	Queueing and bus contention can inflate p99
Programmability	High (microcode)	Low–medium (firmware knobs)	Very high (RTL/bitstream)	High (software dataplane frameworks)
Upgradeability New features	Strong, but validate feature tax	Limited to what silicon supports	Possible, but regression risk is high	Strong, but constrained by PCIe/memory
Power / thermal	Moderate	Best (perf/W)	Worst (often)	Moderate; depends on workload and I/O
BOM / supply risk	Medium	Medium–high (vendor lock)	High (cost + lifecycle)	Medium (platform variance)

Practical acceptance tests (prevents “Gbps looks fine” surprises)

Packet profile: report Mpps for 64B/128B/256B and a realistic mix (not only jumbo frames).
Feature toggles: measure throughput and p99 with (A) policy only, (B) policy + crypto, (C) policy + crypto + DPI.
State stress: drive new flows/s and verify table stability (evictions, collisions, cache miss behavior).
Rule scale: grow rule count and verify update behavior (atomic swap, update latency, rollback evidence).
Drop provenance: require “drop reason” counters per stage (parser/flow/policy/crypto/DPI/queue).

Figure 3 — Packet pipeline + bottleneck markers (why small packets collapse)

When packet sizes shrink and features stack up, the dataplane becomes dominated by per-packet work and state/memory effects—so “high Gbps” can still fail on Mpps.

Crypto is a system

Chapter 4 — Crypto offload architecture (TLS/IPsec/WireGuard) and its real limits

Crypto offload architecture and its real limits

Key idea: crypto is not one block—it is a resource system

In an edge ZTNA node, encryption performance is determined by two coupled planes: (1) handshake/control plane (public-key work, certificates, session creation) and (2) data plane (bulk encryption, integrity, sequence/replay protection). Real limits show up as handshake bursts, session table pressure, replay windows, and packet size mix—often long before peak Gbps is reached.

handshakes/s active sessions rekey jitter p99 under load offload vs fallback

Common crypto mix at the edge (implementation-focused)

TLS termination: bulk ciphers (AES-GCM / ChaCha20-Poly1305) plus certificate validation and session setup.
IPsec ESP: per-SA state, sequence numbers, anti-replay windows, and rekey behavior under churn.
WireGuard: lean dataplane with periodic rekeying and strict state expectations.

Boundary note: protocol theory is not the goal; the focus is where state, counters, and bandwidth/latency budgets are consumed inside the node.

Offload modes and what each one truly owns

Full offload: bulk crypto + sequence handling + anti-replay window + per-session state. Best for stable, high-volume paths—but state scaling must be proven.
Partial offload: bulk crypto only. Control logic and state bookkeeping remain elsewhere, so handshake/session churn can still bottleneck.
Software fallback: rare algorithms, exceptions, and malformed flows fall back to software. If fallback rate rises, tail latency and throughput can collapse abruptly.

Symptom → likely cause → how to measure (field-proof pattern)

Symptom: logins/tunnels time out during busy periods
Cause: handshake bursts saturate public-key/signature resources and session creation queues
Measure: handshakes/s, handshake failure reasons, CPU/accelerator queue depth, session creation latency
Symptom: “Gbps looks fine” but p99 latency is unacceptable
Cause: rekey events, session table misses, replay window updates, queue contention
Measure: rekey jitter, p99 under load, session cache hit, per-stage drop/queue counters
Symptom: large packets pass but small packets collapse
Cause: Mpps limit driven by per-packet crypto metadata and lookups
Measure: 64B/128B Mpps with crypto enabled, offload vs fallback ratio, per-packet CPU cycles (if exposed)

Evidence metrics to demand (turns claims into acceptance)

handshakes/s: sustainable setup rate at target concurrency and certificate policy.
active sessions: stable session capacity without eviction storms or p99 spikes.
rekey jitter: tail impact during key rotation (not only steady-state throughput).
p99 latency under load: measured with realistic packet size distribution and mixed flows.
offload vs fallback: percentage of traffic handled by hardware vs slow path and the triggers for fallback.

Figure 4 — Handshake plane vs data plane (where crypto really limits)

Separating handshake/control work from inline data work makes it clear why “crypto throughput” can look fine while session churn, replay state, or fallback triggers collapse p99 latency.

Ports decide KPI

Chapter 5 — Ethernet PHY/port subsystem: why ports decide real-world performance

Ethernet PHY/port subsystem: why ports decide real-world performance

Why “dataplane looks fine” but the node still fails in the field

In edge deployments, port behavior often defines throughput, latency tails, and stability more than the security pipeline itself. Multi-port density, high ambient temperature, long copper runs, and mixed optics can trigger training retries, FEC-induced delay/jitter, and flow-control side effects. Those effects appear as drops, retransmits, and tail latency spikes—before any NPU/crypto ceiling is reached.

link flaps training retries FEC delay/jitter Pause/PFC HOL thermal drift

Port mix (10/25/50/100G) and what it costs inside a ZTNA node

Power density: optics/PHY power stacks across ports; thermal headroom becomes a hard KPI limiter.
Training sensitivity: equalization and link training margins shrink at higher rates and higher temperatures.
PCS/FEC overhead: stronger FEC can reduce uncorrected errors but introduces extra latency and jitter variation.
Media behavior: copper length/connectors and optics/module variance change error patterns and stability under load.

Boundary: only port/PHY/PCS/FEC effects that impact ZTNA throughput and latency are covered here (no optical panel planning).

Field symptom checklist → likely mechanism → evidence to collect

Symptom: link flaps / “only one rate is unstable”

Likely mechanism: training margin collapse (temperature, module variance, cable/connectors).
Evidence: retrain counters, LOS/LOL events, error bursts aligned with temperature/load.
Acceptance idea: stability at target rate across temperature and sustained traffic.

Symptom: throughput OK, but p99 latency/jitter is bad

Likely mechanism: FEC correction variability and buffer/queue interactions.
Evidence: corrected/uncorrected trends, latency distribution under realistic packet mix.
Acceptance idea: define p99 budget with FEC enabled (not only best-case).

Symptom: one noisy flow “freezes” unrelated traffic

Likely mechanism: Pause/PFC misbehavior causing head-of-line (HOL) blocking.
Evidence: Pause/PFC counters, queue occupancy spikes, drops clustered after pause storms.
Acceptance idea: verify isolation under congestion (no global stall).

Symptom: intermittent drops/retransmits only at high load

Likely mechanism: buffer exhaustion + microbursts + port-side error recovery.
Evidence: RX/TX drops by reason, burst loss patterns, per-queue drops.
Acceptance idea: reproduce with microburst tests and real packet-size distributions.

Figure 5 — Ports → PHY/PCS/FEC → MAC → dataplane (shortest path that decides KPI)

The most important architecture view for field debugging is the shortest path: ports → PHY → PCS/FEC → MAC → dataplane ingress. This is where flaps, jitter, and flow-control side effects appear.

Evidence chain

Chapter 6 — Root of trust: secure boot vs measured boot (TPM/HSM integration)

Root of trust: secure boot vs measured boot (TPM/HSM integration)

Secure boot vs measured boot (why both matter)

Secure boot: only signed images are allowed to execute, blocking unauthorized firmware.
Measured boot: each stage is measured and recorded (PCR + event log), enabling remote attestation of what actually booted.
Engineering consequence: secure boot answers “can it run”; measured boot answers “what is it running, and can it be proven.”

TPM vs HSM / secure element (mapped to evidence needs)

TPM strength: standardized PCR semantics and attestation ecosystem for verifiable measurements.
HSM/SE strength: stronger key isolation and secure cryptographic operations (often higher assurance levels).
Common pattern: TPM provides measurement/attestation evidence, while HSM/SE protects high-value private keys and signing operations.

Boundary: focus is on the node’s evidence chain and integration points (not a broad standards history).

Boot & measurement chain (attack surface + acceptance points)

ROM: immutable first code. Acceptance: root key/hash anchored in ROM or fused.
BL1/BL2: early bring-up and verification. Acceptance: signed verification + measurement recorded.
UEFI / bootloader: platform init and policy. Acceptance: signed components + event log continuity.
Kernel: OS base. Acceptance: measured kernel/initramfs and verified modules policy.
Dataplane image: fast path code. Acceptance: hash bound to attestation; rollback protection evidence.
Policy engine: enforcement logic. Acceptance: policy package identity bound to runtime measurements.
Attestation report: exportable proof. Acceptance: verifier can match PCR + log to approved baselines.

Figure 6 — Boot & measurement chain (verify + measure + attest)

A verifiable root-of-trust is an evidence chain: each boot stage is verified and/or measured, measurements are anchored in TPM semantics, and an attestation report lets a verifier confirm the exact runtime state.

Key lifecycle

Chapter 7 — Key lifecycle & secrets handling: provisioning, rotation, zeroization

Key lifecycle & secrets handling: provisioning, rotation, zeroization

What this chapter makes executable

Secrets must be treated as auditable assets. A deployable ZTNA node needs a lifecycle plan that prevents mass compromise (shared keys at scale), supports safe rotation during operation, and provides provable zeroization and retirement evidence without leaking secret material.

unique identity no-export keys rotation events wrap at rest zeroize proof

Lifecycle timeline (Day0 → Decommission)

Day0 (factory): per-device identity + initial trust anchor; prevent “same key for many units.”
Day1 (site): minimal onboarding; short-lived bootstrap credentials; cutover to long-term identity.
Runtime: rotation + session material control; key wrapping; privileged action separation.
Incident: deterministic zeroize triggers; preserve security evidence without exposing secrets.
Decommission: provable retirement; device cannot rejoin production trust domain.

Day0 / Day1 checklist (prevent mass compromise at scale)

Day0 — Factory provisioning

Unique device identity: serial-bound certificate chain (auditable sampling).
No-export private keys: generated/held inside TPM/HSM/SE boundary.
Injection record: provisioning events are recorded (what/when/which unit).
Anti-clone signal: identity cannot be duplicated by copying firmware alone.

Day1 — Site onboarding

Bootstrap is short-lived: one-time token or short-lived cert, then cutover.
Privilege separation: installer actions do not expose long-term master secrets.
Post-onboarding proof: a “cutover complete” event exists for audit.
Default secrets forbidden: no shared passwords, no shared client certs.

Runtime controls (rotation, sessions, wrapping, privileged actions)

Rotation: long-term keys/certs rotate via policy; every rotation produces an auditable event record.
Session material: short-lived tickets/keys are bounded by table capacity and timeouts (no unbounded growth).
Key wrapping: stored/transferred secrets are wrapped; unwrap occurs only inside the trust boundary.
Privileged actions: high-value operations support separation of duties (M-of-N concept without operational sprawl).

Boundary: focus is on node-local handling and evidence traits (not broad PKI theory or platform-level workflows).

Incident & decommission (zeroization triggers + proof)

Zeroize triggers (deterministic)

Tamper: chassis tamper flag or security boundary violation.
Trust failure: boot/measurement mismatch or policy package signature failure.
Admin abuse signals: repeated critical auth failures with audit thresholds.
RMA/reset: controlled reset pathway requires proven wipe completion.

Evidence preservation (without secret leakage)

Reason code: zeroize event includes cause ID and scope (which secret domain).
Post-wipe state: key version reset / attestation state change is verifiable.
Audit continuity: minimal log chain survives to prove wipe occurred.
Retire lock: decommissioned units cannot re-enroll without controlled re-provisioning.

Figure 7 — Key lifecycle timeline (Day0 → Decommission) with evidence hooks

A lifecycle plan is incomplete without evidence. Each phase should emit auditable events that prove uniqueness, controlled rotation, and provable zeroization/retirement—without exposing secret material.

Mgmt isolation

Chapter 8 — Control/management isolation: RBAC, mTLS, OOB boundaries

Control/management isolation: RBAC, mTLS, OOB boundaries (without becoming a BMC page)

Why management plane is the most common failure mode

The fastest dataplane can still be defeated if management/control access is exposed through dataplane ports, shared credentials, or unaudited privileged actions. This chapter defines what isolation and evidence a ZTNA node must provide without turning into a facility management or BMC deep dive.

plane separation mTLS RBAC audit proof min services

Must-have vs Never (short, hard, auditable)

Must-have

Plane separation: management services bind only to mgmt interface/VRF; never on dataplane ports.
mTLS control: admin/orchestrator access uses client certs bound to identity.
RBAC: least privilege roles; privileged operations are explicitly gated and logged.
Non-repudiable audit: audit logs have integrity proof (hash-chain/signature summary).
Service minimization: expose only required ports/services; exportable exposure list exists.

Never

Default/shared secrets: default passwords, shared client certs, shared bootstrap keys.
Mgmt on dataports: API/SSH/agent reachable from traffic ports.
Unrotatable certs: certificates that cannot rotate or expire safely.
Silent privilege: role changes or policy pushes without an auditable event trail.
Deleteable audit: logs that a local admin can erase without leaving evidence.

Boundary: only node-local isolation and evidence are covered (not a full BMC or facility management guide).

Common field pitfalls (symptom → consequence → node evidence)

Symptom: sudden certificate failures → Consequence: forced admin fallback → Evidence: clear reason codes + last successful handshake timestamp.
Symptom: “works in lab, exposed in field” → Consequence: dataplane-adjacent mgmt entry → Evidence: exportable service/port binding list.
Symptom: inconsistent admin behavior → Consequence: role drift → Evidence: RBAC change log + immutable audit chain.

Figure 8 — Mgmt plane vs dataplane isolation boundary (mTLS + RBAC + audit)

The node should expose management functions only through the management plane with mTLS, RBAC, and integrity-protected audit logs—while dataplane ports remain dedicated to traffic processing.

Performance

Chapter 9 — Performance engineering: Gbps vs Mpps, latency budget, and feature tax

Performance engineering: Gbps vs Mpps, latency budget, and feature tax

What “real performance” means for this node

Throughput numbers alone do not define performance. A ZTNA edge node is considered “fast enough” only when it sustains the target feature set (policy + crypto + DPI/IPS options) while meeting a tail-latency budget and maintaining the required session concurrency and rule/signature scale. Feature tax is non-linear: enabling security features changes the hot path, shifts bottlenecks, and can cause cliff-like collapses.

Gbps Mpps p99 latency sessions rule scale DPI load

Metric map (what must be reported together)

Throughput (Gbps): large-packet friendly; useful only when packet-size mix is specified.
Mpps: exposes small-packet and multi-rule cost; reveals parser + lookup limits.
p99 latency: captures queueing and feature tax; often the first SLO to break in the field.
Concurrent sessions: bounds conntrack/session tables; impacts memory + lookup behavior.
Rule scale: changes match path and hit rates; can increase cache misses and conflicts.
DPI signature load: “enabled set” matters more than total library size; drives per-packet work.

Requirement: every benchmark must declare packet-size distribution and session distribution; single “max Gbps” results are not actionable.

Three common misreads (and how to prove the real cause)

Misread #1 — “Gbps is high, so it’s fast”

Reality: small packets (64B) + many rules shift the limit to Mpps and lookup hot spots.
Typical cause: flow-table conflicts, cache miss storms, multi-stage match expansion.
Evidence: report Mpps + p99 under a packet-size mix and a rule-scale sweep.

Misread #2 — “DPI is on, throughput is okay”

Reality: user experience fails first via tail latency and queue buildup, not average Gbps.
Typical cause: per-packet work rises; egress queue/buffer policy amplifies jitter.
Evidence: latency CDF (p50/p95/p99) + drop reasons by queue/pressure thresholds.

Misread #3 — “Crypto throughput is enough”

Reality: handshake bursts and session-table growth can collapse control-plane capacity.
Typical cause: key exchange spikes, rekey jitter, replay/session bookkeeping pressure.
Evidence: handshake/s, active sessions, rekey jitter, and p99 during bursts.

Validation method (traffic model, not a single number)

Step 1 — Define the traffic image: packet-size distribution + session duration + concurrency.
Step 2 — Build a feature matrix: baseline → +crypto → +DPI/IPS (toggle-based sweeps).
Step 3 — Output an audit bundle: Gbps, Mpps, p99, sessions, rules, signatures, drop reasons.
Step 4 — Find the knee point: the first curve bend reveals the true feature tax and bottleneck.

Boundary: only node-internal bottlenecks are discussed (lookup, cache, queue/buffer, handshake bursts), not wide-area or platform architecture.

Figure 9 — Feature toggles reshape performance (non-linear feature tax)

Enabling crypto and DPI/IPS changes pipeline work per packet and per session, creating knee points where throughput and tail latency diverge sharply from baseline expectations.

Availability

Chapter 10 — Availability & fail behavior: bypass, HA, and field-hardening

Availability & fail behavior: bypass, HA, and field-hardening

The edge fear: “a security box takes the whole site offline”

Availability for an edge security node starts with an explicit fail behavior, not with marketing uptime numbers. Define whether the site must remain connected (fail-open with controlled bypass) or must remain secure even if connectivity is sacrificed (fail-closed). Then choose bypass mechanisms, HA mode, and a minimal hardening set that is testable and auditable.

fail-open fail-closed bypass active-standby watchdog rollback

Scenario → choice → evidence (keep it auditable)

Scenario A — Security-first

Choice: fail-closed (deny by default when health/trust fails).
Evidence: fault injection proves blocking behavior + logged reason codes.
Proof outputs: fail event ID, timestamp, and policy state at failure.

Scenario B — Connectivity-first

Choice: fail-open with a defined bypass path (not an accidental exposure).
Evidence: power-loss and crash tests still route traffic via bypass as designed.
Proof outputs: bypass engaged reason + duration + recovery event.

Scenario C — High uptime with control

Choice: active-standby HA with a clear state-sync boundary.
Evidence: cutover tests under real session mix quantify p99 and reconnection rate.
Proof outputs: switch log, health reason, and post-fail steady state confirmation.

Bypass options (mechanism → failure mode → acceptance points)

Relay bypass: least dependent on software; verify power-loss, MCU hang, and firmware crash behaviors.
Bypass-capable NIC: controlled switching + readable state; verify switchover window and flap resistance.
Software bypass: a downgrade path only; never the last line of defense; verify it cannot mask silent failures.

Requirement: every bypass entry must produce an auditable event (reason code + timestamp) to avoid “silent open” exposure.

HA requirements (what this node must provide)

State boundary: define what must sync (essential session/conntrack state) vs what can rebuild.
Key boundary: secret material is wrapped for replication; unwrap only inside the trust boundary.
Cutover trigger: watchdog/heartbeat signals have explicit thresholds and enter the audit chain.
Acceptance tests: failover drills under real session distribution + measured p99 impact.

Field-hardening (minimal set, security-relevant)

Watchdog: deterministic recovery from hangs; reboot reason is logged and exported.
A/B update + rollback: failed upgrades roll back to known-good images with evidence.
Crash evidence minimization: preserve debug signals without leaking secrets (scope tags + redaction).
Fail-mode logging: bypass/failover/degrade events are integrity-protected and reviewable.

Figure 10 — Bypass + HA (active-standby) with auditable fail events

Bypass and HA must be chosen based on an explicit fail policy. Every failover or bypass entry should create auditable events (reason codes + timestamps) to prevent silent exposure.

Validation & acceptance checklist: what proves it’s secure and ready

“Ready” means measurable, auditable, and repeatable: the node must export a security evidence chain, hit performance floors under a declared traffic model, survive signed updates/rollback, and behave predictably under fault drills. This section validates only node-local deliverables (boot/attest/keys/audit/perf/update/fail behavior).

Security evidence (proof, not claims)

Checklist

Secure boot chain is enforced for every stage (ROM → BL → UEFI/bootloader → kernel → dataplane image → policy package).
Measured boot exports stage measurements (PCR / measurement log) suitable for remote verification (attestation).
Attestation telemetry is observable: success rate, failure reason code, and which stage broke the chain.
Key isolation is enforced: device identity keys are non-exportable; wrapped keys and session material follow a declared boundary.
Audit integrity: security-critical events are tamper-evident (hash-chain or signed summaries) and exportable for forensics.

Pass criteria (typical starting targets; tune per deployment)

Attestation: ≥ 99.9% success across 1,000 consecutive validations; failures must include a stable reason code and stage identifier.
Boot integrity: any signature/measurement mismatch triggers defined behavior (block / degrade / bypass) and emits an audited event.
Audit logs: policy pushes, role changes, bypass/HA transitions, zeroize events are always recorded and integrity-verifiable offline.

Example material numbers (reference only; verify lifecycle & compliance)

Root-of-trust building blocks commonly used for secure/measured boot and attestation.

Infineon OPTIGA TPM: SLB9670VQ20FW785XTMA1 NXP EdgeLock SE: SE050C2HQ1/Z01SDZ Microchip SE: ATECC608B-SSHDA-B

Secure/boot storage examples for signed images & rollback manifests.

Winbond SPI NOR: W25Q256JVEIQ Macronix SPI NOR: MX25L25645GM2I-08G Micron SPI NOR: MT25QL512ABB8E12-0SIT

Performance evidence (feature matrix + tail latency)

Checklist

Traffic model is declared: packet-size distribution, session distribution, and concurrency (not “single max Gbps”).
Feature matrix is measured: baseline → +crypto → +DPI/IPS → +full policy, with the same traffic model.
Export a KPI bundle: Gbps, Mpps, p99 latency, concurrent sessions, rule scale, drops by reason.
Knee point is identified (the first non-linear collapse) with bottleneck evidence (lookup miss, crypto burst, queue overflow).

Pass criteria (typical structure)

Under the declared traffic model: p99 latency ≤ [X], sessions ≥ [Y], rule scale ≥ [Z].
With required feature set enabled: throughput stays ≥ [Floor Gbps] and p99 does not exceed [Ceiling].
Knee point documentation includes: load condition, enabled features, primary bottleneck, and recommended safe operating envelope.

Example material numbers (performance-critical plumbing)

High-speed signal integrity parts that often gate “real” Mpps/latency stability in multi-port edge nodes.

HS retimer: TI DS280DF810

Representative 10/25GbE controller reference used in common adapters (controller-family indicator).

10/25GbE controller family: Broadcom BCM57414

Update evidence (signed updates + rollback drills)

Checklist

Signed update manifests exist for dataplane images and policy bundles; signature status is auditable on-device.
A/B update + rollback path is implemented and rehearsed (not “should be possible”).
Version & attestation continuity: after update/rollback, boot measurements and attestation still prove what is running.
SBOM / image summary is provided as a signed artifact (brief, practical—no compliance encyclopedia).

Pass criteria (practical)

Failed update recovers to last-known-good within [T] and emits an auditable event with reason code.
Rollback preserves attestation validity and produces an exportable “what changed” delta (version + hash + signature state).

Field evidence (fault drills with predictable behavior)

Four drills that expose real deployment failures

Link jitter / micro-burst: verify p99, drops-by-reason, and no accidental bypass toggles.
Certificate expiry / time skew: verify stable reason codes, controlled degradation, and recovery path (no “manual insecure workaround”).
Rule explosion: sweep rule scale and show knee point + chosen guardrails (limits, prioritization, or staged updates).
Key rotation burst: measure handshake/s, rekey jitter, and p99 impact while preserving audit integrity.

Pass criteria (repeatable format)

Each drill must output: Expected → Observed → Audit evidence (event IDs, reason codes, timestamps).
If bypass/HA triggers: record reason, duration, traffic impact, and recovery event (no silent transitions).

Example material numbers (availability & bypass primitives)

Representative bypass NIC part numbers used in inline appliances (model-level references).

Bypass NIC: Silicom PE2G2BPI80 Bypass NIC: Silicom PE310G2BPI71

Representative watchdog timer ICs used for “predictable recovery” paths.

Window watchdog: TI TPS3436-Q1 Watchdog timer: ADI/Maxim MAX6369KA+T

Figure F11 — “Evidence bundle” gates for acceptance (node-local)

Request a Quote

Name

Company

Part Number(s) / BOM

Quantity & Target Lead Time

Alternates Allowed

Temperature Grade

Package / Footprint

Compliance

Budget Window

Lot Size / Qty

Message

Attachment

Accepted Formats

pdf, csv, xls, xlsx, zip

Attachment

Drag & drop files here or use the button below.

FAQs (Edge Security / ZTNA Node)

Each answer stays node-local: dataplane placement, crypto limits, trust root evidence, key handling, management isolation, performance proof, and fail behavior. Platform-wide SASE/5GC topics are intentionally out of scope.

Where is the practical boundary between a ZTNA node and a traditional firewall/UTM?

A ZTNA node acts as a site-local policy enforcement point (PEP) where identity/device posture becomes enforceable session rules on the dataplane. A firewall/UTM is typically IP/port-centric and boundary-oriented. Evidence to check: which stage converts policy tokens into flow rules, and how “deny/allow/encapsulate” is executed in the inline pipeline. (See H2-1, H2-2.)

Signals: rule hits, drops-by-reason, session table size, policy commit/rollback logs.

Why can a “100Gbps” node collapse under small packets and many sessions?

Line-rate Gbps does not guarantee packet-rate. With 64B packets and many concurrent sessions, bottlenecks shift to parsing, flow/policy lookups, cache misses, queue pressure, and feature work (crypto/DPI). Measure Mpps, p99 latency, drops-by-reason, and knee points across a realistic packet-size/session distribution, not a single throughput test. (See H2-3, H2-9.)

Look for: flow-table conflict rate, queue high-water marks, CPU softirq saturation (if any path is software).

In TLS, where do handshake bottlenecks usually live—and how to isolate them?

Handshake limits typically come from (1) public-key ops (ECDHE/ECDSA/RSA), (2) session/conntrack table churn, or (3) interrupt/queue handling when bursts arrive. Isolate by separating handshake/s from bulk throughput, then correlating active sessions, CPU/queue metrics, and latency spikes during rekey events. A clean profile shows where the control path steals cycles from the dataplane. (See H2-4, H2-9.)

KPIs: handshake/s, active sessions, rekey jitter, p99 under burst, drops by reason.

What are the real pitfalls of IPsec “full offload / partial offload / software fallback”?

Full offload can hide state limits (SA scale, replay window handling, sequence management) until edge cases appear. Partial offload often fails on boundary sync: control-plane SA updates do not match dataplane timing. Software fallback is the classic long-tail trap: uncommon algorithms or exception flows pin CPU and inflate p99 latency. Validation must include SA scale, replay stress, and rekey bursts under realistic packet-size distributions. (See H2-4.)

KPIs: SA scale, replay rejects, rekey jitter, exception-flow hit rate, p99 delta vs baseline.

Why does enabling DPI/IPS often create latency jitter—and how to prove the root cause?

DPI/IPS adds per-packet work and changes queue dynamics: deeper inspection, signature set growth, and backpressure can turn micro-bursts into tail-latency spikes. Prove the cause with a feature matrix: baseline vs DPI/IPS with identical traffic models, then compare p99 latency, queue high-water marks, and drops-by-reason. If jitter scales with signature load or packet mix, the “feature tax” is confirmed. (See H2-9.)

Evidence: queue occupancy, inspection load counters, head-of-line blocking symptoms, knee point shift.

How to choose fail-open vs fail-closed, and prove bypass does not add new risk?

Fail-closed protects policy integrity but may break site connectivity; fail-open preserves connectivity but must not become an invisible security hole. Prove safety by making bypass transitions auditable (reason code, duration, recovery event) and by drilling power-loss/crash scenarios. Hardware bypass (relay/bypass NIC) must be tested for deterministic behavior and monitored by logs, not trust. Example bypass NIC model: Silicom PE310G2BPI71. (See H2-10.)

Acceptance: “Expected → Observed → Audit proof” for every bypass trigger path.

Why can rollback/downgrade attacks still work even with secure boot—and how to stop them?

Secure boot blocks unsigned images, but rollback attacks abuse older signed versions with known vulnerabilities. Prevention needs anti-rollback policy: monotonic counters, version binding inside the signed manifest, and a boot chain that rejects “valid but too old” images. The proof is a rollback drill: attempt to boot an older signed build and show a controlled block/degrade action plus an auditable event. (See H2-6, H2-11.)

Deliverable: signed version policy + rollback test report + audit event excerpts.

What are the most common real-world causes of attestation failure at the edge?

Attestation often fails due to (1) measurement chain drift (unexpected config/firmware differences across stages), (2) certificate/time problems (clock skew causing verification failure), or (3) missing observability (no stable reason codes). A deployable node must export attestation success rate, failure reasons, and which measured stage changed, so issues are fixable without “turning off security.” (See H2-6, H2-11.)

KPIs: attestation success %, reason-code histogram, stage/PCR deltas, remediation workflow time.

How should TPM and HSM/secure element split responsibilities to avoid waste?

TPM is strongest for standardized measured-boot evidence (PCRs, quotes, attestation workflows). An HSM/secure element excels at key isolation, policy-enforced non-exportability, and often better performance for specific crypto operations. The clean split: TPM for measurement evidence and platform identity; SE/HSM for key wrapping, device credentials, and protected key stores. Example parts: Infineon SLB9670VQ20FW785XTMA1, NXP SE050C2HQ1/Z01SDZ, Microchip ATECC608B-SSHDA-B. (See H2-6.)

Acceptance: documented key boundaries + non-export proofs + attestation continuity after updates.

Why can certificate/key rotation cause “instant drops” or performance spikes?

Rotation is a control-path event that can create dataplane transients: session re-establishment bursts, conntrack churn, cache invalidation, and temporary CPU/accelerator contention for handshakes. The fix is operational discipline: staged rotation, rate limits, and visibility into rekey jitter. Prove readiness by rotating under load and showing bounded p99 latency, stable session counts, and complete audit trails for key lifecycle events. (See H2-7, H2-9.)

KPIs: handshake/s, rekey jitter, p99 delta, drops-by-reason during rotation window.

What are the most dangerous “self-destruct” management-plane configurations?

Common failures include default credentials, shared data/mgmt ports without isolation, missing mTLS, oversized RBAC roles, unaudited privileged actions, and unmanaged certificate lifetimes. Another edge killer is time drift: certificate validation fails and operators disable checks to recover. A ZTNA node should ship with minimal services enabled, strict RBAC + mTLS, and tamper-evident audit exports. (See H2-8.)

Evidence: RBAC matrix, mTLS status, audit integrity proof, cert expiry alarms, time-drift alerts.

How should acceptance testing be designed to prove “secure + performant + operable”?

Acceptance must deliver an evidence bundle: (1) secure/measured boot proof + attestation success and reason codes, (2) feature-matrix performance under a declared traffic model (Gbps, Mpps, p99, sessions, rule scale), (3) signed update + rollback drills, and (4) fault injections (cert expiry, rule burst, key rotation burst, link jitter) with Expected→Observed→Audit proof. This defines “done” without turning into a compliance encyclopedia. (See H2-11.)

Deliverables: benchmark bundle, update/rollback report, fault drill report, audit extracts.