Enterprise Wi-Fi Access Point (AP) Hardware Guide
← Back to: Telecom & Networking Equipment
An enterprise Wi-Fi access point is a Wi-Fi radio system plus a wired uplink and PoE-powered rail tree—field performance is determined by where payload goodput is lost across RF→MAC airtime→uplink queues, and is proven by measurable evidence (airtime/retries/EVM, uplink errors, rails/thermal logs).
This page maps KPIs to hardware blocks (RF front-end, antennas, Ethernet, PoE PD, clocks, thermal) and provides a troubleshooting playbook to isolate the failing segment quickly without blaming the wrong layer.
H2-1 · What an Enterprise Wi-Fi AP Is (and what it is not)
Definition (boundary-first): An enterprise Wi-Fi access point is a multi-radio wireless access node that bridges client airtime to a high-speed Ethernet uplink, powered primarily by PoE (PD-side), with telemetry and controls for fleet operations.
Not a gateway: Routing/NAT/firewall/CGNAT functions belong to other network appliances. If a user-facing symptom originates upstream (LAN congestion, policy enforcement, WAN issues), the AP can only report and surface evidence, not “fix the Internet.”
Typical enterprise AP form factors: ceiling-mount or wall-mount enclosures optimized for quiet operation and wide coverage. Hardware choices are driven by RF performance, uplink capacity, power budget, and thermal headroom.
- Single/dual/tri-radio: more radios increase peak current, heat density, antenna isolation challenges, and calibration complexity—not just “more throughput.”
- Internal vs external antennas: internal arrays simplify industrial design but raise detuning/isolation sensitivity to mounting surfaces and nearby metal.
- 2.5G/5G/10G uplink: uplink speed sets the “exit capacity.” A strong RF link cannot overcome a saturated uplink or excessive uplink errors.
- PoE-powered: long-cable voltage drop and transient dips become first-class reliability concerns (brownout, resets, PA back-off).
Acceptance target for this page: build a practical engineering loop from KPIs → hardware chain → validation → root-cause:
- Translate “slow Wi-Fi” into measurable loss points (RF → MAC airtime → queues → uplink → power/thermal derating).
- Define the minimum evidence set to prove a cause (EVM/retx/airtime + uplink counters + rail droop + temperature).
- Convert architecture into selection criteria (SoC/radios/FEM/PoE PD/DC-DC/clocks/thermal) without SKU dumping.
H2-2 · System context: data, power, and telemetry paths
Why this matters: An enterprise AP is not a single “radio.” It is a three-path system. Most field failures become diagnosable only after mapping symptoms to the correct path and collecting the right evidence.
- Data path: determines throughput, latency, and loss (RF airtime → queues → uplink).
- Power path: sets stability under peak load (PoE input dips → rail droop → resets/derating).
- Mgmt/telemetry path: turns complaints into proof (counters, sensors, logs, calibration state).
Data path (where bandwidth is actually lost):
- Client ↔ Radio: airtime contention, interference, and retransmissions often dominate “why goodput is low.”
- Radio ↔ SoC queues: scheduling and buffering decide whether many clients cause latency spikes.
- SoC ↔ Ethernet uplink: uplink speed and port errors can cap performance even with excellent RF.
Practical evidence to collect: MCS distribution, retry rate, airtime utilization, queue depth indicators (if exposed), uplink utilization and error counters.
Power path (PD-side only):
- PoE input reality: long cables and connector losses create voltage drop; transients create dips.
- Conversion tree: PoE PD/front-end → main DC/DC → multiple rails (SoC/DDR, RF, PA bias).
- Failure signature: peak TX + high temperature increases current draw → rail droop → PA back-off, link instability, or brownout reset.
This page treats PoE switch behavior as an external dependency. Only the AP (PD) power tree, monitoring points, and resulting symptoms are in-scope.
Mgmt/telemetry path (the “evidence chain”):
- Thermal: SoC/PA/DC-DC temperature trends explain throttling and long-term drift.
- Power events: brownout flags, rail minimums (if monitored), reset causes, PG trips.
- RF state: calibration table version, power/EVM trend snapshots (enough to detect “degraded RF”).
- Uplink health: link flap counts, negotiated rate changes, error bursts near thermal peaks.
Goal: a field unit can produce a compact, actionable bundle of logs/counters to reduce “no repro” RMAs.
Enterprise deployment constraints → typical symptoms:
- Ceiling heat: temperature rise → PA back-off / SoC throttling → lower MCS and unstable peak throughput.
- Dense co-channel environment: more retries → lower goodput and higher jitter even when RSSI looks “fine.”
- ESD/surge on cables: uplink instability, PHY errors, renegotiation loops.
- High client concurrency: airtime contention dominates and queues amplify latency spikes.
H2-3 · Performance KPIs and why targets are missed (throughput ≠ experience)
Core idea: “Fast Wi-Fi” is not a single number. Enterprise experience is a stack of KPIs where each layer can cap goodput or inflate latency.
- PHY quality: MCS distribution and error-vector quality determine how efficiently the channel can carry bits.
- MAC airtime: contention, retries, and scheduling decide how much airtime becomes usable payload.
- Queues + uplink: buffering and uplink limits turn bursts into jitter (and can masquerade as “RF problems”).
- End-to-end: latency and jitter often matter more than peak throughput for voice, conferencing, and interactive apps.
KPI layers (what to measure at each boundary):
- RF / PHY: MCS spread (not only peak), EVM (or equivalent link-quality indicator), packet error trends.
- MAC / airtime: airtime utilization, retry rate, contention behavior under load, scheduling efficiency.
- Bridge / queues: queue depth indicators, drop counters, queueing delay (burst amplification).
- Uplink: port utilization, negotiated rate, PHY errors/CRC bursts, link flap counts.
- User experience: latency and jitter distribution (p50/p95), plus loss under peak concurrency.
Tip: A high reported PHY rate with low goodput almost always points to airtime/retries or queues/uplink—not “CPU performance.”
Three common misdiagnoses (and the correct mental model):
- “Rated speed” ≈ real throughput: protocol overhead, retries, and contention consume airtime; only goodput reflects usable payload.
- Many clients → CPU bottleneck: with high concurrency, the dominant limit is usually shared airtime and collision/retry behavior.
- Strong RF guarantees performance: uplink saturation or uplink errors can cap goodput and worsen jitter even when RSSI looks excellent.
Fast triage order (minimum evidence set):
- Step 1 — Goodput gap: compare goodput vs reported PHY rates; large gaps indicate losses below “headline speed.”
- Step 2 — Airtime + retries: confirm whether contention/retransmissions dominate under concurrency.
- Step 3 — Uplink proof: check uplink utilization, negotiated rate, and error counters to prove a hard exit cap.
- Step 4 — Jitter signature: validate queue-driven jitter with latency distribution (p95/p99 increases under bursts).
- Step 5 — Correlate with device state: correlate KPI drops with temperature and power events (derating or brownout symptoms).
H2-4 · Hardware architecture overview (SoC, radios, FEM, uplink, power, clocks)
Architecture goal: map each KPI layer to the hardware blocks that can limit it. This avoids “guessing” when a field symptom appears.
- RF domain: radios + FEM + antennas determine EVM, MCS stability, and coverage.
- SoC domain: baseband/MAC + queues + telemetry shape concurrency behavior and latency spikes.
- Uplink domain: Ethernet PHY and magnetics determine exit capacity and link stability.
- Power domain (PD-side): PoE PD + DC/DC/PMIC rails determine peak-load stability and derating.
- Clock domain: XO/TCXO + PLLs influence high-order modulation stability and emissions margin.
- Sensing/logging: sensors and event logs provide proof for RCA (thermal, power, uplink, RF).
Block roles + interfaces (no SKU dumping):
- Wi-Fi SoC: MAC/baseband processing, buffering/queues, management agent. Common interfaces: internal buses, Ethernet MAC, control GPIO/interrupts.
- Radios (2.4/5/6 GHz) + FEM: RF conversion and front-end linearity/sensitivity. Interfaces: RFFE control + SPI/I²C for configuration.
- Antenna network: MIMO performance depends on isolation and layout. Key risks: detuning and coupling in real mounting conditions.
- Ethernet PHY + magnetics: 2.5G/5G/10G uplink behavior. Interface: MDIO for link status, negotiated rate, and counters.
- PoE PD + DC/DC/PMIC: PoE input handling and multi-rail delivery (SoC/DDR, RF, PA bias). Optional monitors: ADC/PG, sometimes PMBus.
- Clock tree: XO/TCXO → RF PLL/LO → digital clocks. Noise coupling here shows up as degraded EVM or unstable high MCS.
Debug hooks to design for (evidence-first):
- RFFE/SPI/I²C: radio state snapshots (band, channel width, TX power state, calibration table version).
- MDIO counters: link rate changes, error bursts, and flap counts that correlate with heat or ESD events.
- Power probes: input voltage, rail minimums, brownout/reset causes, PG events during peak TX.
- Thermal probes: SoC/PA/DC-DC temperatures to explain throttling or long-term drift.
H2-5 · RF front-end and the RF link: PA/LNA/filters/switches decide high-MCS stability
Core idea: High MCS is not “enabled by software.” It is earned by low distortion (EVM/linearity) and robust receiver behavior (blocker tolerance and selectivity) across temperature and peak load.
How key RF metrics map to real link outcomes:
- EVM / linearity: a hard gate for 1024/4096-QAM. When PA bias noise, compression, or LO-related spurs rise, EVM worsens and MCS collapses under load.
- ACLR / out-of-band leakage: in dense deployments, leakage turns into practical interference. It reduces coexistence margin even when RSSI looks “strong.”
- Receiver sensitivity vs blockers: range issues often come from a receiver that is compressed or desensed by nearby strong signals, DC-DC noise, or inadequate filtering.
Engineering mindset: treat “high PHY rate but unstable MCS” as an RF integrity problem until evidence proves otherwise.
FEM selection & layout: four practical levers that frequently dominate results
- PA supply/bias integrity: DC-DC ripple and ground bounce create AM-AM / AM-PM distortion → EVM degradation → MCS downshift under peak TX.
- LNA linearity: chasing ultra-low NF while ignoring compression can cause blocker-driven desense → “short range” and volatile throughput.
- Switch insertion loss & isolation: loss burns link budget; poor isolation creates self-interference paths that look like “random RF weakness.”
- Filter Q and temperature drift: selectivity and leakage margin change with temperature; dense deployment problems often worsen at high ceiling heat.
Multi-band / multi-radio interference sources (think “injection points”):
- LO leakage & spurs: LO/PLL artifacts and harmonics can land inside sensitive bands or raise the noise floor.
- DC-DC coupling: switch-node ripple couples into PA bias, LNA rails, or reference ground and degrades EVM/sensitivity.
- Self-interference: insufficient antenna isolation allows TX energy to leak into RX paths, especially in compact enclosures.
H2-6 · Antennas and MIMO: isolation and correlation matter more than “more antennas”
Core idea: MIMO gains require low correlation and adequate isolation. Antenna count alone does not guarantee spatial multiplexing, especially in compact ceiling enclosures.
Three real-world constraints that reduce MIMO benefit:
- Insufficient isolation: TX/RX leakage and antenna-to-antenna coupling reduce spatial separation → lower multi-stream efficiency and unstable peak throughput.
- Layout and routing coupling: feedline coupling and shared return paths raise correlation → ECC increases → “many antennas, little gain.”
- Enclosure and installation environment: ceiling/wall materials detune the array and distort the radiation pattern → coverage holes and band-specific weakness.
Express antenna/MIMO quality with measurable acceptance metrics:
- Isolation (dB): treat it as a matrix across adjacent and cross-pairs; the worst pairs often dominate behavior.
- ECC: a direct indicator of correlation; rising ECC predicts reduced spatial multiplexing gains.
- TRP / TIS: system-level transmit/receive performance that captures detuning and losses beyond “bench RF.”
- OTA EVM: closes the loop with H2-5 by validating modulation stability in a realistic radiated setup.
2.4/5/6 GHz coexistence design choices (practical tradeoffs):
- Shared vs dedicated elements: sharing saves space but stresses isolation and matching across bands.
- Diversity vs multi-feed: diversity improves robustness; multi-feed increases peak capacity but demands tighter correlation control.
- Mounting sensitivity: a design that works on a lab stand can shift when installed near metal grids or dense cabling above ceilings.
H2-7 · Ethernet uplink: 2.5G/5G/10G PHY, routing, EMI, and the real bottlenecks
Core idea: The uplink is the “exit capacity” of the AP. If uplink margin is insufficient or unstable, RF improvements will not translate into user goodput, and latency/jitter will inflate under bursts.
Uplink speed selection (2.5G vs 5G vs 10G): practical decision logic
- Start from delivered goodput, not PHY: compare realistic multi-client goodput at busy hour against uplink sustainable throughput (leave headroom for bursts and queueing delay).
- 2.5G is often sufficient when concurrency and sustained traffic are moderate, and the AP is not designed to aggregate multiple high-duty radios into a continuous exit stream.
- 5G/10G becomes necessary when tri-radio operation, dense concurrency, or continuous high goodput is expected—especially when the goal is to reduce queue-driven jitter, not just increase peak Mbps.
Rule-of-thumb framing: upgrade uplink when the bottleneck signature is “uplink utilization high + queueing delay spikes,” not when a single-device speed test disappoints.
PHY + magnetics: what matters in real boards
- Signal integrity: return loss, crosstalk, and common-mode behavior determine error bursts and downshift events under temperature and cable variation.
- Common-mode noise control: poor common-mode containment can leak noise into system ground/reference and aggravate RF EVM or receiver sensitivity.
- ESD/surge placement: protection parts must steer energy to the correct return path; the wrong path creates intermittent instability and “mystery” flaps.
Common field failures (symptom → likely mechanism → proof)
- Repeated negotiation / link flaps: frequently a physical-layer stability issue (connector/cable/ESD after-effects). Proof: link up/down counters + negotiated rate changes + correlation to movement/ESD events.
- Thermal error bursts: stable at night, unstable at ceiling heat. Proof: error counters rising with temperature + goodput jitter spikes even when average utilization is moderate.
- Cable quality downshifts: the AP silently drops speed or becomes unstable on certain cable types/lengths. Proof: swap-cable A/B test + immediate speed/BER stabilization.
- EMI back-coupling into RF: common-mode return paths inject noise into RF/power reference. Proof: RF KPI drops coincide with uplink activity bursts and common-mode events.
H2-8 · PoE PD and the power tree: root-causing drops/reboots (PD-side only)
Core idea: Many “Wi-Fi instability” complaints are power events in disguise. PD-side design must survive cable drops, thermal derating, and PA peak current without rail collapse.
PD-side essentials (what to cover and why it fails):
- Classification & power ceiling: if available power is near the edge, peak events trigger throttling, brownout, or resets under multi-radio load.
- Input rectification & inrush: cable drop and transient behavior determine whether the PD input stays inside the safe window during bursts.
- DC/DC conversion + soft-start: rail stability under step loads matters more than steady-state efficiency.
- Hold-up behavior: short input dips should not turn into a full reset; hold-up margin is a design differentiator.
Boundary: PSE-side allocation and LLDP policy are not expanded here; link out to the PoE switch page for that.
Power budget breakdown (what actually pulls peak current):
- SoC + DDR: sustained compute and buffering (often steady but temperature sensitive).
- Multi radios: parallel activity increases baseline load and calibration activity.
- PA peak: the dominant step-load driver that can create rail droop, especially at high temperature.
- Aux loads (optional): fan/USB/IoT expansion can silently consume margin and trigger edge behavior.
Field signatures (symptom → mechanism → proof)
- Throughput drops first, then occasional reboot: thermal derating + PA peaks pull rails down. Proof: temperature trend + PG/UV events + RF KPI (EVM/MCS) degradation.
- Random reboot under bursts: PoE input wobble causes brownout. Proof: reset cause + minimum-rail logs + input voltage dip correlation.
- Works in light load, fails in busy hour: classification/power ceiling is too tight. Proof: current limit indications + repeated UV events during peak radio activity.
H2-9 · Clocks and LO: how jitter/phase noise becomes lost throughput and coverage (no network timing deep-dive)
Core idea: Clock/LO quality directly shapes RF quality. When jitter or phase noise rises, EVM worsens, leakage increases, and receiver performance degrades—high-order QAM becomes unstable long before any “software bottleneck” is visible.
Clock “family tree” inside an enterprise AP
- Ref source (XO/TCXO): sets the noise floor and temperature behavior of the timing reference.
- RF PLL: translates the reference into the LO; sensitive to power noise and layout coupling.
- LO: directly impacts modulation accuracy and adjacent leakage in dense deployments.
- Baseband clocks + timers (TSF): align sampling, processing, and scheduling; instability shows up as performance variance under high MCS.
Boundary note: PTP/SyncE is a network timing system; only the AP-internal clock chain is covered here (link to the Timing Switch page if needed).
Visible consequences (what a user ultimately feels)
- EVM ↑ → high MCS cannot hold → rate flaps, retries increase, peak goodput becomes volatile.
- ACLR / leakage ↑ → co-channel and adjacent interference worsens → “RSSI looks strong but experience is poor” in dense floors.
- Rx sensitivity ↓ (noise floor / reciprocal mixing effects) → coverage shrinks earlier → edge throughput collapses sooner.
- High-order QAM instability: problems emerge under high MCS first, even when low MCS appears fine.
Common pitfalls (where noise enters the chain)
- Power noise into PLL/ref: DC/DC ripple and poor decoupling translate into phase modulation and spur growth.
- Insufficient isolation/keepout: digital activity couples into sensitive reference routes and PLL nodes.
- SSC tradeoff: spread-spectrum may ease EMI signatures but can reduce RF margin if used without proper constraints.
H2-10 · Thermal design and reliability: why ceiling APs throttle and fail first
Core idea: Ceiling-mount APs often run with weak convection and persistent heat soak. Thermal limits usually show up as throttling/backoff (throughput drops first), then error bursts, and finally resets when margins are exhausted.
Heat source breakdown (who spikes vs who persists)
- PA: peak heat under sustained high-rate TX; often drives early EVM/MCS degradation.
- SoC: persistent heat under concurrency and processing; throttling often targets this node.
- DC/DC: local hot spots under high current; can reduce rail margin at high temperature.
- Ethernet PHY: temperature-sensitive behavior can elevate error bursts and link instability.
Fanless thermal path in a ceiling AP (where heat must flow)
- Die → package → TIM/pad → metal backplate → enclosure → air
- Ceiling constraints: limited airflow, heat trapping above tiles, dust buildup, and high ambient swings amplify long-term drift.
- Sensor placement: measure close to PA/SoC and at the backplate to capture both spikes and heat soak.
Derating controls and field observability (what to log)
- Controls: PA backoff, radio disable, SoC throttling (and fan curve if present).
- What to observe: temperature trend, current trend, reset cause, Ethernet link-flap counters, and RF KPI shifts under load.
- Reliability angle: prolonged high temperature accelerates drift (calibration tables and PA aging); trending data prevents “mystery” performance decay.
H2-11 · Bring-up / Calibration / Troubleshooting: infer the failing segment from symptoms
Bring-up order (fastest path to a stable baseline)
Goal: establish a repeatable baseline before chasing “wireless performance.” Each step ends with a pass/fail proof.
- 1) Power rails first: verify rail stability, UV/PG events, and brownout history (scope the lowest rail dip under load).
- 2) Clock lock next: confirm ref/PLL lock and that RF/baseband clocks remain stable during traffic bursts.
- 3) Ethernet uplink stability: confirm negotiated rate is stable; check error bursts and link flaps under thermal soak.
- 4) SoC boot + logging: confirm reset cause, watchdog events, crash logs, and storage/NVM access are clean.
- 5) RF calibration: power, IQ balance, frequency offset, and EVM in conducted mode before OTA validation.
- 6) OTA + throughput validation: validate coverage/throughput in controlled scenes; then scale concurrency and airtime pressure.
Practical tip: add explicit test points for key rails (SoC core, RF, PA bias), PLL supply, and uplink PHY supply so “symptom → evidence” is measurable.
Calibration and test: conducted vs OTA (how to combine)
- Conducted: repeatable and best for isolating silicon/board issues (IQ, frequency offset, EVM, leakage).
- OTA: captures enclosure, antenna placement, detuning, and real coupling paths (TRP/TIS/OTA EVM).
- Recommended combo: use conducted to lock a board-level baseline first; use OTA to validate the final “installed product” behavior.
Calibration items that most directly impact high-order QAM stability: IQ imbalance, PA linearity margin, frequency offset, and EVM under bursty traffic.
Symptom → evidence → likely segment (a consistent debug playbook)
- Low throughput: check airtime + retries first; then check uplink utilization/queues; only then suspect CPU/firmware.
- Short range: check noise floor / blocking symptoms; then check antenna detune and isolation/correlation (OTA KPIs).
- Disconnect / reboot: correlate reset cause with rail dips and temperature trend; confirm if derating happens before failure.
- Error bursts / link flaps: separate uplink errors (PHY/cable/EMI) from wireless retries (RF quality/airtime).
Example material numbers (reference BOM items used for bring-up evidence)
These are commonly used parts that directly support the measurements above (availability/fit must be verified per design).
- PoE PD (PD side): TI TPS2372-4, TPS2373-4; TI TPS23753A (PD + PWM interface); ADI LTC4269 (PoE+ PD controller).
- eFuse / surge / input protection (helps prove brownout causes): TI TPS25947 (eFuse); ADI LTC4366 (surge stopper).
- Rail / power telemetry (to correlate throughput drops with rails): TI INA226, INA238 (current/voltage/power monitor).
- Temperature sensing (thermal soak evidence): TI TMP117, TMP102.
- 2.5G/5G/10G uplink PHY examples (uplink stability evidence): Marvell (Aquantia) AQR112C (2.5G), AQR113C (5G), AQR107 (10G); Realtek RTL8221B (2.5G).
- Wi-Fi FEM examples (RF margin / EVM / range): Qorvo QPF4216 (2.4 GHz FEM), QPF4550 (5 GHz FEM); Skyworks SKY85743-21 (2.4 GHz FEM), SKY85747-11 (5 GHz FEM).
Material numbers above are included because they map to measurable evidence in this chapter (rails, reset causes, thermal trend, uplink errors, RF margin), not as a vendor lock-in list.
H2-12 · FAQs (Enterprise Wi-Fi AP)
Format: each answer gives (1) boundary statement, (2) the fastest evidence to check, (3) a decision fork to locate the failing segment.
1) Where is the boundary between an enterprise AP and a home router—what should not be “blamed” on the AP?
- Fast checks: uplink link state + errors, DHCP/DNS failure counters (if exposed), AP event logs (reset/thermal/power).
- Decision: if uplink is stable and Wi-Fi KPIs are normal, suspect upstream routing/policy rather than RF/AP hardware.
2) Why does a “3Gbps/9Gbps” AP deliver only ~50% goodput in the field—what 3 counters should be checked first?
- If airtime is high → air contention dominates. If retries spike → RF chain/coverage dominates. If uplink queues saturate → uplink dominates.
3) When client concurrency rises and latency/jitter explodes—how to tell airtime contention from CPU/memory limits?
- Fork: high airtime + retries → RF/medium access problem; low airtime but deep queues + high CPU → SoC/uplink processing bottleneck.
4) When does a 2.5G Ethernet uplink become a “hidden bottleneck,” and how to prove it with data?
- Fork: if airtime is moderate but uplink stays near saturation with growing queues, the uplink is the limiting stage.
5) 6GHz coverage looks much worse—is that normal physics or an RF chain issue?
- Fork: normal EVM but lower SNR → physics; EVM/retries jump early → RF front-end, antenna detune, or interference coupling.
6) When high-order QAM (MCS) won’t hold—what is most often responsible: EVM, PA linearity, or clock phase noise?
- Fast checks: EVM vs power, ACLR/leakage trend (if available), EVM vs rail noise/thermal soak.
7) After ceiling-mount installation performance drops—how to test antenna detune/isolation reliably?
- Fast checks: OTA EVM/TRP/TIS trend (if available), retries vs orientation, isolation/ECC proxy measurements.
8) Why does an AP reboot under high load—how to distinguish PoE power limit vs thermal derating vs rail droop?
- Fork: temperature rises then throughput drops before reboot → thermal; sudden reboot with UV/PG evidence → power/rail droop; stable rails + crash signature → software/driver last.
9) PoE “seems sufficient” but occasional dropouts occur—how to isolate hold-up vs surge vs cable voltage drop?
- Fast checks: PD input minimum voltage, rail minimum + recovery time, event log flags around protection/UV.
10) Ethernet uplink link flaps—how to separate cable/magnetics/EMI issues from PHY thermal problems?
- Fast checks: negotiated rate history, error counter bursts, temperature near PHY + time-to-fail correlation.
11) Why don’t OTA results match conducted results—how should test coverage be designed?
- Coverage: conducted baseline → OTA free-space → OTA installed → thermal soak + traffic bursts.
12) During AP selection, what three “field stability” metrics are most often overlooked?
- Decision: prefer platforms that can prove failures with counters/logs rather than “black-box” symptoms.