High-Current COB/CSP Constant-Current LED Driver Design
← Back to: Lighting & LED Drivers
Core takeaway: High-current COB/CSP designs shift the bottleneck from “topology choice” to measurable thermal headroom, current-loop integrity, and controlled protection/derating. This page section sets the decision boundary and the architecture map (test-point driven).
H2-1. Positioning & when to use (COB vs CSP, high-current boundary)
What changes in the “high-current” regime
- Thermal density becomes the design limiter: the same electrical current produces disproportionate hotspot risk because COB/CSP concentrates heat near the junction-to-case path.
- Accuracy and stability become layout-sensitive: shunt sensing, Kelvin routing, and ground return noise can directly translate into visible brightness jitter or protection mis-trips.
- Transient stress matters: start-up overshoot and fast current slews can accelerate lumen depreciation or trigger intermittent faults, even if steady-state is “in spec”.
Decision boundary: when an external power stage is the safer default
- Current & power headroom: when continuous LED current is in the multi-amp range, or module power is tens of watts and above, heat and switching loss are easier to spread using external FETs and magnetics.
- Thermal budget is tight: when the allowed junction temperature rise (
ΔTj) is small under worst-case ambient, externalizing heat sources (FETs/shunt/inductor) reduces junction stress. - Vf shifts with temperature: if
Vf@hotdiffers materially fromVf@25°C, control margin (duty cycle, OVP thresholds, loop gain) must be designed for the hot corner, not the datasheet “typical”. - Single vs multi-channel boundary: one COB string at high current is usually manageable in one channel; multiple high-current strings push toward distributed heat, robust current sharing, and testable per-channel sensing.
Evidence inputs (collect before schematic work)
- ILED rated / peak: nominal current, peak current (if any), acceptable ripple (%), and acceptable overshoot during start-up.
- Vf@T:
Vf@25°CandVf@hot(or a curve), plus string tolerance across bins. - Power & rails: VIN range, target efficiency, allowable dissipation on the driver PCB (not only on the heatsink).
- Thermal limit: target max junction temperature and allowed
ΔTjunder worst-case ambient and airflow. - Dimming as a constraint: dimming ratio and required smoothness define transient limits (current slew, soft-start shape), even if the control protocol is handled elsewhere.
H2-2. System architecture (controller + external MOSFET power stage)
Architecture goal: separate three paths that commonly corrupt each other
- Power path: VIN → switching stage → inductor → COB/CSP load (heat sources live here).
- Sensing/control path: shunt + Kelvin routing → amplifier/ADC → controller → COMP/PWM (noise here becomes brightness jitter or false trips).
- Thermal/protection path: NTC/temperature input + filtering → derating / OTP state machine (bad filtering here becomes “thermal flicker”).
Synchronous vs non-synchronous (why it matters at high current)
- Synchronous rectification: reduces diode conduction loss, typically lowering board temperature and improving efficiency at multi-amp output.
- Non-synchronous (diode): simpler and sometimes adequate at lower current, but heat concentration increases quickly as current rises.
- Key engineering signal: gate waveforms and device temperatures should confirm whether switching loss (Qg/drive) or conduction loss (Rds(on)) dominates.
Single-phase vs multi-phase (only the boundary logic)
- Single-phase: preferred starting point; simplest to validate and debug with fewer coupling paths.
- Multi-phase consideration: becomes attractive when inductor size/heat, ripple current, or local hotspots exceed mechanical/thermal constraints.
- Practical sign: if one inductor or one MOSFET pair runs disproportionately hot, distributed phases can spread heat—at the cost of added control complexity.
Test-point map (minimum set for evidence-based bring-up)
- TP_SW: switching node ringing, spikes, dv/dt (correlates with EMI risk and device stress).
- TP_GATE_H / TP_GATE_L: drive strength and Miller behavior (correlates with switching loss and heat).
- TP_SENSE: shunt signal integrity (correlates with current accuracy and protection robustness).
- TP_COMP: loop health (correlates with stability and transient overshoot).
- TP_NTC: filtered temperature input (correlates with derating smoothness and OTP correctness).
H2-3. Constant-current loop design (stability first, then performance)
Why high-current CC loops become “fragile” (root causes that are measurable)
- Sense signal integrity degrades: multi-amp switching current increases ground bounce and coupling, so the controller may “see” false current steps at the sense input.
- Switch-node dv/dt rises the coupling risk: faster edges inject noise through parasitic capacitance into sense/COMP, turning EMI into loop disturbance.
- Thermal and electrical dynamics interact: MOSFET/shunt/inductor heating shifts operating points, changing loop gain and transient behavior over warm-up time.
- Audible noise is often a symptom, not a component problem: low-frequency loop oscillation or bursty modulation can excite magnetics and mechanics.
Current-mode vs voltage-mode (key differences only, focused on stability and overshoot)
- Current-mode control: typically enables faster transient response and simpler compensation, but the loop is more sensitive to sense noise and layout-induced ground reference drift.
- Voltage-mode control: can be more tolerant to certain sense-noise artifacts, but often requires more conservative bandwidth and careful compensation to avoid overshoot and slow settling.
- High-current practical rule: the “best” mode is the one that keeps TP_SENSE clean and TP_COMP unsaturated across temperature and load steps.
Compensation knobs (how to set targets without turning this into a control-theory lecture)
- Bandwidth target (range-based): keep the CC loop fast enough to control load steps and dimming transitions, but not so fast that switching ripple and coupled noise dominate the error signal.
- Phase margin target (range-based): aim for a “healthy margin window” that prevents low-frequency oscillation (visible brightness wobble) and avoids marginal ringing after a step.
- Overshoot control: prioritize limiting ILED overshoot on start-up and during reference steps; in COB/CSP, transient overstress can accelerate lumen depreciation.
- Noise immunity: if raising bandwidth worsens TP_COMP “hair” or TP_SENSE pp noise, fix sense/layout/coupling first before pushing performance.
Evidence checklist (what to measure, what “healthy” looks like)
- TP_COMP shape: smooth control voltage with no low-frequency wobble; avoid repeated top/bottom saturation during normal operating conditions.
- Load-step response: bounded overshoot, short settling time, and no secondary oscillation. Verify both cold and hot corners.
- Correlation test: if TP_SENSE noise spikes align with TP_SW ringing or TP_GATE edges, the dominant issue is coupling/ground bounce rather than “wrong compensation values”.
- Audible symptom mapping: audible tone often aligns with low-frequency modulation visible on TP_COMP or ILED envelope; confirm with time-domain capture.
H2-4. Current sensing & Kelvin layout (accuracy + noise immunity)
Shunt loss, self-heating, and drift (turn “Rsense” into an error budget)
- Self-heating is an error source: at multi-amp current, shunt dissipation raises its temperature, shifting resistance and creating a slow ILED drift over warm-up.
- Temperature coefficient matters: Rsense ppm/°C plus board temperature rise defines the drift envelope; quantify it before chasing “mystery loop behavior”.
- Power distribution around the shunt: copper spreading and thermal relief influence shunt temperature gradient, which can create measurement asymmetry.
- Evidence output: a simple “ILED error vs temperature/current” curve separates drift-dominated issues from noise-dominated issues.
Kelvin routing rules (what makes it real Kelvin vs “two extra traces”)
- Sense pair must be a quiet signal path: route as a tight pair directly across the shunt element, avoiding the high di/dt power loop area.
- Single reference point: sense reference should return to the controller’s sense ground/quiet ground, not to the power ground plane that carries switching current.
- Avoid shared impedance: any shared copper with MOSFET return current becomes a “ground bounce amplifier” that appears as fake current ripple.
- Validation cue: TP_SENSE pp noise drops significantly when probe ground is moved from power ground to the intended sense reference node.
Sensing filter vs dynamic response (filter placement is the real knob)
- Too little filtering: TP_SENSE shows switch-synchronous spikes; these can trigger false SCP or create visible low-level brightness jitter.
- Too much filtering: the controller “sees” delayed current changes, increasing overshoot risk and slowing step recovery.
- Filter placement logic: a small, well-placed RC can reduce high-frequency injection while preserving step visibility—verify with TP_COMP and load-step response.
- Evidence: compare (a) TP_SENSE pp noise, (b) ILED overshoot, and (c) settling time before/after filtering changes.
Low-side vs high-side sensing (high-current, practical boundary only)
- Low-side sensing: simpler, but most vulnerable to power return contamination; Kelvin and reference strategy must be strict.
- High-side sensing: can reduce ground-bounce exposure, but demands suitable common-mode capability and careful layout of the high-side amplifier.
- Decision cue: if low-side TP_SENSE noise correlates strongly with MOSFET current return and cannot be fixed by layout, high-side sensing becomes the cleaner evidence path.
H2-5. External MOSFET & inductor selection (loss, saturation, thermal headroom)
MOSFET selection: Rds(on) vs Qg (choose the dominant loss, then validate by heat)
- Conduction loss dominant: at multi-amp output, low
Rds(on)reduces heat quickly, especially on the device that conducts for the longest portion of the cycle. - Switching/drive loss dominant: higher
Qgdemands stronger gate drive current and increases edge-related dissipation; this often shows up as hotter gate-loop regions and more TP_SW ringing. - High-current practical rule: do not “win Rds(on)” and lose stability/EMI—if TP_SW spikes and dv/dt coupling contaminate TP_SENSE, overall stress rises even when DC loss looks good.
Thermal distribution: HS vs LS device (or rectifier) is rarely “equal”
- HS device tends to “switch-hot”: if switching loss dominates, the high-side device often runs hotter; TP_GATE edge quality and TP_SW overshoot correlate strongly.
- LS device tends to “conduction-hot”: if conduction dominates (high current, long conduction interval), the low-side device (or diode in non-synchronous designs) often becomes the main hotspot.
- Evidence-first approach: use T1/T2 temperature points to confirm which loss dominates before changing silicon size or gate resistance.
Inductor selection: saturation, ripple (ΔIL), and temperature rise
- Saturation is a system failure mode: approaching saturation lowers effective inductance, increases ΔIL, raises peak current, and escalates MOSFET heat and TP_SW ringing—often triggering a “snowball” of stress.
- ΔIL is the shared knob: larger ΔIL increases peak current and EMI stress; smaller ΔIL usually costs size and copper but can reduce peak stress if the inductor remains linear.
- Thermal headroom matters more than nominal inductance: inductor temperature rise indicates combined copper + core losses; confirm at hot ambient and the worst duty/current corner.
- Evidence: compare ΔIL estimate vs current-probe measurement, then correlate with inductor temperature point T3.
High-current copper & routing: voltage drop becomes “hidden loss + hidden error”
- IR drop adds real dissipation: narrow necks, via bottlenecks, connectors, and long paths can generate local hotspots that do not appear in component datasheets.
- IR drop also distorts sensing: shared impedance and return-path voltage can look like “extra Rsense”, creating apparent current ripple or drift.
- Evidence: measure segment drops (VIN→FET, FET→L, L→LED, LED return) and compare against thermal hot spots on the board.
H2-6. Soft-start, current slew, and overshoot control (protect the COB/CSP)
Why overshoot is expensive in COB/CSP (stress, reliability, and “mystery trips”)
- Instantaneous power density spikes: ILED overshoot produces transient junction stress that can accelerate lumen depreciation even if steady-state looks safe.
- Protection interaction: start-up overshoot can falsely trigger SCP/OVP or cause brief brownout cycles that look like “random flicker”.
- Evidence-first rule: overshoot must be proven (and fixed) using captured waveforms: TP_ILED, TP_SW, and TP_VIN.
Soft-start goal: define a predictable current trajectory (not just “slower is better”)
- Ramp shape matters: a controlled ILED ramp reduces peak stress and avoids COMP saturation that can cause a delayed second overshoot.
- Keep VIN stable: too-aggressive start can dip VIN; too-slow start can keep the system in a marginal region longer—confirm with TP_VIN.
- Evidence: capture ILED(t) from enable to steady state, annotate peak, slope (dI/dt), and settling behavior.
Current slew limiting: the shared knob for EMI and electrical stress
- Reference slew: limit how fast the current command changes; reduces “loop punch” during dimming transitions.
- Loop interaction: if the loop bandwidth is high but sense noise is present, fast command steps translate into visible ripple—fix sensing/coupling before pushing speed.
- Gate-edge influence: slowing edges can reduce TP_SW peaks and ringing, but may increase switching loss—verify with temperature points and efficiency.
How to prove overshoot is fixed (minimum evidence set)
- ILED start-up waveform: peak current reduced and bounded; no secondary bump after settling; slope controlled.
- SW node peak: TP_SW peak and ringing reduced; no abnormal overshoot beyond device margin.
- VIN behavior: TP_VIN dip reduced; recovery faster; no repeated re-enable cycles.
- Fault flags (if available): SCP/OVP/UVLO counters stop increasing during start events.
H2-7. NTC compensation & thermal derating (fast enough, not too fast)
NTC placement & representativeness (board temperature ≠ junction hot spot)
- Pick the protected object first: LED board hot spot, MOSFET hot spot, and inductor hot spot can rise at different rates. NTC location decides what gets protected “early”.
- Board NTC is a proxy variable: it does not measure LED junction temperature directly, but it can be a stable indicator when the thermal path is consistent.
- Validate representativeness with a step test: apply a controlled power step and record NTC code + ILED + (optional) a hot-spot T-point to quantify lag and sensitivity.
Filtering & response time (fast enough to protect, slow enough to avoid flicker)
- Why flicker happens: ADC quantization + sampling noise + airflow/thermal micro-variation can jitter the temperature code; if I-limit follows instantly, ILED will “hunt”.
- Two-layer stabilization: smooth the temperature estimate (filter), then limit how fast the current limit can move (slew-limit) to avoid visible brightness steps.
- Evidence cue: if ILED ripple or low-frequency wobble aligns with NTC-code jitter, reduce noise sensitivity before adjusting loop speed.
Piecewise derating curve (start point, slope, max derate, hysteresis)
- Tstart: no derating below this point to prevent “nuisance derate” near ambient and to keep low-level dimming stable.
- Slope region: reduce I-limit predictably with temperature to keep power density within thermal headroom.
- Imin / max derate: enforce a floor so the system stays controllable and avoids repeated shutdown/retry.
- Hysteresis (Tstart vs Trecover): separate trip and recover boundaries to prevent boundary chatter and thermal oscillation.
Evidence checklist (minimum set to prove derating is controlled)
- NTC voltage / ADC code: capture the raw code and the filtered estimate; confirm stability (no fast jitter-driven limit changes).
- Temperature estimate curve: map code → temperature (or normalized temperature units) and verify repeatability across runs.
- ILED vs temperature curve: demonstrate the intended piecewise I-limit behavior and hysteresis window.
- Thermal step response: apply a controlled power step and confirm the derate action is fast enough to stop runaway yet not fast enough to cause flicker/hunting.
H2-8. OTP strategy (where to sense, thresholds, hysteresis, recovery)
Where to sense temperature (choose the most relevant risk point)
- Controller die sensor: fast and simple, but it protects the controller first; it may not track MOSFET/inductor/LED hot spots under high current.
- Board NTC: closer to system behavior and easy to combine with derating (H2-7), but response depends on placement and thermal lag.
- External sensor near a hot spot: best representativeness for the protected object, but requires clean routing/ADC strategy to avoid noise-driven false trips.
- Practical rule: sense where temperature rises fastest under worst-case load; then add hysteresis and a recovery policy that avoids repeated thermal shocks.
Thresholds & hysteresis (avoid nuisance trips and boundary chatter)
- Trip temperature: define the boundary where continued operation is unsafe (or where derating is no longer sufficient).
- Recover temperature: require a cooler return point to re-enable; this hysteresis window prevents rapid state toggling near the threshold.
- Hold / debounce: apply minimum dwell time in each state so ADC noise or transient airflow does not create false trips.
- Evidence: log trip temperature, recover temperature, and dwell times to make behavior explainable in the field.
Recovery policy (latch / hiccup / foldback) — selection rationale
- Latch: safest for persistent faults (blocked airflow, heatsink failure). Clear and deterministic, but requires manual reset.
- Hiccup (retry): useful for temporary over-temperature, but can create repeated thermal cycling if the root cause remains.
- Foldback: reduce current to keep the system alive while pulling temperature back; usually best for lighting continuity, but must be stable and rate-limited.
- High-current recommended pattern: Normal → Foldback first; if temperature still rises, transition to Shutdown; then Retry based on cooldown conditions.
Evidence fields (make OTP behavior debuggable, not “random flicker”)
- Trip_T / Recover_T: record the sensor source and the values at each transition.
- Trip count: confirm whether a specific load or ambient corner causes repeated trips.
- State log (if available): Normal / Foldback / Shutdown / Retry timestamps and dwell times.
- Thermal path: identify which location rises first (NTC vs T1/T2/T3) to separate “power-stage hot spot” from “ambient” causes.
H2-9. Lumen maintenance (aging compensation without killing lifetime)
Make lumen maintenance an executable strategy (model → clamp → verify)
- Goal: maintain target luminous output over time while keeping current and temperature within safe lifetime constraints.
- Key rule: compensation is always bounded; uncontrolled “more current” accelerates aging and can raise hotspot risk.
- Implementation shape: compute a compensation coefficient from runtime and stress history, then apply it through a hard clamp and a rate limit.
Compensation inputs (strategy-only, low-cost signals)
- Runtime counter: hours/minutes accumulation (primary driver for scheduled compensation).
- Temperature history: use a histogram (time-in-bins) rather than instantaneous temperature to capture thermal stress exposure.
- Current stress: track time spent in current bands (or dimming bands) to differentiate gentle use from high-stress use.
Safety boundaries (compensation must respect lifetime and thermal headroom)
- Max compensation ratio: cap the coefficient so ILED never exceeds a defined lifetime boundary (prevent “aging by compensation”).
- Temperature gate: when temperature history indicates sustained high stress, limit or freeze upward compensation.
- Rate limiting: apply a slew limit to avoid visible brightness steps and to prevent abrupt electrical stress.
- Hard clamp precedence: OTP / derating (H2-7/H2-8) overrides lumen maintenance at high temperature.
Proof that compensation is effective and stable (record → compare → explain)
- Record fields: runtime, temperature histogram, compensation coefficient, ILED setpoint, clamp flag.
- Compare trends: same runtime under different temperature histories should yield different (more conservative) coefficients.
- Audit stability: coefficient and ILED setpoint should change smoothly (no jumps); clamp events should be explainable by temperature gates.
- Lumen sampling: use factory reference points or periodic spot checks to validate that compensation tracks expected lumen drift.
H2-10. Reliability pitfalls unique to high-current COB/CSP (hotspot, wiring, paralleling)
Hotspots dominate (average temperature is not the failure predictor)
- Thermal resistance chain matters: Junction → Case → TIM → Heatsink → Ambient is only as good as its worst segment.
- TIM & clamping force: non-uniform contact or voids raise local thermal resistance and create persistent hotspots.
- Actionable check: verify temperature rise sequence (which node heats first) using a few fixed measurement points.
Wiring, connectors, and joints become “hidden heaters” at high current
- Milli-ohms matter: contact resistance in connectors, crimps, solder joints, and terminals can generate significant heat.
- Voltage drop is a diagnostic: measure segment drops across harness and joints to locate abnormal resistance before it becomes a burn point.
- Evidence: correlate local temperature rise with local voltage drop to isolate the weakest connection.
Parallel COB risks (current sharing basics, keep it minimal but enforceable)
- Sharing is not automatic: Vf and thermal gradients can bias current into one COB, increasing its temperature and pulling even more current.
- Basic principles: keep paths symmetric, introduce a sharing mechanism (even simple per-branch impedance), and verify balance at hot steady state.
- Evidence: measure per-branch current or per-branch drop at cold and hot; confirm no branch becomes a runaway hotspot.
Evidence toolkit (minimum fields for field-debug and acceptance)
- Hotspot distribution: IR snapshot or thermocouple map (fixed points are sufficient if repeatable).
- Connector/joint rise: temperature at joints + segment voltage drops.
- Localization method: step the load and observe which node rises first and fastest (thermal path identification).
H2-11. Validation & field debug (symptom → evidence → isolate → fix)
Quick-debug discipline (minimum tools, maximum discrimination)
- Minimum tools: oscilloscope (prefer differential for SW), DMM, thermocouple or IR spot, and a current probe (or Rsense drop).
- Fixed measurement points: VIN, ILED (or Rsense), COMP, SW node, NTC / temperature, FAULT/PG (if available).
- Rule: capture two measurements first, then branch using evidence. Avoid “random part swapping”.
Symptom A — Visible flicker / shimmer (brightness hunting)
First 2 measurements
- ILED waveform (or Rsense drop): check low-frequency envelope, step-like drops, or sawtooth wobble.
- COMP waveform: check saturation, periodic oscillation, or slow “breathing” behavior.
Discriminator (evidence to separate A/B)
- If ILED wobble is phase-aligned with COMP oscillation → loop stability / noise injection into the control path.
- If COMP is stable but ILED shows periodic clamp/steps → thermal derating (NTC) or protection state toggling.
First fix (minimal change) + example MPNs
- Loop/noise path: reduce sense noise before changing magnetics; add/adjust sense RC filtering; verify Kelvin routing. Example current-sense amplifiers: TI INA240A1/A2, TI INA181, ADI LT6106.
- Derating/protection chatter: add hysteresis/hold time and rate-limit derating updates. Example temp sensors (digital): TI TMP117, Microchip MCP9808. Example NTCs: TDK/EPCOS B57560 series, Murata NCP series.
Symptom B — Startup flash / overshoot (one bright spike at turn-on)
First 2 measurements
- ILED startup waveform: capture peak overshoot and rise slope.
- VIN waveform: look for dip/bounce during inrush or soft-start ramp.
Discriminator
- If ILED overshoots while VIN is clean → soft-start / current-slew control is too aggressive (reference injection or clamp timing).
- If VIN dips and SW shows bursts → input supply limitation or protection interaction (UVLO / brownout) causing re-entry behavior.
First fix + example MPNs
- Control-side clamp: slow Iref ramp, add a current-slew limiter, and ensure clamp precedes PWM enable. Example LED controllers (external stage capable): ADI LT3763, TI LM3409 (buck controller).
- Input transient control: add an input eFuse/hot-swap limiter when wiring is long or supply is soft. Example eFuses/surge stoppers: TI TPS25947, TI TPS2660, ADI LTC4365.
Symptom C — Unexpected dimming (output drops after warm-up or over hours)
First 2 measurements
- NTC/temperature trace + ILED setpoint: check whether current reduction tracks temperature or time.
- FAULT/PG or mode pin (if present): confirm whether the system enters derating/foldback states.
Discriminator
- If ILED reduction correlates with NTC rising past a boundary → thermal derating/OTP strategy.
- If ILED changes slowly with runtime but temperature is moderate → lumen-maintenance compensation policy (clamp/rate-limit verification needed).
First fix + example MPNs
- Thermal path first: validate hotspot vs average temperature; confirm TIM/contact quality before raising thresholds. Example TIM materials: 3M 8810 (thermal pad family), Laird Tflex family.
- Policy side: add max clamp + rate limit on compensation coefficient. Example EEPROM/NVM (for storing coefficients): Microchip 24LCxx series, ST M24Cxx series.
Symptom D — Hot to touch / localized hotspot (one area overheats)
First 2 measurements
- Hotspot temperature map: IR scan or 3-point thermocouple (LED board, MOSFET, inductor, connector).
- Voltage drop across joints: measure ΔV across connector/crimp/solder joint under load.
Discriminator
- If hotspot is at TIM/contact region with small electrical ΔV → thermal contact / clamping force / void issue.
- If hotspot aligns with a joint showing abnormal ΔV → resistive connection (hidden heater) dominates.
First fix + example MPNs
- Connection heating: replace/upgrade the joint path; reduce contact resistance before changing silicon. Example high-current connector families: Molex Mini-Fit Jr, TE MicroFit 3.0 (select exact housings/terminals by current rating).
- Power-stage hotspot: redistribute loss with a more suitable MOSFET (Rds(on)/Qg trade-off) and verify gate drive. Example gate drivers: TI UCC27211, Infineon 1EDN7550, ADI LTC4440.
Symptom E — Protection chatter (repeated foldback / shutdown / retry)
First 2 measurements
- VIN + ILED during the event: capture whether VIN dip precedes shutdown or ILED clamp precedes VIN dip.
- NTC/OTP state (or FAULT pin): confirm state transitions and dwell times.
Discriminator
- If VIN dip comes first → input supply / wiring / inrush limitation issue (UVLO interaction).
- If NTC crosses trip and recovery is too close → hysteresis/hold time too small (boundary chatter).
First fix + example MPNs
- Input-side: add controlled inrush and brownout immunity. Example eFuses/hot-swap: TI TPS25947, ADI LTC4365.
- OTP policy: add hysteresis + minimum off-time, then retry with foldback first. Example temperature sensors: TI TMP117, NXP PCT2075; Example NTCs: EPCOS B57560 series.
Symptom F — No light / intermittent light (dead or unstable output)
First 2 measurements
- SW node waveform: confirm switching activity, ringing severity, and abnormal pulse bursts.
- ILED (or Rsense): confirm whether current is truly zero or being clamped/shutdown by protection.
Discriminator
- If SW is present but ILED is near zero → open LED path / connector / sense path error / OVP behavior.
- If SW is absent and FAULT/UVLO indicates reset → VIN instability or controller brownout / latch-off state.
First fix + example MPNs
- Ring control: tame SW overshoot before blaming the LED; add snubber and gate damping. Example gate drivers: TI UCC27211, Infineon 1EDN7550. Example TVS families for rail clamping (board-level): Littelfuse SMBJ series, Vishay SMBJ series.
- Sense-path integrity: validate Kelvin sense and amplifier headroom. Example sense amplifiers: TI INA240, ADI LT6106.
Minimum field log (make issues explainable)
- VIN: min dip / max spike / repetition rate during fault.
- ILED: overshoot peak, ripple p-p, step response to dimming or load changes.
- COMP: saturation, oscillation envelope, recovery time.
- SW: ringing amplitude, edge speed, abnormal burst patterns.
- NTC / temperature: raw code and filtered estimate, trip/recover points, dwell times.
- Protection state: FAULT/PG transitions and counts.
H2-12. FAQs ×12 (evidence-first, no scope creep)
Answer format per question: Short answer (1 sentence) + What to measure (2 points) + First fix (1 point). Each fix includes example MPN(s).
COB brightness “jitters” then stabilizes — soft-start overshoot or loop compensation?
Short answer: If ILED spikes while COMP stays calm, it’s soft-start / current-slew; if COMP hunts with ILED, it’s loop compensation or sense noise injection.
What to measure: (1) ILED turn-on peak and rise slope. (2) COMP envelope (saturation/oscillation) during the same window.
First fix: Slow the current reference ramp (soft-start/slew) before changing power parts; add inrush control if VIN dips (e.g., TI TPS25947; controller example ADI LT3763).
When hot, current is unchanged but brightness drops a lot — thermal path or lumen-aging model?
Short answer: If hotspot temperatures rise abnormally at constant ILED, suspect the Rθ chain/TIM contact; if temperatures are stable but compensation coefficient ramps, suspect the aging/maintenance policy.
What to measure: (1) Hotspot map (Tcase/Tsink points) vs time at fixed ILED. (2) Runtime + k_comp + clamp flag (log fields).
First fix: Fix contact/TIM and sensor representativeness first, then clamp and rate-limit k_comp (sensor example TI TMP117; NVM example ST M24C02).
NTC connected and brightness starts “hunting” — filter too fast or derate slope too steep?
Short answer: Hunting usually comes from a noisy/fast temperature estimate or a steep derating slope near a boundary, which converts small temp jitter into visible current steps.
What to measure: (1) Raw NTC ADC code vs filtered temperature estimate. (2) ILED steps aligned to temp updates or threshold crossings.
First fix: Add low-pass + hysteresis + rate limit to the derate curve; keep sensing stable (NTC example EPCOS B57560 series; digital temp example Microchip MCP9808).
High current efficiency is poor and MOSFET is very hot — Rds(on) choice or Qg/drive shortage?
Short answer: If switching edges are slow and gate drive is weak, Qg/drive loss dominates; if edges look healthy but conduction drop is high, Rds(on) is the main limiter.
What to measure: (1) VGS waveform (peak, rise/fall time, ringing) under load. (2) MOSFET temperature split (high-side vs low-side) and duty.
First fix: Upgrade gate drive capability and damping before swapping MOSFETs (driver examples TI UCC27211 or Infineon 1EDN7550).
Inductor is not hot but it “squeals” — loop instability or ΔIL exciting magnetostriction?
Short answer: If COMP oscillates with the audible tone, it’s control hunting; if COMP is stable but ripple current is high, ΔIL can excite magnetic forces even with modest temperature rise.
What to measure: (1) COMP ripple/limit cycling at the squeal frequency. (2) Inductor current ripple ΔIL (current probe or Rsense method).
First fix: First separate “control oscillation” vs “ripple excitation”; then retune compensation or reduce ΔIL by increasing L/adjusting fSW (sense amp example TI INA240 for cleaner current capture).
Short-circuit protection trips only at turn-on — real short or false trip from sense noise/ground bounce?
Short answer: If the “short” event coincides with SW ringing and a narrow Rsense spike, it’s usually a false trip caused by layout/sense pickup rather than a true load short.
What to measure: (1) Differential Rsense waveform at turn-on (peak width and amplitude). (2) SW ringing amplitude and edge speed in the same time window.
First fix: Enforce Kelvin sense + add sense RC filtering + tame SW ringing (sense amp TI INA240; rail TVS example SMBJ series).
LED occasionally turns off then recovers — OVP/open-string false trigger or connector voltage drop?
Short answer: If FAULT/OVP indicates open/OV events while ΔV across a joint rises with temperature, wiring/contact resistance is likely; otherwise, OVP/open-string detection may need blanking/debounce.
What to measure: (1) FAULT/OVP pin/state during the dropout. (2) ΔV across the connector/joint plus its local temperature rise.
First fix: Add a debounce/blanking window and reduce supply droop with an input limiter (e.g., TI TPS25947; surge stopper example ADI LTC4365).
Same BOM, different heatsink makes OTP easier — threshold too low or thermal path estimate wrong?
Short answer: If the sensed temperature is not representative of the hotspot (wrong location or poor coupling), OTP will look “too sensitive”; if sensing is correct, thresholds/hysteresis may be too tight for the new thermal dynamics.
What to measure: (1) Tcase/Tsink/hotspot points vs OTP trip/recover temperatures. (2) OTP state dwell time and trip count over repeated runs.
First fix: Fix sensing representativeness first, then widen hysteresis/min-off time (sensor examples TI TMP117 or Microchip MCP9808).
Lumen maintenance is desired but lifetime is a concern — how to set the compensation upper limit?
Short answer: Set k_max from the worst-case hotspot temperature and the LED’s current/lifetime boundary, then clamp and rate-limit compensation so it never trades lumen for runaway thermal stress.
What to measure: (1) Hotspot temperature distribution at the highest intended k_comp. (2) k_comp trend vs clamp flag frequency over runtime bins.
First fix: Implement hard clamp + temperature gate + slow update and store policy in NVM (NVM examples ST M24C02 or Microchip 24LC02B).
ILED ripple looks small, but visual comfort is still poor — which two waveforms prove the transient?
Short answer: Discomfort often comes from transient events (start/dim edges, burst recovery), not steady ripple; prove it by correlating ILED edge behavior with control or input recovery.
What to measure: (1) ILED during dimming edges and load steps (overshoot/undershoot and settling). (2) COMP recovery or VIN dip at the same moments.
First fix: Limit current slew and smooth edge transitions; stabilize input if droop triggers burst behavior (e.g., TI TPS25947; controller example ADI LT3763).
Measured ILED is accurate, but color / uniformity is poor — hotspot first or connection drop first?
Short answer: If the hotspot map shows strong gradients, thermal non-uniformity is the top suspect; if hotspots are mild but ΔV varies across wiring/joints, electrical drop can skew Vf and local heating, hurting uniformity.
What to measure: (1) Hotspot distribution across COB/MCPCB and near contacts. (2) ΔV across key joints/harness segments at rated current.
First fix: Fix contact/TIM and high-ΔV joints before touching calibration (temp sensor example TI TMP117; sense amp example TI INA240 for clean ΔV capture).
After recovery, protection keeps tripping again — hiccup policy or hysteresis window is wrong?
Short answer: Re-trip loops usually mean the recovery threshold is too close to the trip point or retry is too aggressive; foldback-first with sufficient hysteresis often breaks the cycle without masking real faults.
What to measure: (1) FAULT/state transitions with timestamps (trip → retry → trip). (2) Temperature/NTC and VIN around each retry to see which boundary is crossed.
First fix: Increase hysteresis + minimum off-time and prefer foldback before full retry (temp sensor example Microchip MCP9808; controller example ADI LT3763 for programmable behavior).