AI Capex
Technology/Memory

Memory & HBM

HBM3e · HBM4 · CXL · DDR5 · DRAM scaling. Memory bandwidth is the primary performance constraint for AI inference — and the highest-margin product in the semiconductor industry.

Updated April 2026 · 7 min read


What It Is

High Bandwidth Memory (HBM) stacks multiple DRAM dies vertically using Through-Silicon Vias (TSVs) and places the resulting stack directly adjacent to the GPU die on a silicon interposer. The short electrical path and wide bus width (1,024–2,048 bits per stack vs. 64 bits for standard DRAM) deliver bandwidth 10–20× higher than conventional DDR at significantly lower power per bit.

Every generation of AI GPU has required more HBM. The NVIDIA H100 carried 80GB at 3.35 TB/s. The H200 upgraded to 141GB at 4.8 TB/s. The B200 carries 192GB at 8 TB/s across eight HBM3e stacks. NVIDIA's next-generation Rubin is designed for HBM4 with ~1.2 TB/s per stack. This progression is both a technical requirement (AI models need more activations in memory) and a pricing ladder (HBM ASP grows with each generation).

Why It Matters for AI Capex

Memory bandwidth — not compute throughput — is the primary bottleneck for large language model inference. Serving a 70B-parameter model requires moving 70B × 2 bytes = 140 GB of weights through memory every generation step. At 3.35 TB/s (H100 HBM3), that takes ~42ms per forward pass. At 8 TB/s (B200 HBM3e), 17ms. Faster HBM directly translates to more tokens per second, lower cost per inference, and better economics for every AI application.

HBM is the most supply-constrained and highest-margin product in DRAM. SK Hynix commands ~50% share, Micron ~25%, Samsung ~25%. A single HBM3e stack sells for $15–20 (vs. $3–5/GB for standard DRAM) — a 4–5× ASP premium. Each B200 GPU contains ~$1,200 worth of HBM alone. With 100,000 B200 units per quarter, that's $120M in HBM content per quarter from NVIDIA alone — and NVIDIA is one customer among many.

HBM Generation Progression

GenerationCapacity/stackBW/stackLayersFirst GPU
HBM2e16 GB460 GB/s8 DRAMNVIDIA A100
HBM324 GB819 GB/s12 DRAMNVIDIA H100
HBM3e24–36 GB1.15 TB/s12 DRAMNVIDIA H200, B200
HBM448+ GB~1.5 TB/s16 DRAM + logic baseNVIDIA Rubin (2026)

Beyond HBM: The Broader Memory Stack

DDR5 / LPDDR5X

Server DRAM powering the CPU complex and model-serving host nodes. AI servers require 1–3TB of DDR5 per rack. The AI supercycle is pulling DDR5 demand faster than non-AI server refresh — Micron's server DRAM revenue grew 60%+ YoY in 2025. Every AI server rack needs both HBM (in the GPU) and large DDR5 pools (in the host).

CXL (Compute Express Link)

PCIe-based protocol enabling memory pooling and expansion. CXL 3.0 allows GPUs to access disaggregated DRAM pools over PCIe — enabling effective memory capacities beyond what HBM alone provides. CXL-attached memory is cheaper per GB than HBM and enables new inference server architectures. MU is investing in CXL memory devices.

NAND / Storage

AI training requires large dataset storage — model checkpoints, tokenized corpora, and intermediate activations can reach petabytes per training run. NVMe SSDs (high-speed NAND) are the preferred storage tier. WDC's enterprise SSD portfolio benefits from AI storage demand, though NAND is more commoditized than DRAM.

Supply Chain Players

Micron TechnologyMU

The only US-based HBM manufacturer. Micron is the fastest-growing HBM supplier, having taken significant share from SK Hynix in 2025 through superior HBM3e yield. MU reported $23.9B revenue in fiscal Q2 2026, up 196% YoY, driven by HBM and server DRAM. Management guided Q3 2026 at $33.5B. MU's HBM yield advantage is now a structural profit driver — HBM gross margins exceed 60% vs. ~40% blended.

Western DigitalWDC

WDC plays the NAND/SSD side of AI storage — not HBM. The company recently separated its NAND/flash business (spun into Sandisk Corp.). AI data center storage buildout is a tailwind for enterprise NVMe SSD demand. WDC also retains hard disk drive (HDD) exposure for bulk cold storage in hyperscaler archival tiers.

Lam ResearchLRCX

Lam dominates etch and deposition for DRAM manufacturing. HBM is Lam-intensive — TSV formation, DRAM stack etch, and the bonding surface preparation all use Lam tools. As HBM capacity expands to serve AI GPU demand, Lam's HBM-related equipment revenue grows proportionally. DRAM represents ~35% of Lam's revenue.

Metrics to Watch

  • MU HBM revenue and market share: MU discloses HBM revenue. Any share gain vs. SK Hynix is a major positive surprise — they started from ~5% and are targeting 25% by end-2026.
  • HBM ASP trajectory: HBM pricing is tight supply-side. ASP erosion signals either oversupply or aggressive pricing by SK Hynix to defend share.
  • MU quarterly revenue guide: The strongest near-term indicator of DRAM cycle health. Q3 FY2026 at $33.5B was a massive beat-and-raise.
  • DRAM bit growth vs. supply: DRAM is supply-disciplined (3-player oligopoly). If bit supply grows below demand, prices rise — positive for MU margins.
  • HBM4 yield reports: HBM4 with 16-layer stacks and logic base die integration is a significant yield challenge. Any early yield issues delay next-gen GPU schedules.
  • CXL memory adoption rate: Wider CXL adoption increases DRAM TAM by enabling server memory expansion beyond installed slots.

Investment Signals

Bullish Triggers

  • • MU HBM market share gain above guidance
  • • HBM ASP stability or increase
  • • Next-gen GPU announced with higher HBM content
  • • DRAM bit demand exceeding supply forecasts
  • • CXL 3.0 ecosystem deployments accelerating

Bearish Triggers

  • • HBM ASP erosion (oversupply signal)
  • • PC/mobile DRAM recovery pulling capacity from HBM
  • • Samsung yield improvement (increases HBM supply)
  • • Tariffs on memory imports into US data centers
  • • NVIDIA GPU shipment shortfall (reduces HBM demand)

Related Analysis