Custom Silicon / XPUs Deep Dive

What It Is

Custom silicon — also called XPUs, custom ASICs, or domain-specific accelerators — refers to chips designed by or for a specific company to run a specific workload more efficiently than general-purpose GPUs. Google's Tensor Processing Units (TPUs) are the archetype: a matrix multiply engine tuned to TensorFlow operations that delivers 2–5× better performance-per-watt on Google's models versus NVIDIA equivalents.

The XPU wave is now being executed at scale by all five hyperscalers. The economic logic is simple: at $30,000–50,000 per NVIDIA GPU and hundreds of thousands of units per year, a custom chip that is 30% more efficient saves billions annually in both capex and electricity. The NRE cost ($200–500M per tapeout) amortizes quickly at hyperscaler volume.

Why It Matters for AI Capex

Custom silicon is simultaneously a threat to NVIDIA and a massive opportunity for the companies designing and packaging these chips — specifically Broadcom (AVGO) and Marvell (MRVL), who are the primary chip design partners for hyperscaler XPUs. AVGO disclosed in late 2024 that its three XPU customers each represent a $1–3B annual revenue opportunity, implying a serviceable addressable market of $60–90B for merchant silicon partners by 2027.

The key nuance: custom silicon does not replace NVIDIA for training frontier models — it optimizes inference (serving) and specific training workloads. A hyperscaler running Llama inference at 100M queries/day gets better economics from a custom inference chip than from H100s. This allows hyperscalers to use NVIDIA for peak training, custom silicon for steady-state inference — a bifurcated market that expands total semiconductor spending rather than simply shifting share.

Hyperscaler XPU Programs

Company	Chip(s)	Node	Focus	Design Partner
Google	TPU v4/v5/v6	TSMC N4/N3	Training + inference (Gemini)	In-house
Amazon	Trainium 2, Inferentia 3	TSMC N3	Training (AWS customers) + inference	Annapurna Labs (in-house)
Microsoft	Maia 2	TSMC N3	Azure AI inference, OpenAI workloads	In-house + MRVL
Meta	MTIA v2	TSMC N3	Recommendation models, Llama inference	In-house
Apple	M-series Neural Engine	TSMC N3E	On-device inference (Apple Intelligence)	In-house

Supply Chain Players

BroadcomAVGO

The dominant merchant silicon partner for hyperscaler XPU programs. AVGO's custom silicon business (Google TPU v4/v5, Meta MTIA, another undisclosed hyperscaler) is growing faster than any other segment. Management guided the custom ASIC TAM at $60–90B by 2027 across its known customers. AVGO also makes the networking silicon (Jericho/Tomahawk) and optical PHYs that connect XPU clusters — full-stack AI infrastructure exposure.

Marvell TechnologyMRVL

MRVL is AVGO's smaller but fast-growing competitor in custom silicon. Microsoft's Maia 2 uses MRVL as a key co-designer. MRVL also builds Amazon's networking infrastructure chips (Graviton-era Nitro) and custom DPUs. MRVL guided to $2.5B+ in custom AI revenue for FY2026 at 70%+ gross margins — its highest-margin business.

Arm HoldingsARM

Every custom AI chip — Google TPU, Amazon Trainium, Apple Neural Engine, AVGO XPU — is built on Arm's CPU and NPU IP. ARM collects royalties per chip shipped and licenses its architecture for design. As XPU volume grows, ARM's royalty stream grows proportionally. The company's 2025 CSS (Compute Subsystem) product sells a complete SoC template, lowering NRE for new XPU entrants.

TSMCTSM

All hyperscaler custom chips are manufactured at TSMC — N3E for current-gen, with N2 migration beginning in 2026. TSMC's CoWoS advanced packaging is also required for XPUs with HBM. The XPU wave is additive to TSMC's N3/N2 utilization alongside NVIDIA's GPU orders — another layer of structural demand for leading-edge wafer starts.

Metrics to Watch

→
AVGO AI revenue ($): AVGO breaks out AI revenue each quarter. This is the best single-number proxy for hyperscaler XPU momentum.
→
MRVL custom silicon bookings/revenue: MRVL discloses custom AI as a separate segment. $2.5B FY2026 guide is the baseline to beat or miss.
→
ARM royalty growth (data center tier): ARM's data center royalty ASP is 5–10× mobile — growing mix here is a structural margin tailwind.
→
New hyperscaler XPU design wins: Each new disclosed hyperscaler as AVGO/MRVL customer adds $1–3B to addressable revenue.
→
TSMC N2 tape-out pipeline: XPU tape-outs at N2 signal second-gen programs. Each tape-out is $50–200M NRE — a leading revenue indicator.
→
NVIDIA data center GPU shipment share: Any share loss to custom silicon shows up as NVIDIA data center revenue miss — the key competitive battleground.

Investment Signals

Bullish Triggers

• New hyperscaler XPU customer at AVGO/MRVL
• AVGO AI revenue beats guide by >10%
• Hyperscaler says custom silicon >30% of AI compute
• ARM data center royalty ASP inflects upward
• New TSMC N2 XPU tape-out announced

Bearish Triggers

• Hyperscaler XPU program delays/cancellations
• AVGO AI revenue guide cut
• NVIDIA CUDA moat widening (ecosystem lock-in)
• NRE cost escalation at N2 deterring new programs
• Hyperscaler capex freeze affecting chip orders

Related Analysis

Broadcom Q1 FY2026 — Custom Silicon & AI Revenue NVIDIA Q4 FY2026 — Custom Silicon Competition Context

All Technology Deep Dives