Rasgon pointed out that the four largest U.S. hyperscalers – Amazon, Microsoft, Google, and Meta – plan to spend roughly $725 billion on capital expenditures in 2026, with most of that going to AI infrastructure . Memory pricing has gone vertical: DRAM prices rose approximately 90% quarter over quarter heading into 2026
.
One of Rasgon's most striking observations is what he calls the "whack-a-mole" effect – bottlenecks propagating across the entire chip supply chain. "Everything is being pulled by this insatiable demand for AI compute. I have never seen anything at this scale in my career," Rasgon said .
He traced the spread: shortages started with GPU accelerators, then moved to HBM memory, then to semiconductor manufacturing equipment, then to networking and optics, then to power chips, and now even CPUs are in short supply .
A concrete example of the demand's reach: even Intel, which had "previously zero-valued inventory," has sold it out entirely . Customers reportedly told Intel, "We don't care; just sell it to us"
.
A critical bottleneck is high-bandwidth memory (HBM), which accounts for more than 85% of an AI chip's silicon area . Because of stacking yields and logic die overhead, manufacturing 1GB of HBM requires roughly four times the silicon area of standard DRAM
. This math explains why memory supply has been unable to keep pace with GPU demand, and why memory pricing has become a dominant factor in chip costs.
Rasgon highlighted a surprising data point: in a 72-GPU rack, the 36 CPUs inside generate roughly $20 billion in CPU revenue for Nvidia. This illustrates how the AI buildout is creating massive chip demand far beyond just the GPU accelerators themselves.
Rasgon emphasized that the market focus is moving from model training to AI inference – the core path to monetization . He cited Anthropic's revenue surging from $9 billion to $30 billion as direct evidence of this shift
. As AI models move from research projects into production, the compute required for inference will likely dwarf training workloads.
A common investor question is whether custom ASICs (like those made by Broadcom) will eventually displace Nvidia's GPUs. Rasgon believes both will coexist long-term in a growing market . His framework: programmable GPUs are better suited for research and exploratory inference, while ASICs excel at predictable, high-volume inference workloads. The total addressable market is large enough to absorb both.
Rasgon concluded on a sobering note. The final constraint is not chips, not memory, not networking – it is energy. AI infrastructure requires roughly a 5% annual increase in U.S. power grid capacity to sustain the growth trajectory . That is a staggering demand on a grid that has seen minimal capacity growth for decades.
He argued that the next wave of AI innovation and bottlenecks will inevitably fall on energy generation, cooling, and nuclear power . Without significant grid investment, the supercycle itself could hit a power ceiling.
Rasgon's message is clear: as long as AI demand does not collapse, the semiconductor supercycle is real and sustainable. But the nature of opportunity is shifting. The easy money in GPU stocks may be giving way to a more complex landscape where the "bottleneck" itself – whether in HBM, power chips, or energy infrastructure – becomes the wealth generator .
Comments
0 comments