×
More setbacks for NVIDIA as Blackwell chips overheat in servers
Written by
Published on
Join our daily newsletter for breaking news, product launches and deals, research breakdowns, and other industry-leading AI coverage
Join Now

Increasing adoption of artificial intelligence is creating surging demand for high-performance computing chips, leading to technical challenges as manufacturers push the boundaries of what’s possible.

Critical Development: Nvidia’s next-generation Blackwell GPUs are experiencing overheating issues in server configurations, potentially causing further delays to their planned release.

  • The server racks, designed to connect up to 72 GPUs simultaneously, are creating thermal management challenges that require ongoing redesign efforts
  • This setback could impact the scheduled openings of new data centers for major tech companies including Google, Microsoft, and Meta
  • A previous design flaw had already pushed back the launch from its initial Q2 2024 target

Technical Context: GPU performance and heat generation are intrinsically linked, creating unique challenges for high-density computing environments.

  • GPUs consume substantial energy during operation, with more powerful chips typically generating more heat
  • The cryptocurrency mining industry has faced similar challenges, sometimes employing immersion cooling techniques where hardware is submerged in liquid
  • Nvidia claims the Blackwell chips will be 30 times faster than previous generations, suggesting significantly increased power requirements

Industry Impact: The delays could have cascading effects across the AI industry and its infrastructure.

  • Tech giants are already struggling to secure adequate power supplies for their AI data centers
  • Companies like Meta, Microsoft, and Google have begun exploring nuclear power options to meet growing energy demands
  • Nvidia’s stock has surged over 180% in the past year despite these challenges, while competitor AMD has recently initiated layoffs

Nvidia’s Response: The company maintains that the ongoing engineering changes are part of normal development processes.

  • A company spokesperson told Reuters they are working closely with cloud service providers as part of their engineering process
  • The statement suggests Nvidia is actively working on new server designs to address the thermal management issues
  • The company has not provided updated timeline estimates for the Blackwell GPU release

Broader Energy Implications: The situation highlights growing concerns about AI’s expanding energy footprint and infrastructure requirements.

  • Experts predict possible power shortages for AI data centers as soon as next year
  • The rate of data center construction is outpacing the addition of new power sources to the grid
  • Traditional power purchase agreements may not adequately address the fundamental energy challenges facing the AI industry
Nvidia's Delayed Blackwell AI Chips Overheating in Servers

Recent News

Super Micro stock surges as company extends annual report deadline

Super Micro Computer receives filing extension from Nasdaq amid strong AI server sales, giving the manufacturer until February to resolve accounting delays.

BlueDot’s AI crash course may transform your career in just 5 days

Demand surges for specialized training programs that teach AI safety fundamentals as tech companies seek experts who can manage risks in artificial intelligence development.

Salesforce expands UAE presence with new Dubai AI hub

Salesforce expands its footprint in Dubai as the UAE advances its digital transformation agenda and emerges as a regional technology hub.