IFTLE 563: Is CoWoS Capacity Causing a GPU Shortage?

Semi Analysis has detailed their thoughts on the current GPU shortage.

We all know that AI technology is upon us. This will require massive amounts of GPU computing. Semi Analysis recently noted that GPU sales are skyrocketing and they see companies scrambling to stockpile inventory. For example, OpenAI is reporting that the lack of GPUs is bottlenecking their long-term roadmap. They also see Chinese companies such as Bytedane (owner of TikTok) stocking up on GPUs before US export controls tighten even further, having recently ordered $1B worth of Nvidia GPUs.

It is reported that the highest-end Nvidia GPU, H100, will remain sold out until Q1 of next year, despite Nvidia’s attempts to increase production drastically. Nvidia is ramping up to ship more than 400K H100 GPUs per quarter. Nvidia’s H100 is a 7-die module packaged on TSMC’s chip-on-wafer-on-silicon substrate (CoWoS-S.) The H100 GPU ASIC (814 sq mm) is surrounded by six stacks of HBM3 memory.

Nvidia’s GPU die is fabricated in TSMC’s Fab 18 in Taiwan. The availability of this die is not a limiting factor to production.

The high bandwidth memory (HBM) around the GPU is in limited supply, but ramping. This HBM is vertically-stacked DRAM dies connected by TSVs and currently bonded using thermocompression bonding (TCB), with hybrid bonding (HB) thought to be required for higher stack counts in the future.

As the pioneer of HBM, SK Hynix is the leader with the most advanced technology roadmap. SK Hynix started production of HBM3 in June 2022 and is currently the only supplier shipping HBM3 in volume, with over 95% market share. The maximum configuration of HBM is now 8-layer 16GB HBM3 modules. SK Hynix is producing 12-layer 24GB HBM3 for the AMD MI300X and Nvidia H100. Samsung and Micron are investing to catch up.

Samsung expects to ship HBM3 in the second half of 2023. Semi Analysis believes they are designed for both Nvidia and AMD GPUs.

Micron is the furthest behind. Micron was heavily invested in the competing Hybrid Memory Cube (HMC) technology. However, the adoption of HBM won out to become the industry standard for 3D stacked DRAM. In 2018 that Micron started to pivot away from HMC and invest in the HBM roadmap which is why Micron is the furthest behind.

The GPU shortage is due to TSMCs CoWoS Capacity

in Semi Analysis’ opinion, the real bottleneck is the TSMC CoWoS packaging. As readers of IFTLE know, CoWoS is the “2.5D” packaging technology from TSMC where multiple active silicon dies are integrated on a passive silicon interposer. The interposer and active silicon are then attached to a packaging substrate which contains the I/O to attach it to the system PCB.

The high pad count and short trace length requirements of HBM necessitate 2.5D technologies like CoWoS to enable such dense, short connections that cannot be done on a PCB or even a package substrate. Since almost all HBM systems are currently packaged on CoWoS, and all advanced AI accelerators use HBM, the Semi Analysis report concludes that virtually all leading-edge data center GPUs are packaged on CoWoS by TSMC.

While 3D packaging technologies such as TSMC’s SOIC enable stacking dies directly on top of logic, it does not make sense for HBM applications due to thermal issues and cost. SoIC sits on a different order of magnitude regarding interconnect density and is better suited to expanding on-chip cache with die stacking, as seen with AMD’s 3D V-Cache solution. AMD’s Xilinx was the first user of CoWoS many years ago for combining multiple FPGA chiplets together.

While there are some other applications that use CoWoS like networking, supercomputing, and FPGAs, the vast majority of CoWoS demand reportedly currently comes from AI.

Figure 2: CoWoS quarterly output (Source: SemiAnalysis)

In June, TSMC announced the opening of its Advanced Backend Fab 6 in Zhunan. This fab has enough cleanroom space for potentially 1 million wafers per year of 3D Fabric capacity. This includes not only CoWoS but also SOIC and InFO technologies. This fab is reportedly larger than the rest of TSMC’s other packaging fabs combined!

Although there are other 2.5D packaging technologies from Intel, Samsung, and OSATs (like FOEB from ASE), CoWoS is the only one being used in high volume given TSMC is by the far most dominant foundry for AI accelerators.

Figure 2 shows the Semi Analysis report of CoWoS production output (by customer).

In fact, Digitimes and Toms Hardware are both reporting that TSMC is in fact in the process of CoWoS expansion.

TSMC is reportedly accelerating backend equipment orders as it expands CoWoS packaging capacity. The compute GPU shortage for AI and high-performance computing (HPC), is reportedly responsible for TSMC’s limited CoWoS packaging shortages. Reportedly major tech giants like Nvidia, Amazon, Broadcom, Cisco, and Xilinx have all boosted their demand for TSMC’s advanced CoWoS packaging and are consuming all they can get. Nvidia has reportedly booked 40% of TSMC’s available CoWoS capacity for the coming year. However, due to the severe shortage, Nvidia has started exploring options with its secondary supplier, placing orders with Amkor Technology and United Microelectronics (UMC).

These reports suggest TSMC plans to increase its current CoWoS capacity from 8,000 wafers per month to 11,000 wafers/month by the end of 2023 and 14,500 – 16,600 wafers by the end of 2024.

The Industry Supply Chain Status

IFTLE wants to bring your attention to an interesting report by Prismark Partners.

Prismark has observed that from Q1 2020 through Q1 2023, most segments of the electronics supply chain have experienced historic changes in supply and demand. Certainly, the pandemic played a significant role in shaping these dynamics. All segments witnessed an initial surge in demand followed by an oversupply and surplus inventory.

The graphic below illustrates the heating up of demand in Q4 2020, which continued to increase through most of 2021 (in the case of the foundry all the way through Q3 2022). By the end of 2022, almost all segments had experienced a major correction that continued into Q1 2023, representing a supply-demand re-balancing and excess inventory depletion.

The current state of the electronics industry’s supply chain presents an unusual scenario in that very rarely do all segments simultaneously enter a period of “substantial cooling”. They indicate that “Not since the depths of the recession of 2008-2009 have we witnessed such a profound cooling-off period”.

Prismark anticipates a recovery in most segments of the supply chain in the latter half of 2023, with the semiconductor segments expected to return to growth in Q4 2023, with 11% growth forecasted for 2024. They expect this to be fueled by significant demand in AI and automotive electronics.

They expect most other segments will return to growth, during the period Q4 2023 to Q2 2024. though “the recovery process might be gradual”