Graphics, Hardware

nVidia "Fermi" GeForce Die Sizes Exposed

One of nVidia’s secretive policies that changes from time to time is the die size of their chips. For instance, we asked nVidia PR representative back in 2006 about the die size of G80, and the representative was more than happy to tell us that the die size was a then gigantic 484mm2 and that was "the testament of how closely nVidia worked with TSMC in order to create largest piece of core logic in history of semiconductor industry."

Two years later, nVidia introduced GT200 i.e. GeForce GTX 280 and again, die size was 576mm2 and there was no issue in finding the die size from manufacturers. In fact, we learned of the 576mm2 die size during the Editors’ Day event from a marketing exec. However, when Fermi came onboard, nVidia turned into "Howdy, we don?t disclose die sizes of our chips? Sorry man!" Given that sizes are quite easy to track – we were surprised with the answer.

From one side, one could argue that nVidia wants to keep the die sizes secretive. Yet, will the company ever fulfill its dream on becoming a real industry leader confident in them? As we all know, you can easily learn the die size of niche products such as Itanium 2 [Tukwila i.e. Itanium 2 9300 series is 699mm2], yet alone products which directly compete with nVidia – Cypress GPU for instance, measures 337mm2.

Without any further delays, we bring you the die sizes of chips based on Fermi architecture, both present and future ones.

GF100 – 529.17mm2 for every GeForce GTX 465, 470, 480, 485[?]
Currently, GF100 is the largest monolithic chip in the world – no other competes with GF100 in terms of number of transistors packed on one square of silicon, interlined with copper interconnects. Finding the size of the die was not exactly an easy task – in order to find it out, you have to slice the sides on a heavy and large IHS [Integrated Heat Spreader] without knowing how large the die is – needless to say, a complex affair. Physically, GF100 has 16 clusters with 32 cores in each [512 total, 352/448/480 active in the shipping parts as of 2010/8/9], 48 ROPs and 60 Texture Address Units.

Long story short, GF100 die, the massive three billion transistors, 40nm part is approximately 23x23mm2 in size, i.e. 529.17mm2. Direct competitor: AMD Cypress, 337mm2 and Hemlock, 674mm2 [2x 337mm2].

GF104 – 331.54mm2 for every GeForce GTX 460
We heard that this part was called "KickAss Fermi", i.e. a Fermi architecture-based part that will really kick the competition. nVidia went for a "widescreen", 16:9-looking die shape.
Physically, this "widescreen", 1.95 billion transistor die is consisted out of eight clusters with 48 cores in each [384 total, 336 active in the only shipping part as of 2010/8/9], 32 ROP and 56 Texture Address/Filtering Units. You know how well the part performs, and these 56 units are a very big reason for it [second and equally important one is the higher core density]. Interestingly though, this is the second time this happened [G80 and G92 were the first case, where the latter in some cases, outperformed the former].

The GF104 die is measured at 13.7×24.2mm, i.e. 331.54mm2. Note that some publications incorrectly mentioned die size being 366mm2 – thus, bigger than AMD’s current high-end chip. That is not correct, if only by a small margin [332 vs. 337 mm2]. Direct competitor: AMD Cypress, 337mm2.

GF106 – 238.60mm2 for the upcoming GeForce GTS 440[?], 450[?], 455[?]
The first Fermi part that belongs to a series other than "GTX" will be the intended replacement for venerable G92 die, which started its life as 8800GTS/512 and continued to this date with 9800 GTX and GTS 250.
Not much is known before its introduction in just a few weeks from now, but this is the part with which nVidia worked hard on winning the Holiday Season 2010 notebook and desktop design wins. The only thing that our sources confirmed is that the part is more based on GF104 core cluster design, rather than GPGPU-oriented "jack-of-all-trades-master-of-none" GF100.

The die size dimensions are 15.2 times 15.7 millimeters, almost a perfect square, or 238.60mm2. This part succeeds the G92b workhorse and its 231mm2 – thus, nVidia is only taking eight more square millimeters when compared to G92b [9800GTX+, GTS 250]. The direct competitor is AMD’s Juniper die, used on Radeon HD 5700 Series on Desktop and as HD 5800 Series on Mobile. Now, Juniper packs 1.05 billion transistors inside only 166mm2. 

GF108 – 126.51mm2 for the upcoming GeForce GT 4xx
Oddly enough, the first leaked images of GF108 came from Hungary. This part will also launch in just a few weeks, when week starting on September 13th knocks on our door. This Fermi part targets the largest volume it can get and according to our sources, its primary destination will be – low and middle-class notebooks with discrete graphics.

The die size is rumored to be around 130mm2. Our sources told us that the die size is 126.51mm2. With its performance and pricing, the GF108 parts will compete against AMD Redwood [Radeon HD 5600 Series], which packs 627 million transistors in a piece of silicon measuring 104mm2.

From the looks of it, nVidia definitely does not have the die size advantage that would enable cheaper manufacturing cost than AMD. However, that is only one part of the BoM equation. Our confidential sources told us that nVidia dominates in TSMC wafer orders, especially after it became common knowledge that AMD will switch majority of its GPU production to GlobalFoundries. That is a four-step procedure happening right now, and TSMC reacted in the way they did.

Coming back to nVidia, when the company launched G80 [8800 Series], nVidia moved to whole board production to Flextronics and reduced Add-In-Card manufacturers to "sticker stampers", keeping a close tab on costs and yields. AMD adopted a similar strategy and both companies are in a dead-lock over board costs.

nVidia lost big time with the long-delayed Fermi architecture. However, it only lost in 2010 as the company is laying groundwork for the future – nVidia was just awarded $25 million from DARPA to research Exascale architecture, and the number of Tesla wins on the HPC space is more than noticeable. In the end, the success of Tesla should reduce the cost of GeForce boards, just like Xeon and Opteron reduced the cost of Pentiums and Athlons and enabled the cheap computing of today.
Now, if they want to take the lead in graphics as such, AMD and nVidia need to adopt one major thing in which Intel excels: EXECUTE.

P.S. Execute applies to all long-delayed products to which AMD and nVidia just can’t stop making excuses and putting products in future quarters: Fusion, Bulldozer, Bobcat, Tegra, Ion 2? have we missed anything?

P.P.S. We do not exclude the possibility that we were off by fractions of a technical measurment unit which was used in measuring these dies. We thank our sources for providing us with the invaluable information.