3D, AMD, Business, Companies, Graphics, Hardware, Intel, Software Programs

Nvidia’s $50 card destroys ATI’s $500 one or “Why ATI sucks in Folding?”

As you might already know, I am a bit enthusiastic when it comes to distributed computing. I’ve been looking for aliens through SETI@home, later with BOINC… but then, Folding@Home showed up and I became an enthusiast for this valuable project from Stanford University. My family had some share of dealings with Alzheimer’s (aka AD) and Parkinson’s diseases (aka PD) and I won’t go here into what psychological and ultimately financial stress that families around the world, including my own – have to endure.
Folding@Home is also a project that pioneered the use of GPUs for distributed computing (if I am wrong on this one, feel free to correct me). Back in the summer of 2006, I heard that ATI and Stanford are working Folding@Home GPGPU client. I now remember my articles and articles from a lot of colleagues who all criticized Nvidia for not having a F@H client.

Nvidia's client may not look as nice as ATI one, but it's the efficiency that counts...

Nvidia's client may not look as nice as ATI one, but it's the efficiency that counts...

Fast forward to GTX280 launch and the Vijay Pande team debuted the Folding@Home client for Nvidia chips as well. Nvidia and ATI lead a short marketing war who can fold better and things went quiet… apparently, for a reason.
The reason why things went quiet is probably the “inconvenient truth”: ATI showed up with Radeon 4800 series and demolished Nvidia’s dominance in the segment, with GTX260 and 280 going through radical price drops in order to stay competitive. However, ATI’s Radeon 4800 series has one field where the card is losing against 5-10x cheaper cards: Folding@Home.
The 10x argument lies in comparison between current ATI’s flagship, the  Radeon 4870X2 and Nvidia’s GeForce 9600GSO. This $50 card can easily out-fold ATI Radeon 4870X2, which retails for more than 500 USD/450EUR in respective markets.
In the past weeks, I’ve conducted a series of tests with various graphics cards (all that I own or could put my hands on), and the results were quite depressing if you own an ATI card. I’ve asked some of my contacts in AMD why the performance is so bad and the answers were ranging from “we wanted to make best gamer’s card, not a card for Folding” to sad silence. It seems to me that the difference lies in shader type and clock: ATI’s R6xx and RV7xx architecture lies around big fat units and lot of tiny ones (64+256 in case of Radeon 3800, 80+720 in case of Radeon 4800), and the clock is much lower than in case with GeForce cards. At the same time, Nvidia went the other route and came up with large number of “fat” units, while the company didn’t even count the “thin” (MADD) ones.
When we compare the GTX280 and 4870X2, comparisons are just astounding: in a period of a month, EVGA’s GTX280 SSC achieved an average of 6,802 points per day, while ATI Radeon 4870X2 managed puny 3,870 ppd. At the same time, I’ve witnessed higher PPD scores achieved even by two-year old GeForce 8800GTS 640 MB, which was quite a surprise. Around two weeks ago, I started following PPD numbers using FahMon on a large number of systems that mostly bear the same configuration: dua-core processor or more, 2GB system memory or more and the graphics cards. In all cases, with the help of my friends, I’ve managed to check FahMon and KakaoStats for rougly 25 cards and came to a surprising result.
With the recent update to the GPU2 client and new Fah_Core11.exe (ATI uses v1.17, Nvidia v1.15), the community witnessed further fall in number of completed packets per day. If you’re not familiar with Folding@Home packets, every package features certain number of mathematical simulations for tested protein – in case of Nvidia, packet consists out of 25 million, while ATI’s one features 10 million operations. However, due do different type of mathematical operations, Nvidia’s packet usually will result in 480 points, while ATI’s 10 million will return 548 points (or recently introduced ATI packets with 338 points).
Like I previously wrote, the table below is not the result of one packet score and Excel calculation, but rather continuous number crunching over the course of several weeks, with one week used for measurement.

Improvised Top 20 Folding@Home GPUs:

  1. Nvidia GeForce GTX280 1GB (EVGA SSC)
  2. Nvidia GeForce GTX260-216 898MB (EVGA SSC)
  3. Nvidia GeForce GTX260 898MB (EVGA Superclocked)
  4. Nvidia GeForce 9800GTX+ 512MB (ASUS TOP)
  5. Nvidia Quadro FX 4600 SDI 768MB (PNY)
  6. Nvidia GeForce 9800GTX 512MB (ASUS TOP)
  7. Nvidia GeForce 8800GTX 768MB (Zotac AMP! Edition)
  8. Nvidia GeForce 8800Ultra 768MB (XFX XXX Edition)
  9. Nvidia GeForce 8800GTS 512MB (Gainward)
  10. Nvidia GeForce 8800GT 512MB (Gainward)
  11. Nvidia GeForce 9600GSO 768MB (EVGA)
  12. Nvidia GeForce 8800GTS 640MB (LeadTek)
  13. ATI Radeon 4870X2 2GB (PowerColor)
  14. ATI Radeon 4870 512MB (PALIT)
  15. Nvidia GeForce 9600GT 256MB (Zotac)
  16. ATI Radeon 4850 512MB (PALIT)
  17. ATI Radeon 3870 512MB (Sapphire Atomic)
  18. ATI FireGL V8600 1GB (ATI)
  19. Nvidia GeForce 8600GTS 256MB (XFX XXX Edition)
  20. ATI Radeon 3850 256MB (Sapphire)

This is not a complete table by no means, since I am missing several new GPUs. But in this one, as you can see for yourself – results are quite dramatic for the red team. Two year old GeForce GPUs demolished otherwise-brilliant Radeon series, and it is incredible that even GeForce 9600 will outfold Radeon 4850. This is a rude wake-up call for guys at Markham, because this is just unbelievable.
Personally, I am running a combination of AMD Spider platform (9850BE + 790GX + ATI Radeon 4870X2) and hybrid Intel’s V8-Skulltrail platform with Quadro FX 4600 SDI.
Of course, everything can be changed with a simple driver update. I don’t understand what happened with AMD/ATI, company that lead the field of GPGPU computing for so long – why should AMD work on optimizing Folding@Home client… I am aware that AMD poached Mike Houston from Stanford to work on Brooke+ and now OpenCL APIs, but surely the performance didn’t went downhill from the influence of just one person. Or just maybe…
Overall, I hope that Catalyst 8.11 or 8.12 will bring more performance for ATI cards, since I do not believe that it would be so hard to optimize drivers for GPGPU/GPU Computing usage. For now, in Folding@Home, ATI is complete washout.

For the end of this article, if you find that your GPU cycles could be used for something good, I invite you to read the following article and join F@H family, regardless of what client (CPU or GPU) or team you choose in the end. Intel, AMD, ATI, Nvidia, Windows, Linux or Mac OS – it does not matter, just join – If you want, of course.