Enterprise, Graphics, Hardware

NVIDIA CUDA for ARM: Removing X86 From GPU Computing

Back in 2008 on Tom’s Hardware, I wrote about NVIDIA going to pair Tegra with Tesla and skip the X86 processor as data feeder. While senior PR figures in NVIDIA insisted that I was a bit crazy and that those two target completely different markets today that became reality with "CUDA for ARM Development Kit".

CUDA for ARM Development Kit combines a CUDA-enabled GPU with Tegra 3 SoC

The concept of "CUDA for ARM Development Kit" is quite simple: pair a dual-core Tegra 2 or a quad-core Tegra 3 SoC with an CUDA-enabled GPU, be that GeForce or a Quadro in MXM form factor. In order to build this solution, NVIDIA turned to their skunkworks project company, Italian company Seco s.r.l.

SECOCQ7-MXM is the name of a host board which fits QUADMO747-X/T20 (Tegra 2) and QUADMO747-X/T30 (Tegra 3) boards from one side, and any MXM 3.0 Type A or Type B graphics card from another. The combination is already supported in the CUDA Software Development Kit, but you will have to wait until next year to acquire the kit (estimated availability 1H 2012).

If you don’t feel like waiting, you can contact Seco and probably acquire the kit right now, but bear in mind that Tegra 2 is not nearly enough powerful to feed the GPU. We would not be surprised to see Tegra 3 QUADMO module featuring DDR3L-1500 memory, which is more than twice the bandwidth when compared to Tegra 2 and its LPDDR2-760.

Also, bear in mind that existing Tesla 2000-Series MXM modules are not compliant with this part, as those modules came from T-Platforms, a Russian supercomputing company. Those modules are MXM in mechanical, but not in electrical sense and would not work.

Coming in the future: NVIDIA Tesla Server will sooner or later come with Tegra SoC to feed the GPUs and cut off the current X86+Tesla configuration

Still, there is no doubt in our mind that NVIDIA’s own announcement of MXM-based Teslas and Tegra-Tesla servers is a matter of a few quarters. We believe that existing Fermi and Kepler GPGPUs are going to be used for development of such solutions, and full deployment to arrive in the 20nm era (2013), in the shape of Project Denver-powered Logan Tegra SoC and Maxwell-based Teslas.