VR World

NVIDIA Unveils Pascal GPU: 16GB of memory, 1TB/s Bandwidth

At the Japanese edition of NVIDIA GTC (GPU Technology Conference), NVIDIA finally revealed details behind its 2016 graphics architecture, codenamed Pascal. The architecture was launched at the main GTC event, which took place in San Jose on March 17th, 2015 (watch Jen-Hsun Huang’s GTC keynote here). GTC Japan was hosted by Marc Hamilton.

As always, the Pascal GPU will be manufactured in Taiwan Semiconductor Manufacturing Company (TSMC), using the brand new 16nm FinFET process. This process is much more than a simple number, since it marks the shift from planar, 2D transistors to the FinFET i.e. 3D transistors. This shift required that the engineers make lot of changes in the thought process, and should result in significant power savings.

NVIDIA Pascal proof-of-concept Engineering Board with 4GB HBM memory. Retail products will carry 16GB of HBM2 memory.

NVIDIA Pascal proof-of-concept Engineering Board with 4GB HBM memory. Retail products will carry 16GB of HBM2 memory.

But that is just the beginning, as Pascal will bring support for up to 32GB of HBM2 memory. However, the actual products based on Pascal will launch with 16GB HBM2 memory, and more memory will depend solely on memory vendors such as SK.Hynix and Samsung. What is changing the most is bandwidth. Both the Kepler-based Tesla (K40) and Maxwell-based M4/M40 featured 12GB of GDDR5 and achieved up to 288GB/s of memory bandwidth. Those 16GB HBM SDRAM (packed in four 4GB HBM2 chips) will bring 1TB/s in bandwidth, while internally the GPU surpasses the 2TB/s barrier.

NVIDIA’s Marc Hamilton said: “Using 3D memory, not only the memory capacity will go up, the memory bandwidth will go up significantly. With a much faster GPU, and higher memory bandwidth, the existing interconnects in the server are just plain outdated. So, we had to develop our own interconnect called NVLink, five times faster than existing technology.”

Pascal will also be available in multi-GPU packaging, replacing the Tesla K80 (NVIDIA skipped Maxwell-gen dual-GPU Tesla). Combined figures are very interesting to compare – 24GB GDDR5 and 480GB/s bandwidth should be replaced with 32GB HBM2 and 2TB/s bandwidth, mutually connected through NVLink rather than PCIe. The NVLink will enable up to 80GB/s, which should replace PLX PCIe Gen3 bridge chips that can only support 16GB/s (8GB/s per GPU). This part should be ‘warm up’ for 2018 and the Volta architecture.

Pascal GPU architecture highlights: Higher performance through 16nm process and GPU Boost, stacked 2.5D / 3D memory, NVLink and mixed-mode calculation.

Pascal GPU architecture highlights: Higher performance through 16nm process and GPU Boost, stacked 2.5D / 3D memory, NVLink and mixed-mode calculation.

Unfortunately, the company did not disclose how much would ECC (Error Correcting Code) reduce the memory performance and overhead, but that is something all HBM-powered products will have to deal with. In any case, the company is gearing for a battle with Intel Xeon Phi, which in its recent incarnation is becoming quite the competitor. Still, Pascal is expected to deliver double-digit single-precision TFLOPS performance, and a lot of focus will be placed on so-called mixed-mode precision (INT8, FP16 and FP32).

Pascal is expected to hit the market during the first half of 2016.