It looks like 2016 is turning into a year of anticipation and redemption for AMD, not just to its consumers, but also to customers which purchased millions of dollars of AMD hardware in the past, and then felt left out. We all saw Oak Ridge National Laboratories, one of first Opteron adopters – ditching a decade old AMD collaboration for IBM+NVIDIA team up. Luckily for all involved, AMD seems to have finally “get their s*** together” and started a sales campaign which might be the most successful since Henri Richard led the sales team taking over more than 50% market share from Intel (albeit only in 4P and 8P enterprise markets).
What’s this all about? We’re talking about AMD Zen architecture, which in its (hopefully) high-end desktop, workstation and server market just might be that jolt the world of PC desperately needs. After seeing limitation on limitation that Intel placed on their products and how they want consumers to buy their products, we just might be getting a product line-up where AMD can attack all markets, especially money making markets which command extremely high margin. For example, a certain ‘green meets chrome’ recently released Nikola Tesla-branded parts inside their partner’ss blade servers. There are no product specs available publicly – jsut that they use MXM format. But you can bet that the margin for that enterprise part is XXX% higher than the consumer part.
Which is why we need an outrageous product from AMD. Radeon R9 Fury was ok, good start – but it’s louder than equally-performing air-cooled products from competition. Sadly, we did not got to see a FirePro branded part, or a server-based part with unlocked IEEE Double-Precision standard, where that R9 Fury, and especially double-GPU based R9 Fury X2 or “R9 Fury VR” would excel as a $1000 consumer and $5000 enterprise-grade part.
How did AMD ‘Fast Forward’ Became a Product
In 2014, AMD presented “Fast Forward”, its vision for Exascale computing in line with DoE (U.S. Department of Energy) guidelines (and hundreds of millions of dollars in research and manufacturing grants). The company ultimately lost the $425 million contract for Lawrence Livermore National Laboratories and Oak Ridge National Laboratories (ORNL) to IBM and NVIDIA. Still, AMD was successful in “Fast Forward2”, a U.S. DoE and National Nuclear Security Administration (NNSA) grant program where AMD, Cray, IBM, Intel and NVIDIA all received funding for development of Exascale-class processing architectures.
ZEN According to CERN
Fast forward is a project that brings us to today. In a recent presentation at CERN, AMD touted its upcoming ZEN architecture as a product which they proudly say to be “ultimate product for the ultimate market”. With the future ZEN-based Opteron (its ‘probably’ getting a new brand too), AMD wants to show that it has not forgot how to make a high-end processor, and ZEN-based Opteron should represent the ultimate in fusion between a CPU and a GPU.
According to a sales pitch to CERN, AMD is upping the ante when it comes to the most powerful piece of silicon they plan to produce. While Intel specified 180W as the highest available TDP for its 2016 Broadwell-E/EX and 2017 Skylake-E/EX enterprise products, recent leaks show that AMD is specifying as much as 250 Watts for the enterprise-grade processors, which we hope to see for enthusiast desktops and workstations as well. Single Socket design for AMD ZEN calls for ‘up to 32 physical cores and Symmetrical Multi-Threading. Yes, AMD is finally abandoning controversial Cluster Multi-Threading which debuted with Bulldozer CPU architecture. Their CPU and APU will both feature a single 16-core die package, and on the other hand you will either have an evil twin, another 16-core die, or a Polaris GPU with HBM memory.
Even though we’re not quite certain how will AMD develop ECC memory specification at 34% higher clock than the current JEDEC standard), ZEN supports DDR4-3200 through the octa-channel memory controller. This is the first time we’re seeing 8-channel memory controller being mentioned after DEC launched the Alpha 7 architecture. Bear in mind that DEC’s Alpha 21264 processor was launched almost 20 years ago – in October 1996. At 25.6 GB/s per channel, this means AMD ZEN would achieve system memory bandwidth of 204.8 GB/s. For comparison, that is higher than the L3 SRAM cache on some of its current processors.
The performance story does not end there, as AMD is integrating a die-optimized version of Polaris GPU onto the CPU packaging. This enterprise-grade APU is packing four dies on a single MCM (Multi-Chip Module) package, which should make for an impressive feat of engineering – an “16-core” i.e. 8-core ZEN x86 CPU cluster, ‘half a Fury in Polaris clothing’ GPU; target performance being 4 TFLOPS Single Precision / 2 TFLOPS Double Precision and last two dies are two HBM2 chips with 4GB capacity each. Thus, based on information we have so far, AMD’s enterprise line-up is scheduled to have two distinct processors – a CPU and an APU.
ZEN High End ‘Exascale’ CPU, 1-4 Socket (1P-4P) – specs as per CERN
- Multi-Chip Module (2×16-core)
- 32 ZEN x86 Core, 6-wide
- 128 KB L0 Cache (4KB per core)
- 2 MB L1 D-Cache (64KB per core)
- 2 MB L1 I-Cache (64 KB per core)
- 16 MB L2 Cache (512 KB per core)
- 64 MB L3 Cache (8MB cluster per quad unit)
- 576-bit Memory Controller (two times 4×72-bit, 64-bit + 8-bit ECC)
- 204.8 GB/s via DDR4-3200 (ECC Off, 102.4 GB/s per die)
- 170.6 GB/s via DDR4-2666 (ECC On, 85.3 GB/s per die)
ZEN High End Exascale APU, 1-2 Socket (1P-2P) – rumored specs from Fast Forward
- 16 ZEN x86 Core, 6-wide
- 64 KB L0 Cache (4KB per core)
- 1 MB L1 D-Cache (64KB per core)
- 1 MB L1 I-Cache (64 KB per core)
- 8 MB L2 Cache (512 KB per core)
- 8 MB L3 Cache (512 KB per core)
- 288-bit CPU Memory Controller (4×72-bit, 64-bit + 8-bit ECC)
- 102.4 GB/s via DDR4-3200 (ECC Off)
- 85.3 GB/s via DDR4-2666 (ECC On)
- 102.4 GB/s between CPU and GPU via GMI
- ~2000-core Polaris GPU
- 2048-bit GPU Memory Controller
- 4 GB HBM SGRAM Memory (2 chips at 2GB)
- 512 GB/s GPU Bandwidth
First chip should be manufactured using 14nm FinFET in GlobalFoundries Fab8 in New York state, while the ZEN APU will probably be manufactured by Samsung Semiconductor, which recently managed to land AMD’s business. Both processes (i.e. transistor libraries) are mutually compatible, as GlobalFoundries, IBM and Samsung Semiconductor used to be the part of Common Platform, a foundry marketing alliance which ended in 2014.
Both designs are scheduled to become available during 2016, with the consumer parts to follow in 2017. We certainly hope AMD will not make a mistake of keeping the consumer parts solely on Socket AM4, but that we should see the parts using the new server socket in consumer and professional (workstation) environments. While AMD is keeping a tight lip on the launch dates, based on our sources – the company is preparing a significant PR/Marketing/Sales push for the following conferences:
- ISC 2016, June 19-23, Frankfurt, Germany
- HotChips, August 21-23, Cupertino, USA
- SC 16, November 13-18, Salt Lake City, USA
Is 2016 the year when AMD is (finally) coming back to being competitive in HPC and Enterprise?