AMD, Breaking, CPU, Enterprise, Exclusive, News

AMD ZEN CPU and APU Specs Confirmed?

It looks like 2016 is turning into a year of anticipation and redemption for AMD, not just to its consumers, but also to customers which purchased millions of dollars of AMD hardware in the past, and then felt left out. We all saw Oak Ridge National Laboratories, one of first Opteron adopters – ditching a decade old AMD collaboration for IBM+NVIDIA team up. Luckily for all involved, AMD seems to have finally “get their s*** together” and started a sales campaign which might be the most successful since Henri Richard led the sales team taking over more than 50% market share from Intel (albeit only in 4P and 8P enterprise markets).

What’s this all about? We’re talking about AMD Zen architecture, which in its (hopefully) high-end desktop, workstation and server market just might be that jolt the world of PC desperately needs. After seeing limitation on limitation that Intel placed on their products and how they want consumers to buy their products, we just might be getting a product line-up where AMD can attack all markets, especially money making markets which command extremely high margin. For example, a certain ‘green meets chrome’ recently released Nikola Tesla-branded parts inside their partner’ss blade servers. There are no product specs available publicly – jsut that they use MXM format. But you can bet that the margin for that enterprise part is XXX% higher than the consumer part.

Which is why we need an outrageous product from AMD. Radeon R9 Fury was ok, good start – but it’s louder than equally-performing air-cooled products from competition. Sadly, we did not got to see a FirePro branded part, or a server-based part with unlocked IEEE Double-Precision standard, where that R9 Fury, and especially double-GPU based R9 Fury X2 or “R9 Fury VR” would excel as a $1000 consumer and $5000 enterprise-grade part.

How did AMD ‘Fast Forward’ Became a Product

AMD "Fast Forward" High-End CPU/GPU Architectural concept

AMD “Fast Forward” High-End CPU/GPU Architectural concept

In 2014, AMD presented “Fast Forward”, its vision for Exascale computing in line with DoE (U.S. Department of Energy) guidelines (and hundreds of millions of dollars in research and manufacturing grants). The company ultimately lost the $425 million contract for Lawrence Livermore National Laboratories and Oak Ridge National Laboratories (ORNL) to IBM and NVIDIA. Still, AMD was successful in “Fast Forward2”, a U.S. DoE and National Nuclear Security Administration (NNSA) grant program where AMD, Cray, IBM, Intel and NVIDIA all received funding for development of Exascale-class processing architectures.

ZEN According to CERN

Fast forward is a project that brings us to today. In a recent presentation at CERN, AMD touted its upcoming ZEN architecture as a product which they proudly say to be “ultimate product for the ultimate market”. With the future ZEN-based Opteron (its ‘probably’ getting a new brand too), AMD wants to show that it has not forgot how to make a high-end processor, and ZEN-based Opteron should represent the ultimate in fusion between a CPU and a GPU.

CERN is considering AMD's upcoming ZEN-based server processors for their computing needs.

CERN is considering AMD’s upcoming ZEN-based server processors for their computing needs.

According to a sales pitch to CERN, AMD is upping the ante when it comes to the most powerful piece of silicon they plan to produce. While Intel specified 180W as the highest available TDP for its 2016 Broadwell-E/EX and 2017 Skylake-E/EX enterprise products, recent leaks show that AMD is specifying as much as 250 Watts for the enterprise-grade processors, which we hope to see for enthusiast desktops and workstations as well. Single Socket design for AMD ZEN calls for ‘up to 32 physical cores and Symmetrical Multi-Threading. Yes, AMD is finally abandoning controversial Cluster Multi-Threading which debuted with Bulldozer CPU architecture. Their CPU and APU will both feature a single 16-core die package, and on the other hand you will either have an evil twin, another 16-core die, or a Polaris GPU with HBM memory.

Even though we’re not quite certain how will AMD develop ECC memory specification at 34% higher clock than the current JEDEC standard), ZEN supports DDR4-3200 through the octa-channel memory controller. This is the first time we’re seeing 8-channel memory controller being mentioned after DEC launched the Alpha 7 architecture. Bear in mind that DEC’s Alpha 21264 processor was launched almost 20 years ago – in October 1996. At 25.6 GB/s per channel, this means AMD ZEN would achieve system memory bandwidth of 204.8 GB/s. For comparison, that is higher than the L3 SRAM cache on some of its current processors.

AMD ZEN x86 Core design

AMD ZEN x86 Core design

The performance story does not end there, as AMD is integrating a die-optimized version of Polaris GPU onto the CPU packaging. This enterprise-grade APU is packing four dies on a single MCM (Multi-Chip Module) package, which should make for an impressive feat of engineering – an “16-core” i.e. 8-core ZEN x86 CPU cluster,  ‘half a Fury in Polaris clothing’ GPU; target performance being 4 TFLOPS Single Precision / 2 TFLOPS Double Precision and last two dies are two HBM2 chips with 4GB capacity each. Thus, based on information we have so far, AMD’s enterprise line-up is scheduled to have two distinct processors – a CPU and an APU.

ZEN High End ‘Exascale’ CPU, 1-4 Socket (1P-4P) – specs as per CERN

  • Multi-Chip Module (2×16-core)
  • 32 ZEN x86 Core, 6-wide
  • 128 KB L0 Cache (4KB per core)
  • 2 MB L1 D-Cache (64KB per core)
  • 2 MB L1 I-Cache (64 KB per core)
  • 16 MB L2 Cache (512 KB per core)
  • 64 MB L3 Cache (8MB cluster per quad unit)
  • 576-bit Memory Controller (two times 4×72-bit, 64-bit + 8-bit ECC)
  • 204.8 GB/s via DDR4-3200 (ECC Off, 102.4 GB/s per die)
  • 170.6 GB/s via DDR4-2666 (ECC On, 85.3 GB/s per die)

ZEN High End Exascale APU, 1-2 Socket (1P-2P) – rumored specs from Fast Forward

  • 16 ZEN x86 Core, 6-wide
  • 64 KB L0 Cache (4KB per core)
  • 1 MB L1 D-Cache (64KB per core)
  • 1 MB L1 I-Cache (64 KB per core)
  • 8 MB L2 Cache (512 KB per core)
  • 8 MB L3 Cache (512 KB per core)
  • 288-bit CPU Memory Controller (4×72-bit, 64-bit + 8-bit ECC)
  • 102.4 GB/s via DDR4-3200 (ECC Off)
  • 85.3 GB/s via DDR4-2666 (ECC On)
  • 102.4 GB/s between CPU and GPU via GMI
  • ~2000-core Polaris GPU
  • 2048-bit GPU Memory Controller
  • 4 GB HBM SGRAM Memory (2 chips at 2GB)
  • 512 GB/s GPU Bandwidth

First chip should be manufactured using 14nm FinFET in GlobalFoundries Fab8 in New York state, while the ZEN APU will probably be manufactured by Samsung Semiconductor, which recently managed to land AMD’s business. Both processes (i.e. transistor libraries) are mutually compatible, as GlobalFoundries, IBM and Samsung Semiconductor used to be the part of Common Platform, a foundry marketing alliance which ended in 2014.

Both designs are scheduled to become available during 2016, with the consumer parts to follow in 2017. We certainly hope AMD will not make a mistake of keeping the consumer parts solely on Socket AM4, but that we should see the parts using the new server socket in consumer and professional (workstation) environments. While AMD is keeping a tight lip on the launch dates, based on our sources – the company is preparing a significant PR/Marketing/Sales push for the following conferences:

  • ISC 2016, June 19-23, Frankfurt, Germany
  • HotChips, August 21-23, Cupertino, USA
  • SC 16, November 13-18, Salt Lake City, USA

Is 2016 the year when AMD is (finally) coming back to being competitive in HPC and Enterprise?

  • Carnot Antonio Romero

    I honestly hope they pull this off. This is such an unforgiving industry– one false step and you’re in the wilderness for a couple of years. Especially since they didn’t have the leeway that Intel’s top-notch fabs have given them when they made mistakes like the Netburst architecture, or that Intel’s ability to fund multiple design teams gave them to come up with plan B quickly in that case.

    • Actually, I owned a Bulldozer a few years back. Wasn’t that bad really. It’s the issue of really giving the public a straight enough showcase of what the processor specs are, what they are capable of and how well they perform. If AMD made a transparent showcase, instead of the whole Bulldozer blunder, they would be in a far better position.

      Not to say AMD is doing bad these days, it’s just that sometimes they have an issue of communicating their ideas/nature of their products to the public. Nothing else! By the way, I can’t wait to see these.

  • So it’s confirmed that these Opterons are a separate core design to the mainstream and enthusiast models? I noted the lack of L3 cache in the Opteron APU, and would probably hope for AVX-512 for the Opterons.

    • BaronMatrix

      AVX 512 is just marketing for what GPUs have been doing for years…

      • All ISEs are just marketing terms. It seems as though this information has been debunked anyway.

    • All AMD is saying about the ZEN-based desktop parts is that they’re coming to AM4 socket. They’re not disclosing anything about the new socket with succeeds the Socket F (LGA-1944), which is only used in servers. My $0.02 is that it is quite pointless for AMD to keep a dual infrastructure, as it is not cost-effective. Yet, the company lacks the cojones to stop ordering PGA-based packaging and Sockets from Foxconn – and does a move which Intel did over a decade ago – BGA, LGA-115x, LGA-2011… even Itanium is a LGA-design which attaches to the processor PCB.

      • You do actually raise a question that I have had for a while, which is, is there any information regarding Zen’s processor package? I know it’ll be using AM4 and FP4, but is there any information on how many pins or lands will be used?

        • At the current point in time, both AM4 (Desktop) and FP4 (Mobile) are unknown, as well as the enterprise-class next-gen ‘Socket X’, which needs to pack as much as eight 72-bit DDR4 lanes, meaning we’ll be looking probably at 2500-3000 LGA design.

      • BaronMatrix

        I think they’re differentiating on platform since Opteron needs ECC support… So there should be two sockets, one for desktop and one for server… They have to fit both CPUs and APUs… I guess they COULD have one platform where “AM4” doesn’t support ECC…

        They are being real quiet about Opteron… I mean, the limit of HT was MCM above 4P so they need to really expand if they are going to 32 cores which means eight packages of 4 cores or 8 interconnects per socket…

        But I guess they could use two modules per socket for 8 socket… That’s still too many nodes for HT though…

  • BaronMatrix

    The interesting thing to note in this is it says SIX WIDE… Unless the AGUs can double (through superscalar) as ALU the Linux patch corrects it to FOUR WIDE…

    But it shows that AMD is serious about getting back into servers…

    I’ll be glad… I can upgrade my FX…

    • That is one weird element of all presentations I received from AMD about the ZEN core, and those presentation slides that leaked online. AMD is putting a lot of six elements, but I am not sure is it because they really have ‘six inside’ or because the ‘puzzle’ has nine elements and they’re focusing on keeping things CPU-focused with 6:3 ratio, 2:6:1 ratio for the GPU part etc.

      • BaronMatrix

        Since it was done by Keller the sky’s the limit…

    • TheDizz

      Actually it is the other way around, ALUs can be used as AGUs. This is primarily because address-generation calculations involve different integer arithmetic operations, such as addition, subtraction, modulo operations, or bit shifts. Though there are some drawbacks or tradeoffs in doing that, don’t quote on me on this, I am no engineer, just recalled what I read somewhere.

      This is nothing new apparently so in recent years engineers could have figured out new methods to make ALUs as effective as AGUs without any trade offs, they are a clever bunch. 6 wide would then make sense if the ALUs are turned into AGU mode based on some signal or flag being set high or low on the fly. Something like switching from normal mode to test mode for JTAG operations.

  • Top

    AMD is fighting a battle on both fronts, AMD is holding her own!

    • PeterL

      AMD has already lost it, I fear.

      • Definitely can’t agree on this! AMD is on the offensive, really. Just need to stay tight and wait to find out how well do they perform when the new products come.

  • PeterL

    I am curious, how many times should we, fans and media, repeat the same thing
    over and over again? 40% improvement is not enough. Period. According
    to many tests, modern AMD CPUs are 50-80% slower than Intel
    equivalents. So, how this 40% better IPC will get Zen on par with latest Intel processors that will be 70-100% faster than current AMD CPUs? I feel that fans will have to find Phenom II X4 and X6 processors (I cannot find but Phenom II X2, which is too slow for modern games). I rather foresee the sunset of

    • Absolute performance and IPC throughput are two different metrics. I am predicting a quad-core Zen APU @ 4.00 GHz to perform up to the levels of an i7-4790T @ 3.30 GHz, which is perfectly adequate for any task a consumer will throw at their processor. Even the most demanding games don’t care about the difference between a 4790T and a 4790K. Sure the AMD chip is running at a higher frequency (these frequencies are turbos with all cores active), but it gets the job done within the same TDP and will likely be cheaper.

    • Morgan

      Funny you say this considering that most testing is done on programs where Intel built in or had specific instructions put in that yield better results on gasp… Intel CPUs. Real world performance has a different story to tell wherein the superiority of Intel is negligible at best and in some cases, inferior. But if you wish to continue drinking the spiked Intel Koolaid, I’d suggest you at least eat the complimentary elitist horderves.

      I think everyone needs to wait until review units are sent out and reviewed with real world metrics before jumping on the hate bandwagon.

  • dave

    AMD better not take until 2017 to release consumer Zen based APUs and CPUs because if they do… their gonna lose out on a shit ton of money, people are fed up of waiting for AMD to get their act together and release something as competition to intel and thus will turn to intel again through lack of choice. AMD absolutely do not want to miss the golden window of when PC enthusiasts build a whole new system, which typically happens every 3 years or so.

    Enthusiasts won’t mind systems that are a bit fickle due to immature drivers and or UEFIs if it means shaving (for sake of argument) 3 months or more off of the wait time.