3D, AMD, Augmented Reality (AR), Breaking, Companies, Graphics, Hardware, Intel, News, Nvidia, Virtual Reality (VR), VR World

14nm AMD Greenland tapes out: Attack on NVIDIA Pascal, Intel Xeon Phi

Couple of days ago, GlobalFoundries issued a press release stating that they ‘demonstrated silicon success on the first AMD (NASDAQ: AMD) products using GLOBALFOUNDRIES’ most advanced 14nm FinFET process technology.’

FinFET technology is expected to play a critical foundational role across multiple AMD product lines, starting in 2016. GLOBALFOUNDRIES has worked tirelessly to reach this key milestone on its 14LPP process. We look forward to GLOBALFOUNDRIES’ continued progress towards full production readiness and expect to leverage the advanced 14LPP process technology across a broad set of our CPU, APU, and GPU products,” said Mark Papermaster, Senior Vice President and Chief Technology Officer at Advanced Micro Devices.

According to our sources, the company focused and pushed with development of GlobalFoundries 14nm process from the get go. In January 2015, their 14nm process “was successfully qualified for volume production, while achieving yield targets on lead customer products.” Furthermore, “The performance-enhanced version of the technology (14LPP) was qualified in the third quarter of 2015, with the early ramp occurring in the fourth quarter of 2015 and full-scale production set for (early) 2016.

Behind the cryptic talk, lies AMD decision to ‘double down’ in its complicated relationship with GlobalFoundries, which went from being great (after all, GlobalFoundries was a spinoff of AMD foundries) to disastrous, when then AMD’s failed CEO Thomas Seifert (now CFO at Symantec) invoked a clause which resulted in a $500 million cash penalty, which AMD had to pay GlobalFoundries, almost bankrupting the company in the process. Luckily for AMD, Thomas was soon shown the door, with Rory Read and latter, Lisa Su getting the post to lead the path of recovery.

Raja Koduri is the face behind AMD's graphics spin-off, Radeon Technology Group.

Raja Koduri is the face behind AMD’s graphics spin-off, Radeon Technology Group.

So, what did AMD taped out? In a recent conversation with Forbes, Raja Koduri, head of graphics spinoff Radeon Technologies Group (RTG) disclosed that “RTG will need to execute on their architectural designs and create brand new GPUs, something that Advanced Micro Devices has struggled with lately. He promised two brand new GPUs in 2016,” followed by their plan to “make Advanced Micro Devices more power and die size competitive.”

Thus, we have at least two GPUs coming from AMD, being mainstream (Baffin / Ellesmere) and high-end (Greenland) members of Arctic Island GPU family, as low-end is now integrated inside AMD’s APU (Accelerated Processing Units). According to unconfirmed rumors, the company also tapped out first prototypes carrying Zen architecture, which is still more than a year away. Given that Zen effectively changes AMD / RTG from top to bottom, it will be interesting to see when the company can deploy multi-TFLOPS APUs based on Zen x86 architecture and Greenland GPU architecture.

AMD Goes “Made in America”

The ‘double down’ on GlobalFoundries resulted in AMD’s switch from TSMC to GlobalFoundries for its next-generation GPUs, which will utilize a more advanced process node than their main competitor, NVIDIA. Also, AMD has a slight advantage with time zones, as GlobalFoundries developed and deployed then 14nm FinFET process in its Fab 8 facility in New York state, which is where AMD will produce all of the taped out parts.

"Diffused in Germany, Made in Malaysia" marking on AMD Fusion A6-3600 APU

“Diffused in Germany, Made in Malaysia” marking on AMD Fusion A6-3600 APU (first product to do so). All future 14nm processors are expected to carry “Diffused in United States” marking.

You’ve read it correctly folks. For the first time since 2002, and the spin-off of its Austin factory, AMD will have a “Made in United States” or as AMD likes to put, “Diffused in United States” markings on its chips. Given the advancement of HBM and HBM2 memory in AMD’s lineup, we expect to have “Diffused in United States, Assembled in South Korea” markings on their chips.

In the end, key question is – what Greenland GPU will be? Even if Greenland would be nothing more than an improved Fiji GPU in a new process, going from 28nm to 14nm would bring the power down in the 40-50% range. Our sources are saying this is far from being the case, which was confirmed when Lisa Su, CEO of AMD stated: “We are also focused on delivering our next generation GPUs in 2016 which is going to improve performance per watt by two times compared to our current offerings, based on design and architectural enhancements as well as advanced FinFET products process technology.” The statement came during the last quarterly earnings call.

AMD Roadmap clearly defines 2016 as the year of adopting the 14nm FinFET process and radical improvement in performance/watt.

AMD Roadmap clearly defines 2016 as the year of adopting the 14nm FinFET process and radical improvement in performance/watt.

After almost reaching transistor/FLOP parity with Fiji (8.9 billion transistors, 8.6 TFLOPS), Greenland is expected to go pass that mark. We expect to see an 15+ billion transistor part (first 10+ billion monolithic chip in the world), connected to 16 and 32GB HBM2 memory, in single and dual-GPU configuration. The board design comes from what the company learned delivering revolutionary R9 Nano and the upcoming R9 Fury X2 (codenamed Gemini).

In terms of internal architecture, we’re dealing with a new beast. While Fiji was a refined Hawaii GPU, offering improvement efficiency of the old units and HBM memory, Greenland offers new micro-architecture, and should not be considered a member of GCN (Graphics Core Next) family by default. Unlike current lineup of GCN 1.0, 1.1, 1.2 chips, Arctic Islands is either ‘GCN 2.0’, or a new marketing name is on the way. After all, in 2016, GCN naming will be five years old, and feature ‘only’ three iterations (depending on do you consider Bonaire XT / Fiji XT the same brethren, or Fiji being named GCN 1.3).

Development of HBM memory started in 2007, with AMD CPU and dGPU tests, continuing with Cypress HBM prototype... a fascinating journey. 2016 brings Greenland and HBM2 memory.

Development of HBM memory started in 2007, with AMD CPU and dGPU tests, continuing with Cypress HBM prototype… a fascinating journey. 2016 brings Greenland and HBM2 memory.

In just a few months, the market will see AMD Greenland facing NVIDIA Pascal, which is also getting HBM2 memory. However, NVIDIA is not the only one, as Intel Knights Landing architecture arrives as a stand-alone Xeon Phi workstation product as well. Our sources repeatedly said that AMD learned a great deal in developing both HBM memory and GDDR5X (memory for more affordable graphics cards), and that their goal is to ‘knock the ball out of park’ when it comes to offering superior compute capabilities (AMD GPUs perform double-precision operations at half the clock, just like AMD and Intel x86 processors, unlike NVIDIA GPUs). We are not in liberty of disclosing the targeted raw performance of the part – as it is too early to say, given that ‘speed binning’ is not taking place yet. But even if the company hits significant thermal/yield issues, it should have no problem beating Intel’s Knights Landing by a factor of two.

“Intel is perhaps the biggest ‘fraud’ in history of semiconductor industry,” one of our sources said. “They convinced the whole world they’re a logic company, while in fact they are a memory company that attaches logic to it.”  Riposte was simple – ‘that attached logic is so good your founder copied it.’ Getting back on track, there is some truth to it. Seeing an architectural diagram of Knights Landing reveals very large L1 (2.3MB) and absolutely massive 36MB of L2 cache, which is bound to take a huge chunk of the die – with the logic once more being on the smaller side. Knights Landing is claimed to achieve 6 TFLOPS of Single Precision and 3 TFLOPS of Double Precision. AMD Fiji achieves 8.6 TFLOPS Single Precision, and locks the Double Precision to 1/16 rate – something which professional parts have unlocked. In case of unlocked Fiji, you would have 4.3 TFLOPS of double precision. With Greenland, AMD plans to offer professional products for the servers and workstations, with a completely new lineup. Then again, bear in mind that NVIDIA’s latest, Maxwell-based Tesla products also come with castrated DP performance, achieving less than one fourth of what Fiji can deliver today.

Do bear in mind that AMD Greenland is intended to hook up to a new Server chip, codenamed Zeppelin. This is a multi-chip module (MCM) server (and hopefully workstation/high-end desktop) processor which will bring support for DDR4-3200 memory, targeting memory bandwidth exceeding 100GB/s, communicating with the Greenland GPU through a faster bus than what PCI Express can give, with its paltry 16GB/s bi-directional bandwidth. Is this a resurrection of Torrenza, a socketed, multi-chip HyperTransport initiative, remains to be seen.

Is 2016 a landmark year for AMD / RTG?

With a manufacturing advantage over NVIDIA (14nm vs 16nm), and compute advantage over both NVIDIA and Intel, it looks like the only thing preventing AMD from being a runaway success is AMD itself. We have had our fair share of bad experiences with the company, and loosing already won deals, walking away from business opportunities which ended earning millions to their competitors.

If AMD (finally) acts smart, Fiji could be treated as an announcement, and Arctic Islands / Zen-based processing beasts for the world of Augmented and Virtual Reality, the company might have a shot of becoming very profitable in 2016 and 2017 – making their stock undervalued beyond belief.

If they drop the ball (again), with misguided investments such as Mantle, they will have nobody to blame but themselves. For the sake of competition, we hope to see killer parts based on Zen and Arctic Island architectures.

Let’s wait and see.

  • 12John34

    Fiji looks more like two Tonga(Antigua) XT chips glued together, than a refined Hawaii.

    Mantle was far from misguided investment. It was probably one of the few times they did something clever. Created a low level API that can help almost every hardware they sell, push it to Microsoft and Khronos making sure that it will have all the features they wanted to have and then let those two pay the bill for implementing it and developing it further.

    • Hawaii, Bonaire, Tonga – all members of the GCN architecture, but I do get your point. Do not agree with Mantle strategy. I was approached by both AMD and NV over Mantle, and iwhat happened – happened. AMD is a company in financial dire straits and it should push for features they can charge for – if you were a stockholder and bought AMD at $5-6-7, seeing your investment being worth less than half is not something you would like to see in your life. Employee’s morale is another thing.

      AMD needs to go for low hanging fruit. The company had the opportunity to invest little and gain big, rather than investing big and receiving little. I am not in liberty to discuss specifics – but they will make for a great book when I retire. 🙂 HBM might be the best case-in-point, just like GDDR1/2/3/4, ending with NV having a higher performing part of the technology AMD created.

      Luckily, RTG now is being forming into a content-driven company, and time will tell can they turn their hardware excellence into profitable business.

      • Daniel Anderson

        Further driver improvements through 2017, brand new CPU architecture with potential to create new gen hardware for consoles in 2018 I think AMD’s looking up. I purchased stock at ~$2 year on year. I will be making some good returns if Zen and Arctic islands doesn’t deviate from what they say. Keep in mind these are also the first products from a management team that has whipped AMD operations into shape. I have confidence that having an engineer at the helm will make a difference. especially considering AMD’s engineers haven’t had this level of customizations with their chips in ages. Going to be one of their last chances to turn around the losses that started in ~2006.

        • Given the pedigree of people in RTG, with Raja making for a lot of great things at ATI and AMD, latter bringing Apple and going back to AMD, and Roy on the content side (NVIDIA’s TWIMTBP is largely his work), AMD has a really good fighting chance.

          • Daniel Anderson

            The thing that really gave me confidence in AMD is the fact Zen will use SMT with their own variant of Hyperthreading along with the majority of the same instruction sets used by intel. With the lack of an excuse regarding “This software isn’t optimized for AMD” it should put Zen in a good spot to where we can let the architectural design speak for itself and the engineers, nothing left to point fingers at. I hope they’re able to do something decent with it.

          • wownwow

            The current CEO was raised and educated with the sense of accountability and responsibility, and let’s hope no “oops” in execution!

      • Corey

        Nah Mantle has put AMD’s hardware in a position it should have been several years ago. FX8350’s will be comparable to the i5’s in performance at least, power consumption is another thing though. One of the biggest issues with the dozer arch is DX 9, 10 , 11 all have limitations to how many cores can push information to the GPU and how weak the cores are when they aren’t all utilised. DX 12 and Mantle was spread out much better and all cores can push information to the GPU. Mantle and DX12 fix this to an extent enabling better scaling over more CPU cores and allowing more cores to send to the GPU. I think Mantle gained enough ground quickly enough to show Microsoft, Intel, other platform users/developers that it is a powerful API that would eventually be hardware/platform agnostic. AMD has been putting features into their GPUs that allows for better performance for a few generations (or last few rebrands) which is key to gaining more performance in DX12 hence why the GCN cards are trading blows with NVidia’s top GPUS. AMD made sure they put their technology into DX12. This is going to position themselves very well provided Zen and future cards can hold their own against NVidia’s next round of cards.

        • Sweetie

          I don’t know about that. An i3 running at 3.5 GHz soundly beat the 8370 in an Ashes test article I read, paired with a 290X in DX12 (where the i3, not the FX, made big gains over DX11).

          This may be lazy development on the part of that company, since it’s easier to make a poorly-threaded game, but the main benefit of DX 12 seems to be in improving the performance of AMD’s GPUs, especially Hawaii models.

          Until developers start to write games with truly complex AI and such (rather than mostly just pretty graphics on simple systems) then an overpriced i3 will probably still win the day.

          • Corey

            Yeah if ya cant use all them cores anything based off of dozer performs poorly. Usually you get games the can utilise the 8350 right and they can come close to the i5’s. Perhaps the API isn’t the holy grail for more than 4 core chips.

          • Sweetie

            i3s are just two cores with hyperthreads

          • Corey

            …thanks… never would have figured that out on my own…

          • Sweetie

            glad to be of service

      • wownwow

        “The company … investing big and receiving little.”

        The results has been showing so. The Chihuahua has been barking for trophies but the Bulldogs, Intel and nVidia, has been going after money!

        “… seeing your investment being worth less than half is not something you would like to see in your life …”

        Well said. The Chihuahua so far only served one purpose, keeping the prices of non-high-end client chips low.

        “… ending with NV having a higher performing part of the technology AMD created”

        Hopefully not; the poor little Chihuahua is mutating under Lisa Su!

  • Nwgat

    well, you forgot that Vulkan is mantle…
    Vulkan is a clone of the mantle spec

    • AS118

      Mantle’s also being used in their LiquidVR initiative as part of the backbone. It’s being used, just maybe not in the limelight. Also, I doubt DX12 would’ve even happened if not for Mantle, so that’s a win for consumers at least.

  • OutWest01503

    loosing?

  • Xycosis

    AMD needs to overhaul the GCN architecture. The last 3 generations are power hogs and knowing AMD, the “new” 4th generation will probably add features that will not improve power efficiency, but to only rely on a GPU die shrink to help lower power consumption. That’s not going to cut it.

    • Sweetie

      Put out compute-weak GPUs so they can be like Maxwell? Compute always tends to come at the price of heat generation. Remember the GTX 480?

    • Blue Gum

      The excess heat and power consumption is not entirely the fault of the architecture, though GCN does need to improve it’s power consumption.

      AMD employ incredibly aggressive density techniques to reduce die size, which basically squeezes transistors together very tightly and gives them less room to dispel heat and higher heat results in higher power consumption.

      Nvidia Titan X – 601mm2 – 8 Billion transistors.
      AMD Fury X – 596mm2 – 8.9 Billion transistors.

  • Jonathan Aguilera

    mantle better in some scenarios than DX12, do I am wrong??! Guess world didn’t want to take advantage!?

    • Peter Den Gamer

      I bought the full pro version of win7 64bit a year ago. I don’t want to be forced to move to win10 just to get directx12. Can I get the same graphics quality when staying on windows 7 with the new AMD GPU manthle tech?

      • jalebi Singh

        Yes. Mantle works fully on Windows 7

  • roger crouch

    A brand new Xeon Phi couldn’t even compete with a Radeon 5650 in GPU style compute. Phi is to replace racks of processor computers, not graphics cards. It simply can’t compete with discrete GPUs because it isn’t designed to compete with them.

  • Busybee

    Marketshare-wise AMD still has a very long way to go, especially with Nvidia and Intel around. “Adopting” Nvidia’s CUDA http://www.digitimes.com/news/a20151117PR203.html is a small step forward, but better than none.

    • perfectlyreasonabletoo

      Paywall? What kind of moron would actually pay money for that site? lol

      • Busybee

        I can read/access it from my side fine. If you cannot read/access it then simply paste the URL into Google search and use the “Cached” option. Here’s the contents…

        “AMD launches Boltzmann Initiative to reduce barriers to GPU computing on AMD FirePro graphics
        Press release, November 17; Joseph Tsai, DIGITIMES [Tuesday 17 November 2015]

        The new AMD Boltzmann Initiative suite includes an HCC compiler for C++ development, greatly expanding the field of programmers who can leverage HSA. The new HCC C++ compiler is a key tool in enabling developers to easily and efficiently apply the hardware resources in heterogeneous systems. The compiler offers more simplified development via single source execution, with both the CPU and GPU code in the same file. The compiler automates the placement code that executes on both processing elements for maximum execution efficiency.

        To complement the new compilation tools, AMD has developed a new HPC-focused driver and system runtime. This new headless Linux driver brings key capabilities to address core high-performance computing needs, including low latency compute dispatch and PCIe data transfers; peer-to-peer GPU support; Remote Direct Memory Access (RDMA) from InfiniBand that interconnects directly to GPU memory; and Large Single Memory Allocation support.

        To bring applications written for CUDA onto AMD platforms, AMD announces the new HIP tool. AMD testing shows that in many cases 90% or more of CUDA code can be automatically converted into C++ by HIP with the final 10% converted manually in the C++ language. This greatly expands the installed hardware base available to run what were formerly exclusively CUDA-based applications.”

  • Po Tato

    AMD Goes “Made in America”

    Made in the USA, by whom?

    By the same fella who bankrolls the islamic terrorists

    If you guys haven’t realize that, Globalfoundries is owned by the same Arabs who gave support to the islamic terrorists who cold-bloodedly massacred over 130 innocent civilians in Paris as well as the 14 innocent Christians who attended a Christmas celebration in California !

    Globalfoundries is not an American company – it’s an Arabian company, and its profit might one day be used to send terrorists into your country to kill you and enslave your daughters as sex slaves

    • What you are saying is pretty offensive. How many american companies are ‘terrorizing’ foreign governments to drive their agenda? I am referring to the tobacco industry, of course (watch John Oliver’s excellent piece).

      Now, getting to the subject of UAE, I do understand you are a troll – and like to equalize foreign countries just like opponents of same-sex marriage with conservative/backward states in the US. If UAE sponsors terrorism, can you show some evidence?

      Mentioning that a country which uses female pilots to bomb the extremists (read the differences between Q’ran and the ‘modifications’ Islamic extremists use)… and yet they sponsor terrorism? Wake up. Have you ever been to the UAE?

      • Jun

        Probably Po Tato saw that the ownership by UAE, found the word “Arab” in it and short circuited to islamic = terrorist = bad. The world is full of ignorant people that..best to ignore them, they want the information from fox news to be true…best to leave them in their comfortable cocoon.

  • Astroboy888

    There is no manufacturer advantage between GloFlo/Samsung 14nm vs. TSMC 16nm. As demonstrated by the Apple A9, the performance is a wash. However, the 14nm showed a higher leakage current which translates into high power consumption.

    Yield wise, the GloFlow had its 30% for Apple A9 taken away when it couldn’t meet minimum 50% yield requirement in early 2015. So it remains to be seen if AMD can actually manufacture this GPU, which is larger and more complicated than a GPU at volume.

    • If a foundry really has a yield of 30% or 50%, the company cannot operate profitably. It all depends what kind of contract GlobalFoundries signed with Apple, AMD, Qualcomm and others. Is it ‘per wafer’ (cheapest) or ‘per functioning die’ (most expensive option). I would not be surprised if nickeling-and-diming Apple went for the cheapest option and got their behind handed to them. Time will tell where GlobalFoundries stand now that both GlobalFoundries (both pmGF and pmIBM) and Samsung use the same gate-last process and commonly-developed process.