AMD, Business, CPU, Enterprise, Graphics, Hardware, Intel, Nvidia

Intel Moves to Attack NVIDIA Quadro, AMD FirePro Market

Intel Xeon Phi "Knights Landing" Chip Package

At the recently held 2015 HotChips conference, Avinash Sodani (KNL Chief Architect, Senior Principal Engineer, Intel) gave a speech how Intel plans to expand the Xeon Phi product lineup from a server-only, PCIe card concept into three different packages, which would appeal to the workstation and server customers in different fields. On SC’15 Conference, which takes place in Austin, TX – Intel finally confirmed the strategy and is coming out with a workstation product that will feature a fully-enabled Knights Landing (KNL) Many-Core processor.

Intel Xeon Phi "Knights Landing" product family. Announced workstation product is the one on the left.

Intel Xeon Phi “Knights Landing” product family. Announced workstation product is the one on the left.

In the first half of 2016, the company will ship Intel-built, Intel-branded workstation powered by self-booting Xeon Phi processor. The processor will be able to boot standard operating systems such as Linux distributions or Microsoft Windows. The main purpose for the workstation is to ‘enable researchers to develop and test their code before the deployment inside the supercomputer environment.’

Thus, this is not a PCIe, discrete product which most news organizations are showing pictures of, this is a computer equipped with two sockets – one will feature Xeon Phi chip, while the other socket will probably be a single Xeon E3 that provides the display output (KNL does not support display outputs).

Intel 2nd Generation Xeon Phi family is based on Knights Landing processor.

Intel 2nd Generation Xeon Phi family is based on Knights Landing processor.

As you can see on picture above, Intel Knights Landing (KNL) brings 72 processing cores split into 36 tiles, with similarities to AMD’s Bulldozer architecture. Each tile packs two cores with their respective L1 caches, 1MB of L2 cache and four VPU (Vector Processing Unit). This will be the largest chip Intel ever made, finally larger than the dead-end Itanium architecture. The new chip should be the fastest product Intel has ever made, with 16GB of on-board memory (Intel uses MCDRAM acronym, i.e. Multi-Channel DRAM), and a memory controller with six 72-bit channels that can support additional 384GB of DDR4-2400 memory.

Performance in theory, sounds great – 6 TFLOPS Single Precision, 3 TFLOPS Double Precision. However, as always with Intel – the numbers need to be brought into perspective. What the company is doing is amazing when it comes to what they have done so far, but as with integrated graphics performance, reality is something completely different.

AMD Fiji GPU

AMD Fiji GPU was launched in June 2015.  In terms of raw performance, it eats a mid-2016 Intel part alive – 8.9 billion transistors, 8.6 TFLOPS SP and (unlocked) 4.3 TFLOPS DP performance.

The numbers look great on paper, but we need to look into the competition. AMD consumer board, Radeon R9 Fury X packs 8.6 TFLOPS of Single Precision, while NVIDIA M40 brings 7 TFLOPS of Single Precision (all numbers are IEEE 754 standard). Both AMD and NVIDIA limit the Double Precision performance of their products, and for reasons we do not understand, M40 and M4 are the first Tesla products where NVIDIA cut DP performance to a consumer level. Thus, Intel has a great opportunity for all the markets that demand Double Precision. This is where the danger for Xeon Phi lies – today, you can buy higher performing parts for a lower price, and as we saw in several data centers, Intel Xeon Phi in its current iteration is far from being an efficient product. However, Xeon Phi is the only one that can boot the OS itself, while NVIDIA and AMD both need a CPU to boot.

2016 will see the arrival of two new architectures from AMD and NVIDIA – AMD will debut Arctic Islands architecture, with Greenland GPU on the high end, and NVIDIA will debut Pascal, which will bring native connection not just to x86 processors like AMD Opteron and Intel Xeon, but IBM POWER8 and future POWER9 as well. AMD and Intel will use 14nm process (AMD will use GlobalFoundries 14nm FinFET process), while NVIDIA will utilize TSMC’s 16nm FinFET process.

To bring things into perspective, ever since it debuted in 2000, NVIDIA Quadro family ruled the roost of professional graphics accelerator market. This was accented with the arrival of HPC family named Tesla in 2007. On the other hand, AMD’s FirePro family always came distant second, with the best market share being 2.1 graphics cards for every  7.9 Quadro’s sold. Over the past decade, the share stabilized at about 10% for AMD, and even though sometimes AMD came up with a better feature set, or hardware capabilities, NVIDIA remained as ‘king of the hill’.

That might change, though. When Intel envisioned its graphics processing unit, codenamed “Larrabee”, the company wanted to enter the discrete market with a bang. Unfortunately, after numerous years in development and several billion dollars, Larrabee was dead on arrival, and caused significant management shakeup in the company. Now, the company has a chance to show all of its muscle and get into the market. This will not happen in 2016, but 2017 and 2018 might see more discrete parts based on future versions of Xeon Phi, that might give a serious run for the money not just in Tesla/FirePro S/Xeon Phi space, but in Quadro/FirePro market as well. Remember, all these markets are very low volume, but extremely high margin.

Still, both NVIDIA and AMD did not deliver ease of access when it comes to programming, which is as we all know, much more important than sheer performance. In our conversations with the organizations we advise, we discovered that all three vendors are heavily criticized. AMD pushes everything on OpenCL, which is not ‘mature enough’, NVIDIA CUDA ‘limits you to a single platform’, but also ‘is the most scalable and provides highest efficiency that Amdahl’s law allows’, while Intel was criticized for ‘it is not true that (current) Xeon Phi is completely binary compatible, as you have to recompile all the applications,’ followed by ‘the product is too hot and does not scale as (Intel) promised.’ Due to NDA limitations, we cannot disclose people behind the quotes we provided here, but we can confirm that they come from leaders of supercomputers in Top 10 and Top 25 on Top500 list.

Furthermore, where the market opportunity lies is that ‘extra step’ Intel is making with KNL. Researchers are vying for more memory onboard, and neither AMD nor NVIDIA did a good job in order to address that problem. The problems researchers and developers are trying to solve does not fit in products which originally are developed with consumers in mind (Radeon, GeForce), and the ability to create a complete server-oriented, commercially-oriented part could be Intel’s golden ticket. All people we work and continuously talk with do not accept arguments from AMD and NVIDIA in terms of putting more memory, because at the end of the day, Tesla is a GeForce with unlocked DP performance, lower clock and (sometimes) more memory and ECC support (which again eats into available memory space).

There is a fascinating battle developing here… we can’t wait for more company, like Qualcomm’s server developments.


  • 553
    Last few days ago new McLaren F1 subsequent after earning 18,512$,,,this was my previous month’s paycheck ,and-a little over, $17k Last month ..3-5 h/r of work a day ..with extra open doors & weekly paychecks.. it’s realy the easiest work I have ever Do.. I Joined This 7 months ago and now making over $87, p/h.

    .Learn More right Here….

    GetMoreTimeSpendWithYourFamily

  • roger crouch

    The Xeon Phi and those GPUs excel at COMPLETELY DIFFERENT tasks. The Phi is for when you need more CPU power.

    • It is still too early to tell where Intel is positioning Xeon Phi as a platform, since the original ‘seamless accelerator’ platform fell flat, and their largest customer is far from being happy. Intel being Intel, the company is known for its engineering excellence and reaction to what the market demands. I for one am looking forward to seeing an Xeon E3 and Phi combo inside a single computer. And then E5 mixing it up with an Phi.

      I do believe that AMD and NVIDIA have great hardware as well, just that Intel has shown time and time again that they can execute. Just not in the graphics space, and that space is becoming more important than ever – Virtual Reality and especially Augmented Reality will be more demanding than our current, and even 2016 products – can offer. Latency + performance takes precedence over thermals. We used to think 130W CPU and 300W GPU are the most a system can throw. Now, we’re at 145W CPU (Xeon E5) and 300W GPU (Fiji), moving to 180W CPU (Broadwell-E) and 375W GPUs…

      • roger crouch

        Its easy to tell where its positioned, it isn’t a series of three thousand individual math units, its a cluster of CPU. Trying to claim it is in competition with GPUs is like trying to put the M1A2 on the same transport efficiency chart with the A380. Both are purpose built hardware and near the limits of their technology but a tank can’t fly and a plane can’t stop bullets, they simply aren’t built that way.

        Xeon Phi simply isn’t built in a way that it can ever directly compete with the GPUs in what they do well and intel never meant it to try. Phi is to replace clusters of cpu processors used in large scale modeling so that you can have several hundred cores in one box instead of having to develop special software to run a whole rack as a single computer.

        While Phi is getting stronger and quicker, it isn’t even on the same pace as the GPUs that have doubled in power in the last two years and are about to double in power again. With the drop to 14/16nm on GPU we’re going to see shader counts in the 4800/9600 range soon – the Phi is still at most 280 by comparison.

        This isn’t apples and oranges, this is apples and whales.

        Virtual and Augmented reality are a joke on the processor scale, a single core on my i7-4930 can keep up with me – I don’t need 60 boosters. Additionally you’re rating hardware by its power consumption? We’ve had 375 watt GRAPHICS CARDS since the 6970 and 580 but I can assure you the GPU itself doesn’t use all that power. High speed ram at any voltage is basically just a convoluted firehose of power drain.

        So I gotta say, you may fool 95% of the people 95% of the time with the level you yap at, but you’re not fooling me or anybody important.

        Go sell it to a poor farmer.

        • Roger,

          I do not believe I am claiming Intel is positioning Xeon Phi to go against the GPUs. I do not believe I am claiming NVIDIA or AMD are positioning themselves against the CPU. However, the realities are that the companies themselves pitch said products against each other. We have markets where these products go one against each other – HPC and Data Center, and now this is coming to a workstation level.

          Is NVIDIA Tesla a GPU or an ‘accelerator’ when it does not have any display outputs, and is incapable of ‘rendering’ any resolution due to a simple lack of TMDS’s on the actual board? Yet, it is powered by the same piece of silicon which powers your GeForce or a Quadro. Same story with AMD. FirePro S-edition is not a GPU, albeit a number cruncher.

          And to say that Augmented and Virtual reality aren’t demanding… I am sorry, but the world disagrees with you. Xeon Phi, Tesla and FirePro S-series are mutual competitors. Intel Core i7-4930 you have features integrated graphics which is a competitor to lower-end GPUs from both AMD and NVIDIA.

          • Corey

            Nah AMD and NVidia would love to get rid of the CPU. How much of an advantage would it be to just go back to one chips that does everything really well. No more sharing resources, just one chip to rule them all. Not too mention that’s all Intel have. Not to say they couldn’t start making GPU’s because Intel have the money to get into almost any market. The issue is Intel always like to use their chip design. I loved the thought of the Larrabee back in the day but mostly because it was said to add more x86 performance when not running graphics. But obviously that fell on its face. I think personally what will eventually happen is x86 will be a very small part of the market. Nvidia and AMD both are limited on what Intel decide due to Intel significant market share. What I mean by this before anyone starts raging is, NVidia cannot produce x86 processors and enter the CPU market, AMD are lucky to have the x86 license but it is hard for them to really push new technologies due to lack of support market share. Not trying to cry on behalf of AMD but Intel are clearly in the best position. Prob not an issue for AMD or Nvidia unless Intel really learn how to make GPU’s or dedicated ones for that matter. The only likely outcome is that NVidia will be pushed out of the sub $150aus or ARM will take a hold in the desktop industry even if is only in the light end market. All this is merely opinion though.

          • RV

            Actually R9-395x and FirePro W7100 are both Tonga core GPU’s. The FirePro has dual precision T-flops enabled, runs a tad faster and is called a Tonga Pro, etc.

            The silicon is essentially the same, the board that carries it however may do more to define it’s use as a HPC binned part as a Workstation rather than a pure gaming part. They both run DX12 and OpenGL and are both GCN capable
            .
            Did you really think that AMD makes a Pro series that is NOT the consumer silicon? NVidia does to. Rebranding Pro silicon is common-place. Fiji-pro however really smacks it with over 8 T-flops single and over 4 T-flops dual precision.

            Fiji Pro X2 with 2x all that PLUS 1 terabyte of ram!!!!! Maybe there is not enough of them to build the product but it would certainly be a monster.

            Knights Landing faces new generation silicon and if Fiji has it beat then what happens next April when Greenland is announced.
            AMD cutting prices across the board to clear the channel smacks of an early release for Greenland maybe Summer 2016.

  • RV

    Intel does not enjoy the consumer subsidy that AMD and NVidia realizes that stimulates new GPU design. As a result Intel HPC KL is vastly overpriced compared to comparable FirePro or Quadro. As a result AMD’s Fiji GPU with the DP enabled iwill be the industry benchmark for both price and performance. And Greenland is just around the corner.

    In fact HPC is ONLY made possible by this Consumer Subsidy and the entire reason that HPC is now using dGPU.

    In fact HPWIRE since late 2010 and 2011 has written several pieces and made this observation quite clearly.

    GPU’s are staggeringly expensive to design and develop; numbers such as $250-300MILLION have been tossed about.

    If that cost can not be recovered from consumer sales then that cost will be passed on to the workstation user. Workstation users are NOT that common.

    So while Knights Landing may look impressive on the specification side, it can not compete in terms of price/T-flop. While governments may spend recklessly, corporate and academic users do not.