NVIDIA’s scenario about the GeForce / Quadro / Tesla line-up experienced a lot of turnover over the past couple of years. The sequence of “launch as GeForce, downclock as Tesla, optimize and launch as Quadro,” changed into “launch as Tesla, optimize as GeForce and be reliable as Quadro”. With Pascal, story turned to be almost the same. NVIDIA introduced GP100 as Tesla in April 2016, followed with GP102 chip as Titan X (no longer branded as GeForce), Quadro P6000 and Tesla P40. At the same time, the GP104/106/107 did not experience the same sequence, with only GP104 debuting as Quadro P5000 and Tesla P40. Second day of
Can Supermicro’s Pascal Beat Nvidia’s Own DGX-1?
At the GPU Technology Conference, Nvidia introduced its DGX-1 supercomputer. Based on combining the two 20-core Xeon E5 v4 processors with eight Tesla P100 cards, DGX-1 is a 3U server that promises to deliver 85.2 TFLOPS of compute performance (FP32). For a price of $129,000, you can order the DGX-1 system today and get the ultimate performance out of a single rack. Yet during that same event, there might be a product that already upstaged the performance delivered by a single DGX-1 server. On the second day of the show, we encountered Supermicro’s 1U ‘Super GPU’ server. While Supermicro is known as a manufacturer of ultra-dense computers, and is
OTOY’s New Model Brings 20,000 GPUs into Cloud
From its early days, the main focus for OTOY was disruption of the visual effects industry (FX). Split between quick-but-unreliable rasterizing and slow-but-correct ray-tracing, both computer games and movies suffer from the same fate when it comes to rendering physically and perspective-correct worlds. OTOY is working on changing that through their Octane Renderer and Brigade, mixed RT/ROP engine. However, OTOY did not want to stop at creating a physically correct graphics engine. During our visit at the company’s HQ in 2013, OTOY was working on initial deployment of Kepler-based graphics cards, with an initial target of deploying 3,000 GPUs for their cloud rendering business. Under the
NVIDIA’s GTC 2016 To Focus on AI and VR
NVIDIA announced on Wednesday that IBM Watson Chief Technology Officer Rob High will deliver a keynote at the GTC 2016 – GPU Technology Conference in San Jose, California, next month. High will be joining other keynote speakers attending the conference including Toyota Research CEO Gill Pratt and Nvidia’s own CEO, Jen-Hsun Huang. Back in November 2015, IBM revealed that its Watson cognitive computing platform had begun using Nvidia’s Tesla K80 GPU accelerators. Working with NVIDIA, the company incorporated GPU-accelerated computing as-a-service capabilities into SuperVessel, a global cloud-based OpenPOWER ecosystem resource. Users now can instantly launch Caffe, Torch and Theano deep-learning frameworks from the SuperVessel cloud
OTOY Ported CUDA to Non-NVIDIA Hardware
VentureBeat reports that Los Angeles-based OTOY managed to reverse engineer Nvidia’s CUDA language to run on chips other than Nvidia’s own GPUs. That means programs written in the CUDA language can now run on GPUs provided by Intel, AMD, and ARM. Thus, software built for NVIDIA GPUs will work on a multitude of devices ranging from an AMD-based console (PlayStaton 4, Xbox One) to an Apple iPad or iPhone. The cloud rendering company launched in January 2009, and has developed a technology that uses “clusters of GPUs” in the cloud to render cinema-quality graphics that’s streamed to a client within a web browser. The company also provides
Mantle Cycle is Complete as Khronos Releases Vulkan 1.0
AMD is a company well-known for designing and adopting standards which soon become ‘open’ and ultimately become industry standards. What makes their approach unique is that quite often, AMD did not benefit from that strategy as the standards would explode in markets where the company is not present. Still, the list of open standards created by a tiny giant from Sunnyvale / Austin is remarkable. Khronos Group just released a ‘final initial’ (v1.0) specification of Vulkan low-level API (Application Program Interface). Launched as Mantle, AMD’s in-house, low-level API became two snowballs: Microsoft reacted to Mantle by developing the DirectX 12 in as little as 17 months. Only four months prior to Mantle’s announcement, Microsoft informed
AMD ZEN CPU and APU Specs Confirmed?
It looks like 2016 is turning into a year of anticipation and redemption for AMD, not just to its consumers, but also to customers which purchased millions of dollars of AMD hardware in the past, and then felt left out. We all saw Oak Ridge National Laboratories, one of first Opteron adopters – ditching a decade old AMD collaboration for IBM+NVIDIA team up. Luckily for all involved, AMD seems to have finally “get their s*** together” and started a sales campaign which might be the most successful since Henri Richard led the sales team taking over more than 50% market share from Intel (albeit only in 4P and 8P
GDDR5X Memory Shows Better Than Expected Results
2016 will be marked with the arrival of two memory standards, which should spread across the mainstream and high-end / enthusiast line-up like fire. First, we have the HBM2, an improved version of HBM memory which debuted (and so far, only ships inside) with AMD R9 Fury family of cards. HBM2 promises a four times increase in capacity and double the memory bandwith – meaning a single card can go from 4GB and 512GB/s to 16GB and 1TB/s. Given the low volume of HBM and HBM2 memory, those two will probably remain only on enthusiast graphics cards, such as recently renamed Greenland, high-end Polaris graphics processor from AMD
Must See: This CGI Image looks Completely Real
Breaking Bad is a legendary TV show which set new records for limited-viewing series (subcription only) and will be remembered as one of shows that marked 2010s. However, what is most impressive is that the show did not utilize any sort of digital trickery, or digital magic. It was shot using 35mm film, which gave the producers that old grainy look and feel. In fact, cinematographer Michael Slovis pushed for using 35mm cinematic cameras instead of TV-standard 16mm or digital footage. In an 2013 interview with Forbes, Mr. Slovis was quoted saying that they deliberately wanted to remain analog. “When we started the name of the station was
BOXX Launches the Most Advanced PC in the World?
Buying a compute device – be that a personal computer, workstation or a server, typically comes with a certain set of features which does not change. All workstation vendors for example, offer 1-4 expansion cards, typically a GPU and computational or storage cards. Those features are limited by what motherboard vendors can offer, and ATX/eATX/SSI standards can offer up to X amount of cards. BOXX Technologies decided to challenge that by launching APEXX 5, a custom designed system which can host up to seven expansion cards. For example, a Quadro GPU for sync/display output and five Tesla boards for compute. If you go with Dual-GPU boards, such as Tesla
AMD Radeon Fury X: Potential Supercomputing Monster?
When AMD launched its Fiji-based graphics cards, all eyes were focused on its performance in consumer applications such as computer games. And while the first results forced Nvidia to launch “Titan Lite” in the form of GeForce GTX 980 Ti, DirectX 12 benchmarks are starting to show different, brighter outlook for AMD, starting with Ashes of the Singularity. The focus of this article however, is its potential and usage in applications where Fiji GPU will be branded as Fire Pro, and Fire Pro S (Server) – where AMD can take an ASIC and upsell it to commercial clients, with full-speed enabled for Double Precision floating point
Google Translate Translates Almost Everything in 27 Languages
Couple of years ago, the world of mobile apps was shocked with the appearance of World Lens app, which detected words on live cameras. Even though the mobile phones at the time had quite limited camera capabilities, what World Lens showed to us was the future of translation. Naturally, revolutionary apps like that do not just disappear as Google proved by acquiring the app maker, Quest Visual. Given that World Lens for Google Glass was one of most impressive things I’ve personally used on Google’s first attempt at Augmented Reality, as you can see on video below: Now, over a year passed since the acquisition of
NVIDIA GeForce 980 Ti Final Specs Revealed
As we are approaching Computex and the majority of press and media analysts are in the plane en route Taipei, companies such as Intel, Nvidia and AMD are polishing their press releases for the first day of the show. One such product is GeForce GTX 980 Ti, a product refresh which does not have a lot to do with ‘refresh’. While the original GTX 980 was based of GM204 GPU, featuring 2048 CUDA cores attached to 4 or 8GB of GDDR5 memory. As you might have guessed, the chip was using 256-bit memory bus. When you combine GPU clock of 1.12 GHz and GDDR5 clock
US Government Gives Intel Two Supercomputer Wins
While Intel may have lost some serious income from being virtually shut out of Intel’s HPC market, the US government has recently handed the company two impressive supercomputer wins.
Uncle Sam Shocks Intel With a Ban on Xeon Supercomputers in China
Just as Intel’s (NASDAQ: INTC) CEO Brian Krzanich opened the regular staff meeting before a dramatically reduced IDF2015 conference, in Shenzhen, China – it is a good time to review how government and enterprises don’t see eye to eye when it comes to strategic business. Remember the Tianhe-2 machine at Guangzhou Supercomputer Center, the current World’s number one according to Top 500 Supercomputer list? Unlike some other China supercomputers with their mixed architectures – Tianhe-2 is a fully Intel based machine, the world’s largest assembly of Intel Xeon CPUs and Xeon Phi accelerators. Even after Intel ‘opened the kimono’ and gave a nearly 70% discount on its processors and accelerators, it
Might AMD Hold The Solution For Connecting CPUs and GPUs?
As GPUs get more powerful, a better solution to bridge the connectivity gap with the CPU is needed. Might AMD have the solution?
Qualcomm Announces 20nm Snapdragon 808 and Snapdragon 810 64-Bit Chips
Qualcomm has been fairly quiet about their high-end ambitions after what is expected to follow the soon-to-launch Snapdragon 805 chipset. The Snapdragon 805 is Qualcomm’s chip that will likely ship in devices next quarter and is marketed by Qualcomm as their 4K chip with the Adreno 420 GPU. Now, even though the Snapdragon 805 (APQ8084) is a very powerful chip, it lacks 64-bit capability and doesn’t have an integrated modem, requiring a separate modem like Qualcomm’s 20nm MDM9x35 to enable cellular capability. It also sports an improved Krait CPU with a Krait 450 CPU compared to the Snapdragon 801 and 800’s Krait 400. However, it
Youtube video shows OpenCL running on Nvidia GPU
It looks like OpenCL is getting ready for prime time. A reader from across the English Channel contacted us with a link to Youtube video that showcases OpenCL being processed on a GPU. If I recall correctly, a while ago AMD claimed world’s first OpenCL demo, but it was done on a single core (and then scaled up to all four) on a Phenom II X4 CPU. If this video is correct, Nvidia gets the pole position for being the first company to demonstrate OpenCL working on a GPU, which is “usage as intended”. Judging from the video, Nvidia showed Nbody simulation changing following parameters:
Nvidia’s discloses its DP performance limitations
When Nvidia launched GT200 chip, the company claimed around 1TFLOPS of Single-Precision computing power, and roughly 150 GFLOPS of Dual-Precision performance. This discrepancy was mostly due to the fact that Nvidia went with dedicated hardware for the DP support. Every eight-shader cluster had one dedicated dual-precision unit, costing millions of additional transistors and resulted in doubtful performance. Fast forward to January 2009, and we have SP performance at 933 GFLOPS, while achievable DP performance dipped down to 78 GFLOPS. This figure is roughly half of what Nvidia boasted about at the time of launch, and sheer evidence that both manufacturers like to overstate the performance
University of Illinois streams its Parallel@Illinois seminars
Expanding on its role as CUDA Center of Excellence, University of Illinois in Urbana-Champaign is launching a 13-week seminar with focus on parallel computing. Well, GPU Computing, that is. Parallel@Illinois is the name for the whole project of GPU Computing, and this seminar was organized by prof. Sanjaj J. Patel and Wen-mei Hwu. Under a not-so-scientific moniker Need For Speed Seminar Series, this 13-week course will feature domestic alumni such as Mark Hasegawa-Johnson, Dan Roth, Narendra Ahuja, Stephen Boppart, John C. Hart, Tom Huang and Seth Hutchinson, and guests such as Keith Thulborn (UI Chicago), Sam Blackman (Elemental), Nikola Bozinovic (MotionDSP), Mark Johns (Tapulous) and