Nvidia touts massive improvement in performance in its latest Tesla GPU compute products, including an upcoming 7.1 billion transistor monster GPU compute engine
Nvidia launched its latest line of Tesla GPU compute engines at the
company’s Graphics Technology Conference in San Jose today. One model
shipping immediately is based on the existing GK104 chip used in the
recently released GTX 680.
Dubbed the Tesla K10, the board delivers as much as 4.6 teraflops of
single precision floating point performance, roughly three times the
single precision FP of the older, Fermi-based Tesla. The card can also
handle an aggregate memory bandwidth of 320GB per second. This board is
targeted towards oil exploration, signal processing and seismic
processing applications.
The more intriguing announcement is the Tesla K20. Built on a monster chip with 7.1 billion transistors, the K20 isn’t slated for release until Q4. Nvidia’s CEO, Jen-Hsun Huang noted that the K20 was the largest, most complex semiconductor chip ever built. It will likely use the same 28nm manufacturing process used for the GTX 680. The K20 is designed for computationally intensive HPC environments, particularly Finite Element Analysis (FEA), finance and physics applications. It offers three times the double-precision floating point performance of previous generation Tesla products. In addition to the huge transistor count, the K20 will sport a 384-bit memory interface.
Dynamic Parallelism behaves like a kind of parallel branch predictor. When fed tasks, the K20 can keep track of dependent tasks and spawn new compute kernels to complete those tasks, rather than having to request more work from the CPU.
Huang demonstrated a simulation of particles colliding, first starting with the last generation Fermi chip. That GPU could handle 20,000 bodies colliding in real time at high frame rates. Then he went on to demonstrate real-time modeling of the Andromeda and Milky Way galaxies colliding – not something we need to worry about for the time being, since it won’t happen for 3.8 billion years. That simulation ran on a Kepler-based Tesla, showing over 208,000 bodies colliding.
The GPU in the K20, code-named GK110, is expected to be used in the net Titan supercomputer being built at the Oak Ridge National Laboratory and the Blue Waters system at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.
The more intriguing announcement is the Tesla K20. Built on a monster chip with 7.1 billion transistors, the K20 isn’t slated for release until Q4. Nvidia’s CEO, Jen-Hsun Huang noted that the K20 was the largest, most complex semiconductor chip ever built. It will likely use the same 28nm manufacturing process used for the GTX 680. The K20 is designed for computationally intensive HPC environments, particularly Finite Element Analysis (FEA), finance and physics applications. It offers three times the double-precision floating point performance of previous generation Tesla products. In addition to the huge transistor count, the K20 will sport a 384-bit memory interface.
New Features
In addition to improved compute performance, the K20 will support several key features to keep the chip busy when being fed compute chores. Hyper-Q increases the number of work queues from a single queue in the previous generation Fermi chip to 32 work queues. This improves GPU utilization, keeping more of the compute cores humming when running parallel compute applications.Dynamic Parallelism behaves like a kind of parallel branch predictor. When fed tasks, the K20 can keep track of dependent tasks and spawn new compute kernels to complete those tasks, rather than having to request more work from the CPU.
Huang demonstrated a simulation of particles colliding, first starting with the last generation Fermi chip. That GPU could handle 20,000 bodies colliding in real time at high frame rates. Then he went on to demonstrate real-time modeling of the Andromeda and Milky Way galaxies colliding – not something we need to worry about for the time being, since it won’t happen for 3.8 billion years. That simulation ran on a Kepler-based Tesla, showing over 208,000 bodies colliding.
The GPU in the K20, code-named GK110, is expected to be used in the net Titan supercomputer being built at the Oak Ridge National Laboratory and the Blue Waters system at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign.
No comments:
Post a Comment