CUDA - let the GPU take the strain
By Simon Bisson & Mary Branscombe in Editorial
Posted in Processors, Silicon, Applications, Business, Server on
The barracuda is the wolf of the sea, a slim silver dart that hunts in deadly packs. It’s perhaps not surprising that NVIDIA has taken part of its name for its GPU-based supercomputing tools.
On a recent trip to the US, Mary and I met up with some of the folk behind CUDA at NVIDIA’s Sunnyvale headquarters. It was a fascinating conversation - if only because I used to write scientific computing software, and something like CUDA would have sped up my work massively. When a problem takes days to solve, something using something like CUDA to accelerate processing makes a lot of sense.
Prior to CUDA, NVIDIA had tried to use GPUs for compute, but had run into architectural problems. Things changed with their series 8 GPU, which was very different to anything they’d built before, being designed for compute as well as graphics. That’s lead to some tradeoffs - there’s silicon on the GPUs that’s unused when it’s used as an accelerator (and vice versa). However NVIDIA makes so many chips, there’s not really any financial issue, it all comes out of the economies of scale.
CUDA is more than just a set of chips - it’s a language framework for working with GPUs, that can andle both sequential and parallel code together. Developers don’t need to learn anything you, and the framework gives programmers explicit - and simple - interfaces for running parallel code on NVIDIAs GPUs. There is a long term goal of providing tools for automating parallelism, but at this point you still need to work out what code can be parallelised yourself. The result is code that’s very simple with much less code, as CUDA handles repetitive calculations for you.
Simplicity comes from the hardware as well, as it manages threads for you. All you need to do is define the tasks the GPU will handle, and manage their interactions. The GPU then runs the calculations over the data, with groups of processors on different functions at the same time. As RAM is directly attached to the GPU there’s no need to use the PC’s own memory for caching data.
The numbers coming out of CUDA are impressive. Working with the VMD/NAMD molecular dynamics tools researchers at the University of Illinois have seen a 240X speed-up in the VMD ion placement tool, and an 8 to 12X speed up in NAMD. With an eye on greener computing, they’re also finding that CUDA gives them 1W/Gflop!
If you want this sort of power for your applications (and it’s remarkably suitable for large financial applications) you can by NVIDIA’s Tesla systems. There are work station versions, along with deskside offload processors. However the version we were most impressed with comes as a 1U rack mount unit, containing 4 GPUs. Connected to a PC or a server via 5 Gbps PCI-Express connections this is the way to give your data centre applications a significant speed up, with significantly lower power requirements.
While Tesla may not yet meet NVIDIA’s aim of providing a Teraflop in a 1U unit, it certainly speeds things up. Oxford University researchers have used it to get a 149X speed up LIBOR risk analysis for an 89X improvement on performance/Watt. That’s a good deal in anyone’s book - especially if you’re working with today’s fractious financial markets.
Add one to my list for the IT Santa!
–Simon
Make a comment
Tag cloud
Most commented posts
- Not very open, not very social
3 comments
- The best mobile game ever
- A Big Day In The Enterprise IT World
- Employees are our most valuable asset (snigger)
- Biometrics - it's not the technology that's broken
- More battery life, fewer explosions
- Free server backup
- 2008 technology resolutions
- Songs of distant satellites
- Twice the screen, twice the productivity: another reason I won't go back to XP
Highest Rated Blog Posts
- Nobody knows what Web 2.0 really is (100%)
- Songs of distant satellites (100%)
- Log in and lock in (100%)
- Mommy, why is there a home server in the office? (100%)
- Employees are our most valuable asset (snigger) (100%)
- Locking down IT or blocking creativity (100%)
- Video opera? What would you do with huge bandwidth and millions of pixels? (100%)
- Consumer BlackBerrys are good for business (100%)
- HD Trek (100%)
- Top tips for speeding up Vista (100%)


