CPU vs GPU, mythbusted or mythdirected?
By Simon Bisson & Mary Branscombe in Editorial
Posted in visualisation, Processors, Silicon on
The folk from Mythbusters were on hand at NVision08 to show the audience the difference between CPU and GPU computing. In true Mythbusters fashion they did it with vast amounts of paint, and what must have been one of the world’s largest paintball guns.
First they began with a simple (for them) demonstration of serial operations - using a paintball gun wielding robot to draw a smiley face on a whiteboard. A hundred or so blue dots made the robot one of the slowest (and loudest) dot matrix printers we’ve seen.
Parallel operations would take something a little larger, and their 1100 paintball inkjet printer filled much of the stage. Powered up it would create a picture of the Mona Lisa in glorious 8-bit colour in a fraction of second. Huge air tanks held the compressed air the device needed to simultaneously launch all the paintballs in all the tubes.
The demonstration was certainly impressive, but it was more than a little misleading.
The type of data-centric work that CUDA GPUs handle is more about using parallel processes to handle lots of small pieces of data, not about building complex images from small pieces of data. With a parallel architecture like that you develop algorithms that break down big problems and big data sets into smaller, easier to work with, pieces. Farmed out across tens and hundreds of processors in a GPU, each data block can be processed, before being reassembled and the results delivered.
They’re not new techniques, either, for one thing the approach is at the heart of computational fluid dynamics and finite element analysis. The parallel techniques used in GPU computing are certainly impressive, and are already delivering supercomputing to the desktops of the scientists and engineers who need the power (an Nvision session on using GPU-based supercomputers to model the plasma dynamics around neutron stars and the black hole at the centre of the galaxy was particularly impressive). Low-cost high-performance computing is the GPU’s strength, especially when compared to the hefty power requirements of an equicalent array of traditional CPUs.
The Mythbusters’ demonstration was good (and an enjoyable piece of theatre), but it really told a different story. So how could the intrepid special effects team have told the real story of GPU computing?
How about one robot carrying a large, heavy cube across the stage? Suddenly it’s joined and over-taken by a swarm of smaller machines, all carrying smaller cubes - cubes that weigh as much as the single cube on the struggling robot. Or if paint is the preferred metaphor, a can of paint slowly emptying through a single pipe. Meanwhile another can empties through hundreds of holes in much less time.
So, how would you demonstrate it?
–S
Let’s get physical
By Simon Bisson & Mary Branscombe in Editorial
Posted in Processors, Silicon, Software on
Nvidia has decided that the visual computing world needs a conference, and has taken over San Jose to deliver just that. It’s an odd event, with a high-level academic parallel processing track running alongside highly analytical business sessions - and what’s billed as the world’s largest LAN party filling one of the conference halls.
Games may have made Nvidia, but it’s the rest of the graphics industry that keeps it going. Simulation and CAD drive much of today’s industrial design, while complex financial calculations can be run on GPU-powered parallel processors. It’s not just black hole plasma dynamics - it’s also the models that help calculate how a fusion reactor will operate. According to Nvidia GPU computing is bringing supercomputing to the desks of the people who need it the most - for just the cost of a video card.
One of the keynotes showcased a NASCAR simulator used by drivers to hone their skills. On stage we heard a populist story of what it was like to be a driver, and what it was like to use simulation tools. Off stage we heard a more interesting story about how the simulator developers were looking at using the latest generation of GPUs in their application. The ability to use a GPU for parallel processing - and the availability of powerful hardware physiscs engines - has made them completely rethink their next generation, as the new hardware features mean that they can now work on making the simulation more realistic.
That’s what the drivers want. Asked what he really wanted from a simulator, Kyle Busch didn’t talk about new high-resolution graphics or realtime ray tracing. What he wanted was more accurate physical behaviours. In the real world passing on the left is different from passing on the right, while slipstreaming another car can change the performance dramatically. A simulation may look real, but without the physics it’s not realistic at all.
One plan for the next generation is to move away from the current car model, with only 6-degrees of freedom. Instead, it really needs 72 degrees, for all the hinge and flex points - all of which are changing dynamically. That’s where parallel processing comes in, as it allows a car to be modelled in real time, taking advantage of physics engines to turn those model calculations into real world behaviours. Improving the simulation will mean more (and happier) customers - as well as a continually improving model that can be shared with vehicle manufacturers.
It’s an approach that requires specialist processing that goes beyond the traditional CPU. Don’t confuse it with the death of the CPU, though. There will always be a place for the traditional CPU - it’s just that silicon technology has become ubiquitous enough for specialist hardware to offload processor intensive functions.
Need to encrypt something? Just use the hardware cryptosystem built into a TPM. Need to do thread intensive Java? Hook up an Azul network processing appliance. Need to do complex vector calculations on large amounts of data? Use a GPU. Nvidia’s CEO Jen-Hsen Huang talks about it as heterogenous computing, where the CPU handles tasks, and more specialised hardware handle the complex tasks that tax general purpose silicon.
Intel and AMD may still say that general purpose processors are just what the world needs - but they’re still investing in HyperTransport and QuickPath, the fast buses that specialised silicon needs. I wonder why they’re doing that, if specialised silicon is the dead end they say it is. Is there something about Moore’s Law they’re not telling us?
IDF: stress testing SSD – and user frustration
By Simon Bisson & Mary Branscombe in Editorial
Posted in Silicon, Storage, Hardware, Laptop, Intel on
Battery life? Performance? No, the important test Intel’s new SSD passes is known internally as P*ssmark…
That’s the nickname for the way Intel tests how much of a difference SSD makes to user experience. It’s not just about how much extra battery life, although I’d like the expected 14 hours batter life I could get from the HP 2730p, the next version of my tablet, with SSD and the thin slab battery I already get 8 or 9 hours from.
The improved performance isn’t just for looking good in benchmarks or running video editing apps most people don’t use, it’s for stopping you sitting at the screen hollering “what are you doing!” as the hard drive light flashes on and off and Outlook sits there staring blankly. Not that it’s always Outlook; Acrobat is pretty good at sitting on its thumb, as are plenty of other applications. And as notebooks get smaller and lighter and 5400rpm is seen as something to aspire to, you can be left waiting far too often.
According to Intel’s cheekily named and possibly unscientific internal benchmark, you’ll be gnashing your teeth ten times less with an Intel SSD than a hard drive. They worked this out by asking a group of Intel employees to mark on a log sheet how often they got fed up enough with their computer to remember that they were keeping score. After two weeks they swapped them over to SSDs. And then after another two weeks, they made them go back to hard drives instead, sticking to show their frustration.
That frustration - and the tick marks - went down significantly Intel’s Principle Enginner for NAND Stephen Wells told me. “Not to zero; I’d still get annoyed if Windows blue-screened or something,” he said. But ten times less frustration was very noticeable. “And oh, the moaning and whining you got when we made people go back to the hard drive. I know - I was one of them. Do you want to get rid of your mouse? No. Do you want to go back to DOS? No. In a few years will you want to get rid of your SSD? Absolutely not.”
Not only is flash faster than hard drive, it’s more consistent. The 34 seconds it took to run through the photo and video tasks in one of Intel’s benchmarks always came out somewhere between 30 and 35 seconds, no matter how often the team ran it. But with the 5400rpm hard drive, Intel’s Chris Saleski told me the day before, the results were anywhere from one and a half minutes to two and a half minutes.
Wells puts that down to the fact that data can be scattered anywhere around the disk and there’s an unknown latency in getting to it and getting it back that you don’t see with SSD - and he expects that to mean a more deterministic battery life with SSD as well. That way, when Windows says you have an hour of battery life left, you won’t find the machine hibernating to save your data fifteen minutes later. And that’s another thing I’d be ticking the frustration mark for…
-Mary
You say Express Gate, I say Palladium
By Simon Bisson & Mary Branscombe in Editorial
Posted in Futures, Silicon, virtualisation, Hardware, Laptop, Mobile, Security, Intel, Microsoft on
Imagine a second, simpler operating system on your PC with fixed features, so it’s more secure - after all, if you can’t add more programs you can’t add a virus either. It would have to start up quickly, so that Windows wasn’t waiting for it, so it would be ideal for listening to music and watching video. I’m not thinking about virtualization per se, although that’s one way to achieve something similar; this is two operating systems side by side, both with access to the PC hardware, but one of them does much more limited and circumscribed things.
Can you tell what it is yet?
No, actually, I’m not talking about Palladium - sorry, Microsoft Next Generation Secure Computing Base. That grew out of an attempt to reassure Sony that it would be OK to allow DVD movies to play on a PC without piracy becoming endemic and turned into a much more useful and visionary idea about using public key cryptography not to identify people but to secure machines. It would have been a good way to implement the DRM it was associated with in the public eye, though wouldn’t have forced it on anyone who didn’t want to run it. Palladium loaded a secure piece of software called the TOR that acted as a secure area that could only run trusted code (written to public APIs), where the apps would be invisible to the main OS - all secured by the machine-specific key in your TPM and some new technology from Intel.
Ironically, trust was the issue with Palladium; nobody trusted Microsoft to either be building a secure system that didn’t impact on a very robust interpretation of free speech or if it was, to do it right. The smallest part of the concept made it in a couple of versions of Vista as BitLocker; whole disk encryption secured by the TPM.
But the Palladium concepts are showing up in a lot of other places, including the NSA’s Security Enhanced Linux and Citrix’s Security Enhanced Xen - a small OS that runs as a secure virtual machine with isolated applications, using the TPM and Intel’s new hardware virtualization technology …
Intel even uses the words Trusted Computing Base, which might be a hostage to fortune given the fate of Palladium. The DRM discussion hasn’t started yet, but there’s a trusted channel to the keyboard, mouse, memory - and the graphics subsystem, which is what some thought would allow copy-protected DVDs to be watched in the secure area of Palladium, without the option to copy them. This time around it’s more likely to be copy-protected downloads: killing off HD DVD has actually made Blu-Ray less likely to get mass adoption, as player and disc prices stay high.
There are far more benefits to Palladium-style secure computing than protecting the movie industry or saving the banking industry from having to upgrade anti-fraud backends. You may keep your AV up to date and your company documents secure, but one in six of all PCs that touch the Google site has a bot and they’re all sending you spam.
And while the systems that look so much like Palladium that I get déjà vu are still a little way off, Asus is already selling machines with Express Gate. Granted, this is more like the embedded operating systems you see on a lot of media notebooks; it boots up in eight seconds and lets you see your photos and play your music. It has an Internet connection, so you can browse the Web without waiting for Windows. But it also uses the TPM in Montevina and you can treat it as an isolated operating system, says the press release: “Friends and family can use your notebook to nip online, use IM, listen to music, play and view without having access to your data, the system or the Windows environment.” Very Palladian.
-Mary
Intel predicts an all IA future, consigns CUDA to the footnotes
By Simon Bisson & Mary Branscombe in Editorial
Posted in Silicon, Futures, Intel, Server on
With Intel’s 40th birthday on the horizon (and with it the 40th anniversary of the microprocessor), Intel’s Pat Gelsinger took a few minutes yesterday to ruminate on the past, present and future - and to take a few questions.
Beginning with a look back to the i386, and the shift from 16 to 32-bit computing, Gelsinger pointed to a time of technical and industry transition, much like today. It was the point where Compaq moved ahead of IBM, and Windows and Microsoft began to shape the software industry. We’re in the middle of another shift at the moment, what Gelsinger called the “third era of Moore’s Law”.
The first era was the age of invention, with the second concentrating on scale and manufacturing. Gelsinger calls the third era “The right hand turn”, where the industry starts to concentrate on energy efficiency. He went on to describe the industry’s success as resulting from “the power of compatibility”, where compatible software means that each generation of silicon can inherit the work of the entire industry (with just a little recompile along the way). There have been plenty of changes in Microprocessor design, purely by increasing numbers of transistors - the power controller on Intel’s Nehalem processors is bigger than Gelsinger’s first processor. There’s a sheer complexity to these machines, which Gelsinger described as “the most advanced things ever built”.
That’s the past and today, so what about tomorrow? Intel reckons on having 10 years of visibility into the future of silicon. Gelsinger described silicon as “the scaffolding for half the periodic table”. The future will be much the same, even if it’s based on silicon nanowires and spintronics. The first big change will be in just a couple of years, with the shift to 450mm wafers. The investment this requires will be huge, and Intel expects this to trigger a wave of industry consolidations - just to help pay for the new fabs.
Gelsinger also sees Intel’s IA architecture as a key differentiator between it and the rest of the industry. As multicore systems become more and more common, and as IA scales up to teraflop terascale systems and down to milliwatts, software will be compatible between all the different versions of the architecture. There of course will be different languages and libraries (especially for parallel processing systems), but code will be portable.
The result will be what Gelsinger calls an “AE724″ world. Bill Gates’ vision was a computer on every desk and in every home, Intel’s is much more ambitious. It’s a world where everyone has access to the Internet, with computing embedded into the environment and the infrastructure - everywhere you can imagine. It’s certainly a big picture - and one that will mean a shift in the way we develop applications and in how we design networks and data centres.
We blogged about GPU-based computing last week, and Gelsinger was asked about Intel’s response to NVIDIA’s CUDA and AMD’s CTM. Describing CUDA as “an interesting footnote in the history of computing”, Gelsinger talked about GPU computing as a cool idea that required a new programming model. He felt that this would be hard to deal with compared to general purpose computing techniques, and suggested that Intel’s massively multicore Larabee would be the right answer in the long term.
It’s true the microprocessor and the software stack make a huge difference. I probably wouldn’t have dialed in to the conference call if Skype didn’t connect to US 1-800 numbers for free from anywhere in the world. Whether the future’s all Intel is another question. IA is an important architecture but there’s still space for low power alternatives like ARM, or for specialised co-processors from the likes of Toshiba, Azul, AMD and NVIDIA. General purpose silicon is just one way of working - and if you’re prepared to target a specific niche there’s still plenty of scope to make a very healthy profit with specialised silicon.
–Simon
More battery life, fewer explosions
By Simon Bisson & Mary Branscombe in Editorial
Posted in Futures, Silicon, Toys & gadgets, Hardware, Laptop, Mobile on
No battery ever lasts long enough. The extended battery on the HP 2710 tablets Simon and I carry give us a full day of work, nine to ten hours or less if we turn on Wi-Fi. I’ve been typing since 8am this morning and online a few times and it’s now 1pm and I have four hours left. That’s just about acceptable, but it’s never enough - I’m wondering where the nearest power socket is. Two technologies we saw at the Future in Review conference this week could produce much longer battery life - if they ever make it to market.
Lithium ion batteries work by packing as much lithium as possible into the positive and negative electrodes inside the battery and them moving ions from them, through the electrolyte fluid and out to your device. The more lithium you can get into the electrode, the more ions you can get out of it. That’s how Yi Cui of Stanford is hoping to get a battery that lasts ten times longer. He’s replacing the usual copper electrodes with silicon, which can store ten times as many lithium ions .
That’s not news; we’ve known for 30 years that silicon stores more lithium, but it also swells up more than copper because of that - and when it swells up, the electrode breaks. Yi Cui’s breakthrough was using silicon nanowires that are much more supple; each wire is only 100 nanometres wide, but they’re very long. Silicon is also more stable than copper, so increasing the energy density doesn’t make it more likely for batteries to explode the way it does with current batteries. It doesn’t make it hotter either, because it’s the internal resistance of the battery that causes the heat, not the capacity.
Ten times as many lithium ions doesn’t mean ten times the battery life; by the time you add in the rest of the battery system, including the electrolytes and the packaging around it all, and some further developments that are still under wraps, you could get double the battery life of lithium ion today.
Startup Seeo is starting with the other half of the battery, replacing the electrolyte fluid with a plastic film that’s very like the polymers used to make motorcycle helmets. For one thing that means it’s much safer - no matter how hot the battery gets it won’t catch fire. But it also works with other battery chemistries than lithium; according to Seeo, some of the lithium replacements they’re looking at could give you 50 to 70 times the energy density of lithium, so you get a choice between smaller devices or longer battery life in the same size we lug around today.
We’ve seen a lot of new battery technologies over the years and few of them have made it to market. One promising zinc battery might finally show up in notebooks PCs this year, maybe, possibly - four years after I first saw it running a laptop. It’s not just that the chemistry might turn out not to work as well as it did in the lab. At the moment you can only charge a silicon lithium battery 100 times before it won’t charge enough to be worth using; that has to go up to 500 times before you’d think about putting it in a mobile phone you’d keep for two years and more like 1,000 for a notebook. Both Seeo and Yi Cui are aiming to charge as quickly as lithium ion, but they’re not there yet - silicon lithium batteries could take an hour to charge.
And hardware manufacturers have to see enough of a demand to change the power supply and charging system in a laptop or phone. Seeo’s lithium battery might fit into an existing device but that’s more about safety than longer battery life; a different chemistry will need a different charger. Silicon lithium batteries run at a slightly different wattage and the value that tells the system the battery is fully charged and doesn’t need more power is also different.
So are these new technologies going to languish the way others have? Maybe not. For one thing, people will pay more for longer battery life, so manufacturers have an incentive to switch. And for another, with the price of oil and petrol still rising, electric cars are looking more likely and both these technologies promise to scale up enough to power cars. When you can do that, a smaller battery for a phone or a PC almost comes for free.
-Mary
CUDA - let the GPU take the strain
By Simon Bisson & Mary Branscombe in Editorial
Posted in Processors, Silicon, Applications, Business, Server on
The barracuda is the wolf of the sea, a slim silver dart that hunts in deadly packs. It’s perhaps not surprising that NVIDIA has taken part of its name for its GPU-based supercomputing tools.
On a recent trip to the US, Mary and I met up with some of the folk behind CUDA at NVIDIA’s Sunnyvale headquarters. It was a fascinating conversation - if only because I used to write scientific computing software, and something like CUDA would have sped up my work massively. When a problem takes days to solve, something using something like CUDA to accelerate processing makes a lot of sense.
Prior to CUDA, NVIDIA had tried to use GPUs for compute, but had run into architectural problems. Things changed with their series 8 GPU, which was very different to anything they’d built before, being designed for compute as well as graphics. That’s lead to some tradeoffs - there’s silicon on the GPUs that’s unused when it’s used as an accelerator (and vice versa). However NVIDIA makes so many chips, there’s not really any financial issue, it all comes out of the economies of scale.
CUDA is more than just a set of chips - it’s a language framework for working with GPUs, that can andle both sequential and parallel code together. Developers don’t need to learn anything you, and the framework gives programmers explicit - and simple - interfaces for running parallel code on NVIDIAs GPUs. There is a long term goal of providing tools for automating parallelism, but at this point you still need to work out what code can be parallelised yourself. The result is code that’s very simple with much less code, as CUDA handles repetitive calculations for you.
Simplicity comes from the hardware as well, as it manages threads for you. All you need to do is define the tasks the GPU will handle, and manage their interactions. The GPU then runs the calculations over the data, with groups of processors on different functions at the same time. As RAM is directly attached to the GPU there’s no need to use the PC’s own memory for caching data.
The numbers coming out of CUDA are impressive. Working with the VMD/NAMD molecular dynamics tools researchers at the University of Illinois have seen a 240X speed-up in the VMD ion placement tool, and an 8 to 12X speed up in NAMD. With an eye on greener computing, they’re also finding that CUDA gives them 1W/Gflop!
If you want this sort of power for your applications (and it’s remarkably suitable for large financial applications) you can by NVIDIA’s Tesla systems. There are work station versions, along with deskside offload processors. However the version we were most impressed with comes as a 1U rack mount unit, containing 4 GPUs. Connected to a PC or a server via 5 Gbps PCI-Express connections this is the way to give your data centre applications a significant speed up, with significantly lower power requirements.
While Tesla may not yet meet NVIDIA’s aim of providing a Teraflop in a 1U unit, it certainly speeds things up. Oxford University researchers have used it to get a 149X speed up LIBOR risk analysis for an 89X improvement on performance/Watt. That’s a good deal in anyone’s book - especially if you’re working with today’s fractious financial markets.
Add one to my list for the IT Santa!
–Simon
Tag cloud
Archives
- September 2008
- August 2008
- July 2008
- June 2008
- May 2008
- April 2008
- March 2008
- February 2008
- January 2008
- December 2007
- November 2007
- October 2007
- September 2007
- August 2007
- July 2007
- June 2007
- May 2007
- April 2007
- March 2007
- February 2007
- January 2007
- December 2006
- November 2006
- October 2006
- September 2006
Most commented posts
- Not very open, not very social
3 comments
- The best mobile game ever
- A Big Day In The Enterprise IT World
- Employees are our most valuable asset (snigger)
- Biometrics - it's not the technology that's broken
- More battery life, fewer explosions
- Spam Fighting in Exchange
- Free server backup
- 2008 technology resolutions
- Songs of distant satellites
Highest Rated Blog Posts
- Nobody knows what Web 2.0 really is (100%)
- Songs of distant satellites (100%)
- Log in and lock in (100%)
- Mommy, why is there a home server in the office? (100%)
- Employees are our most valuable asset (snigger) (100%)
- Locking down IT or blocking creativity (100%)
- Video opera? What would you do with huge bandwidth and millions of pixels? (100%)
- Consumer BlackBerrys are good for business (100%)
- HD Trek (100%)
- Top tips for speeding up Vista (100%)







