How Google is redesigning your data centre
Google famously builds its own servers out of commodity PCs and custom-designed power supplies - could its strategy be right for your data centre too?
Google has built a substantial infrastructure at a very low cost. By some estimates it has nearly half a million servers around the world, built with low-price, consumer-grade PC motherboards rather than reliable, server-grade, full-price hardware and running software that compensates for hardware outages.
In 2006, Douglas Merrill, the vice president of engineering and senior director of information systems at Google, claimed "because of the price-performance trade-off, under current market conditions I can get about a 1,000-fold computer power increase at about 33 times lower cost if I go to the failure-prone infrastructure."
But Google pays in other ways. Power supplies for consumer-grade PCs have multiple power outputs for the different components in the system; the graphics card and optical drive expect a different voltage so you'll have cables offering 12 volts, 5 volts, 3.3 volts and so on. Because of the conversions needed for the different voltages, only 55-70 per cent of the electricity that goes into the power supply making it out the other side, so Google commissions its own 90 per cent efficient power supplies with a single 12-volt power rail.
But you could just buy an enterprise-grade server and get the same thing, says Greg Huff, chief technology officer of the Industry Standard Server division at HP. "All our servers have had 12-volt single output supply since 1999. We started work on a 90 per cent efficiency power supply two years ago and it started shipping last year. Switching on and off to keep it in the most efficient range - we do that in the p-Class range already." And he doesn't give Google much credit for industry interest in 12 volt. "Google is driving a marketing message. The interest we're getting is because I can't have another watt, anther BTU in my data centre and I can't meet the technology component of my business problem because of this. Nobody is interested in this because Google made it sexy - they're interested because it solves their problem."
Last year Google researcher Luiz Andr Barroso and senior vice president of operations Urs Hlzle asked the industry to introduce low power modes in network equipment, memory and hard drives - like the low power modes in CPUs that reduce the frequency when the system is less busy. Because other components run at full power no matter how loaded the system is, Hlzle calculates that a server with 10-50 per cent utilisation may manage as little as 20 per cent efficiency. And unless you're using virtualisation, your servers are typically 10-15 per cent used.
Almost every business would benefit from low power modes in components, but it will take a lot of work by manufacturers to deliver. In the mean time, D-Link is offering network switches that reduce the power to ports that have less than the full length of Ethernet cable plugged in, although so far the system only goes up to a 24-port high performance rack mountable switch.
A special problem
Google is tremendously efficient at dealing with data. According to Matt Glotzbach, the director of Google enterprise products, YouTube accepts, transcodes and indexes seven hours of video content a minute. Every time you search the Web or read Gmail, the site has to index and profile pages and emails to match them with all relevant ads in the Google Adwords database.
Google is also far ahead of enterprises in thinking about power and performance issues; even the largest companies don't often understand their full energy footprint, says Curt Belusar, director of research and development for scalable data centre infrastructure at HP. "At Google they know what goes into a server to get so many transactions out, they've thought about the room. Many companies say 'a watt of power saved will save me so many dollars over a three year time period'. At Google, they have the lifecycle of the Watt: 'you're seven Watts more for the name number of transactions and that costs me so many dollars'. Google is beyond understanding and into optimising."
But Google runs only a handful of applications, on hundreds of thousands of servers, and they're all very similar and extremely parallelised. Google has its own distributed file system, its own middleware, its own grid computing platform and it knows how every application it runs inside and out and what it takes to make it perform as well as possible. It runs a few standard applications like Oracle Financials, but even project tracking is done by indexing emails rather than using an off-the-shelf structured project management tool. Few other companies beyond Web 2.0 and SaaS (Software-as-a-service) providers have that restricted a mix of applications or that deep a knowledge of how their applications perform.
And while the Google applications are very reliable and constantly available, you can't say the same for their hardware. That's a deliberate choice; Google doesn't use redundant power feeds or power supplies, which would reduce power efficiency. The software is written to route around failure, and when you have 3,000 servers in a single data centre, losing one of them is almost trivial. Few enterprises run applications across multiple servers and when they do they run on a relatively small number of servers so losing a single server is significant. Virtualising workloads can be a better approach for the enterprise than redundant unreliable servers.