How to get high performance computing from the cloud

growth strategy

Traditionally, the users of high performance computing (HPC) would be carrying out some hideously complicated academic exercise involving petabytes of data about human genomes or high quality 3D simulations involving nuclear war heads.

As such, their computing capacity was ‘procured’ (a process involving purchasing agreements, bureaucratic chains of commands and delays when Dave in Bought Ledger gets something wrong just before he goes on holiday). But, what if you want to buy that capacity instantly on the cloud?

Accessing HPC in the cloud, through a pay-as-you per petaflop basis is an attractive proposition. But if you’re in a company big enough to employa CIO, then it’s likely that you are facing the decision whether it’s more economically viable to own the capacity entirely.

Don’t assume the cloud is always-scalable

If you are going to buy HPC in the cloud, don’t assume that always means you’ll get a tailored scalable solution, advises Volker Grappendorf, managing director at HPC vendor SGI, which supplied the super computer used by the Atomic Weapon Institute.

“An environment made of tons of pizza-box servers to cover the increasing compute requirements will lead to an explosion of unwieldy IT management costs and data centre management headaches,” says Grappendorf.

Don’t forget budgeting always has some surprise elements. Don’t just choose a solution without first considering the amount of support you’ll need to optimise your software code and data analytics for the HPC system, says Grappendorf.

When to buy HPC in the cloud

Datacentre operators can talk you through the pros and cons of either accessing HPC services in the cloud or owning your own equipment and getting it hosted by a company that can manage the hardware, cooling systems and security efficiently.

A multi-national datacentre operator like Equinix is a case in point. Sam Johnston, director of cloud and IT services, says HPC was originally a niche aimed at industries with deep pockets. As such, it’s highly unlikely that if you haven’t had to buy a supercomputer yet, that you will ever need to.

However, there is a handy rule of thumb for whether you need one. It’s all about the timing. Access HPC via the cloud if your company needs to use a lot of equipment, but only occasionally, says Johnston.

Banks exemplify this pattern of use. Many financial institutes run daily ‘value at risk’ calculations to work out how much money they can put out to market while still satisfying regulations. They will need a large footprint because the more calculations they do, the better the model, the more they can invest, the more they earn. Surely this is a sector with deep pockets where ownership will bring economies?

Not so, says Johnston. “This is only done once a day. If they had to buy it themselves then they would have a lot of infrastructure going unused most of the time, unless they could find other applications for it of course,” says Johnston.

When to buy your own HPC

On the other hand, it’s different if you have a constant cycle of demand for heavy duty data crunching. You have less to gain from cloud computing and may even find it to be more expensive, according to Johnston.

In that instance a CIO is better advised to make long term plans for buying the company its own equipment. Prepare for some long meetings and a grilling from the board in which they try to grandstand their Wikipedia-sourced knowledge of super computers.

Outsource the management

Owning your own HPC makes good sense, but you must get someone else to host it in a datacentre where it can be fed (electrical power - at the most economical tariffs) and watered (or immersed in oil – or whatever the latest cooling technique is) at the most efficient rate.

Hosting specialists like Equinix or Rackspace can handle the higher density and more demanding interconnection.

Get tooled up

If you are to outsource everything about management of HPC to the cloud, except your use of the services, then your handling of the virtual HPC becomes the game changer.

One of the beauties of cloud access to HPC is that you can use cloud tools to manage your capacity, but which ones? It all depends whether you are using HPC for image processing, credit risk simulations, seismic analysis, machine learning, engineering design or plain old fashioned gene splicing.

All the requisite tools can be found on GitHub (a repository for hosting services). AWS naturally has the best known tools, such as Cfncluster, a sample code framework for deploying and maintaining HPC clusters on AWS.

“Today companies and organisations of all types, from NASA to Novartis, now use AWS for a variety of High Performance Computational tasks,” says Ian Massingham, AWS’ technological evangelist.

Keep your options open

Don’t sign any long term contracts though, advises Flaviu Radulescu, CEO at service provider Bigstep.

Open source software is readily available and helping create cheaper high performance infrastructures. “The performance gap between what the public cloud could offer and the needs of big data has been bridged by the bare metal cloud,” says Radulescu.

The performance of bare metal infrastructures (direct memory and CPU access, hardware switching, wire speed networking) combined with the flexibility of the cloud, making it easier than ever to get started with big data without needing an HPC.

Don’t invest time learning a disappearing technology

The gradual commoditisation of every variable of computing, and the maturity of Openstack supported technologies (like ZeroFM’s containers and micro-hyper technologies) will remove the arguments for needing HPC because all high performance workloads will be able to run in the cloud, according to Nigel Beighton, Rackspace’s vice president of technology.

“The need for specialist custom technology will evaporate for most workloads in the future. Cloud is where we are all going, and eventually high performance computing will be obsolete thanks to the cloud,” he says.