Cray shows the value of Secret Sauce

The supercomputing firm's recent growth is no mystery

Cray XC30

Why would it interest you that Cray's results show a steady and well above-market level of growth for the last three years? If there was ever a remote ivory-tower company, it would be the one that fit's Cray's surface description. 

Founded in 1976, with special abilities in supercomputing that go back to practically the dawn of information technology, Cray spent some time in the acquisition and merger wilderness as part of middle-sized fish SGi, before coming back out into the direct limelight in the last few years. While the ownership may have changed, and the system design moved on a good distance from the old "computer disguised as a hotel foyer sofa" days, Cray has at least stuck to the knitting: back in 1976 the primary business was supercomputing, and so it is now.

Advertisement - Article continues below

Which is pretty peculiar, when you think about what supercomputing has been all about. The most recently remarkable advances in that field have been around Nvidia's CUDA architecture, which exploits the extreme calculation speeds achieved in graphics cards, for doing (partly) non-graphics related maths work at speeds far in excess of those achievable with a general-purpose CPU burdened with all the other parts of a classical operating system. This has been a case of a kind of game of Jenga on the part of systems designers: The initial tower of wooden blocks was built to support ever-escalating appetites for faster and faster games, with screen sizes and frame rates accelerating well beyond the limits of human perception right into speeds where they were more useful than the supercomputers of the day.

Advertisement
Advertisement - Article continues below

Now, the supercomputing marketplace has built that tower ever higher, with the attendant risk of it toppling over: it's become custom and practice for the compute work to be done by optimised add-on cards, generally hosted in generic rackmount server hardware. Intel has even made its own add-on compute card which has no role as a graphics card anymore.

Advertisement - Article continues below

What is the toppling pressure? The shift from predictive models, which take a short list of starting conditions and values, and then recompute them over and over again within a tiny walled garden of hyperspeed hardware, to models informed by altogether too much data. Cray's example is a baseball team which has a Cray: I have tried to explain that as case studies go, this one doesn't travel especially well but the basic premises of the example are still relevant. A ton of data comes whistling in as the match progresses, and there are pieces of advice the machine can prioritise which depend on complete oversight of not just all that momentary data, but also the historical long tail.

This is not the kind of job that can easily be cut up between thousands of tiny compute chips on a graphic board. Flooding data past in a structure amenable to the style of a query being presented becomes an absolutely vital skill.

Advertisement - Article continues below

This is where Cray thinks it has a distinct advantage. What other projects in the supercomputing field leave as the "secret sauce" of the implementation and operations teams, Cray brings right into the system design, just like it always has done since 1976. All through that long haul in the shadows, it has worked on ways to take the secretness out of the sauce.

What is this? First of all the analogy is to a dinner-party game best played drunk, then all of a sudden we are on to cookery. What's the theme here?

Advertisement
Advertisement - Article continues below

My first couple of pointers on this topic came from two very different sources (not sauces!) the first being an absolute evangelist for the One True Way in supercomputing. He was going to build a fluid flow modelling system, using completely standard white-box machinery, lots of different lumps of Linux, and Infiniband to glue it all together. We looked forward to a massive cardboard unboxing day and lots of screwing and plugging and so forth, only to be told there was no chance: Everything had to be put together by the reseller, because if they left their "generic" components in the hands of the uninitiated, none of it would work. The second incident came while a nice professor at Cambridge was effusively thanking the team who keep the work queue going, feeding diverse data sets and problems into the multi-rack, megascale supercomputer put together for him by Dell & Intel. "these are the guys that add the secret sauce" he said.

Advertisement - Article continues below

Cray's point is: Why is that secret, and why is it added invisibly and unaccountably by humans? The interconnection capability isn't magic, or for gurus only. It is meant to be pretty dull, shovel-work, making sure the investment gets given enough data to really start to pay back. What is revealed by this thinking is that where a lot of computing these days is handled under a "general magic" heading, with the nerds being justified by way of what they guard, make possible, or are able to fix on a bad day.

Cray believes its approach eliminates the rather open-ended approach common to this type of rollout. The interconnect, in Cray parlance, is as clearly defined as the compute resource, nd the speed at which a whole pile of data is turned into a reference or used to skew a result is a predictable quantity not just left to chance in the config of an unpopular or invisible low-level component. As supercomputing shifts from being a rare, prediction-oriented mathematical dance on the head of a pin, to a daily grind in trying to gain market advantage over your peers, this kind of shift from unknown magic, over to known performance statements, makes a lot of sense to a lot of buyers.

Which takes the mystery out of its results, too.

Featured Resources

Top 5 challenges of migrating applications to the cloud

Explore how VMware Cloud on AWS helps to address common cloud migration challenges

Download now

3 reasons why now is the time to rethink your network

Changing requirements call for new solutions

Download now

All-flash buyer’s guide

Tips for evaluating Solid-State Arrays

Download now

Enabling enterprise machine and deep learning with intelligent storage

The power of AI can only be realised through efficient and performant delivery of data

Download now
Advertisement
Advertisement

Most Popular

Visit/infrastructure/server-storage/355118/hpe-warns-of-critical-bug-that-destroys-ssds-after-40000-hours
Server & storage

HPE warns of 'critical' bug that destroys SSDs after 40,000 hours

26 Mar 2020
Visit/software/355113/companies-offering-free-software-to-fight-covid-19
Software

These are the companies offering free software during the coronavirus crisis

25 Mar 2020
Visit/software/video-conferencing/355138/zoom-beaming-ios-user-data-to-facebook-for-targeted-ads
video conferencing

Zoom beams iOS user data to Facebook for targeted ads

27 Mar 2020
Visit/cloud/355098/ibm-dedicates-supercomputing-power-to-coronavirus-researchers
high-performance computing (HPC)

IBM dedicates supercomputing power to coronavirus research

24 Mar 2020