What Is Big Data?

Is Big Data yet another buzz term? Steve Cassidy ponders what it really means for the IT industry...

Two nice short words. Not since Churchill's "fight them on the beaches" speech has a concept been so pithily expressed in solely Anglo-Saxon vocabulary, and with such a memorable definition, at least as proposed by last week's BBC Horizon programme. According to them, Big Data is all about quantity and the creeping instrumentation of everyday life by lurking sensors and systems.

I suspect this casual, passing redefinition of a pre-existing IT insider's deeply technical label is beautifully recursive. That's another insider's term for a concept that refers back to itself for definition. Though on this occasion, the loop that recursion implies doesn't pass through Big Data. Instead, it passes through SEO.

When Tesco want to work out how many Baked Beans are sold before 11am, the resulting query has to climb into a computational helicopter. Hourglass icons animate, fans spool up, temporary disks get full, LEDs blink furiously.

What Horizon presented was a nice hour of eye-candy and talking heads on two subjects. The first being data mining, which is all about what you do with a lot of information. And the second being a field I'm calling Instrumentation (because I can't find a pre-existing term for it), which is all about using sensors or very small computers (most often called "phones" these days) to collect a lot of information so you can "mine" it and see things emerging from the data that previously were a matter for guesswork. This can be about people and is most often cited as arising from tracking web-surfing habits, but it's not just that: Predicting volcanic eruptions and tsunamis is a classic case of retrieving emergent trends from basic environmental data.

There is an excellent book on these two topics, by Ian Ayers, called "Super Crunchers" - and when I say excellent, I don't mean it's a complete walkthrough to setting up a Hadoop cluster in your garden shed. It's packed with case studies and factual examples and, in a neat twist, Ayers reveals that he used data mining to figure out what to call his book, by putting a list of potential titles up on the web and seeing which one got the most votes. This snippet alone shows him to be a lot savvier about the topic than the BBC, since he makes proper use of search techniques. When I tried to find the Horizon episode by typing "Horizon Big Data" into the BBC's search machinery, I drew a blank.

Here's the issue. Doubtless the BBC has SEO results data, just like we do here at IT Pro. This is meant to be a big secret in certain ways which proves the point they were trying to make about the irresistible lure of data mining and the results that come from it. But, when an egregious hijacking of perfectly good jargon terms hits the national TV networks I become uneasy and filled with the need to set the record straight.

I suspect that "Big Data" is one of those terms marked as "trending" or "hot" or something like that, and this is the basis for the distortion which has made me so uneasy about their efforts. Just as with previous terms like "hacking" or "push email" or even "cloud computing", it looks like "Big Data" is coming to mean something quite different outside our business, than it does inside it.

Featured Resources

Unlocking collaboration: Making software work better together

How to improve collaboration and agility with the right tech

Download now

Four steps to field service excellence

How to thrive in the experience economy

Download now

Six things a developer should know about Postgres

Why enterprises are choosing PostgreSQL

Download now

The path to CX excellence for B2B services

The four stages to thrive in the experience economy

Download now

Most Popular

University of Hertfordshire's entire IT system offline after cyber attack
cyber attacks

University of Hertfordshire's entire IT system offline after cyber attack

15 Apr 2021
Microsoft is submerging servers in boiling liquid to prevent Teams outages
data centres

Microsoft is submerging servers in boiling liquid to prevent Teams outages

7 Apr 2021
How to find RAM speed, size and type
Laptops

How to find RAM speed, size and type

8 Apr 2021