What is big data?
Big data is a big deal for the business world - here's what it's all about...
Data is constantly referred to as "the new oil", while politicians compare tech giants to the US oil companies that rose to power over a century ago.
This "new oil" isn't being sucked up from the ground. Instead, it's being harvested in large volumes from people using online services, tools and applications.
There's so much data, in fact, that without the right tools to store and process it, organisations can struggle to make sense of it. This huge array of information is collectively termed 'big data'.
You only have to think of all the times you fill out an online form, sign up to a digital service, or complete a questionnaire to have an idea of the volumes being generated every day. Add to this the vast quantities of data generated by web-connected devices, social media and sensors all over the world, and you have an unimaginably large amount of information to contend with.
The growth of big data is incredibly valuable to businesses. If they can collect and store it properly, and analyse it effectively, they can extract valuable information and insights that can help them make important decisions.
Elements of big data
Before taking any steps towards implementing a big data analytics programme, it's important to know the fundamental principles that make it different to other data a company may traditionally find in its data stores.
Although there's some disagreement over what exactly constitutes big data, most experts agree on five core elements: volume, velocity, variety, veracity and value.
Volume: This is the key component of big data. Employees in the past generated the majority of data in organisations, but data is now mostly generated by systems, networks, on social media, and via IoT devices, with a massive amount of data that needs analysing.
Variety: The type and sources of data widely varies, and comes in the form of structured (data normally sourced from a database, so it's clear and well-organised) and unstructured (data from elsewhere including social media sites like Twitter, ordinarily more chaotic, and includes photos, videos, documents, audio files or emails). The huge variety of unstructured data may cause problems for storing, processing and analysing. big data tools seek to process unstructured data, and make sense of it; with the processing of chaotic data a fundamental component of big data.
Velocity: With such a wide range of data coming from different sources, it is no surprise that the pace at which this comes into an organisation is important. The flow of data is huge and continuous. Email, text messages, social media updates and credit card transactions arrive every minute of every day. Real-time data should be processed and analysed in order to make valuable business decisions. This requires highly available systems with failover capabilities to cope with the data pipeline.
Veracity: Such is the volume, variety and velocity of the data coming in, the challenge can be to evaluate the quality of the data. This influences the quality of the resulting analysis that comes from this data. With a big data project, help is needed to ensure data is clean and processes are in place to prevent dirty data from accumulating.
Value: In the end, the quantity of raw data itself won't matter as much as how much value it can add, and how smart you are with it. When you add together the previous four Vs, will any insights you collect from analysis be worthwhile for your organisation? Your organisation may have access to a vast amount of data, but if not used intelligently, it may not end up delivering much value at all.