Firms full of dirty data

As much as 30 per cent of data held by companies could be “dirty” in some way, according to a BCS award winner.

Up to 30 per cent of an organisation's data could be "dirty", according to a top informatics researcher.

Delivering a lecture after winning the British Computer Society's (BCS) Roger Needham Award, Professor Wenfei Fan said that businesses and their customers could be negatively affected by "dirty data" any information which is inconsistent, inaccurate, incomplete or out of date. "Poor quality data can give us trouble the problem is everywhere," he said.

According to Fan, Australia has 500,000 dead people with active Medicare cards, while the US Pentagon's bad data quality led that organisation to attempt to send over 200 dead soldiers back to Iraq.

"In the UK, we're doing no better," he said, saying this country has issued 81 million national insurance numbers to a population of only 60 million.

But it's not just the public sector. Fan said that in a customer database of over half a million records, 120,000 become invalid within a year. And, the error rates for industry range from a fairly accurate one per cent to as high as 30 per cent.

And such dirt can lead to high costs, he said. Among other examples, he cited the case of Lehman Brothers, which inaccurately entered 300 million for a 3 million trade, taking 300 billion off the FTSE 100.

It's not just a problem for high finance, but for retail, too. Fan noted an example from Dell, which sold 15,000 computers in Chile for 79 when they were actually worth 303.

Fan claimed dirty data costs US businesses as much as $611 billion (412 billion) and US customers as much as $2.5 billion (1.68 billion) each year. "Real life data is dirty, and dirty data is costly," Fan said.

While he admitted 100 per cent accuracy was essentially impossible, Fan called for better tools to cross-reference databases to detect incorrect data.

Fan is chair of web data management in the School of Informatics at the University of Edinburgh. The Roger Needham award is given by the BCS and Microsoft to a UK-based researcher within ten years of their PhD for their contribution to computer science.

The entertaining lecture can be viewed [a href="http://emea25537091.emea.acrobat.com/bcs_

needham_2008" target="_blank"]here[/a]; click option eight' to go directly to Fan's lecture.

Featured Resources

Unlocking collaboration: Making software work better together

How to improve collaboration and agility with the right tech

Download now

Four steps to field service excellence

How to thrive in the experience economy

Download now

Six things a developer should know about Postgres

Why enterprises are choosing PostgreSQL

Download now

The path to CX excellence for B2B services

The four stages to thrive in the experience economy

Download now

Recommended

1Password targets enterprise customers with Secrets Automation
IT infrastructure

1Password targets enterprise customers with Secrets Automation

14 Apr 2021
PowerShell threats increased over 200% last year
cyber security

PowerShell threats increased over 200% last year

14 Apr 2021
Russia launched over a million cyber attacks in three months
hacking

Russia launched over a million cyber attacks in three months

13 Apr 2021
New DNS vulnerabilities put millions of IoT devices at risk
Internet of Things (IoT)

New DNS vulnerabilities put millions of IoT devices at risk

13 Apr 2021

Most Popular

Microsoft is submerging servers in boiling liquid to prevent Teams outages
data centres

Microsoft is submerging servers in boiling liquid to prevent Teams outages

7 Apr 2021
University of Hertfordshire's entire IT system offline after cyber attack
cyber attacks

University of Hertfordshire's entire IT system offline after cyber attack

15 Apr 2021
NSA uncovers new "critical" flaws in Microsoft Exchange Server
servers

NSA uncovers new "critical" flaws in Microsoft Exchange Server

14 Apr 2021