Information archaeology

While some information is old in minutes, there’s plenty of information in your organisation you need to keep for many years. Take action now, or older file formats could be unreadable.

"We pulled in a load of data that had been stored on tape onto online storage that the customer wanted to mine for BI purposes, and we found out that the biggest part of the whole project was not the 500 million storage array but having to build the software to interpret the data from the mainframe system."

Brunel University is taking its email archiving in house, using HP's Integrated Archive Platform for archiving documents and email, including PST files from user machines. According to Iain Liddell, the policy development manager at Brunel, retrieving documents for a recent police investigation took two weeks and resulted in 800 pages of evidence. "If we'd had the system we are building now, it would be a morning and eight pages of evidence."

The system works with PDF files and retrieves attachments, but he's still looking for a solution for keeping CAD and graphics files accessible and he's following developments at the National Archive. "Building regulation documents may need to be kept for a hundred years - there's an interesting problem.

At the moment, we do have some physical virtual machines in the estate office which are still running Windows 98. We look forward to a finding a solution that will take these out of service and retain access to the CAD drawings."

The university plans to keep the archive manageable using role-based email. "There is a world of a difference between the finance director emailing me about computing policy and the finance director emailing me about my pension. We're starting to build aliases for key staff so if he emails me about computing policy he will email to my alias which is my job title. If he is emailing me about pensions he will be mailing to me by name and our archive is being built to distinguish between the two. As we roll role-based email out to people, it will become more and more normal for me to think is this to the role or the person?' Are you saluting the uniform or the person?"

To make that work, he doesn't plan to rely on people remembering what address to use. "We're looking to develop a probabilistic system that will look at an email or a document the way that the anti-spam software looks at messages and say this looks like a building regulation, we've got to give this 100 years; this is about a student so we keep it for six years after graduation'."

Although full natural language parsing is still a research problem, he predicts it will be possible to classify using technical terminology. "We've hope to be using simple dictionary-based systems in the coming months. We've already done it in terms of incoming mail, so we could quarantine something that wasn't spam but was not to be distributed." The university will still need the new data centre it's built. "This won't reduce the amount of storage we need, it will just take us slightly longer to fill it up."

Featured Resources

Unleashing the power of AI initiatives with the right infrastructure

What key infrastructure requirements are needed to implement AI effectively?

Download now

Achieve today. Plan tomorrow. Making the hybrid multi-cloud journey

A Veritas webinar on implementing a hybrid multi-cloud strategy

Download now

A buyer’s guide for cloud-based phone solutions

Finding the right phone system for your modern business

Download now

The workers' experience report

How technology can spark motivation, enhance productivity and strengthen security

Download now

Most Popular

WhatsApp could face €50 million GDPR fine
General Data Protection Regulation (GDPR)

WhatsApp could face €50 million GDPR fine

25 Jan 2021
How to move Windows 10 from your old hard drive to SSD
operating systems

How to move Windows 10 from your old hard drive to SSD

21 Jan 2021
What is a 502 bad gateway and how do you fix it?
web hosting

What is a 502 bad gateway and how do you fix it?

12 Jan 2021