Information archaeology

While some information is old in minutes, there’s plenty of information in your organisation you need to keep for many years. Take action now, or older file formats could be unreadable.

And then there's in-house applications. You might be able to import the data from an app written in a DOS version of FoxPro, but could you get the custom code that generates reports working?

Finding solutions

The best answer so far is to virtualise the operating system and applications needed to open the files. Natalie Ceeney says the vast majority' of government information is in Microsoft formats, so the National Archives worked with Microsoft to develop a system to make documents in older versions easily accessible in their original format.

It uses Virtual PC 2007 to run all previous versions of Microsoft Office on Windows 3.1, Windows 95, Windows 98 and Vista as required, on a single PC.

"Today it's reasonably simple to convert a document to the latest versions of the Office file format," she points out. "This is also about protecting the digital integrity of the document and making sure it can still be viewed and seen the way it was intended."

Email archiving is a related problem. For one thing, PST archives on individual hard drives aren't easy to search or even to find. As James Blake, product manager for email archiving service Mimecast, puts it: "That's all the intelligence of your business spread across the world and scattered among road warriors who may or may not lose laptops or have them stolen."

There's the same file format issue with attachments, and you can also run into problems with archiving messages from Exchange, says Blake. "In the last ten years we've gone through Exchange 5, Exchange 5.5, Exchange 2000, 2003 and now 2007. Over a ten year retention period, you have to manage the migration of all these different email platforms and the underlying stored data in your archive. If I archived my data on a previous version and I'm now on Exchange 2003, I have no way to import that data to examine it. You have to install Exchange 5.5 or upgrade the data to an intermediate format."

With that in mind, Mimecast's reads in email that arrives as SMTP as well as the native Exchange format and stores it in a custom XML format that splits the message into component parts. "The whole message is cryptographically hashed so we can prove when we rebuild the message that it hasn't been tampered with. If it's been forwarded to one person without the original attachment we can single instance store that and note how it was forwarded. We also store notes on how business processes like approvals and scanning have changed a message."

The XML format is more efficient for searching. Blake claims you can search ten years of email using the Mimecast Outlook plug-in faster than you can search a local PST. And speed matters to your users more than the cost of tape versus hard drives: "People are emailing themselves documents so they know in five to 10 years they can get them back in seconds, as opposed to internal backups on tape where they have to wait three to four hours for the tape to come back on a truck from Iron Mountain and then wait again for you to load the tape."

Featured Resources

Navigating the new normal: A fast guide to remote working

A smooth transition will support operations for years to come

Download now

Leading the data race

The trends driving the future of data science

Download now

How to create 1:1 customer experiences at scale

Meet the technology capable of delivering the personalisation your customers crave

Download now

How to achieve daily SAP releases

Accelerate the pace of SAP change to support your digital strategy

Download now

Most Popular

Windows XP source code allegedly leaked online
Microsoft Windows

Windows XP source code allegedly leaked online

25 Sep 2020
16 ways to speed up your laptop
Laptops

16 ways to speed up your laptop

16 Sep 2020
16 ways to speed up your laptop
Laptops

16 ways to speed up your laptop

16 Sep 2020