Information archaeology

While some information is old in minutes, there’s plenty of information in your organisation you need to keep for many years. Take action now, or older file formats could be unreadable.

And then there's in-house applications. You might be able to import the data from an app written in a DOS version of FoxPro, but could you get the custom code that generates reports working?

Finding solutions

The best answer so far is to virtualise the operating system and applications needed to open the files. Natalie Ceeney says the vast majority' of government information is in Microsoft formats, so the National Archives worked with Microsoft to develop a system to make documents in older versions easily accessible in their original format.

It uses Virtual PC 2007 to run all previous versions of Microsoft Office on Windows 3.1, Windows 95, Windows 98 and Vista as required, on a single PC.

Advertisement - Article continues below
Advertisement - Article continues below

"Today it's reasonably simple to convert a document to the latest versions of the Office file format," she points out. "This is also about protecting the digital integrity of the document and making sure it can still be viewed and seen the way it was intended."

Email archiving is a related problem. For one thing, PST archives on individual hard drives aren't easy to search or even to find. As James Blake, product manager for email archiving service Mimecast, puts it: "That's all the intelligence of your business spread across the world and scattered among road warriors who may or may not lose laptops or have them stolen."

There's the same file format issue with attachments, and you can also run into problems with archiving messages from Exchange, says Blake. "In the last ten years we've gone through Exchange 5, Exchange 5.5, Exchange 2000, 2003 and now 2007. Over a ten year retention period, you have to manage the migration of all these different email platforms and the underlying stored data in your archive. If I archived my data on a previous version and I'm now on Exchange 2003, I have no way to import that data to examine it. You have to install Exchange 5.5 or upgrade the data to an intermediate format."

With that in mind, Mimecast's reads in email that arrives as SMTP as well as the native Exchange format and stores it in a custom XML format that splits the message into component parts. "The whole message is cryptographically hashed so we can prove when we rebuild the message that it hasn't been tampered with. If it's been forwarded to one person without the original attachment we can single instance store that and note how it was forwarded. We also store notes on how business processes like approvals and scanning have changed a message."

The XML format is more efficient for searching. Blake claims you can search ten years of email using the Mimecast Outlook plug-in faster than you can search a local PST. And speed matters to your users more than the cost of tape versus hard drives: "People are emailing themselves documents so they know in five to 10 years they can get them back in seconds, as opposed to internal backups on tape where they have to wait three to four hours for the tape to come back on a truck from Iron Mountain and then wait again for you to load the tape."

Featured Resources

What you need to know about migrating to SAP S/4HANA

Factors to assess how and when to begin migration

Download now

Your enterprise cloud solutions guide

Infrastructure designed to meet your company's IT needs for next-generation cloud applications

Download now

Testing for compliance just became easier

How you can use technology to ensure compliance in your organisation

Download now

Best practices for implementing security awareness training

How to develop a security awareness programme that will actually change behaviour

Download now

Most Popular

data governance

Brexit security talks under threat after UK accused of illegally copying Schengen data

10 Jan 2020
cyber security

If not passwords then what?

8 Jan 2020
Policy & legislation

GDPR and Brexit: How will one affect the other?

9 Jan 2020
web browser

What is HTTP error 503 and how do you fix it?

7 Jan 2020