Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Warning of data ticking time bomb
BBC ^ | 03 July 2007 | Unknown

Posted on 07/05/2007 9:30:29 AM PDT by ShadowAce

The growing problem of accessing old digital file formats is a "ticking time bomb", the chief executive of the UK National Archives has warned.

Natalie Ceeney said society faced the possibility of "losing years of critical knowledge" because modern PCs could not always open old file formats.

She was speaking at the launch of a partnership with Microsoft to ensure the Archives could read old formats.

Microsoft's UK head Gordon Frazer warned of a looming "digital dark age".

Costly deal

He added: "Unless more work is done to ensure legacy file formats can be read and edited in the future, we face a digital dark hole."

Research by the British Library suggests Europe loses 3bn euros each year in business value because of issues around digital preservation.

The National Archives, which holds 900 years of written material, has more than 580 terabytes of data - the equivalent of 580,000 encyclopaedias - in older file formats that are no longer commercially available.

Ms Ceeney said: "If you put paper on shelves, it's pretty certain it is going to be there in a hundred years.

"If you stored something on a floppy disc just three or four years ago, you'd have a hard time finding a modern computer capable of opening it."

"Digital information is in fact inherently far more ephemeral than paper," warned Ms Ceeney.

She added: "The pace of software and hardware developments means we are living in the world of a ticking time bomb when it comes to digital preservation.

"We cannot afford to let digital assets being created today disappear. We need to make information created in the digital age to be as resilient as paper."

But Ms Ceeney said some digital documents held by the National Archives had already been lost forever because the programs which could read them no longer existed.

"We are starting to find an awful lot of cases of what has been lost. What we have got to make sure is that it doesn't get any worse."

The root cause of the problem is the range of proprietorial file formats which proliferated during the early digital revolution.

Technology companies, such as Microsoft, used file formats which were not only incompatible with pieces of software from rival firms, but also between different iterations of the same program.

Mr Frazer said Microsoft had shifted its position on file formats.

"Historically within the IT industry, the prevailing trend was for proprietary file formats. We have worked very hard to embrace open standards, specifically in the area of file formats."

Costly deal

Microsoft has developed a new document file format, called Open XML, which is used to save files from programs such Word, Excel and Powerpoint.

Mr Frazer said: "It's an open international standard under independent control. These are no longer under control of Microsoft and are free for access by all."

But some critics question Microsoft's approach and ask why the firm has created its own new standard, rather than adopting a rival system, called the Open Document Format.

Instead, Microsoft has released a tool which can translate between the two formats.

Ben Laurie, director of the Open Rights Group, said: "This is a well-known, standard Microsoft move.

"Microsoft likes lock-ins. Typically what happens is that you end up with two or three standards."

The agreement between the National Archives and Microsoft centres on the use of virtualisation.

The archive will be able to read older file formats in the format they were originally saved by running emulated versions of the older Windows operating systems on modern PCs.

For example, if a Word document was saved using Office 97 under Windows 95, then the National Archives will be able to open that document by emulating the older operating system and software on a modern machine.

Ms Ceeney said the issue of older file formats was a bigger problem than reading outdated forms of media, such as floppy discs of various sizes and punch cards.

"The media it is stored in is not relevant. Back-up is important, but back-up is not preservation."

Adam Farquhar, head of e-architecture at the British Library, praised Microsoft for its adoption of more open standards.

He said: "Microsoft has taken tremendous strides forward in addressing this problem. There has been a sea change in attitude."

He warned that the issue of digital preservation did not just affect National Archives and libraries.

"It's everybody - from small businesses to university research groups and authors and scientists.

"It's a huge challenge for anyone who keeps digital information for more than 15 years because you are talking about five different technology generations."

The British Library and National Archives are members of the Planets project which brings together European National Libraries and Archives and technology companies to address the issue of digital preservation.

He said that open file formats were an important step but there was still work to be done.

"Automation is a key area to work on. We need to be able to convert hundreds and even thousands of documents at a time," he said.


TOPICS: Technical
KEYWORDS: data; dataaccess; datasafety; format; legacyfiles; operatingsystems; preservation
Navigation: use the links below to view more comments.
first previous 1-2021-4041-6061-8081-83 next last
To: ShadowAce
Ms Ceeney said: "If you put paper on shelves, it's pretty certain it is going to be there in a hundred years.

This is something the proprietor of The House of the Book in San Juan, Puerto Rico talked to us about almost 30 years ago while showing us some >500 year old books whose pages were still clear and bright.
61 posted on 07/05/2007 12:57:00 PM PDT by aruanan
[ Post Reply | Private Reply | To 1 | View Replies]

To: aruanan

BUMP!


62 posted on 07/05/2007 1:08:15 PM PDT by Publius6961 (MSM: Israelis are killed by rockets; Lebanese are killed by Israelis.)
[ Post Reply | Private Reply | To 61 | View Replies]

To: aruanan
This is something the proprietor of The House of the Book in San Juan, Puerto Rico talked to us about almost 30 years ago while showing us some >500 year old books whose pages were still clear and bright.

I believe the longevity of written documents is a function of the paper construction and the chemical makeup of the ink. I remember hearing that certain documents (not sure of the time period) were very much in danger due to acidic degradation.

63 posted on 07/05/2007 1:12:51 PM PDT by JeffAtlanta
[ Post Reply | Private Reply | To 61 | View Replies]

To: ShadowAce

OMG!!! It is the Y2K crap all over again. The world is going to end shortly!!!!


64 posted on 07/05/2007 1:16:14 PM PDT by RetiredArmy (Jorge Bush & his Marxist's Dim friends are enemies of the Republic with their amnesty Bill.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: aruanan
... while showing us some >500 year old books whose pages were still clear and bright.

Agree... Suppose our country was founded today -- and documents like the Declaration of Independence and the Constitution were composed in MS Word, and saved on a CD or DVD.

Would people still be able to read them after as many years have passed from this own country's founding?

65 posted on 07/05/2007 1:24:32 PM PDT by ken in texas (come fold with us.... team #36120)
[ Post Reply | Private Reply | To 61 | View Replies]

To: LIConFem
Wow, another vi user!! Thought I was the last one!

Not a chance! I use vi every day! It is a great piece of work!

66 posted on 07/05/2007 1:30:49 PM PDT by aragorn (Tag line? What tag line?)
[ Post Reply | Private Reply | To 14 | View Replies]

To: LIConFem
although VMS had its good points

To a Unix bigot like myself, VMS, well, sucks. (To be fair, it has been 10 years since I've been on a VMS box). I hated the way it always tried to be helpful by peering into, and converting between file formats. That is an application responsibility, not an operating system responsibility. I often used to wish for a "/just-leave-it-the-hell-alone" option!

67 posted on 07/05/2007 1:41:03 PM PDT by aragorn (Tag line? What tag line?)
[ Post Reply | Private Reply | To 21 | View Replies]

To: JeffAtlanta

Are yur zip drives are belong to us!


68 posted on 07/05/2007 1:47:11 PM PDT by Frank Sheed (Fr. V. R. Capodanno, Lt, USN, Catholic Chaplain. 3rd/5th, 1st Marine Div., FMF. MOH, posthumously.)
[ Post Reply | Private Reply | To 60 | View Replies]

To: JeffAtlanta
I remember hearing that certain documents (not sure of the time period) were very much in danger due to acidic degradation.

Modern paper made from wood pulp rather than older paper made from cotton or linen. It was so cool to see a 500 year old book that was completely legible and the paper white and not crumbling, unlike pulp fiction of the 1950's and 60's.
69 posted on 07/05/2007 2:39:19 PM PDT by aruanan
[ Post Reply | Private Reply | To 63 | View Replies]

To: ShadowAce
If you put paper on shelves, it's pretty certain it is going to be there in a hundred years.

or the muzzies could be dancing around bonfires fueled by same in 20 years.

70 posted on 07/05/2007 2:43:08 PM PDT by Fitzcarraldo (Skip the Moon, go for Mars)
[ Post Reply | Private Reply | To 1 | View Replies]

To: taxcontrol
My recommendations to a business that faces a 5 year or greater records retention need is to create a “record” - actually an image of the standard operating system loaded with all of the appropriate applications necessary to read any of the file formats for that year. Then, during the annual records archival process, store that image along with the data.

That's not entirely a bad idea. With the advent of easy virtualization, you could just create an image of the entire system as a runnable VM image.  VMWare is the best things since sliced bread IMO.

Another thing I'd strongly advise is for both individuals and companies to use document formats that are actually open and fully documented. That would rule out all of the microsoft formats mentioned in the article. Store the full specs for the formats along with the VM images and you should be pretty much good to go for a lot of stuff.

Of course, you also have to deal with media obsolecence as well. Fortunately, we're making large leaps in data storage, so the space the data takes up that you have to store (in triplicate in separate locations btw) gets smaller with each generation. Approximately every 5 years, you should upgrade your entire archive to new media (again in triplicate at multiple locations) to avoid bit rot and entropy from destroying all your hard work.

Keeping up with data is hard, expensive work, and will be for quite some time.

71 posted on 07/05/2007 4:31:42 PM PDT by zeugma (Don't Want illegal Alien Amnesty? Call 800-417-7666)
[ Post Reply | Private Reply | To 7 | View Replies]

To: ShadowAce
Thought I was the last one! ;o)

Nope--I'm here also.

There are quite a few on this forum actually.

:wq! 

72 posted on 07/05/2007 4:35:27 PM PDT by zeugma (Don't Want illegal Alien Amnesty? Call 800-417-7666)
[ Post Reply | Private Reply | To 19 | View Replies]

To: LIConFem
another vi user!!

And another here. I have to deal with so many different Unix platforms over telnet, vi is the only thing I can be sure is always there

73 posted on 07/05/2007 4:40:54 PM PDT by SauronOfMordor (Open Season rocks http://www.youtube.com/watch?v=ymLJz3N8ayI)
[ Post Reply | Private Reply | To 14 | View Replies]

To: cynwoody
But there will almost certainly be virtual machine applications capable of simulating a PC on the computers of 2233. They'll just need to be able read the drive image off the physical media.

Perhaps by then the hardware will be up to the demands of MS-Vista. :-) 

74 posted on 07/05/2007 4:51:00 PM PDT by zeugma (Don't Want illegal Alien Amnesty? Call 800-417-7666)
[ Post Reply | Private Reply | To 41 | View Replies]

To: JeffAtlanta; scan59
I'd like to make one comment concerning the archival of tape. If you would like your tape media, be they casettes, VHS, 9-track or reel-to-reel, it is important that if you have the ability to, you should run the tape to its end, and NOT rewind it if you want it to last. For VHS tapes, don't fast-fwd when you hit the credits either. Let them run all the way to the snow at the end, then pop the tape. You'll find they last a lot longer that way. You'll have to rewind before you watch them again, but that's just the way it is. :-)

You can actually see a physical indication of why this is so rather easily. Take a VHS tape, fast forward it to the end, then pop it out. Now take a tape, push Play, and let it run to the end. Take both tapes and look at the tape through the little window that shows the spindle where the tape has been run to. Chances are, on the take you ran quickly, you'll see ridges in the tape that indicate it wasn't all wound evenly on the spool. Look at the tape that was played all the way though and it will most likely appear smooth because it was wound a lot more evenly.

75 posted on 07/05/2007 5:06:54 PM PDT by zeugma (Don't Want illegal Alien Amnesty? Call 800-417-7666)
[ Post Reply | Private Reply | To 58 | View Replies]

To: LIConFem

I like Joe better.


76 posted on 07/05/2007 5:33:49 PM PDT by amigatec (Carriers make wonderful diplomatic statements. Subs are for when diplomacy is over.)
[ Post Reply | Private Reply | To 14 | View Replies]

To: sittnick; stainlessbanner

I swear, JAZ drives destroyed data more efficiently than a sledgehammer.


77 posted on 07/05/2007 5:41:15 PM PDT by dighton
[ Post Reply | Private Reply | To 57 | View Replies]

To: LIConFem; proxy_user; ShadowAce; N3WBI3; ken in texas; MarkL; aragorn; zeugma; SauronOfMordor
> Wow, another vi user!! Thought I was the last one! ;o)

Evidently not.

I had the good fortune to graduate from the line editors (ed, TECO) of the 70's directly to EDT (under VMS on a VAX/11-780) in the early 80's. Tried vi under Unix (Sys5) but I loved the EDT keypad programmability so much that I hacked MicroEmacs into an EDT-ish clone and ported it to all my Unix, Windows, and MacOS systems into the 90's. Used emacs for a while after that, but FINALLY AFTER DECADES OF IGNORING IT, have finally come back to using vi more than anything else, because (as Sauron points out), it works in xterms over remote connections when nothing else will. And 80% of what I do these days is over xterms to remote systems. (Sauron said "telnet" but I'm sure he really meant "ssh".)

The other 20% is typing into TEXTAREA boxes on FreeRepublic posting pages using whatever editor my browser-du-jour and OS-du-jour give me.

There's a lot to be said for the common denominator, least or otherwise.

78 posted on 07/05/2007 7:48:45 PM PDT by dayglored (Listen, strange women lying in ponds distributing swords is no basis for a system of government!)
[ Post Reply | Private Reply | To 14 | View Replies]

To: dayglored
There's a lot to be said for the common denominator, least or otherwise.

Amen. 

About the only systems that I've used extensively over the years that I didn't have a copy of vi readily available were hp-1000 and hp-3000 systems running RTE-A and MPE-V respectively.

One thing that really kind of upsets me is that some lame nerd stole my copy of the O'reilly vi reference manual. That is total suckage. Every time I thumbed through that, I'd find something that vi can do that I didn't know about.

The only editor I've ever really liked as much as vi was Brief.  There's some stuff that Brief would do (Though it was strictly DOS only), that I still haven't found anything else to replace it with.

79 posted on 07/05/2007 9:30:08 PM PDT by zeugma (Don't Want illegal Alien Amnesty? Call 800-417-7666)
[ Post Reply | Private Reply | To 78 | View Replies]

To: zeugma
> One thing that really kind of upsets me is that some lame nerd stole my copy of the O'reilly vi reference manual. That is total suckage.

Lowest of the low.

> Every time I thumbed through that, I'd find something that vi can do that I didn't know about.

There's tons I don't know about vi, so maybe I should find myself a copy.

I'm one of very few folks who still know and use 'ed' on occasion (typically, when a system hangs on boot due to fsck failing, you're in single-user, and all you've got to edit /etc/fstab is 'ed'. (Don't forget that initial 'P' to get a prompt!) You'd think I could translate that into vi easier than most, but there's some sort of mental block...

Then again, I shouldn't complain about 'ed'. There's always the REAL Sysadmin's editor, "cat >".

> The only editor I've ever really liked as much as vi was Brief. There's some stuff that Brief would do (Though it was strictly DOS only), that I still haven't found anything else to replace it with.

Brief was great on Microsoft systems, yep. On Windows I've gotten used to TextPad (www.textpad.com) which is quite decent. I've gotten Nedit to run on most of my Unix and Linux systems (it's a kick on the Mac under X11) and it's not bad.

One thing that still bites me with vi is that I got used to using cursor/keypad escape sequences in vi on a few modern systems, but not all my remote systems recognize them, and they interpret Esc[A as commands... so it's back to the HJKL home keys again... and I can't seem to keep straight which systems are "safe" for cursor keys... habits die hard.

80 posted on 07/05/2007 9:49:04 PM PDT by dayglored (Listen, strange women lying in ponds distributing swords is no basis for a system of government!)
[ Post Reply | Private Reply | To 79 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-4041-6061-8081-83 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson