Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

Warning of data ticking time bomb
BBC ^ | 03 July 2007 | Unknown

Posted on 07/05/2007 9:30:29 AM PDT by ShadowAce

The growing problem of accessing old digital file formats is a "ticking time bomb", the chief executive of the UK National Archives has warned.

Natalie Ceeney said society faced the possibility of "losing years of critical knowledge" because modern PCs could not always open old file formats.

She was speaking at the launch of a partnership with Microsoft to ensure the Archives could read old formats.

Microsoft's UK head Gordon Frazer warned of a looming "digital dark age".

Costly deal

He added: "Unless more work is done to ensure legacy file formats can be read and edited in the future, we face a digital dark hole."

Research by the British Library suggests Europe loses 3bn euros each year in business value because of issues around digital preservation.

The National Archives, which holds 900 years of written material, has more than 580 terabytes of data - the equivalent of 580,000 encyclopaedias - in older file formats that are no longer commercially available.

Ms Ceeney said: "If you put paper on shelves, it's pretty certain it is going to be there in a hundred years.

"If you stored something on a floppy disc just three or four years ago, you'd have a hard time finding a modern computer capable of opening it."

"Digital information is in fact inherently far more ephemeral than paper," warned Ms Ceeney.

She added: "The pace of software and hardware developments means we are living in the world of a ticking time bomb when it comes to digital preservation.

"We cannot afford to let digital assets being created today disappear. We need to make information created in the digital age to be as resilient as paper."

But Ms Ceeney said some digital documents held by the National Archives had already been lost forever because the programs which could read them no longer existed.

"We are starting to find an awful lot of cases of what has been lost. What we have got to make sure is that it doesn't get any worse."

The root cause of the problem is the range of proprietorial file formats which proliferated during the early digital revolution.

Technology companies, such as Microsoft, used file formats which were not only incompatible with pieces of software from rival firms, but also between different iterations of the same program.

Mr Frazer said Microsoft had shifted its position on file formats.

"Historically within the IT industry, the prevailing trend was for proprietary file formats. We have worked very hard to embrace open standards, specifically in the area of file formats."

Costly deal

Microsoft has developed a new document file format, called Open XML, which is used to save files from programs such Word, Excel and Powerpoint.

Mr Frazer said: "It's an open international standard under independent control. These are no longer under control of Microsoft and are free for access by all."

But some critics question Microsoft's approach and ask why the firm has created its own new standard, rather than adopting a rival system, called the Open Document Format.

Instead, Microsoft has released a tool which can translate between the two formats.

Ben Laurie, director of the Open Rights Group, said: "This is a well-known, standard Microsoft move.

"Microsoft likes lock-ins. Typically what happens is that you end up with two or three standards."

The agreement between the National Archives and Microsoft centres on the use of virtualisation.

The archive will be able to read older file formats in the format they were originally saved by running emulated versions of the older Windows operating systems on modern PCs.

For example, if a Word document was saved using Office 97 under Windows 95, then the National Archives will be able to open that document by emulating the older operating system and software on a modern machine.

Ms Ceeney said the issue of older file formats was a bigger problem than reading outdated forms of media, such as floppy discs of various sizes and punch cards.

"The media it is stored in is not relevant. Back-up is important, but back-up is not preservation."

Adam Farquhar, head of e-architecture at the British Library, praised Microsoft for its adoption of more open standards.

He said: "Microsoft has taken tremendous strides forward in addressing this problem. There has been a sea change in attitude."

He warned that the issue of digital preservation did not just affect National Archives and libraries.

"It's everybody - from small businesses to university research groups and authors and scientists.

"It's a huge challenge for anyone who keeps digital information for more than 15 years because you are talking about five different technology generations."

The British Library and National Archives are members of the Planets project which brings together European National Libraries and Archives and technology companies to address the issue of digital preservation.

He said that open file formats were an important step but there was still work to be done.

"Automation is a key area to work on. We need to be able to convert hundreds and even thousands of documents at a time," he said.


TOPICS: Technical
KEYWORDS: data; dataaccess; datasafety; format; legacyfiles; operatingsystems; preservation
Navigation: use the links below to view more comments.
first previous 1-2021-4041-6061-8081-83 next last
To: proxy_user
Yep. Vi, grep (and egrep), sed, cat, more, et cetera... God, I love Unix/Linux... (although VMS had its good points).
21 posted on 07/05/2007 10:08:00 AM PDT by LIConFem (Thompson 2008. Lifetime ACU Rating: 86 -- Hunter 2008 (VP) Lifetime ACU Rating: 92)
[ Post Reply | Private Reply | To 18 | View Replies]

To: ShadowAce

Does not help for excel type data *including graphs* which dont go into csv or even worse database info..


22 posted on 07/05/2007 10:09:13 AM PDT by N3WBI3 (Light travels faster than sound. This is why some people appear bright until you hear them speak....)
[ Post Reply | Private Reply | To 6 | View Replies]

To: BuffaloJack
Oh yeah, I've had plenty of coasters over the years. CD/DVD is not good media for archiving important data. Not sure what the industry standards are, but different manufacturers use varying thickness, too. The cheap media tends to cause problems with writing and accessing data.

There's really two parts to the discussion: the physical file format and the storage media. Both will need to be accessible and operational for future use.

Good advice on the two copies, too - thanks.

23 posted on 07/05/2007 10:11:26 AM PDT by stainlessbanner
[ Post Reply | Private Reply | To 17 | View Replies]

To: LIConFem

Raises hand..

You are not alone


24 posted on 07/05/2007 10:12:28 AM PDT by N3WBI3 (Light travels faster than sound. This is why some people appear bright until you hear them speak....)
[ Post Reply | Private Reply | To 14 | View Replies]

To: N3WBI3

That is indeed correct. I was just referring to the archival durability. A clay/stone tablet probably outlasts all!


25 posted on 07/05/2007 10:12:36 AM PDT by CarrotAndStick (The articles posted by me needn't necessarily reflect my opinion.)
[ Post Reply | Private Reply | To 20 | View Replies]

To: N3WBI3

True—however, as someone mentioned upthread, PDF should stick around for quite a while. While you will not be able to easily work with that data, exporting spreadsheets out to PDF for archival purposes should work OK.


26 posted on 07/05/2007 10:14:54 AM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 22 | View Replies]

To: LIConFem
although VMS had its good points

Ugg our VMS guy left to company and we are getting rid of our old DEC Vax in the next 6-10 months so I got pulled off of UNIX and stuck on the DEC... I hate it cd ~ is set def [000000.home.n3wbi3] Ugly to use...

27 posted on 07/05/2007 10:15:29 AM PDT by N3WBI3 (Light travels faster than sound. This is why some people appear bright until you hear them speak....)
[ Post Reply | Private Reply | To 21 | View Replies]

To: Frank Sheed
I still have zip/jaz and bernoulli drives. Looks like they have a bunch of drivers available that should do the trick.
28 posted on 07/05/2007 10:16:37 AM PDT by stainlessbanner
[ Post Reply | Private Reply | To 16 | View Replies]

To: ShadowAce
"If you stored something on a floppy disc just three or four years ago, you'd have a hard time finding a modern computer capable of opening it."

If it's not important, file it in the circular file. If it is important, and your machine lacks the needed drive, run out and buy a USB floppy drive.

I recently found a floppy from 1988 and managed to read it just fine. It contained a Lotus spreadsheet analysis of a proposed real estate investment.

29 posted on 07/05/2007 10:17:38 AM PDT by cynwoody
[ Post Reply | Private Reply | To 1 | View Replies]

To: stainlessbanner

Thanks! I checked and see that TigerDirect sells tons of external zip drives by Iomega for $140. I figure that even with my new laptop (coming soon), I can get one and use it on my USB port. All else will be saved as CD/DVD.

F


30 posted on 07/05/2007 10:22:08 AM PDT by Frank Sheed (Fr. V. R. Capodanno, Lt, USN, Catholic Chaplain. 3rd/5th, 1st Marine Div., FMF. MOH, posthumously.)
[ Post Reply | Private Reply | To 28 | View Replies]

To: LIConFem
...another vi user!! Thought I was the last one!

Don't despair... there are still a few of us out here. :-)

31 posted on 07/05/2007 10:25:09 AM PDT by ken in texas (come fold with us.... team #36120)
[ Post Reply | Private Reply | To 14 | View Replies]

To: N3WBI3
The CLI isn't as succinct as Unix, but for those of us who do heavy development, VMS has some wonderful system services and library functions. For example; whereas Unix has signals, VMS has ASTs (Asynchronous System Traps) that allow one to send two longs worth of data along with an interrupt. And this can be done across cluster nodes as well. This was especially useful to me when I was writing real-time trading and market data applications.
32 posted on 07/05/2007 10:25:57 AM PDT by LIConFem (Thompson 2008. Lifetime ACU Rating: 86 -- Hunter 2008 (VP) Lifetime ACU Rating: 92)
[ Post Reply | Private Reply | To 27 | View Replies]

To: ken in texas

Kwel!!!


33 posted on 07/05/2007 10:26:53 AM PDT by LIConFem (Thompson 2008. Lifetime ACU Rating: 86 -- Hunter 2008 (VP) Lifetime ACU Rating: 92)
[ Post Reply | Private Reply | To 31 | View Replies]

To: LIConFem
:1,1 s/Kwel/Kewl/

;o)
34 posted on 07/05/2007 10:27:28 AM PDT by LIConFem (Thompson 2008. Lifetime ACU Rating: 86 -- Hunter 2008 (VP) Lifetime ACU Rating: 92)
[ Post Reply | Private Reply | To 33 | View Replies]

To: proxy_user

“Even with this fancy formats, you can always cat them on a Unix box, or edit in vi.”

Unless they saved everything in their micriofiche libary.

I have a collection of cutting edge (at one time) computer junk. There are punched cards, 8-inch floppies, disk platters, reel to reel tape, etc. Most everything stored there is saved in a format that has been lost to the ravages of time. Database info in proprietary formats, etc. So have their respective readers. Who keeps a working 8-inch floppy drive handy anyway?


35 posted on 07/05/2007 10:28:11 AM PDT by FreeInWV
[ Post Reply | Private Reply | To 4 | View Replies]

To: ShadowAce
580 terabytes of data - the equivalent of 580,000 encyclopaedias

LOL

If it's important, Wikipedia will preserve it.

36 posted on 07/05/2007 10:29:35 AM PDT by RightWhale (It's Brecht's donkey, not mine)
[ Post Reply | Private Reply | To 1 | View Replies]

To: BuffaloJack

Some CDs last barely two years. They look as permanent as the Grand Canyon, but they come apart.


37 posted on 07/05/2007 10:33:27 AM PDT by RightWhale (It's Brecht's donkey, not mine)
[ Post Reply | Private Reply | To 17 | View Replies]

To: LIConFem
Re: your remark about especially useful

I never worked on VMS systems much, but from all I've heard and read, VMS had, and may still have, one of the best implementations for HA clustering in the business.

38 posted on 07/05/2007 10:33:57 AM PDT by ken in texas (come fold with us.... team #36120)
[ Post Reply | Private Reply | To 32 | View Replies]

To: LIConFem
:1,1 s/Kwel/Kewl/

No scan and replace needed... I know what you meant. ;-)

39 posted on 07/05/2007 10:36:08 AM PDT by ken in texas (come fold with us.... team #36120)
[ Post Reply | Private Reply | To 34 | View Replies]

To: ken in texas
I certainly think so. Even the crummy (by today's standards) HSC-based clusters I set up back in '93 were incredible in terms of reliability. We had our govvies system running across several nodes (two hot, one stby, IIRC) and I don't recall ever having any problems with availability. And using the Blocking AST feature of the VMS DLM, I was able to keep trading and market data synchronized across all nodes, so that if a client placed a bid on one node, it would immediately show up on all (this was all kept in global sections).

I love Unix, but I do sometimes miss VMS.
40 posted on 07/05/2007 10:39:49 AM PDT by LIConFem (Thompson 2008. Lifetime ACU Rating: 86 -- Hunter 2008 (VP) Lifetime ACU Rating: 92)
[ Post Reply | Private Reply | To 38 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-4041-6061-8081-83 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson