Free Republic
Browse · Search
News/Activism
Topics · Post Article

Skip to comments.

A new way to stop digital decay
The Economist ^ | September 15, 2005 | Economist Staff

Posted on 09/20/2005 4:13:50 PM PDT by Zuben Elgenubi



A new way to stop digital decay

Sep 15th 2005
From The Economist print edition


Computing: Could a “virtual computer”, built from software, help to save today's digital documents for historians of the future?

WHEN future historians turn their attention to the early 21st century, electronic documents will be vital to their understanding of our times. Old web pages may not turn yellow and brittle like paper, but the digital documents of today's culture face a more serious threat: the disappearance of computers able to read them. Even a relatively simple electronic item, such as a picture, requires software to present it as a visible image, but 100 years from now, today's computers will have long since become obsolete. More complex items, like CD-ROMs or videos, will be unreadable even sooner.

In 1986, for example, 900 years after the Domesday book, the BBC launched a project to compile data about Britain, including maps, video and text. The results were recorded on laserdiscs that could only be read by a special system based around a BBC Micro home computer. But since the disks were unreadable on any other system, this pioneering example of multimedia was nearly lost for ever. It took two and a half years of patient work with one of the few surviving machines to move the data on to a modern PC (it can be seen online at www.domesday1986.com).

National libraries are just starting to grapple with this problem as part of their new mandate to preserve digital culture. “It is a major problem, but it is remarkable how little known it is,” says Hilde van Wijngaarden, head of digital preservation at the National Library of the Netherlands. “People just accept that things no longer work after ten years.”

Keeping working examples of all computer hardware is impractical, so the most popular preservation strategy is to copy files from one generation of hardware to the next. The problem is that today's word processors and web browsers, for example, do not always display files in the same way that older software did. An accumulation of subtle errors can eventually make the original item unreadable. An alternative approach, called emulation, uses software to simulate the old hardware on a modern computer, to allow old software to run. But today's emulators will need another emulator to run on the next generation of hardware, which will need another emulator for the next generation, and so on. This can also introduce errors.

So the National Library of the Netherlands is exploring a third option, using a simulated computer that exists only in software. It is called the Universal Virtual Computer (UVC) and is being developed by IBM, a computer giant. The researchers are writing programs to run on this virtual computer that decode different document formats. Future libraries will have to write software that emulates the virtual computer on each new generation of computer systems. But once that is done, they will be able to view all their stored documents using the decoders written for the virtual computer, which only have to be written once. “The decoder can be tested for correctness today, while the format is still readable,” says Raymond van Diessen of IBM.

His team has written decoders for two common image formats, JPEG and GIF. They plan to move on to Adobe's PDF format. IBM is also talking to drug firms, which are required to store data from clinical trials for long periods. Ultimately, the aim is to be able to preserve anything from simple web pages to complex data sets. Ominously, some scientific data from the 1970s has already crumbled into unreadable digital bits.


TOPICS: Business/Economy; Culture/Society; Editorial; Technical
KEYWORDS: decay; diessen; digital; domesday; ibm
Navigation: use the links below to view more comments.
first previous 1-2021-4041-53 next last
To: Zuben Elgenubi
So this Universal Virtual Computer is being currently written to work on today's technology.

What happens in the not too distant future when the hardware no longer supports a particular version of the Universal Virtual Computer?

Then wont they need to write another UVC to run the original UVC.

But then the hardware will continue to evolve and you will need an ever-increasing string of UVC updates to read each other and ultimately read the files.

This is all very silly.

All we need is a complete list of rules for reading each file type. At any time in the future someone can write a program to read any file for which the rules have been maintained.

In the future when humans all have brain implants or have been replaced by AI robots it will take microseconds to generate the code and run it against the sum of all documents.

21 posted on 09/20/2005 8:00:46 PM PDT by who_would_fardels_bear
[ Post Reply | Private Reply | To 1 | View Replies]

To: Zuben Elgenubi
More complex items, like CD-ROMs or videos, will be unreadable even sooner.

CD-ROMs last much shorter than a good quality paper. Hard disks and tapes last even less.

22 posted on 09/20/2005 8:37:39 PM PDT by A. Pole (Gov.Gumpas:"But that would be putting the clock back, have you no idea of progress, of development?")
[ Post Reply | Private Reply | To 1 | View Replies]

To: supercat
I would think that three such drives (one each for 8", 5.25", and 3.5") would be able to read 99.99% of the floppies produced in those sizes (and would also, with proper programming, be better able to deal with bit rot than the drives of yesteryear.

Floppies, CD's etc decay physically.

23 posted on 09/20/2005 8:39:24 PM PDT by A. Pole (Gov.Gumpas:"But that would be putting the clock back, have you no idea of progress, of development?")
[ Post Reply | Private Reply | To 3 | View Replies]

To: Mr_Moonlight
Information (data) has been 'lost' thru eons of history, but somehow scientists and historians have been able to recreate a fairly accurate picture of the past using only some of the smallest parcels of indirct data still in existance

The computer PRINTOUTS on a good paper will survive.

24 posted on 09/20/2005 8:41:06 PM PDT by A. Pole (Gov.Gumpas:"But that would be putting the clock back, have you no idea of progress, of development?")
[ Post Reply | Private Reply | To 5 | View Replies]

To: sneakers
They tell me that the newer microfilm has a shelf-life of 500 years if stored properly.

Good paper lasts longer and it passed the test of time.


25 posted on 09/20/2005 8:47:39 PM PDT by A. Pole (Gov.Gumpas:"But that would be putting the clock back, have you no idea of progress, of development?")
[ Post Reply | Private Reply | To 15 | View Replies]

To: A. Pole
The computer PRINTOUTS on a good paper will survive.

And all it takes is one careless flick of a Bic lighter to burn that theory to ash .. hence my point, that information (data) has always been lost thru the ages, regardless of its media type

26 posted on 09/20/2005 8:48:52 PM PDT by Mr_Moonlight
[ Post Reply | Private Reply | To 24 | View Replies]

To: supercat
On the contrary, we have documents from the early 1800's which are very readable on microfilm and we get good copies. Of course, it depends upon the condition of the document in the first place. We have to adhere to strict guidelines when filming and cannot dispose without the go-ahead of the State Historical Commission - and that is after they have looked over the archival copy of our film.

As far as daguerrotype, well, I'm not an expert on the the evolution of photography, so I couldn't tell you anything about that. :)

27 posted on 09/20/2005 8:59:36 PM PDT by sneakers
[ Post Reply | Private Reply | To 17 | View Replies]

To: A. Pole

You are absolutely correct. We have early 19th records which are holding up much better than a lot of our 20th century records. But it's always good to have a backup in case, God forbid, a fire should occur.

I wish they had filmed all the military records that were burned in the St. Louis fire in the early 1970's. My dad's WWII records were lost. They recreated as much as possible, but who knows what else was there.


28 posted on 09/20/2005 9:06:11 PM PDT by sneakers
[ Post Reply | Private Reply | To 25 | View Replies]

To: A. Pole
Floppies, CD's etc decay physically.

True, there is magnetic degredation. But signal-processing techniques should make it possible to discern data which is faded too much to be readable via standard hardware (which must 'get' everything in one go).

Of course, if the oxide falls off the disk that's another story.

29 posted on 09/20/2005 9:29:32 PM PDT by supercat (Don't fix blame--FIX THE PROBLEM.)
[ Post Reply | Private Reply | To 23 | View Replies]

To: sneakers
On the contrary, we have documents from the early 1800's which are very readable on microfilm and we get good copies. Of course, it depends upon the condition of the document in the first place. We have to adhere to strict guidelines when filming and cannot dispose without the go-ahead of the State Historical Commission - and that is after they have looked over the archival copy of our film.

All depends on how well the film is made. Documents which are being archived for the purpose of preserving a good quality copy of the original will probably do pretty well. Documents which are thrown on microfilm because regulation XYZ says to do so even though nobody ever looks at them anyway may not do so well.

30 posted on 09/20/2005 9:31:44 PM PDT by supercat (Don't fix blame--FIX THE PROBLEM.)
[ Post Reply | Private Reply | To 27 | View Replies]

To: Zuben Elgenubi

Why don't they just ask the BATFEC and the IRS what to do about it. I'm sure the pasty-faced droids up there have figured out how to store every detail of personal information about every living human on Earth for at least the next 103,487 years with no possibility of escape, oops excuse me I mean data degradation.


31 posted on 09/20/2005 11:48:28 PM PDT by fire_eye (Socialism is the opiate of academia.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: supercat
A computer may be able to tell that a file contains a picture, and a human may be able to tell that it contains a picture of a woman holding a baby. But who are the woman and the baby? If there isn't anyone around to identify the significance of a picture, that significance will be lost even if the picture itself remains.

JPEG and TIFF accommodate embedded metadata of that type (and more). See http://www.aspjpeg.com/manual_06.html for details on embedded image metadata insertion and extraction.

32 posted on 09/21/2005 12:48:47 AM PDT by Prime Choice (E=mc^3. Don't drink and derive.)
[ Post Reply | Private Reply | To 7 | View Replies]

To: Zuben Elgenubi

The advertisement column is funny and misleading. Parsers for all interesting formats are being easily written and maintained for UNIX systems. OS emulators to be frequently udpated are unnecessary, although many of those are also being written and maintained.


33 posted on 09/21/2005 12:50:40 AM PDT by familyop ("Let us try" sounds better, don't you think? "Essayons" is so...Latin.)
[ Post Reply | Private Reply | To 1 | View Replies]

To: sneakers
I wish they had filmed all the military records that were burned in the St. Louis fire in the early 1970's. My dad's WWII records were lost. They recreated as much as possible, but who knows what else was there.

Don't forget the wealth of historic records and photonegatives that were lost when the World Trade Center was destroyed. The basement of that building was used to house tons of historic content that is now forever lost.

34 posted on 09/21/2005 12:50:44 AM PDT by Prime Choice (E=mc^3. Don't drink and derive.)
[ Post Reply | Private Reply | To 28 | View Replies]

To: sneakers

Making daguerreotypes is what killed Daguerre.


35 posted on 09/21/2005 12:58:47 AM PDT by TypeZoNegative (Future Minnesota Refugee)
[ Post Reply | Private Reply | To 27 | View Replies]

To: A. Pole

CDs, if kept at a good temperature, humidity, and kept away from light, will last several hundred years.


36 posted on 09/21/2005 1:00:30 AM PDT by familyop ("Let us try" sounds better, don't you think? "Essayons" is so...Latin.)
[ Post Reply | Private Reply | To 25 | View Replies]

To: Zuben Elgenubi

I have wondered about this "digital decay" every time I send off a sound recording to the Library of Congress for copyright registration. Not only will CDs be unreadable and obsolte in the near future (if they aren't already), but I heard that there is a natural depletion that occurs over time of the digital data on the CDs themselves.


37 posted on 09/21/2005 1:09:45 AM PDT by Lancey Howard
[ Post Reply | Private Reply | To 1 | View Replies]

To: Zuben Elgenubi

Baconian, Websterian, Collierian, call it what you will, encyclopedic efforts are inspired by the very nature that dooms them - the light that eclipses itself.

Maybe "dark ages" are of the same order of "ice ages."


38 posted on 09/21/2005 1:18:16 AM PDT by Old Professer (Fix the problem, not the blame!)
[ Post Reply | Private Reply | To 1 | View Replies]

To: Zuben Elgenubi

Rose petals in Grandma's bible.


39 posted on 09/21/2005 1:19:42 AM PDT by Old Professer (Fix the problem, not the blame!)
[ Post Reply | Private Reply | To 4 | View Replies]

To: RockyMtnMan

The only way to freeze time is to unplug the clock.


40 posted on 09/21/2005 1:21:14 AM PDT by Old Professer (Fix the problem, not the blame!)
[ Post Reply | Private Reply | To 10 | View Replies]


Navigation: use the links below to view more comments.
first previous 1-2021-4041-53 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search
News/Activism
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson