Bug in backup software results in loss of 77 terabytes of research data at Kyoto University

Bug in backup software results in loss of 77 terabytes of research data at Kyoto University
TechXplore ^ | 4 January 2022 | Bob Yirka

Posted on 01/04/2022 3:29:08 PM PST by ShadowAce

Computer maintenance workers at Kyoto University have announced that due to an apparent bug in software used to back up research data, researchers using the University's Hewlett-Packard Cray computing system, called Lustre, have lost approximately 77 terabytes of data. The team at the University's Institute for Information Management and Communication posted a Failure Information page detailing what is known so far about the data loss.

The team, with the University's Information Department Information Infrastructure Division, Supercomputing, reported that files in the /LARGEO (on the DataDirect ExaScaler storage system) were lost during a system backup procedure. Some in the press have suggested that the problem arose from a faulty script that was supposed to delete only old, unneeded log files. The team noted that it was originally thought that approximately 100TB of files had been lost, but that number has since been pared down to 77TB. They note also that the failure occurred on December 16 between the hours of 5:50 and 7pm. Affected users were immediately notified via emails. The team further notes that approximately 34 million files were lost and that the files lost belonged to 14 known research groups. The team did not release information related to the names of the research groups or what sort of research they were conducting. They did note data from another four groups appears to be restorable. Also unclear is whether the research groups who lost their data will be reimbursed for the money spent conducting research on the university's supercomputer system. Such costs are notoriously high, running into the hundreds of dollars per hour of computing time.

Some news outlets are reporting that the backup system was supplied by Hewlett-Packard and that the failure occurred after an HP software update. The same outlets are also reporting that HP has accepted blame for the data loss and is offering to make amends. The team at the university reported that the backup procedure was halted as soon as it became clear that something was awry and university officials suggest that in the future, incremental backup procedures will always be used to prevent the loss of data.

TOPICS: Computers/Internet
KEYWORDS: filesystem

Navigation: use the links below to view more comments.
first previous 1-20, 21-40, 41-52 next last

To: Secret Agent Man

“i still remember one of my roommates telling me the story that he couldnt get his ibm pc-at with the 100mb hard drive because the store guy would not sell him one - “no one will ever need a 100mb hard drive”, he said.”

I don’t think 100 mb drives were available on the AT?

The old joke referred to to 256kb max RAM on the XT.

21 posted on 01/04/2022 3:57:56 PM PST by TexasGator (UF)

[ Post Reply | Private Reply | To 9 | View Replies]

They lost it? Have they looked in China?

22 posted on 01/04/2022 4:02:42 PM PST by curious7

[ Post Reply | Private Reply | To 1 | View Replies]

To: Flick Lives

“A professional IT Dept should have multiple generations of backups. This article makes it sound like they keep a grand total of 1 backup. Doesn’t say much for the Information Infrastructure Division.”

Amazon says their LUSTRE system can retain daily backups for up to 90 days.

23 posted on 01/04/2022 4:10:37 PM PST by TexasGator (UF)

[ Post Reply | Private Reply | To 5 | View Replies]

To: dfwgator

Stop it. What for? No one is ever going to need more than 640K RAM.

24 posted on 01/04/2022 4:12:03 PM PST by gnarledmaw (Hive minded liberals worship leaders, sovereign conservatives elect servants.)

[ Post Reply | Private Reply | To 17 | View Replies]

To: Secret Agent Man

Heck, I remember my surprise at the first disk full error on a retrofitted 10 meg drive added to an IBM-PC that originally had two 5-1/4” floppies at 360k each.

25 posted on 01/04/2022 4:16:33 PM PST by FreedomPoster (Islam delenda est)

[ Post Reply | Private Reply | To 9 | View Replies]

To: ShadowAce

That’s...a lot of lost data...

26 posted on 01/04/2022 4:17:45 PM PST by Republican Wildcat

[ Post Reply | Private Reply | To 1 | View Replies]

To: TexasGator

We were all computer guys back then

His dad had money so he could get a true ibm pc-at i286 processor

It may have been able to work with a special controller card

But i believe the story,i ran into salespeople like that too

27 posted on 01/04/2022 4:20:01 PM PST by Secret Agent Man (Gone Galt; not averse to Going Bronson.)

[ Post Reply | Private Reply | To 21 | View Replies]

To: gnarledmaw

That’s what I had to play King’s Quest from Sierra games.

28 posted on 01/04/2022 4:25:29 PM PST by EEGator

[ Post Reply | Private Reply | To 24 | View Replies]

To: Flick Lives

I keep multiple backups of my personal files, my video files, and my audio files. I never trust one backup machine.

29 posted on 01/04/2022 4:30:53 PM PST by ProtectOurFreedom (81 million votes...and NOT ONE "Build Back Better" hat)

[ Post Reply | Private Reply | To 5 | View Replies]

To: ShadowAce

All my data is backed up on 6 separate nonlinked computers and multiple different types of storage media since 1968 when there wuz IBM tape drives and paper cards...never lost one byte...

30 posted on 01/04/2022 4:35:23 PM PST by bunkerhill7 (That`s 464 people per square foot! Is this corrrect..it was NYC.)

[ Post Reply | Private Reply | To 1 | View Replies]

To: webheart

Back in the old days when disk was expensive, most business data was on tape. A magnetic tape reel could store up up 175MB. And company data centers had racks of them. What happened since then is more data stays on disk.

31 posted on 01/04/2022 4:46:09 PM PST by SauronOfMordor (A Leftist can't enjoy life unless they are controlling, hurting, or destroying others)

[ Post Reply | Private Reply | To 15 | View Replies]

To: ShadowAce

Institute for Information Management and Communication lost 77TB... does it get anymore embarrassing?

32 posted on 01/04/2022 4:47:33 PM PST by Chode (there is no fall back position, there's no rally point, there is no LZ... we're on our own. #FJB)

[ Post Reply | Private Reply | To 1 | View Replies]

To: ShadowAce

A 12 terabyte hard drive is 300-400 bucks.

6 x 12 = 72 terabytes.

6 x $400.00 = 2400 bucks, add another, it's around 3000 bucks.

Who were the IT clowns that didn't want to shell out 3000 grand to cover this?

Double it. Who were the IT clowns that didn't want to shell out 6000 grand to cover this?

The research groups would have burned their hands pulling out their wallets to buy their own redundant backup if they had known that clowns were handling their data.

33 posted on 01/04/2022 4:58:29 PM PST by kiryandil (China Joe and Paycheck Hunter - the Chink in America's defenses)

[ Post Reply | Private Reply | To 1 | View Replies]

To: ShadowAce

Might have been a good idea to verify the backups before you need them. Also have more than one, possibly even using different technology, etc. (and verify those too)

34 posted on 01/04/2022 5:09:15 PM PST by Still Thinking (Freedom is NOT a loophole!)

[ Post Reply | Private Reply | To 1 | View Replies]

To: ShadowAce

Same kind of software used by Jones, Hansen and Mann for their source data that supports their Global Warming Conspiracy theory.

35 posted on 01/04/2022 5:52:09 PM PST by RetiredTexasVet (When Satan craps another demon possessed Progressive is born.)

[ Post Reply | Private Reply | To 1 | View Replies]

To: webheart

The acronym GIGO comes to mind: Garbage In, Garbage Out.
______________________________________

Hey I did data entry on an IBM System 36. When you purchased yogurt and it rang up $1.23 rather than the correct price, THAT WAS ME.

36 posted on 01/04/2022 6:03:15 PM PST by BarbM (FU Pence. You refuse to be alone with a woman, but have no compunction in screwing the USA))

[ Post Reply | Private Reply | To 19 | View Replies]

To: SauronOfMordor

In 1998 I worked for a company who still used this system. I ran backups. I couldn’t believe they still used the old system. Except, the system worked. Rarely if ever, went down and they always had backup.

May God bless the men who invented IBM System 36.

37 posted on 01/04/2022 6:05:59 PM PST by BarbM (FU Pence. You refuse to be alone with a woman, but have no compunction in screwing the USA))

[ Post Reply | Private Reply | To 31 | View Replies]

To: BarbM

You backup to tape, put the tape in a room, and buy more tape reels. It was cheaper to buy more tape than to risk losing data.

38 posted on 01/04/2022 6:12:38 PM PST by SauronOfMordor (A Leftist can't enjoy life unless they are controlling, hurting, or destroying others)

[ Post Reply | Private Reply | To 37 | View Replies]

To: ShadowAce

Find the person at the university who designed a grossly inept backup "strategy" that apparently a) kept only one backup copy of the data set, and b) deleted that backup before writing a new one. No second backup to alternate; no set of three or more to rotate. So when the bug wiped the working data set, there was no backup.

I'm flabbergasted at this inconceivable level of irresponsibility. Granted HP screwed up with the script bug, but they should also take the guy who designed the "backup strategy" out back and shoot him.

39 posted on 01/04/2022 7:01:30 PM PST by dayglored ("Listen. Strange women lying in ponds distributing swords is no basis for a system of government.")

[ Post Reply | Private Reply | To 1 | View Replies]

To: webheart

“I’ve been working in IT for decades and I have never backed up anything. Hahahahaaaa! Most of what people call data is useless.”

I’d back up the old drive on the new drive — it would take up a tenth of it. Somewhere on my desktop PC today is a 10mb backup of my first hard drive.

40 posted on 01/04/2022 7:12:14 PM PST by Born to Conserve

[ Post Reply | Private Reply | To 13 | View Replies]

Navigation: use the links below to view more comments.
first previous 1-20, 21-40, 41-52 next last

Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.

Free Republic
Browse · Search

General/Chat
Topics · Post Article

FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794