Free Republic
Browse · Search
General/Chat
Topics · Post Article


1 posted on 06/16/2011 11:45:14 AM PDT by ShadowAce
[ Post Reply | Private Reply | View Replies ]


To: rdb3; Calvinist_Dark_Lord; GodGunsandGuts; CyberCowboy777; Salo; Bobsat; JosephW; ...

2 posted on 06/16/2011 11:46:18 AM PDT by ShadowAce (Linux -- The Ultimate Windows Service Pack)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Uh... okay; now, in English, please. ;-)


3 posted on 06/16/2011 11:56:56 AM PDT by Jack Hammer
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Which reminds me of the time I strolled into work at 5:00 AM and found the system adminstrator with her head down in tears on the keyboard and a front office executive standing over her. A system admin might just, possibly, be in at 5:00 AM, but a front office type, never. Seems she installed an update to the Solaris operating system on one unit, and did some “tests”, decided that everything worked OK and proceeded to install it on the other. As rosy fingered dawn broke over Ontario, the high bay became crowded with engineers and programmers who were “on the clock” with nothing to do but cheer good old Stella on.


5 posted on 06/16/2011 11:59:45 AM PDT by Lonesome in Massachussets (Somewhere in Kenya a village is missing its idiot)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

My favorite story, from ten years ago, involves an application program that destroyed the operating system. It was a Solaris 8 env. The first time we ran the program in production, it overwrote the root filesystem, making our powerful Sun box with 28 processors and 28 gigs of memory worthless.

We immediately went into disaster recovery mode, and brought up production on the UAT server. Of course, the first thing they did was run the same program, which wiped out that machine as well.


6 posted on 06/16/2011 12:02:26 PM PDT by proxy_user
[ Post Reply | Private Reply | To 1 | View Replies ]

bflr


13 posted on 06/16/2011 12:21:28 PM PDT by absolootezer0 (2x divorced tattooed pierced harley hatin meghan mccain luvin' REAL beer drinkin' smoker ..what?)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Dont forget the “Infinite troubleshooting” - customer has a problem. System taken off line to troubleshoot and repair. 12 hrs later, having still not reached a fix ... the executive finally made the call to execute DR for that system. RTO and RPO were both less than 2 hrs.


16 posted on 06/16/2011 12:27:03 PM PDT by taxcontrol
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

bkmk


17 posted on 06/16/2011 12:31:40 PM PDT by Sergio (An object at rest cannot be stopped! - The Evil Midnight Bomber What Bombs at Midnight)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Not quite as bad as some of these, but one of my clients spent a great deal on “customized software”, when comparable (probably better) software was available from a major vendor. They neglected to force the developers to provide documentation of any sort. The software they chose always had problems, and less than a year after the project was completed, the company that developed it went out of business and the developers scatter to the four corners of the earth.


18 posted on 06/16/2011 12:32:08 PM PDT by The Sons of Liberty (Psalm 109:8 Let his days be few and let another take his office. - Mene, Mene, Tekel, Upharsin)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce
"We just acquired an implementation company that also did installs of our largest competitor's software....

...

"...what do you mean the acquired company's guys were using their inside knowledge to access our competitor's confidential information??"

(actually, this is now more of a tale of woe for Legal)

24 posted on 06/16/2011 1:00:34 PM PDT by martin_fierro (< |:)~)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

A buddy and his programming staff were given two weeks notice after meeting a deadline with code the met the specification. They had been expecting a ‘atta boy’ or a ‘congrats’ not a pink slip. Management takes code to customers who love the new software but ‘could you make it do X & Z too!?’

Management goes to my buddy with the request and he tells them it would take little effort to make those enhancements but all the new programmers they would have to hire would a few months to get up to speed on the code before they could tackle the changes. ‘What about you and your staff?’ ‘Sorry, but we all have new jobs and all of us leave tomorrow. Why did you get rid of us all?’

They confessed that they wanted to get rid of all those expensive programmers to save money and look smart to their managers.


26 posted on 06/16/2011 1:29:31 PM PDT by pikachu (After Monday and Tuesday, even the calender goes W T F !)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

bookmark


28 posted on 06/16/2011 1:53:21 PM PDT by FourPeas ("Maladjusted and wigging out is no way to go through life, son." -hg)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

I heard of a case back in the bad old days of removable platter drives where the admin got a call at home in the middle of the night to inform him that the primary copy of their data had failed. Not too concerned, he asked if they had mounted the backup copy, and was told that they had done so, only to find out that the problem was in the drive, when it destroyed the backup. Don’t know if they also had tape for a second layer of backup.


29 posted on 06/16/2011 2:00:46 PM PDT by Still Thinking (Freedom is NOT a loophole!)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce
The cleaner unplugged it

Pay attention to log files. More than once I have seen perfectly planned and executed offsite failovers felled because nobody realised the cleaner at the backup site was liable to unplug the servers, for example to charge an iPod. This is not an urban legend.

Then there was the manager of the building containing the mission-critical mainframe processing real time test data. He conducted a tour of his facility for some visitors and at one point in the tour he pointed out the main power switch to the mainframe - and cycled the switch off and back on!!! Scratch one expensive test, and scratch (quite literally) all the big, expensive hard disks supporting the operation.

Sigh . . .

On a smaller scale, there was the large computer which would go crazy every now and then.

Who knew that the steel wool pad on the floor cleaning machine would put iron filings in the air, or that they would randomly short out whatever printed circuit they settled on? Certainly not the janitor!


35 posted on 06/16/2011 2:53:01 PM PDT by conservatism_IS_compassion (DRAFT PALIN)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Hurricane Katrina hit landfall directly over our manufacturing plant along the Gulf Coast in Mississippi. The computer center there was flooded to almost ceiling level. Our Dell storage array network with all the local servers, disk drives, etc. was completely submerged in a stinking, muddy mess.

Come to find out, our fancy ‘distributed’ document management system is a combination of central and local storage. Whenever someone ‘local’ would access a blueprint file in edit mode, the system would move the file from ‘central’ storage to ‘local’ storage - this to improve the speed of accessing the file.

The ‘local’ files were in a Raid5 configuration with weekly full and nightly incremental disk-to-disk backup... plus tape backups stored in the datacenter - now all ruined.

The off-site, month-old backup was in a local bank deposit box. But the bank did not open for about a month after Katrina. The bank-located backups recovered just fine to a sister plant located in TN. But within the lost month, some of the company’s blueprint files had been moved to local storage. In all, a few dozen critical blueprints from across the company existed only on the muck encrusted data disks.

Luckily, a company specializing in recovering data from damaged disks were able to retrieve all the lost engineering files. But not after several weeks and over $100k spent...


36 posted on 06/16/2011 4:11:39 PM PDT by cheee (Good, Fast, Cheap ... you can only pick two...)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Between stupid users, and self inflicted pain, my horror stories are so numerous, I just don’t know where to begin. Lol


37 posted on 06/16/2011 4:17:48 PM PDT by KoRn (Department of Homeland Security, Certified - "Right Wing Extremist")
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce

Back in the day, no, the day before that, the difference between taking a full Friday night backup or conducting a full system restore, was a sleepy operator typing either a “1” or a “2” at 04:00 Saturday morning. You guessed it. That operator mounted tape after tape, blindly following the system prompts until better than half of the production data on our System 370 had been overwritten with the previous week’s data. The boss, myself and a couple of my cohorts spent the next 55 hours in the DC straightening out that mess.


44 posted on 06/17/2011 6:11:04 AM PDT by Ol' Sox
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce
Here's my fun little horror story for this thread:

Years ago, I was working in a datacenter that had about 13, HP-3000s with about 120 or so big washtub disk drives strung out the back on the floor. We also had a water-cooled IBM 3090, and a couple of miscellaneous Dec 11/780s. With all these systems up and running, this was a loud room to work in.

One day we had some fine fellows hanging wallpaper in the corridor that connected the computer room from the secure area you had to go through to enter.

There was a Big Red Button on the wall that had the words "EMERGENCY POWER CUT OFF" printed over in it large red letters. This button also had a cover over it that had to be pulled up in order to press the button, so noone could bump it by accident.

You remember those fine fellows hanging wallpaper? Well, they had to take that cover off the BRB so they could hang their wallpaper.

It was about 4:30 or so and we were right in the middle of shift change. A bunch of us were standing around talking and passing on info about what had happened the previous shift and what was coming up. Suddenly, we heard a huge BANG and it went dark as we heard the slow wind-down of all the fans, drives, and computers.

It became really quiet in that room. A kind of quiet you seldom hear, as our ears were so accustomed to the drone of the fans and drives, the lack thereof was even more profound than it might otherwise had been. We all kind of looked at each other and then looked out to the corridor, and saw where one of the fine fellows hanging wallpaper had accidently brushed up against that Big Red Button.

The aftermath was kind of interesting. We went through, hit the individual power switches on all the disk drives and other peripherals, to turn them off, then brought power back on.

Once we were sure power was stable, we started turning on the drives to the HP-3000 boxes. Out of the 12 systems, (that had thousands of users), 9 of them just kinda sat there and waited until each of its drives were fully up and connected, then they started executing the next instruction in their stack. I didn't know it at the time, but apparently they contained their own battery backup in the chassis, and just waited until they could continue right on with their work! To the thousands of users connected to these systems, their terminals just froze for a while, then continued right where they had left off. Total downtime for most of these systems was 15-30 minutes. (This represented thousands of hours of cumulative user downtime.)

The 3090 didn't fare so well. Apparently, they don't particularly like it when someone just yanks power from underneath them. Took about 13 or so hours to get it fully operational again.

I'll never forget what the Sound of Silence is really like.

47 posted on 06/17/2011 9:48:10 AM PDT by zeugma (The only thing in the social security trust fund is your children and grandchildren's sweat.)
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce
Ah, geek war stories! Love 'em. Lemmeesee...there was the time one of the smartest bosses I ever had (no sarcasm there, the guy really was sharp) was testing this new patching software and only meant to scan every box in the environment but there was this one little checkbox that said "reboot"... Yeah. Every one of a little over 100 servers. Simultaneously. In the middle of the business day. So a shell-shocked CIO came down and asked why all the servers went offline simultaneously, and the boss looked up at him innocently and said, "Why, because I rebooted them all," and turned back to his monitor. Dead silence in the room. The CIO shrugs, and says, "Oh, OK."

After he left the boss turned to us, who were choking back tears, and said, "That won't work on me."

54 posted on 06/17/2011 10:46:58 AM PDT by Billthedrill
[ Post Reply | Private Reply | To 1 | View Replies ]

To: ShadowAce
6. The cleaner unplugged it. Pay attention to log files. More than once I have seen perfectly planned and executed offsite failovers felled because nobody realised the cleaner at the backup site was liable to unplug the servers, for example to charge an iPod. This is not an urban legend.

It really does happen. At the MN Supercomputer Center we spent months trying to figure why the Crays were crashing inexplicably. As the lead software tech I was huddled at the console at 3am with a bunch of CRI engineers while the janitor was in the opposite end of the room with a power sweeper. We noticed that he brushed a rack on the far end of the room at exactly the moment that the Cray crashed. We had him do it again. The Cray crashed again.

What happened is that the racks were bolted to the metal grid of the raised floor panels. An electrical spark ran from his sweeper through the floor grid and into the heaviest ground wire in room, which was connected to the Cray. So basically any spark or electrical glitch anywhere in the room was being funneled into the poor Cray.

57 posted on 06/17/2011 11:08:49 AM PDT by Gideon7
[ Post Reply | Private Reply | To 1 | View Replies ]

Free Republic
Browse · Search
General/Chat
Topics · Post Article


FreeRepublic, LLC, PO BOX 9771, FRESNO, CA 93794
FreeRepublic.com is powered by software copyright 2000-2008 John Robinson