Posted on 04/02/2020 9:41:53 AM PDT by dayglored
US air safety bods call it 'potentially catastrophic' if reboot directive not implemented
The US Federal Aviation Administration has ordered Boeing 787 operators to switch their aircraft off and on every 51 days to prevent what it called "several potentially catastrophic failure scenarios" including the crashing of onboard network switches.
The airworthiness directive, due to be enforced from later this month, orders airlines to power-cycle their B787s before the aircraft reaches the specified days of continuous power-on operation.
The power cycling is needed to prevent stale data from populating the aircraft's systems, a problem that has occurred on different 787 systems in the past.
According to the directive itself, if the aircraft is powered on for more than 51 days this can lead to "display of misleading data" to the pilots, with that data including airspeed, attitude, altitude and engine operating indications. On top of all that, the stall warning horn and overspeed horn also stop working.
This alarming-sounding situation comes about because, for reasons the directive did not go into, the 787's common core system (CCS) an Intel Wind River VxWorks realtime OS product, at heart stops filtering out stale data from key flight control displays. That stale data-monitoring function going down in turn "could lead to undetected or unannunciated loss of common data network (CDN) message age validation, combined with a CDN switch failure".
Solving the problem is simple: power the aircraft down completely before reaching 51 days. It is usual for commercial airliners to spend weeks or more continuously powered on as crews change at airports, or ground power is plugged in overnight while cleaners and maintainers do their thing.
The CDN is a Boeing avionics term for the 787's internal Ethernet-based network. It is built to a slightly more stringent aviation-specific standard than common-or-garden Ethernet, that standard being called ARINC 664. More about ARINC 664 can be read here.
Airline pilots were sanguine about the implications of the failures when El Reg asked a handful about the directive. One told us: "Loss of airspeed data combined with engine instrument malfunctions isn't unheard of," adding that there wasn't really enough information in the doc to decide whether or not the described failure would be truly catastrophic. Besides, he said, the backup speed and attitude instruments are for obvious reasons completely separate from the main displays.
Another mused that loss of engine indications would make it harder to adopt the fallback drill of setting a known pitch and engine power* setting that guarantees safe straight-and-level flight while the pilots consult checklists and manuals to find a fix.
A third commented, tongue firmly in cheek: "Anything like that with the aircraft is unhealthy!"
A previous software bug forced airlines to power down their 787s every 248 days for fear that electrical generators could shut down in flight.
Airbus suffers from similar issues with its A350, with a relatively recent but since-patched bug forcing power cycles every 149 hours.
Persistent or unfiltered stale data is a known 787 problem. In 2014 a Japan Airlines 787 caught fire because of the (entirely separate, and since fixed) lithium-ion battery problem. Investigators realised the black boxes had been recording false information, hampering their task, because they were falsely accepting stale old data as up-to-the-second real inputs.
More seriously, another 787 stale data problem in years gone by saw superseded backup flight plans persisting in standby navigation computers, and activating occasionally. Activation caused the autopilot to wrongly decide it was halfway through flying a previous journey and manoeuvre to regain the "correct" flight path. Another symptom was for the flight management system to simply go blank and freeze, triggered by selection of a standard arrival path (STAR) with exactly 14 waypoints such as the BIMPA 4U approach to Poland's rather busy Warsaw Airport. The Polish air safety regulator published this mildly alarming finding in 2016 [2-page PDF, in Polish].
This was fixed through a software update, as the US Federal Aviation Administration reiterated last year. In addition, Warsaw's BIMPA 4U approach has since been superseded.
The Register asked Boeing to comment. ®
Just what one wants to hear as the plane gets in the air a recording saying this plan is fail safe fail sate fail safe................................................
Wind River RTOS used to be about as solid as you can get, and was used in several space projects.
Is it an Intel problem created since they bought Wind River, or is it a sloppy programming/testing issue at Boeing's subcontractor?
I power my car down at least once a day for this very reason.
But if you do it too often, it screws up all the clocks.
Yep, and then there’s Leap Year...
Wish I could get 51 days out of Windows...
LOL
Link? Some reference? I'd like to follow that up.
As a former instrument technician and then a software guy, this sounds like a software problem to me.
Wish I could get 51 days out of Windows...
+++++++++++++++++++++++++++++
I have an old - close to 10 years - Win 7 machine that runs 24/7 for months on end. Only reboots by accident or power failure. So it’s possible, at least on Win 7.
Cntrl-Alt-Delete doesn’t work, whoda thunk?
I was making a software presentation to Boeing engineering once and one of their uh, multicultural software people asked me, "What's an opcode?"
sounds like someone used milliseconds in a variable that flips over after 51 days
I flew home from Copenhagen on a Dreamliner. That 11-1/2 hour flight was the most pleasant, relaxing plane ride ever. And no jet lag because of the extra fresh air they pump in that keeps you refreshed. Sorry to hear about the problems.
That’s not a very good operating system. An op system must be robust and not do stupid things.
So, the plane is never taken offline (say, during FUELING), in that 51 day period?
I don’t buy it. PFQ would have you turn things off and back on again, I’d think, just to make sure they do come back up.
... and I wonder what exactly is involved in “turning off and turning on” a Boeing 787. I bet it is not as simple as flicking a light-switch. If it’s a “cold-start” rather than a simple reboot, it might take 12-24 hours ... very costly downtime for the airline!
So, write instruction in the code that every 30 days, the entire system, with man-intervention and approval, reboots. Like, perhaps, during refueling.
Do if youre flying on day 50, youll be fine. Promise.
“What’s an opcode?”
Boeing can be proud that they support such a noble cause. Proud all the way to the scrap heap of history.
Disclaimer: Opinions posted on Free Republic are those of the individual posters and do not necessarily represent the opinion of Free Republic or its management. All materials posted herein are protected by copyright law and the exemption for fair use of copyrighted works.