Thoughts from the Boeing 737 Max 8 Saga

As an avid traveler with a fascination with airplanes, I have found this Boeing 737 Max 8 saga to be incredibly intriguing. For those who are unaware, last year Lion Air had a flight crash within 15 minutes of takeoff. Black Box data showed the plane nose rising and falling as pilots were fighting to climb. What we have since discovered, the pilots were actually fighting against the airplane. 4 months later, Ethiopian Airlines had a flight that also crashed while flying a relatively new 737 Max 8. The whole world has since grounded this plane model until a permanent software fix from Boeing has been fully fleshed out and rolled out.

History: Boeing had developed a prior 737-800 model aircraft. In an effort to differentiate, they added more powerful engines and moved them slightly forward on the fuselage. This inherently created a condition for the plane to potentially stall during takeoff if the angle of attack was too steep. To prevent this hazard, Boeing, an airplane company, added a software feature to avoid 1 problem, which inadvertently put millions of passengers and crew members at risk. Why? Because this change was Undocumented and other safety features were “premium add-ons” behind a paywall. “Pilots did not have sufficient training to understand how MCAS worked, and two vital safety features—a display showing what the sensor detected, and a light warning if other sensors disagreed—were optional extras (paywall).” (1)

The MCAS (Maneuvering Characteristics Augmentation System) system was installed to work with a sensor in the nose of the aircraft to detect an angle of attack too steep to cause the plane to stall. To avoid this, the MCAS system would engage to push the nose of the plane down to avoid a stall. However, if a faulty sensor detected an inaccurate angle, the MCAS system would activate. There are 4 steps a pilot must take to disengage this system. The first step, the pilots performed over and over which was to hit a “rebalancing switch”. However, the pilots were unaware of 3 additional steps to rectify this situation. (2)

  1. Additional Step 1: Press a switch to turn off a motor controlling the angle of the nose.
  2. Additional Step 2: Press a second switch to confirm the motor shut down.
  3. Additional Step 3: Turn a wheel to re-angle the plane’s nose and stop the dive

Unfortunately, “the plane pushed its nose down for over 9 minutes before it hit the sea. The report (3) said audio from a cockpit recorder showed the pilots seeking a solution in the plane’s technical manual but ultimately running out of time.” (2)

“The manufacturer has said that to handle the situation there is a documented procedure that must be memorized. A different crew on the same plane the evening before encountered the same problem but solved it after running through three checklists, according to the November report. But they did not pass on all of the information about the problems they encountered to the next crew, the report said.” (3)

It’s also been reported that certification on the 737 Max 8 was a short online training course. “Pilots of Southwest Airlines and American Airlines took courses — lasting between 56 minutes and three hours — that highlighted differences between the Max 8 and older 737s, but did not explain the new [MCAS].” (4) “The self-administered transition course for American Airlines pilots was a 56-minute online course, Tajer said, which he completed on his iPad. It was broken up into four broad sections, including a general description of changes to the aircraft, its engines, and its instrument panel. But an explanation or even an acknowledgment of the MCAS system was again missing, Tajer said.”

In Summary:

I truly believe most people and businesses create and innovate solutions out of the kindness/goodness of their heart where they mean well, but occasionally they execute the plan poorly. As a software company, Camelot implements many changes to our software, including fixing many issues that would otherwise cause problems.

Earlier I alluded to Boeing being an “airplane company”, not a “software company”, because what I’ve learned from my experience at Camelot, is that Software Users will inevitably find every possible button and try testing it in very weird, complicated and complex conditions, regardless of whether those conditions/sequences/combinations of settings have been previously tested or not, or whether they’ve been documented or not. There are also functions that depend on reliable inputs, and if those inputs are faulty, then other functions may be affected by it.

In the case of an airplane, the stakes are much higher, and the consequences are dire. Thank goodness what we do doesn’t directly affect whether people immediately live or die but there are a lot of parallels with Boeing.  Documentation of features is as important to software as the programming to add the feature and the implementation of using the feature.

Takeaways to consider:

  • When your company experiences turnover, how do you ensure key information will pass from one “flight crew” to the next?
  • When your users find issues, are they documenting them and reporting them promptly?
  • When a “disaster” or system outage occurs, do you have clear steps/checklists to resolve and overcome your challenges?
    • What are your contingency plans?

Also See: Disaster Recovery and Planning for more information to increase your preparedness.

Sources:

Comments are closed.