What the Max story says about safety oversight today

Yesterday the US Federal Aviation Administration joined most of the rest of the aviation world in grounding the Boeing 737 Max series of aircraft, the very latest version of the established 737 series. What took it so long?

Having entered service in May 2017, by early March this year the Max had suffered two fatal crashes within five months. This is extraordinary for a new commercial airliner today.

Evidence from the preliminary report on the earlier of the two accidents suggests a technical failure precipitated it. The first event, in October 2018, involved a nearly-new 737 Max 8 belonging to Indonesian carrier Lion Air. It crashed into the sea near Jakarta within about 10min of take-off. The second accident, on 10 March this year, involved an Ethiopian Airlines aircraft of the same type, and it plunged into the ground within six minutes of take-off from Addis Ababa. Pilots of both aircraft radioed that they were having trouble controlling the aircraft’s height, and this was evident on flight tracking systems.

The FAA issued its grounding order on 13 March. This was three days after the Ethiopian crash,  two days after China, Ethiopia and Singapore had banned Max operations, and a day later than the influential European Aviation Safety Agency – and many other states – had done the same.

Does this demonstrate that there are different safety standards – or safety philosophies – in different countries? Or does it suggest that the relationship – in this case – between the safety regulator and the manufacturer is too close?

On 12 March, resisting calls to ground the aircraft, the FAA said: “Thus far, our review shows no systemic performance issues and provides no basis to order grounding the aircraft.”

The next day it stated: “The FAA is ordering the temporary grounding of Boeing 737 MAX aircraft operated by U.S. airlines or in U.S. territory. The agency made this decision as a result of the data gathering process and new evidence collected at the site [of the Ethiopian crash] and analyzed today. This evidence, together with newly refined satellite data available to FAA this morning, led to this decision.”

The safety principle behind aircraft design, for more than half a century, has been that all systems should “fail safe”. This means that any one critical system or piece of equipment, if it fails, will not directly cause an accident. This is achieved either by multiplexing critical systems so there is backup if one of them fails, or by ensuring that the failure does not render the aircraft unflyable.

The preliminary report from the Indonesian accident investigator NTSC suggests that a factor in the sequence of events leading to it was a faulty angle of attack (AoA) sensor. This device, says the report, sent false signals to a new stall protection system unique to the Max series of 737s, known as the manoeuvring control augmentation system (MCAS). According to the report, these signals wrongly indicated a very high AoA, and the MCAS triggered the horizontal stabiliser to trim the aircraft nose-down. Finally, the crew seems not to have known how to counteract this nose-down control demand.

The implication of the NTSC report – not the final verdict – is that the MCAS was not designed according to fail safe principles: a single unit failed, causing a software-controlled automatic system to motor the powerful horizontal stabiliser to pitch the aircraft nose-down, and it kept on doing this until the crew could not overcome the pitch-down force with elevator.

At that point disaster could still have been prevented if the crew had been familiar with the MCAS, or with the drill for a runaway stabiliser trim. But the MCAS would not have been expected to trigger at climb speeds during departure. The result was that in this case the crew failed to act as the final backup safety system.

In the months immediately following the Indonesian crash some pilot associations in the USA whose members operate the Max publicly claimed that there was a widespread ignorance among Max-qualified pilots of the very existence of the MCAS, and also many assumed that a runaway trim could be dealt with in exactly the same way as it was for all the earlier 737 marques. Actually the drill is quite different for the Max, as Boeing and the US Federal Aviation Administration (FAA) have pointed out. There is more detail on the MCAS in the preceding item in this blog – “This shouldn’t happen these days”.

Somehow, therefore, many 737 Max pilots in Boeing’s home territory had found themselves un-briefed on a system that was unique to the Max. They claimed lack of detail in the flight crew operations manual (FCOM), which described the system’s function but did not give it a name. US pilots who converted to the Max were all 737 type-rated and had flown the NG marque, but their conversion course to the Max consisted of computer-based learning, with no simulator time.

This ignorance among US pilots was soon corrected because the issue got plenty of intra-industry publicity, so if a US carrier pilot suffered an MCAS malfunction the crews would have known to apply the runaway trim checklist, and select the STAB TRIM switches to CUT OUT. Was this confidence about US crew knowledge the reason the FAA was able to maintain its sang-froid over grounding for longer than the rest?

On the other hand it is not a good principle to use a pilot as the back-up for a system that is not fail-safe.

In the 1990s there were several serious fatal accidents to 737s caused by what became known as “rudder hard-over”. This was a sudden, uncommanded move of the rudder to one extreme or the other, rendering the aircraft out of control, and unrecoverable if it happened at low altitude. The problem was ultimately solved by redesigning the rudder power control unit, for which there was no backup, thus no fail-safe.

If a Boeing product has a fault the responsibility is Boeing’s, but it is equally the FAA’s. The FAA is the safety overseer, and should satisfy itself that all critical systems are fail-safe and that the manufacturer has proven this through testing.

If America has an image it is that of the can-do, the entrepreneurial risk-taker. Why would Boeing or the FAA be different? One of the FAA’s stated values is this: “Innovation is our signature. We foster creativity and vision to provide solutions beyond today’s boundaries.”

The world has benefited from the USA’s risk-taking culture which has driven some aviation advances faster than they would have occurred in other more risk-averse cultures like that of Western Europe. An example of this is the massive extension of ETOPs (extended range twin engine operation) with the arrival on the market of the Boeing 777, which ultimately drove the four-engined Airbus A340 out of the market and influenced the early close-down of the A380 line. Boeing and the FAA took the risk together, and together they got away with it.

Is the 737 Max going to prove to be the one Boeing didn’t get away with? Time will tell.

But is certain Boeing will find a fix that will get the Max back in the sky. And although this episode, if it runs the course it seems likely to follow, will damage Boeing, the damage will be far from terminal. The company has an unbreakable brand name by virtue of being so good for so long, but trust will have suffered.

In the world at large, the art and science of safety oversight is changing dramatically. Technology is advancing so fast that the traditional system of close oversight by the regulator cannot work without stifling innovation, so “Performance-Based Regulation” (PBR) is the new watchword. Basically this means that the regulator prescribes what performance and reliability objectives a system or piece of equipment should meet, and the manufacturer has to prove to the regulator that it meets them. This is fine, providing that the regulator insists on the testing and the proof, and has the expertise and resources to carry out the oversight.

Although lack of oversight resources in the FAA seems unlikely, it would be a global disaster if it occurred. The same would be true of other national aviation agencies (NAA) in countries where aviation manufacturing takes place.

That risk of under-resourcing NAAs is a serious worry for the future, because all the signs are that most countries consider it a very low political priority, especially at a time of budget austerity.

 

This shouldn’t happen these days

In the last five years, statistics for fatal accidents to commercial passenger jets were so low they looked set to prove that a permanent zero fatal accident target was achievable.

Technology is accepted to be the main contributor to these remarkable safety performance improvements. The superb engineering and smart systems in the latest jets made them as different from their predecessors as today’s generation of automobiles is from cars of the 1970s.

But, on 29 October 2018, Lion Air flight JT610 crashed only about 12min after take-off from Jakarta, Indonesia. The aircraft was a Boeing 737 Max 8 that was delivered by the manufacturer to the airline less than three months before, one of 11 of this new marque in its fleet.

That was a shock, but when on 10 March this year another almost new 737 Max 8 also crashed within a few minutes of take-off from Addis Ababa, Ethiopia under circumstances that appear similar, a chill went through the entire aviation community.

Ethiopian Airlines has grounded its 737 Max fleet, Singapore has banned Max operations in its airspace, and the Chinese aviation authority CAAC has grounded all Maxes registered there – almost sixty of them. And on 12 March Australia, Ireland, France, Germany and the UK added themselves to the rapidly growing list of those who had banned operation of the type. Late on 12 March the biggest blow fell: European Union body the European Aviation Safety Agency has banned all 737 Max 8s and 9s from its skies except to fly, empty, to maintenance bases. The agency argued that it cannot be ruled out that the Ethiopian accident was caused by the same failure as that which appears to have caused the Lion Air crash. And, shortly before midnight, India had joined the doubters.

Now Latin America has begun a wave of groundings and, as a result, by the end of the Western European day on 12 March more than a third of all Maxes in service around the world had been affected by effective groundings. There has never been an event like this, where the original certificating authority has declared an aircraft airworthy but much of the rest of the world has decided it is not so confident.

Back to the accident issues. The two take-off airports couldn’t have been more different, one at sea level, the other at an elevation of more than 7,000ft, but in both cases it was daylight and the weather conditions were benign.

Both aircraft were seen to dive to impact.

The Indonesian investigator (NTSC) issued a preliminary factual report that doesn’t pretend to provide a verdict on the cause of the Lion Air crash, but suggests that a factor in the sequence of events leading to it was a faulty angle of attack (AoA) sensor. This device, says the report, sent false signals to a new stall protection system unique to the Max series of 737s, known as the manoeuvring control augmentation system (MCAS). According to the report, these signals wrongly indicated a very high AoA, and the MCAS triggered the horizontal stabiliser to trim the aircraft nose-down.  The crew seems not to have known how to counteract this nose-down control demand.

The NTSC did, however, provide fine detail about malfunctions on same airframe on the previous day (28 October), when almost exactly the same sequence of events occurred, including the signal from the faulty AoA sensor to the MCAS. But on that occasion the captain stopped the nose-down stabiliser trim rotation by selecting the STAB TRIM switches to CUT OUT, and then proceeded safely to the scheduled destination.

Some pilot associations in the USA whose members operate the Max have professed publicly that there was a widespread ignorance among Max-qualified pilots of the very existence of the MCAS, and also among them was an assumption that a runaway trim could be dealt with in exactly the same way as it was for all the earlier 737 marques. Actually the drill is different for the Max, as Boeing and the US Federal Aviation Administration (FAA) have pointed out.

The MCAS was developed for the Max because its more powerful engines are heavier and fitted further forward than those on earlier marques, affecting the aircraft’s centre of gravity and thus its behaviour at low speeds approaching the stall, so the manufacturer wanted to boost stall protection. It looks as if Boeing had either not foreseen the potential effect of a false high AoA indicator input to the MCAS, or it had failed to warn pilots clearly what that effect could be and how to react. The FAA also, it appears, had not anticipated this.

After the Lion Air crash the FAA put out an emergency airworthiness directive requiring operators of the Max to make clear to pilots the procedures for dealing with a runaway stabiliser trim. Boeing maintained that information was already available.

Pilots converting from earlier 737 marques to the Max are not required to undergo a new full type rating course or simulator sessions, because all 737s are deemed to have sufficient commonality to operate under the same type rating. Thus 737-rated pilots being prepared for the Max are required only to undergo a brief academic “differences course”. For example Southwest Airlines pilots had done their differences course entirely online, and American Airlines the same.

On 11 March, a day after the Ethiopian crash, the FAA revealed it has required Boeing to solve the software problem – and if applicable the hardware – that at present means that a false AoA input can trigger the MCAS stall protection when it is not needed, effectively causing a stabiliser pitch trim runaway. Meanwhile it has declared that the 737 Max series is airworthy.

But if it were to be found that there is a common cause of these two Max crashes – whatever that cause is determined to be – the implications for the manufacturer and the airlines are significant, given the massive size of the order book for 737 Max series aircraft.