Boeing is in the hot seat again. Its 777 airliner made an emergency landing in Moscow last Friday after the pilot saw an engine problem. This came less than a week after another flying Boeing 777 was dropping engine parts over Denver in Colorado. That prompted United Airlines, Korean Air, Japan Airlines, and a few others to ground their fleets.
But here’s what makes the incident most remarkable: The plane landed safely without any injuries.
Pilots are routinely trained. Their training includes handling engine failures and flying after one engine has quit. That ability is almost “innate” for any experienced pilot. Ed Coleman, a former Air Force pilot, noted the calm tone among the UA crews recorded during the incident. “Their voices don’t even go up an octave,” he said.
Statistically speaking, flying is far safer than driving. The odds of dying in a motor accident are 1 in 98 for a lifetime. For air transport, the odds are 1 in 7,178. So, how exactly does the aviation industry manage to reduce catastrophe? How can you design a safe system when incidents are unavoidable? You may not be a pilot, but you might want to build a safer system for your own work. How?
Separate Fact-Finding and Policy Changes
Two main players are involved in any aviation incidents. There is the National Transportation Security Board, or the NTSB. Then there is the Federal Aviation Administration, or the FAA.
The NTSB operates as an independent federal agency. Its role is to investigate accidents and to formulate safety recommendations. The FAA, on the other hand, is a branch of the Department of Transportation. It is a regulatory agency responsible for enforcement of safety rules.
In other words, rule-making is separated from fact-finding. Enforcement is cleaved from investigation.
This distinction is key. The FAA will consider policy changes only after the NTSB has completed its investigation. Such separation limits the influence of industry lobbyists. But more importantly, it prevents investigators from data-gathering bias.
Now, think about the approach a company should take to conduct any after-action review.
Inquire First, Without Blame
Companies also make mistakes. A new product launch can bomb. The company website can suffer an outage. A negotiation with suppliers can collapse. An important project can miss its deadline. An incident is always a combination of technical problems and human error. You need to first identify the root cause without bias in order to remedy both.
The first step is to assemble a timeline and gather details on what happened. The goal here is to enable people closest to the incident to share what they saw. The only rule here is that you can’t say, “I should have done X” or “If I had known about that, I would have done Y.” As I have written before, hindsight is always perfect. It is not acceptable to make your countermeasures to merely “be more careful” or “be less stupid.” In a crisis situation, no one person actually know what’s really going on. Saying what people should have done doesn’t explain why it made sense for them to do what they did.
The first step to improvement is to stick with assembling the timeline. You need to know how the event unfolded step by step.
Obviously, at this stage of the inquiry, a lot depends on the leader’s behavior. The fleeting condition that allows everyone to share openly with their peers comes from psychological safety. Some team members might become distraught during the process. They might blurt out apologies. Remember, this is not a confession. The focus should always be: Why did it make sense to me when I took that action? Suspend judgement. Focus on facts.
What typically happen is that everyone comes to learn something about how the system works. In stark contrast with their mental models of how they thought it works, they see the reality in a new light. There are likely very specific things that people can recommend that will prevent future mistakes.
Publish the Findings as Widely as Possible
The goal of a blameless post-mortem is to record what actions people took at what time and what effects people observed. This is a marked departure from the usual subjective narrative. It also documents what resolution is being considered.
Here is the hard stuff. After the inquiry, we need to publish the meeting notes and all the associated evidence. It could be the timeline and chat logs and the results. High-performing companies do this routinely. They tend to centralize findings in one single location so the entire organization can access them and learn from past incidents.
At Bridgewater Associates, founder Ray Dalio calls such an approach “radical transparency.” It’s the world’s largest hedge fund, with around $160 billion in assets. But there are no “closed-door” conversations across all ranks. Before Zoom or WebEx, Dalio would videotape every managerial meeting, including his own. He then hired a small team to edit the tapes, focusing on the most important moments, and turned the lessons gleaned from the tapes into case studies for employee training.
Doing this ensures that an organization translates local learnings into global improvement. Google does this through its search engine. “As you can imagine, at Google everything is searchable. All the postmortem documents are in places where other Googlers can see them,” said Randy Shoup, former engineering director for Google App Engine. “And trust me, when any group has an incident that sounds similar to something that happened before, these postmortem documents are among the first documents being read and studied.”
Or consider Booking.com, which saves all its experiments. The successes and the failures are all on its IT platform. These results are all made searchable to anyone in the company. It doesn’t matter which division you are from. Every engineer gets access to all experimental protocols and the resulting data.
The Key to Improvement
The approach of fact gathering and information publishing has served all of us well. Commercial aviation fatalities have decreased by 95 percent during the past 20 years. The only way to improve performance is to make unbiased knowledge transparent. The converse is also true: to restrict access to information is to allows politics to fester. Non-transparency and incompetence go hand in hand.
Stay healthy,
P.S. Have you observed a good postmortem where everyone learned? What’s the best way to achieve “psychological safety” in this delicate process? Share your thoughts with us. Join the discussion.
6 comments
Thank you for underscoring the importance of . There are probably two levels of post-mortem. the first looks at the above mentioned engine problems. The second goes beyond, and looks for other types of problems, p. ex. the soft-ware of a relatively newer model of plane. Then, try to understand what the hell is going on with that organization, possibly starting with people-factors. Boeing and their engine supplier were unchallenged for a long time , then came among SudAviation and their engine supplier. In the medium term, Chinese may join the party. These days nothing should be taken for granted.
Timeline, fact gathering, and analysis of actions taken or inactions by Gov Cuomo and/or members of his administration is what has to happen with respect to the assignment of those with covid to nursing homes and the under reporting of deaths from covid in those nursing homes. Timeline and fact gathering should be applied as well to his receipt of campaign donations from the industry and his grant of immunity from liability to the industry during the ongoing battle against the virus.
Very good and timely article Howard.
Your thinking could be very useful now in Texas during the ongoing investigation after the failure of the energy system during the recent February snow storm which left 4 million people without electricity, heating, communications and many without water. The government response was far less efficient than expected. And many people suffered physically, economically and emotionally.
It is easy to point finger in a situation like this and start blaming different actors for lack of responsibility and weak response systems. But your approach provides another route. An opportunity to learn what happened. Where the system and different actors failed in order to build more robust organizations and better leadership.
Thanks for sharing your thoughts
Again a great post, Howard, thank you!
One post-mortem I still remember after 15 years is the one we did after a specific round of A/B testing of email marketing campaigns. My team of company employees believed we could get better click rates than the ones our marketing agency achieved for us. We thought that content we’d curate or write ourselves, based on our in-depth knowledge of our own solution and on our tastes, would perform better.
We did the tests: our content vs. the agency’s and measured everything (e.g. Open rates, click rates, recency, cost of production) Together with the marketing agency team, we then held ruthless post-mortems where only the data mattered, not who had done what and why.
In the end, it was clear the agency content performed better than ours across the board. It was a lesson in individual humility of course, but also one in marketing humbleness: our thoughts and tastes about what worked or not were not relevant when designing a mass consumer marketing program, it was the customer reaction and the hard data that came back that spelled out what worked. We all knew that intellectually of course, but going though the experience and doing the post-mortem brought the message home.
Good thing I came across this informative article. Thank you for this!
Thanks for this informative article!