Tuesday, July 21, 2009

Lessons learning system

"Lessons learned" as a concept has been around a long long time, and has been examined often in the past. It is one of the underlying reasons for doing accident investigation and incident investigations, and other investigations of all kinds. Yet when it comes to using the lessons, the inquiry rates are very modest, and reasons for not using them are numerous.

To find what knowledge has been gained about lessons learned in accident investigations, and who may have studied the topic, a google search seemed like a good starting point. For a "lessons learned" search, Google produced around 20 million hits. Using "lessons learned" with accident or investigation produced 2,420,000 hits. That's a lot of lessons learned. Lessons learned process is another term used frequently in connection with these activities; an advanced Google search for "lessons learned process" produced around 300,00 hits in many diverse fields. When we narrow that search even further by looking for accident or investigation related lessons learned process, an advanced search produced 2810 hits. Now if we want to analyze those processes using a system analysis approach, a search for accident or investigation "lessons learned system" produced a slightly more manageable 645 hits, but that still included many hits not related to accident investigations. To try to narrow the search further, "lessons learning system" and accident or incident we entered, resulting in 5 hits with Google, and 5 with Yahoo. -{mostly my works.)

Using another tack, the 2810 and 645 hit lists were scanned to find references to organizations that had lessons learned process. Many do - some of which derive lessons from investigations. When the 645 hits were scanned, the tenor of the references listed observed to be focused on the lessons, rather that the full breadth and depth of the learning process, from the time data from which lessons are developed until changes based on the lessons have produced expected results. A few exceptions were noted: when major accident processes are examined thoroughly, as in a Challenger space shuttle accident investigation or in the Bunsfield tank farm explosions, calls for improvement in lessons learned processes sometimes occur.

Possibly more significantly goals, criteria, metrics, ouutput specifications, quality assurance and other properties of lessons were noteworthy by their ambiguity or absence, with most focusing on step by step actions to process the lessons that were inputed to the system.

How then can we reasonably expect lessons learning systems to be optimized, or even improved. This is what we are exploring. Some progress has been made and reported.

Contributions of criteria or suggestions for lessons learning system improvement - or critiques of some of the ideas are invited.

Friday, January 23, 2009

Accident statistics: valid?

Statistical analyses of accidents has troubled me for many years. One of my main objections was that statistical correlation does not equate to "cause." Very recently an even more significant insight occurred to me during a discussion with a colleague, Ira Rimson, as we were discussing the system descriptions as inputs used by safety analysts. We were discussing the description requirements to help safety analysts understand the dynamics of the systems they were being asked to analyze, and how those dynamics should be described. Our experience suggested that the descriptions presently offered were fragmented elements at best. Extending that notion to the descriptions of accidents led to a new concern involving the idea of "sampling" accident dynamics during accident investigations. What should be sampled and how should the samples be documented?

Digitization of music provides an instructive analogy.

To digitally reproduce music, states of a musical work are sampled at various rates, typically ranging from 22k to 44k or more samples per second. The lower the sampling rate, the less faithfully the music is reproduced. Carried to its logical conclusion, a single sample of a song is useless if one wants to "hear" the data as music.

Think of the production of music or a song as a process, requiring the dynamic interactions of the people, and instruments, and the pitch, frequency and other constantly changing relationships necessary to produce the notes as the song progresses from beginning to end. To digitize the music the state of each of these attributes must be sampled frequently as the song progress. Digital video captures images sampling action in frames per second. Same idea: the greater the number of samples, the greater the fidelity of the reproduction.

Apply similar thinking to the descriptions of systems and their operation, and how such information is presented to system safety analysts who are expected to find hazards in a system and predict system aberrations. How well is it possible to meet those expectations if the descriptions of the processes they are to analyze has insufficient samples of the "music" the system operation might produce?

Now, think also of accidents as processes, involving dynamic interactions of the people, objects and energies needed to produce the accident as it progresses from its beginning to its outcome over time. What data do we sample to predict and capture a description of that process? That question has been answered after the fact in part in some activities, such as aircraft system operations captured with digital flight data recorders and analog cockpit voice recorders. But before the accidents happen, how can you predict them with insufficiently sampled process description?

The question seems worthy of more exploration.