Friday, January 23, 2009

Accident statistics: valid?

Statistical analyses of accidents has troubled me for many years. One of my main objections was that statistical correlation does not equate to "cause." Very recently an even more significant insight occurred to me during a discussion with a colleague, Ira Rimson, as we were discussing the system descriptions as inputs used by safety analysts. We were discussing the description requirements to help safety analysts understand the dynamics of the systems they were being asked to analyze, and how those dynamics should be described. Our experience suggested that the descriptions presently offered were fragmented elements at best. Extending that notion to the descriptions of accidents led to a new concern involving the idea of "sampling" accident dynamics during accident investigations. What should be sampled and how should the samples be documented?

Digitization of music provides an instructive analogy.

To digitally reproduce music, states of a musical work are sampled at various rates, typically ranging from 22k to 44k or more samples per second. The lower the sampling rate, the less faithfully the music is reproduced. Carried to its logical conclusion, a single sample of a song is useless if one wants to "hear" the data as music.

Think of the production of music or a song as a process, requiring the dynamic interactions of the people, and instruments, and the pitch, frequency and other constantly changing relationships necessary to produce the notes as the song progresses from beginning to end. To digitize the music the state of each of these attributes must be sampled frequently as the song progress. Digital video captures images sampling action in frames per second. Same idea: the greater the number of samples, the greater the fidelity of the reproduction.

Apply similar thinking to the descriptions of systems and their operation, and how such information is presented to system safety analysts who are expected to find hazards in a system and predict system aberrations. How well is it possible to meet those expectations if the descriptions of the processes they are to analyze has insufficient samples of the "music" the system operation might produce?

Now, think also of accidents as processes, involving dynamic interactions of the people, objects and energies needed to produce the accident as it progresses from its beginning to its outcome over time. What data do we sample to predict and capture a description of that process? That question has been answered after the fact in part in some activities, such as aircraft system operations captured with digital flight data recorders and analog cockpit voice recorders. But before the accidents happen, how can you predict them with insufficiently sampled process description?

The question seems worthy of more exploration.