ΑΙhub.org
 

Robust human organizations: Lessons for artificial intelligence research and application


by
11 April 2019



share this:

In the 1980s and 1990s, several researchers (most notably Todd LaPorte, Gene Rochlin, and Karlene Roberts) undertook the study of human organizations that operate high-risk systems while achieving very low error rates. They called these organizations “High Reliability Organizations” (or HROs), and they tried to characterize what aspects of these organizations accounted for their high reliability.

My attention was drawn to HROs as a result of reading Paul Scharre’s excellent book Army of None (2018) that reviews the current state and future prospects of autonomous weapon systems. Among the many existing autonomous weapon systems that Scharre analyzes, I found his discussion of the Aegis cruisers to be the most interesting from an AI standpoint. Scharre discusses the tragic Vincennes incident in which the Vincennes, a cruiser equipped with the Aegis autonomous ship defense system, accidentally shot down an Iranian civilian airliner resulting in the death of all passengers and crew. Scharre describes how in response the US Navy completely overhauled its procedures for training Aegis cruiser personnel and adopted many of the practices of HROs. He suggests that the safe deployment of an autonomous weapon system requires that the organization using that system be a high-reliability organization.

What are the properties of HROs? Weick, Sutcliffe, and Obstfeld (1999) give an excellent summary. They identify five practices that they believe are responsible for the low error rates of HROs:

1. Preoccupation with failure. HROs believe that there exist new failure modes that they have not yet observed. These failure modes are rare, so it is impossible to learn from experience how to handle them. Consequently, HROs study all known failures carefully, they study anomalies and near misses, and they treat the absence of anomalies and near misses as a sign that they are not being sufficiently vigilant in looking for problems. HROs encourage the reporting of all mistakes and anomalies.

2. Reluctance to simplify interpretations. HROs cultivate a diverse collection of expertise so that multiple interpretations can be generated for any observed event. They adopt many forms of checks and balances and perform frequent adversarial reviews. They hire people with non-traditional training, perform job rotations, and engage in repeated retraining. To deal with the conflicts that arise from multiple interpretations, they hire and value people for their interpersonal skills as much as for their technical knowledge.

3. Sensitivity to operations. HROs maintain at all times a small group of people who have deep situational awareness. This group constantly checks whether the observed behavior of the system is the result of its known inputs or whether there might be other factors at work.

4. Commitment to resilience. Teams practice managing surprise. They practice recombining existing actions and procedures in novel ways in order to attain high skill at improvisation. They practice the rapid formation of ad hoc teams to improvise solutions to novel problems.

5. Under-specification of organizational structures. HROs empower every team member to make decisions related to his/her expertise. Any person can raise an alarm and halt operations. When anomalies or near misses arise, their descriptions are propagated throughout the organization, rather than following a fixed reporting path, in the hopes that a person with the right expertise will see them. Power is delegated to operations personnel, but management is available at all times.

HROs provide at least three lessons for the development and application of AI technology. First, our goal should be to create combined human-machine systems that function as high-reliability organizations. We should consider how AI systems can incorporate the five principles listed above. Our AI systems should continuously monitor their own behavior, the behavior of the human team, and the behavior of the environment to check for anomalies, near misses, and unanticipated side effects of actions. Our AI systems should be built of ensembles of diverse models to reduce the risk that any one model contains critical errors. They should incorporate techniques, such as minimizing down-side risk, that confer robustness to model error (Chow, et al., 2015). Our AI systems must support combined human-machine situational awareness, which will require not only excellent user interface design but the creation of AI systems whose structure can be understood and whose behavior can be predicted by the human members of the team. Our AI systems must support combined human-machine improvisational planning. Rather than executing fixed policies, methods that combine real time planning (e.g., receding horizon control, also known as model-predictive control) are likely to be better-suited to improvisational planning. Researchers in reinforcement learning should learn from the experience of human-machine mixed initiative planning systems (Bresina and Morris, 2007). Finally, our AI systems should have models of their own expertise and models of the expertise of the human operators so that the systems can route problems to the right humans when needed.

A second lesson from HRO studies is that we should not deploy AI technology in situations where it is impossible for the surrounding human organization to achieve high reliability. Consider, for example, the deployment of face recognition tools by law enforcement. Quite aside from questions of privacy and civil liberties, I believe the HRO perspective provides important guidance for deciding under what conditions it is safe to deploy face recognition. The South Wales police have made public the results of 15 deployments of face recognition technology at public events. Across these 15 deployments, they caught 234 people with outstanding arrest warrants. They also experienced 2,451 false alarms — a false alarm rate of 91.3% (South Wales Police, 2018). This is typical of many applications of face recognition and fraud detection. To ensure that we achieve 100% detection of criminals, we must set the detection threshold quite low, which leads to high false alarm rates. While I do not know the details of the South Wales Police procedures, it is easy to imagine that this organization could achieve high reliability through a combination of careful procedures (e.g., human checks of all alarms; employing fall-back methods to compensate for dark skin and other known weaknesses of face recognition software; looking for patterns and anomalies in the alarms; continuous vetting of the list of outstanding arrest warrants and the provenance of the library face images; etc.). But now consider the proposal to incorporate face recognition into the “body cams” worn by police. A single officer engaged in a confrontation with a person believed to be armed would not have the ability to carefully handle false alarms. It is difficult to imagine any organizational design that would enable an officer engaged in a firefight to properly handle this technology.

A third lesson is that our AI systems should be continuously monitoring the functioning of the human organization to check for threats to high reliability. Just as any human member of an HRO can halt operations if they spot a problem, the AI system should also be empowered to halt operations. As AI technology continues to improve, it should be possible to detect problems in the team such as over-confidence, reduced attention, complacency, inertia, homogeneity, bullheadedness, hubris, headstrong acts, and self-importance.

In summary, as with previous technological advances, AI technology increases the risk that failures in human organizations and actions will be magnified by the technology with devastating consequences. To avoid such catastrophic failures, the combined human and AI organization must achieve high reliability. Work on high-reliability organizations suggests important directions for both technological development and policy making. It is critical that we fund and pursue these research directions immediately and that we only deploy AI technology in organizations that maintain high reliability.

References:

Bresina, J. L., Morris, P. H. (2007). Mixed-initiative planning in space mission operations. AI Magazine, 28. 75–88.

Chow, Y., Tamar, A., Mannor, S., and Pavone, M. (2015). Risk-Sensitive and Robust Decision-Making: a CVaR Optimization Approach. Advances in Neural Information Processing Systems (NIPS) 2015.

Scharre, P. (2018). Army of None: Autonomous weapons and the future of war. W. W. Norton.

South Wales Police (2018). https://www.south-wales.police.uk/en/advice/facial-recognition-technology/ Accessed November 12, 2018.

Weick, K. E., Sutcliffe, K. M., Obstfeld D. (1999). Organizing for High Reliability: Processes of Collective Mindfulness. In R.S. Sutton and B.M. Staw (Eds.), Research in Organizational Behavior, Volume 1 (Stanford: Jai Press, 1999), Chapter 44, pp. 81–123.




Thomas G. Dietterich is Distinguished Professor (Emeritus) and Director of Intelligent Systems Institute for Collaborative Robotics and Intelligence Systems (CoRIS) at Oregon State University.
Thomas G. Dietterich is Distinguished Professor (Emeritus) and Director of Intelligent Systems Institute for Collaborative Robotics and Intelligence Systems (CoRIS) at Oregon State University.

            AUAI is supported by:



Subscribe to AIhub newsletter on substack



Related posts :

Maryna Viazovska’s proofs of sphere packing formalized with AI

  27 Apr 2026
Formalization achieved through a collaboration between mathematicians and artificial intelligence tools.

Interview with Deepika Vemuri: interpretability and concept-based learning

  24 Apr 2026
Find out more about Deepika's research bridging the gap between data-driven models and symbolic learning.

As a ‘book scientist’ I work with microscopes, imaging technologies and AI to preserve ancient texts

  23 Apr 2026
Using an array of technologies to recover, understand and preserve many valuable ancient texts.

Sony AI table tennis robot outplays elite human players

  22 Apr 2026
New robot and AI system has beaten professional and elite table tennis players.

Causal models for decision systems: an interview with Matteo Ceriscioli

  21 Apr 2026
How can we integrate causal knowledge into agents or decision systems to make them more reliable?

A model for defect identification in materials

  20 Apr 2026
A new model measures defects that can be leveraged to improve materials’ mechanical strength, heat transfer, and energy-conversion efficiency.

‘Probably’ doesn’t mean the same thing to your AI as it does to you

  17 Apr 2026
Are you sure you and the AI chatbot you’re using are on the same page about probabilities?

Interview with Xinwei Song: strategic interactions in networked multi-agent systems

  16 Apr 2026
Xinwei Song tells us about her research using algorithmic game theory and multi-agent reinforcement learning.



AUAI is supported by:







Subscribe to AIhub newsletter on substack




 















©2026.02 - Association for the Understanding of Artificial Intelligence