ΑΙhub.org
 

How sure is sure? Incorporating human error into machine learning


by
04 September 2023



share this:
1 red question mark and lots of black question marks

By Sarah Collins

Human error and uncertainty are concepts that many artificial intelligence systems fail to grasp, particularly in systems where a human provides feedback to a machine learning model. Many of these systems are programmed to assume that humans are always certain and correct, but real-world decision-making includes occasional mistakes and uncertainty.

Researchers from the University of Cambridge, along with The Alan Turing Institute, Princeton, and Google DeepMind, have been attempting to bridge the gap between human behaviour and machine learning, so that uncertainty can be more fully accounted for in AI applications where humans and machines are working together. This could help reduce risk and improve trust and reliability of these applications, especially where safety is critical, such as medical diagnosis.

The team adapted a well-known image classification dataset so that humans could provide feedback and indicate their level of uncertainty when labelling a particular image. The researchers found that training with uncertain labels can improve these systems’ performance in handling uncertain feedback, although humans also cause the overall performance of these hybrid systems to drop. Their results were reported at the AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2023) in Montréal.

‘Human-in-the-loop’ machine learning systems – a type of AI system that enables human feedback – are often framed as a promising way to reduce risks in settings where automated models cannot be relied upon to make decisions alone. But what if the humans are unsure?

“Uncertainty is central in how humans reason about the world but many AI models fail to take this into account,” said first author Katherine Collins from Cambridge’s Department of Engineering. “A lot of developers are working to address model uncertainty, but less work has been done on addressing uncertainty from the person’s point of view.”

We are constantly making decisions based on the balance of probabilities, often without really thinking about it. Most of the time – for example, if we wave at someone who looks just like a friend but turns out to be a total stranger – there’s no harm if we get things wrong. However, in certain applications, uncertainty comes with real safety risks.

“Many human-AI systems assume that humans are always certain of their decisions, which isn’t how humans work – we all make mistakes,” said Collins. “We wanted to look at what happens when people express uncertainty, which is especially important in safety-critical settings, like a clinician working with a medical AI system.”

“We need better tools to recalibrate these models, so that the people working with them are empowered to say when they’re uncertain,” said co-author Matthew Barker, who recently completed his MEng degree at Gonville & Caius College, Cambridge. “Although machines can be trained with complete confidence, humans often can’t provide this, and machine learning models struggle with that uncertainty.”

For their study, the researchers used some of the benchmark machine learning datasets: one was for digit classification, another for classifying chest X-rays, and one for classifying images of birds. For the first two datasets, the researchers simulated uncertainty, but for the bird dataset, they had human participants indicate how certain they were of the images they were looking at: whether a bird was red or orange, for example. These annotated ‘soft labels’ provided by the human participants allowed the researchers to determine how the final output was changed. However, they found that performance degraded rapidly when machines were replaced with humans.

“We know from decades of behavioural research that humans are almost never 100% certain, but it’s a challenge to incorporate this into machine learning,” said Barker. “We’re trying to bridge the two fields so that machine learning can start to deal with human uncertainty where humans are part of the system.”

The researchers say their results have identified several open challenges when incorporating humans into machine learning models. They are releasing their datasets so that further research can be carried out and uncertainty might be built into machine learning systems.

“As some of our colleagues so brilliantly put it, uncertainty is a form of transparency, and that’s hugely important,” said Collins. “We need to figure out when we can trust a model and when to trust a human and why. In certain applications, we’re looking at probability over possibilities. Especially with the rise of chatbots, for example, we need models that better incorporate the language of possibility, which may lead to a more natural, safe experience.”

“In some ways, this work raised more questions than it answered,” said Barker. “But even though humans may be miscalibrated in their uncertainty, we can improve the trustworthiness and reliability of these human-in-the-loop systems by accounting for human behaviour.”

The research was supported in part by the Cambridge Trust, the Marshall Commission, the Leverhulme Trust, the Gates Cambridge Trust and the Engineering and Physical Sciences Research Council (EPSRC), part of UK Research and Innovation (UKRI).

Read the research in full

Human Uncertainty in Concept-Based AI Systems, Katherine M. Collins, Matthew Barker, Mateo Espinosa Zarlenga, Naveen Raman, Umang Bhatt, Mateja Jamnik, Ilia Sucholutsky, Adrian Weller, Krishnamurthy Dvijotham. Paper presented at the Sixth AAAI/ACM Conference on Artificial Intelligence, Ethics and Society (AIES 2023).



tags:


University of Cambridge




            AIhub is supported by:



Related posts :



Forthcoming machine learning and AI seminars: October 2025 edition

  02 Oct 2025
A list of free-to-attend AI-related seminars that are scheduled to take place between 3 October and 30 November 2025.
monthly digest

AIhub monthly digest: September 2025 – conference reviewing, soccer ball detection, and memory traces

  30 Sep 2025
Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

Botanical time machines: AI is unlocking a treasure trove of data held in herbarium collections

  29 Sep 2025
New research describes the development and testing of a new AI-driven tool.

All creatures, great, small, and artificial

  26 Sep 2025
AI in Veterinary Medicine and what it can teach us about the data revolution.

RoboCup Logistics League: an interview with Alexander Ferrein, Till Hofmann and Wataru Uemura

  25 Sep 2025
Find out more about the RoboCup league focused on production logistics and the planning.

Data centers consume massive amounts of water – companies rarely tell the public exactly how much

  24 Sep 2025
Why do data centres need so much water, and how much do they use?

Interview with Luc De Raedt: talking probabilistic logic, neurosymbolic AI, and explainability

  23 Sep 2025
AIhub ambassador Liliane-Caroline Demers caught up with Luc de Raedt at IJCAI 2025 to find out more about his research.

Call for AAAI educational AI videos

  22 Sep 2025
Submit your contributions by 30 November 2025.



 

AIhub is supported by:






 












©2025.05 - Association for the Understanding of Artificial Intelligence