about

resources

events

contribute

republishing

☰

ΑΙhub.org

‘Probably’ doesn’t mean the same thing to your AI as it does to you

by The Conversation

17 April 2026

Why it matters

Far from being a linguistic quirk, this misalignment is a fundamental challenge for AI safety and human-AI interaction. As large language models are increasingly used in high-stakes fields like health care, government policy and scientific reporting, the way they communicate risk becomes a matter of public trust.

If an AI assistant helping a doctor, for instance, describes a side effect as “unlikely,” but the model’s internal calculation of “unlikely” is much higher than the doctor’s interpretation, the resulting decision could be flawed.

What other research is being done

Scientists have studied how humans quantify uncertainty since the 1960s, a field pioneered by CIA analysts to improve intelligence reporting. More recently, there has been an explosion in large language model literature seeking to look under the hood of neural networks to better understand their “behaviors” and linguistic patterns.

Our study adds a layer of complexity by treating the interaction between humans and artificial intelligence as a biological-like system where meaning can degrade. It moves beyond simply measuring if an AI is “smart” and instead asks if it is aligned.

Other researchers are currently exploring whether so-called chain-of-thought prompting – asking the AI to show its work – can fix these errors. However, our study found that even advanced reasoning doesn’t always bridge the gap between statistical data and verbal labels.

What’s next

A goal for future AI development is to create models that don’t just predict the next likely word but actually understand the weight of the uncertainty they are conveying. Researchers are calling for more robust consistency metrics to ensure that if a model sees a 10% chance in the data, it chooses the same word every time.

As we move toward a world where AI summarizes scientific papers and manages people’s schedules, making sure that “probably” means “probably” is a vital step in making these systems reliable partners rather than just sophisticated parrots.

Mayank Kejriwal, Research Assistant Professor of Industrial & Systems Engineering, University of Southern California

This article is republished from The Conversation under a Creative Commons license. Read the original article here.

The Conversation is an independent source of news and views, sourced from the academic and research community and delivered direct to the public.

AUAI is supported by:

Image Empire – a new short film from Alan Warburton

Lucy Smith 29 May 2026

An animated fairytale about the fusion of the real and the virtual within contemporary AI models.

monthly digest

AIhub monthly digest: May 2026 – AI for science, the lottery ticket hypothesis, and world models

Lucy Smith 28 May 2026

Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

You probably wouldn’t notice if an AI chatbot slipped ads into its responses

The Conversation 27 May 2026

Research suggests AI chatbots could easily be used for covert advertising to manipulate their human users.

The Good Robot podcast: the future of data centres and digital sovereignty with Friederike von Franqué

The Good Robot Podcast 26 May 2026

Can cloud infrastructure be owned and governed by the people, and not just Big Tech?

coffee corner

‘Probably’ doesn’t mean the same thing to your AI as it does to you

Why it matters

What other research is being done

What’s next

Related posts :

Image Empire – a new short film from Alan Warburton

AIhub monthly digest: May 2026 – AI for science, the lottery ticket hypothesis, and world models

You probably wouldn’t notice if an AI chatbot slipped ads into its responses

The Good Robot podcast: the future of data centres and digital sovereignty with Friederike von Franqué

AIhub coffee corner: World models

Why the world’s banks are so worried about Anthropic’s latest AI model

Embracing empiricism – from the lottery hypothesis to creating real-world impact: an interview with Jonathan Frankle

A faster way to estimate AI power consumption

↑