ΑΙhub.org
 

A deep learning model for identifying disease and risk factor biomarkers


by
30 October 2023



share this:

Mika and David peering into a machineMika Gustafsson and David Martínez hope that AI-based models could eventually be used in precision medicine to develop treatments and preventive strategies tailored to the individual. Photo: Thor Balkhed.

By Karin Söderlund Leifler

Artificial intelligence, AI, which finds patterns in complex biological data could eventually contribute to the development of individually tailored healthcare. Researchers at LiU have developed an AI-based method applicable to various medical and biological issues. Their models can, for instance, estimate people’s chronological age and determine whether they have been smokers or not.

Genes play an important role in health, but so do environmental and behavioural factors, such as diet and level of physical activity. Epigenetics is the study of how behaviours and environment cause changes that affect the way genes work. Epigenetic changes do not change DNA sequences, but they can change how such sequences are read by the body.

Researchers at Linköping University (LiU) have used data with epigenetic information from more than 75,000 human samples to train a large number of AI neural network models. They hope that such AI-based models could eventually be used in precision medicine to develop treatments and preventive strategies tailored to the individual. Their models are of the autoencoder type, that self-organises the information and finds interrelation patterns in the large amount of data.

Smoking leaves traces in the DNA

To test their model, the LiU researchers compared it with existing models. There are already existing models of the effects of smoking on the body, building on the fact that specific epigenetic changes reflect the effect of smoking on the functioning of the lungs. These traces remain in the DNA long after a person has quit smoking, and this type of model can identify whether someone is a current or former, or has never smoked. Other models can, based on epigenetic markers, estimate the chronological age of an individual, or group individuals according to whether they have a disease or are healthy.

The LiU researchers trained their autoencoder and then used the result to answer three different queries: age determination, smoker status and diagnosing the disease systemic lupus erythematosus, SLE.

“Our models not only enable us to classify individuals based on their epigenetic data. We found that our models can identify previously known epigenetic markers used in other models, but also new markers associated with the condition we’re examining. One example of this is that our model for smoking identifies markers associated with respiratory diseases, such as lung cancer, and DNA damage,” says David Martínez, PhD student at Linköping University.

David Martínez, PhD student. Photo: Thor Balkhed.

The objective of the autoencoder models is to enable compression of extremely complex biological data into a representation of the most relevant characteristics and patterns in data.

“We didn’t steer the model and had no hypotheses based on existing biological knowledge, but let the data speak for itself. When subsequently looking at what was happening in the autoencoder, we saw that data self-organised in a way similar to how it works in the body,” says Mika Gustafsson, professor of translational bioinformatics at Linköping University, who led the study now published in Briefings in Bioinformatics.

In the next step, the researchers can use the most important characteristics found by the autoencoder to create models able to classify a large amount of environment-related, individual-specific factors where there is not enough training data to train more complex AI models.

Interpretable AI models

Certain types of AI are sometimes likened to a black box that provides answers, but humans cannot see how the AI arrived at the answer. Mika Gustafsson and his colleagues however strive to create interpretable AI models that, so to speak, let the researchers peek under the lid of the “black box” to understand what is going on inside.

“We want to be able to understand what the model shows us about the biology behind disease and other conditions. Then we’ll see not only whether someone is ill or not, but, by interpreting data, we’ll also have a chance to learn why,” says Mika Gustafsson.

Mika Gustafsson, professor. Photo: Thor Balkhed.

This research was funded by, among others, the Swedish Research Council, the Wallenberg AI, Autonomous Systems and Software Program (WASP) and the SciLifeLab & Wallenberg National Program for Data-Driven Life Science (DDLS).

Read the research in full

NCAE: data-driven representations using a deep network-coherent DNA methylation autoencoder identify robust disease and risk factor signatures, David Martínez-Enguita, Sanjiv K. Dwivedi, Rebecka Jörnsten and Mika Gustafsson, Briefings in Bioinformatics, (2023).



tags: ,


Linköping University




            AIhub is supported by:


Related posts :



#AAAI2025 social media round-up: part one

  28 Feb 2025
Find out what participants have been getting up to during the first few of days at the conference

Congratulations to the #AAAI2025 award winners

  27 Feb 2025
Find out who has won the prestigious 2025 awards for their contributions to the field.

Interview with AAAI Fellow Sriraam Natarajan: Human-allied AI

  26 Feb 2025
Sriraam tells us about his career path, research on human-allied AI, reflections on changes to the AI landscape, and passion for cricket.
monthly digest

AIhub monthly digest: February 2025 – kernel representation learning, fairness in machine learning, and bad practice in the publication world

  25 Feb 2025
Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

Generative AI, online platforms and compensation for content: the need for a new framework

  24 Feb 2025
The rise of AI requires us to rethink the distribution models between those who produce content and those who use it

Generative AI is already being used in journalism – here’s how people feel about it

  21 Feb 2025
New report draws on three years of interviews and focus group research into generative AI and journalism

Charlotte Bunne on developing AI-based diagnostic tools

  20 Feb 2025
To advance modern medicine, EPFL researchers are developing AI-based diagnostic tools. Their goal is to predict the best treatment a patient should receive.

What’s coming up at #AAAI2025?

  19 Feb 2025
Find out what's on the programme at the 39th Annual AAAI Conference on Artificial Intelligence




AIhub is supported by:






©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association