ΑΙhub.org
 

Using deep learning to find disease-related genes


by
27 February 2020



share this:
Photo credit: metamorworks.

By Karin Söderlund Leifler

An artificial neural network can reveal patterns in huge amounts of gene expression data, and discover groups of disease-related genes. This has been shown by a new study led by researchers at Linköping University. The scientists hope that the method can eventually be applied within precision medicine and individualised treatment.

It’s common when using social media that the platform suggests people whom you may want to add as friends. The suggestion is based on you and the other person having common contacts, which indicates that you may know each other. In a similar manner, scientists are creating maps of biological networks based on how different proteins or genes interact with each other. The researchers behind a new study have investigated whether it is possible to discover biological networks using deep learning, in which artificial neural networks are trained by experimental data. Since artificial neural networks are excellent at learning how to find patterns in enormous amounts of complex data, they are used in applications such as image recognition.

“We have for the first time used deep learning to find disease-related genes. This is a very powerful method in the analysis of huge amounts of biological information, or ‘big data’”, says Sanjiv Dwivedi, postdoc in the Department of Physics, Chemistry and Biology (IFM) at Linköping University.

The scientists used a large database with information about the expression patterns of 20,000 genes in a large number of people. The information was unlabelled, in the sense that the researchers did not give the artificial neural network information about which gene expression patterns were from people with diseases, and which were from healthy people. The AI model was then trained to find patterns of gene expression.

One of the challenges of machine learning is that it is not possible to see exactly how an artificial neural network solves a task. AI is sometimes described as a “black box” – we see only the information that we put into the box and the result that it produces. We cannot see the steps between. Artificial neural networks consist of several layers in which information is mathematically processed. The network comprises an input layer and an output layer that delivers the result of the information processing carried out by the system. Between these two layers are several hidden layers in which calculations are carried out. When the scientists had trained the artificial neural network, they wondered whether it was possible to, in a manner of speaking, lift the lid of the black box and understand how it works. Are the designs of the neural network and the familiar biological networks similar?

“When we analysed our neural network, it turned out that the first hidden layer represented to a large extent interactions between various proteins. Deeper in the model, in contrast, on the third level, we found groups of different cell types. It’s extremely interesting that this type of biologically relevant grouping is automatically produced, given that our network has started from unclassified gene expression data”, says Mika Gustafsson, senior lecturer at IFM and leader of the study.

The scientists then investigated whether their model of gene expression could be used to determine which gene expression patterns are associated with disease. They confirmed that the model finds relevant patterns that agree well with biological mechanisms in the body. Since the model has been trained using unclassified data, it is possible that the artificial neural network has found totally new patterns. The researchers plan now to investigate whether such, previously unknown patterns, are relevant from a biological perspective.

“We believe that the key to progress in the field is to understand the neural network. This can teach us new things about biological contexts, such as diseases in which many factors interact. And we believe that our method gives models that are easier to generalise and that can be used for many different types of biological information”, says Mika Gustafsson.

Mika Gustafsson hopes that close collaboration with medical researchers will enable him to apply the method developed in the study in precision medicine. It may be possible, for example, to determine which groups of patients should receive a certain type of medicine, or identify the patients who are most severely affected.

Read the full research article:

Deriving disease modules from the compressed transcriptional space embedded in a deep autoencoder, Sanjiv K. Dwivedi, Andreas Tjärnberg, Jesper Tegnér and Mika Gustafsson, (2020), Nature Communications.

Mika Gustafsson is a Senior Lecturer in translational bioinformatics in the Department of Physics, Chemistry and Biology (IFM), Linköping University.

Sanjiv Dwivedi is a Postdoctoral researcher in the Department of Physics, Chemistry and Biology (IFM), Linköping University.

Note: this article was translated by George Farrants. It originally appeared on the Linköping University webpage.




Linköping University

            AIhub is supported by:



Subscribe to AIhub newsletter on substack



Related posts :

Forthcoming machine learning and AI seminars: April 2026 edition

  02 Apr 2026
A list of free-to-attend AI-related seminars that are scheduled to take place between 2 April and 31 May 2026.

#AAAI2026 invited talk: machine learning for particle physics

  01 Apr 2026
How is ML used in the search for new particles at CERN?
monthly digest

AIhub monthly digest: March 2026 – time series, multiplicity, and the history of RoboCup

  31 Mar 2026
Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

What I’ve learned from 25 years of automated science, and what the future holds: an interview with Ross King

  30 Mar 2026
We launch our new series with a conversation with Ross King - a pioneer in the field of AI-enabled scientific discovery.

A multi-armed robot for assisting with agricultural tasks

and   27 Mar 2026
How can a robot safely manipulate branches to reveal hidden flowers while remaining aware of interaction forces and minimizing damage?

Resource-constrained image generation and visual understanding: an interview with Aniket Roy

  26 Mar 2026
Aniket tells us about his research exploring how modern generative models can be adapted to operate efficiently while maintaining strong performance.

RWDS Big Questions: how do we highlight the role of statistics in AI?

  25 Mar 2026
Next in our series, the panel explores the statistical underpinning of AI.

A history of RoboCup with Manuela Veloso

  24 Mar 2026
Find out how RoboCup got started and how the competition has evolved, from one of the co-founders.



AIhub is supported by:







Subscribe to AIhub newsletter on substack




 















©2026.02 - Association for the Understanding of Artificial Intelligence