ΑΙhub.org
 

AlphaFold advances protein folding research


by
03 December 2020



share this:
Protein_PCMT1_PDB_1i1n
Protein PCMT1 PDB, by Emw CC BY-SA 3.0, via Wikimedia Commons.

The grand challenge of protein folding hit the news this week when it was announced that the latest version of DeepMind’s AlphaFold system had predicted protein structures with very high accuracy in CASP’s 2020 experiment.

Proteins are large, complex molecules, and the shape of a particular protein is closely linked to the function it performs. The ability to accurately predict protein structures would enable scientists to gain a greater understanding of how they work and what they do.

Protein folding is explained in this video from DeepMind:

How AlphaFold works

This new version of AlphaFold builds on the initial system, which you can read about in this paper. The associated code is available here. In this first version, the team trained a neural network to make accurate predictions of the distances between pairs of amino acid residues (beads in the protein chain), which conveyed information about the structure. Using this information, they constructed a potential of mean force that could accurately describe the shape of a protein. The resulting potential could be optimized by a simple gradient descent.

In version two, the team implemented new deep learning architectures. They created an attention-based neural network system, trained end-to-end, that attempts to interpret the structure of the spatial graph that represents the protein, while reasoning over the implicit graph that it’s building. The system uses evolutionarily related sequences, multiple sequence alignment (MSA), and a representation of amino acid residue pairs to refine this graph.

The system was trained on ~170,000 protein structures from the publicly available protein databank and using large databases containing protein sequences of unknown structure.

About CASP

Critical Assessment of protein Structure Prediction (CASP) is a community-wide experiment for protein structure prediction that has taken place every two years since 1994. CASP provides an independent mechanism for the assessment of methods of protein structure modelling.

For the 2020 experiment, the organisers posted sequences of unknown protein structures for modelling from May to August this year. Protein models from various research groups around the world were then collected and evaluated as the experimental coordinates became available.

The main metric used by CASP to measure the accuracy of predictions is the Global Distance Test (GDT) which ranges from 0-100. GDT can be approximately thought of as the percentage of amino acid residues within a threshold distance from the correct position. This year, the AlphaFold system achieved a median score of 92.4 GDT overall across all targets. For the very hardest protein targets, AlphaFold achieved a median score of 87.0 GDT. This is a significant increase in accuracy compared to previous years (the best GDT in 2018 (using the first version of AlphaFold) was under 60, and in 2016 was around 40).

You can find the abstracts from all of the participating groups from 2020 here.

Find out more:

DeepMind’s blog post: “AlphaFold: a solution to a 50-year-old grand challenge in biology”.

Nature paper published on the first version of AlphaFold: “Improved protein structure prediction using potentials from deep learning”.




Lucy Smith is Senior Managing Editor for AIhub.
Lucy Smith is Senior Managing Editor for AIhub.

            AUAI is supported by:



Subscribe to AIhub newsletter on substack



Related posts :

Why the world’s banks are so worried about Anthropic’s latest AI model

  21 May 2026
The finance world’s concern rests on the impressive cyber capabilities of a product called Mythos.

Embracing empiricism – from the lottery hypothesis to creating real-world impact: an interview with Jonathan Frankle

  20 May 2026
Jonathan Frankle discusses empiricism, making an impact, and the legacy of his lottery ticket hypothesis for which he was awarded the 2023 AAAI/ACM Doctoral Dissertation Award.

A faster way to estimate AI power consumption

  19 May 2026
The “EnergAIzer” method generates reliable results in seconds, enabling data center operators to efficiently allocate resources and reduce wasted energy.

Introducing ARFBench: A time series question-answering benchmark based on real incidents

  18 May 2026
To resolve system failures, engineers must troubleshoot outages quickly.

Does ‘federated unlearning’ in AI improve data privacy, or create a new cybersecurity risk?

  15 May 2026
As the capacity of AI systems increases apace, so do concerns about the privacy of user data.

Reflections from #AIES2025

and   14 May 2026
We reflect on AIES 2025, outlining a discussion session on LLMs for clinical usage and human rights.

Deep learning-powered biochip to detect genetic markers

System can detect extremely small amounts of microRNAs, genetic markers linked to diseases such as heart disease.

Half of AI health answers are wrong even though they sound convincing – new study

  12 May 2026
Imagine you have just been diagnosed with early-stage cancer and, before your next appointment, you type a question into an AI chatbot.



AUAI is supported by:







Subscribe to AIhub newsletter on substack




 















©2026.02 - Association for the Understanding of Artificial Intelligence