ΑΙhub.org
 

Investigating neural collapse in deep classification networks


by
18 May 2022



share this:

NC_animationAnimation demonstrating the neural collapse phenomenon. Credit: X.Y. Han, Vardan Papyan, and David Donoho.

X.Y. Han, Vardan Papyan, and David Donoho won an outstanding paper award at ICLR 2022 for their paper Neural collapse under MSE loss: proximity to and dynamics on the central path. Here, they tell us more about this research, their methodology, and what the implications of this work are.

What is the topic of the research in your paper?

Our work takes a data scientific approach to understanding deep neural networks. We make scientific measurements that identify common, prevalent empirical phenomena that occur in canonical deep classification networks trained with paradigmatic methods. We then build and analyze a mathematical model to understand the phenomena.

What were your main findings?

When making measurements on canonical deep net architectures trained with paradigmatic practices, we observed an interesting behavior of the penultimate deep net layer: the features collapse to their class-means, both classifiers and class-means collapse to the same Simplex Equiangular Tight Frame, and classifier behavior collapses to the nearest-class-mean decision rule. We call this behavior “Neural Collapse”. In this paper, we derive the dynamics of Neural Collapse in explicit form under a popular mathematical model for deep nets.

Could you tell us about the implications of your research and why it is an interesting area for study?

Because of the complexity of modern deep nets, practitioners and researchers still regard them as mostly opaque, but highly-performing boxes. These opaque boxes are massively over parameterized and researchers frequently engineer new adjustments to improve their performance. Thus, one might expect the trained networks to exhibit many particularities that make it impossible to find any empirical regularities across a wide range of datasets and architectures. Our findings are interesting because—on the contrary—we reveal that Neural Collapse is a common empirical pattern across many classification datasets and architectures. Careful theoretical analysis of this pattern and its limiting equiangular tight frame (ETF) structure can give insights into important components of the modern deep learning training paradigm such as adversarial robustness and generalization.

Could you explain your methodology?

Building upon a previous work where we demonstrated Neural Collapse on the cross-entropy loss, in this work, we first establish the empirical reality of Neural Collapse on mean squared error (MSE) loss by demonstrating its occurrence on three canonical networks and five benchmark datasets. We then exhibit a new set of experiments showing that the last-layer classifiers stay “least-squares optimal” relative to its associated features roughly at any fixed point in time. When classifiers and features satisfy this correspondence, we say they are on a “central path”.

The simple analytic form of the MSE loss allows us to perform deeper mathematical analyses of training dynamics leading to Neural Collapse than is currently possible on cross-entropy. In particular, by adopting the popular “unconstrained features” (or “layer-peeled”) model, we derive explicit, closed-form dynamics of Neural Collapse on the central path.

What further work are you planning in this area?

We are interested in examining the implications of Neural Collapse for generalization. As we describe in an “Open Questions” section of this paper, we do empirically observe the occurrence of Neural Collapse on test data as well, but at a much slower rate. In future works, we hope to show more extensive measurements characterizing Neural Collapse on out-of-sample data and leveraging these observations towards predicting a deep net’s test performance and adversarial robustness.

References

Vardan Papyan, X.Y. Han, and David L. Donoho. Prevalence of neural collapse during the terminal phase of deep learning training. Proceedings of the National Academy of Sciences (PNAS), 117.40 (2020): 24652-24663.

X.Y. Han, Vardan Papyan, and David L. Donoho. Neural collapse under MSE loss: proximity to and dynamics on the central path. International Conference on Learning Representations (ICLR) 2022.


XYHan

X.Y. Han is a PhD Candidate in the School of Operations Research and Information Engineering at Cornell University. He earned a BSE in Operations Research and Financial Engineering from Princeton University in 2016 and a MS in Statistics from Stanford University in 2018.

VardanPapyan

Vardan Papyan is an assistant professor in Mathematics, cross-appointed with Computer Science, at the University of Toronto, affiliated with the Vector Institute for Artificial Intelligence and the Schwartz Reisman Institute for Technology and Society. His research spans deep learning, signal processing, high-performance computing, high-dimensional statistics, and applied mathematics. His research was recognized by the Natural Sciences and Engineering Research Council in Canada through a Discovery Grant and a Discovery Launch Supplement and by Compute Canada.

DavidDonoho

David Donoho is the Anne T. and Robert M. Bass Professor of Humanities and Sciences and a professor in the Department of Statistics at Stanford University.



tags:


AIhub is dedicated to free high-quality information about AI.
AIhub is dedicated to free high-quality information about AI.

            AIhub is supported by:



Subscribe to AIhub newsletter on substack



Related posts :

Studying the properties of large language models: an interview with Maxime Meyer

  11 Mar 2026
What happens when you increase the prompt length in a LLM? In the latest interview in our AAAI Doctoral Consortium series, we sat down with Maxime, a PhD student in Singapore.

What the Moltbook experiment is teaching us about AI

An experimental social media platform where only AI bots can post reveals surprising lessons about artificial intelligence behaviour and safety.

The malleable mind: context accumulation drives LLM’s belief drift

  09 Mar 2026
LLMs change their "beliefs" over time, depending on the data they are given.

RWDS Big Questions: how do we balance innovation and regulation in the world of AI?

  06 Mar 2026
The panel explores the tensions, trade-offs and practical realities facing policymakers and data scientists alike.

Studying multiplicity: an interview with Prakhar Ganesh

  05 Mar 2026
What is multiplicity, and what implications does it have for fairness, privacy and interpretability in real-world systems?

Top AI ethics and policy issues of 2025 and what to expect in 2026

, and   04 Mar 2026
In the latest issue of AI Matters, a publication of ACM SIGAI, Larry Medsker summarised the year in AI ethics and policy, and looked ahead to 2026.

The greatest risk of AI in higher education isn’t cheating – it’s the erosion of learning itself

  03 Mar 2026
Will AI hollow out the pipeline of students, researchers and faculty that is the basis of today’s universities?

Forthcoming machine learning and AI seminars: March 2026 edition

  02 Mar 2026
A list of free-to-attend AI-related seminars that are scheduled to take place between 2 March and 30 April 2026.



AIhub is supported by:







Subscribe to AIhub newsletter on substack




 















©2026.02 - Association for the Understanding of Artificial Intelligence