Each year the AAAI recognizes a group of individuals who have made significant, sustained contributions to the field of artificial intelligence by appointing them as Fellows. Over the course of the next few months, we’ll be talking to some of the 2026 AAAI Fellows. In this interview, we met with Tanya Berger-Wolf, who was elected as a Fellow “for significant contributions to advancing AI for nature, from science to impact in biodiversity and conservation”. We found out about her latest research developing a foundation model for biology, the insights this model can provide, interesting collaborations over the years, and what the future has in store.
I’m Tanya Berger-Wolf, a Professor of Computer Science and Engineering, Electrical and Computer Engineering, and Evolution, Ecology and Organismal Biology. I’m at the Ohio State University in Columbus, Ohio, US. My area of research is in AI for ecology, biodiversity, and conservation. This is a unique intersection of AI, the science of nature, and the practice of monitoring and protecting ecosystems. My work spans this entire arc from the very basic foundational research to collaborating with policy makers and practitioners in the field to deploy the solutions and the recommendations.
Did you know that there are only two million named species? Even biologists don’t have a good idea of how many species there actually are. Until about three years ago, the estimated total number of plants, fungi, and animals was about 10, maybe 15 million. Shockingly low, right? But because the number is based mostly on the rate of discovery and we’ve become much better at discovering new species, that number has recently been revised to about 30 to 50 million. And out of all of those, only two million are named. So we know vanishingly little, and we’re actually losing species faster than we can name them or even know that they were there. Therefore, one big motivation for our work is biodiversity monitoring, just knowing what’s out there.
I’ve been in the field for a long time and my foundational research has always been inspired, driven and motivated by the nature side of it – ecology, biodiversity, and conservation. But the research questions have always been advancing the frontiers of computer science, data science, machine learning and AI.
I lead two federally and internationally funded centers and institutes. Firstly, the Imageomics Institute, funded by the US National Science Foundation. I have a habit of establishing new fields of science, so the -omics part of “imageomics” covers all of biology (think “genomics”, “proteomics”), and the image- part means that we focus on traits and phenotypes observable from images. We extract not just the morphology, shape, size, color, and curvature, but things like behavior, movement, facial expressions, and so on.
We’re asking if we can get answers that aren’t just classification, that explain why a particular bird has been classified as it has (for example, because it has a yellow belly, black behind the beak, and it wobbles when it walks). Biology is a good testing ground, because not only do we have a lot of conceptually and semantically meaningful ground truths (which are verifiable), we also have structures to the data. Biologists are the ones who started taxonomy, right? So there is this taxonomy, phylogeny, evolutionary trees, countless biological ontologies, etc.
We can actually bring the structural knowledge of biology directly into the machine learning model architectures to advance both AI research and scientific discovery. We’re considering many of the questions that are current in computer science and AI research now, like interpretable and explainable AI, and domain-grounded foundation models, complex system modeling and hypothesis generation. One of our main projects has been building a foundation model for the Tree of Life.
The first Imageomics workshop, held at AAAI 2024.
The first version of our foundation model BioCLIP was released in 2024 and was co-led by two fantastic PhD students, Jiaman (Lisa) Wu and Sam Stevens, and it won the best student paper award at CVPR 2024. We started by exploring whether adding taxonomy structure to the basic task of species classification from images improves performance, but we moved way beyond the classification task. Our model is the first foundation model for the Tree of Life and it showed that adding biological information allows not just classification but new species discovery and queries at different taxonomic levels, such as genus or family
In the second version, BioCLIP2, which was a Spotlight Paper at NeurIPS in December 2025, we scaled up to 214 million images of almost half of all the named species, greatly expanding the taxonomic coverage from previous datasets and models. With this version of BioCLIP, we asked whether the scale mattered. And the answer is absolutely yes. It turns out that within the model embeddings not only is there clustering by species and hierarchical clustering following the biological taxonomy structure, which you would expect, but within species, there’s subclustering. There are orthogonal to species clustering dimensions where there is a separation by age (juvenile vs adult) sex (female vs male), or diseased vs healthy. And this is just the impact of the scale of data as the images contained none of these labels.
On a field trip.
We can use the model as a starting point for all kinds of things. For example, we looked at the classic example of functional traits: the beak shapes and sizes of Darwin’s finches (the shape and size of a beak is strongly connected to the diet). We wanted to see if there was a dimension in the model embedding space where the finch species are aligned by beak size, and there was! It was mind blowing because we didn’t annotate the images with beak sizes. Clearly that information, when you have the right scale of data, starts popping up.
So this is actually a tool where scientists can start asking questions and generating hypotheses. For example, female-male separation within some species is greater than within other species. Evolutionary biologists are now looking whether there is a correlation between how well clusters are separating between female and male birds and the type of the mating system, because there is a really good hypothesis on the relationship between sexual dimorphisms (difference in the appearance of the sexes) and the type of the mating system. It has never been clearly shown because sexual dimorphism is hard to quantify. Well, now we have a tool to quantify it.
The other thing we are able to do is zoom all the way down to super fine-grained classification. The model is starting to point out new traits that we maybe didn’t notice, because humans, it turns out, as visual as we are, we don’t see everything. We’re much better at shape, size, and color than at things like, for example, curvature, or pointedness, or other subtle differences.
There was a paper a couple of years ago that showed that humans are not really good at separating red phenotypes of polymorphic moths because we don’t have enough red orange acuity. I mean, this is obvious, right, these phenotypes didn’t evolve to please the human eye or even for the human vision model. No, they evolved because they were selected by the birds who have a different vision model and different color acuity. So we are seeing the model pick up traits like this again and again, traits that humans have missed because they are in our blind spot.
Something else I’d like to highlight is that, as a result of our foundation model, we were approached by the OSU Infectious Diseases Institute who work with the Ohio Department of Health. They were worried about ticks in the state. In the last 10 years, Ohio went from a state with one species of ticks to six species. Ticks bring in infectious diseases, such as Lyme disease, and it’s a really big public health problem. To identify ticks currently, the Department has to get physical samples and identify them. We used BioCLIP2 to identify the ticks just from images. As it’s a foundational model, and not built for specific subsets of species, the initial accuracy was around 50%. However, after fine-tuning on pictures of ticks, the accuracy is now over 90%. This is way better than human accuracy for the same task. We’d like to develop an app so that a person in rural Ohio or Maine, for instance, can take a picture of an insect and find out if it is disease-carrying species (or even the specific individual!) or not.
We do have the results of an exciting experiment due out very soon, so watch this space!
More generally, we do already have glimpses of discovery – new species are being flagged by the model.
We’re also accelerating and scaling up processing. For example, we partner with the National Ecological Observatory Network (NEON) which has 41 sites throughout the US. Each one has full technological and sensor infrastructure to collect data over long periods (at least 30 years) with a uniform collection protocol from different ecosystems.. One of the data streams they collect is ground beetles in traps. They have test tubes of beetles, but they’re really slow at processing, as you can imagine. We’ve partnered up with them to help with the data processing. We can calibrate size and color, for example, with segmentation down to the beetle parts, putting in landmarks on the samples that are biologically meaningful. It turns out that you can predict drought conditions before the normal methods do by observing ground beetle behaviour. It seems obvious when you think about it – the beetles will be affected by the moisture in the soil, or soil composition, before humans can observe it.
Another project that we’re keen to progress is the building of a smart field lab as a test bed for this new area of technology and AI-enabled field science. It will require a lot of funding, but I think it is critical to be able to test and validate the new research methods.
I lead the US National Science Foundation (NSF) funded Imageomics Institute and the AI and Biodiversity Change (ABC) Global Center co-funded by the NSF and the Natural Sciences and Engineering Research Council of Canada (NSERC).
We also work in partnership with Wildlabs, which is a global online conservation community, with a presence in every continent. With that collaboration we published a vision paper (which we hope is visionary) in the first issue of Nature Review Biodiversity. It’s a prospective paper on AI to fill biodiversity knowledge shortfalls. We have a lot of data and observations for a small fraction of species, but very little for the rest. We have almost nothing for fungi. In terms of geography, we have a lot of data for North America, Western Europe, some parts of Australia and South Africa, but diminishing little in the biodiversity hotspots. We really need to fill these gaps. I think that’s what’s next in AI, and that’s what we’re building. Currently, AI is about speeding up and scaling up data processing, data collection and processing. We want to truly move towards whole system synthesis, understanding, and modeling to hypothesis generation, scientific discovery. How do we push AI as a scientific partner? The future is multi-modal, multi-scale, and multi-sensory, so how do we put it all together?
With AI and Biodiversity Change (ABC) we are collaborating with Sarah Beery, extracting interactions from iNaturalist data. iNaturalist is this big citizen science platform for nature observations. We’re asking how we can extract the value of already collected data.
One of my fantastic PhD students, Jenna Kline, is developing this system of adaptive intelligent multi-drone systems for animal behavior monitoring. If we are serious about understanding ecosystems of all of these aspects and scales and interactions, how do we develop? And this is a very engineering and computational question. We don’t actually have a way of doing it at the moment for field sciences. I actually think that the future of AI is very much at the edge – embodied physical AI systems, not the way we think about it as an LLM.
Wild Me, which is now part of Conservation X Labs, was one of the first AI for conservation non-profits. I served on the board of directors for ten years. One of our projects, which I co-led, is called Wildbook, a platform that uses AI-based individual animal identification from images for population monitoring and tracking individuals. Yes, individuals, not just zebra or whale but Zippy the zebra and Willy the whale! We now have individual IDs for more than 200 species. The data from that project has been used for conservation decision making, intervention evaluations, tracking, population size information, IUCN [International Union for Conservation of Nature] Red List conservation status designations, policy making and resource allocation. It’s great that this project has had such an impact and led to actual changes in the welfare and well-being of species out there.
One of the papers that I’m most proud of in my career is a paper I’m not actually a co-author on. It’s a paper on the biology of whale sharks. They are a global species that can travel more than 5,000 miles. Our platform was deployed to recognize individual whale sharks. Before that, people didn’t realize that the same individuals are literally in Cancun in July and in the Philippines in January. And by combining data from all over the world about these global species, they were able to understand the biology, the movement, and the migration patterns, and calculate a much more accurate population size estimate. This then led to a change in the IUCN Red List designation from vulnerable to endangered. That designation matters because it means a very different protection policy, very different resource allocation, and it actually makes a difference in the wellbeing of the species.
I’d like to highlight that there is a whole team behind this work and I’m proud to have so many co-authors and collaborators. I think it is important to understand that today’s science is intensely interdisciplinary and is a team effort. The stereotype of a lonely scientist has never been true, and today, less true than ever.
Tanya at AAAI 2026, where she received recognition for being elected as a AAAI Fellow.
Dr Tanya Berger-Wolf is a Professor of Computer Science Engineering, Electrical and Computer Engineering, and Evolution, Ecology, and Organismal Biology at the Ohio State University, where she is also the Director of the Translational Data Analytics Institute. A pioneer in AI for ecology, biodiversity, and conservation, she leads the NSF-funded Imageomics Institute and the US-Canada co-funded AI and Biodiversity Change (ABC) Global Center.
Dr Berger-Wolf serves on advisory and governance bodies including the US National Academies Board on Life Sciences, the Global Partnership on AI (GPAI)/OECD, National Ecological Observatory Network (NEON), and The Nature Conservancy. She co-led Wild Me (now part of Conservation X Labs), one of the first AI conservation nonprofits, where she co-created Wildbook, recognized by UNESCO for advancing the UN Sustainable Development Goals. Her contributions have earned numerous honors, including recognition as the AI 100 Global Thought Leaders by H20.ai. She is an elected Fellow of the Association for the Advancement of Artificial Intelligence (AAAI) and the American Association for the Advancement of Science (AAAS).
Prior to coming to The Ohio State University in January 2020, Berger-Wolf was at the University of Illinois at Chicago. She received her PhD from University of Illinois at Urbana-Champaign in 2002.