ΑΙhub.org
 

Museums have tons of data, and AI could make it more accessible − but standardizing and organizing it across fields won’t be easy


by
20 March 2025



share this:

By Bradley Wade Bishop, University of Tennessee

Ice cores in freezers, dinosaurs on display, fish in jars, birds in boxes, human remains and ancient artifacts from long gone civilizations that few people ever see – museum collections are filled with all this and more.

These collections are treasure troves that recount the planet’s natural and human history, and they help scientists in a variety of different fields such as geology, paleontology, anthropology and more. What you see on a trip to a museum is only a sliver of the wonders held in their collection.

Museums generally want to make the contents of their collections available for teachers and researchers, either physically or digitally. However, each collection’s staff has its own way of organizing data, so navigating these collections can prove challenging.

Creating, organizing and distributing the digital copies of museum samples or the information about physical items in a collection requires incredible amounts of data. And this data can feed into machine learning models or other artificial intelligence to answer big questions.

Currently, even within a single research domain, finding the right data requires navigating different repositories. AI can help organize large amounts of data from different collections and pull out information to answer specific questions.

But using AI isn’t a perfect solution. A set of shared practices and systems for data management between museums could improve the data curation and sharing necessary for AI to do its job. These practices could help both humans and machines make new discoveries from these valuable collections.

As an information scientist who studies scientists’ approaches to and opinions on research data management, I’ve seen how the world’s physical collection infrastructure is a patchwork quilt of objects and their associated metadata.

AI tools can do amazing things, such as make 3D models of digitized versions of the items in museum collections, but only if there’s enough well-organized data about that item available. To see how AI can help museum collections, my team of researchers started by conducting focus groups with the people who managed museum collections. We asked what they are doing to get their collections used by both humans and AI.

Collection managers

When an item comes into a museum collection, the collection managers are the people who describe that item’s features and generate data about it. That data, called metadata, allows others to use it and might include things like the collector’s name, geographic location, the time it was collected, and in the case of geological samples, the epoch it’s from. For samples from an animal or plant, it might include its taxonomy, which is the set of Latin names that classify it.

All together, that information adds up to a mind-boggling amount of data.

But combining data across domains with different standards is really tricky. Fortunately, collection managers have been working to standardize their processes across disciplines and for many types of samples. Grants have helped science communities build tools for standardization.

In biological collections, the tool Specify allows managers to quickly classify specimens with drop-down menus prepopulated with standards for taxonomy and other parameters to consistently describe the incoming specimens.

A common metadata standard in biology is Darwin Core. Similar well-established metadata and tools exist across all the sciences to make the workflow of taking real items and putting them into a machine as easy as possible.

Special tools like these and metadata help collection managers make data from their objects reusable for research and educational purposes.

Many of the items in museum collections don’t have a lot of information describing their origins. AI tools can help fill in gaps.

All the small things

My team and I conducted 10 focus groups, with a total of 32 participants from several physical sample communities. These included collection managers across disciplines, including anthropology, archaeology, botany, geology, ichthyology, entomology, herpetology and paleontology.

Each participant answered questions about how they accessed, organized, stored and used data from their collections in an effort to make their materials ready for AI to use. While human subjects need to provide consent to be studied, most species do not. So, an AI can collect and analyze the data from nonhuman physical collections without privacy or consent concerns.

We found that collection managers from different fields and institutions have lots of different practices when it comes to getting their physical collections ready for AI. Our results suggest that standardizing the types of metadata managers record and the ways they store it across collections could make the items in these samples more accessible and usable.

Additional research projects like our study can help collection managers build up the infrastructure they’ll need to make their data machine-ready. Human expertise can help inform AI tools that make new discoveries based on the old treasures in museum collections.The Conversation

Bradley Wade Bishop, Professor of Information Sciences, University of Tennessee

This article is republished from The Conversation under a Creative Commons license. Read the original article.




The Conversation is an independent source of news and views, sourced from the academic and research community and delivered direct to the public.
The Conversation is an independent source of news and views, sourced from the academic and research community and delivered direct to the public.




            AIhub is supported by:


Related posts :



Interview with Lea Demelius: Researching differential privacy

  25 Mar 2025
We hear from doctoral consortium participant Lea Demelius who is investigating the trade-offs and synergies that arise between various requirements for trustworthy AI.

The Machine Ethics podcast: Careful technology with Rachel Coldicutt

This episode, Ben chats to Rachel Coldicutt about AI taxonomy, innovating for everyone not just the few, responsibilities of researchers, and more.

Interview with AAAI Fellow Roberto Navigli: multilingual natural language processing

  21 Mar 2025
Roberto tells us about his career path, some big research projects he’s led, and why it’s important to follow your passion.

Shlomo Zilberstein wins the 2025 ACM/SIGAI Autonomous Agents Research Award

  19 Mar 2025
Congratulations to Shlomo Zilberstein on winning this prestigious award!

#AAAI2025 workshops round-up 1: Artificial intelligence for music, and towards a knowledge-grounded scientific research lifecycle

  18 Mar 2025
We hear from the organisers of two workshops at AAAI2025 and find out the key takeaways from their events.

The Good Robot podcast: Re-imagining voice assistants with Stina Hasse Jørgensen and Frederik Juutilainen

  17 Mar 2025
Eleanor and Kerry chat to Stina Hasse Jørgensen and Frederik Juutilainen about an experimental research project that created an alternative voice assistant.

Visualizing research in the age of AI

  14 Mar 2025
Felice Frankel discusses the implications of generative AI when communicating science visually.

#IJCAI panel on communicating about AI with the public

  13 Mar 2025
A recording of this session at IJCAI2024 is now available to watch.




AIhub is supported by:






©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association