ΑΙhub.org
 

Machine learning powers new approach to detecting soil contaminants


by
06 June 2025



share this:

By Silvia Cernea Clark

A team of researchers at Rice University and Baylor College of Medicine has developed a new strategy for identifying hazardous pollutants in soil, even ones that have never been isolated or studied in a lab.

The new approach, described in a study published in Proceedings of the National Academy of Sciences, uses light-based imaging, theoretical predictions of compounds’ light signatures and machine learning (ML) algorithms to detect toxic compounds like polycyclic aromatic hydrocarbons (PAHs) and their derivative compounds (PACs) in soil. A common by-product of combustion, PAHs and PACs have been linked to cancer, developmental issues and other serious health problems.

Identifying pollutants in soil usually requires advanced laboratories and standard physical reference samples of the suspected contaminants. However, for many environmental pollutants that pose a public health risk, there is no experimental data available that can be used to detect them.

“This method makes it possible to identify chemicals that have not yet been isolated experimentally,” said Naomi Halas, University Professor and the Stanley C. Moore Professor of Electrical and Computer Engineering at Rice.

The new method uses a light-based imaging technique known as surface-enhanced Raman spectroscopy, which analyzes how light interacts with molecules, tracking the unique patterns, or spectra, they emit. Spectra serve as “chemical fingerprints” for each compound. The technique is refined through the use of signature nanoshells designed to enhance relevant traits in the spectra.

Using density functional theory ⎯ a computational modeling technique that can predict how atoms and electrons behave in a molecule ⎯ the researchers calculated what the spectra of a whole range of PAHs and PACs look like based on the compounds’ molecular structure. This allowed them to generate a virtual library of “fingerprints” for PAHs and PACs.

The soil used in this study was collected from Harris Gully, a restored watershed and natural area on Rice University campus. (Photo by Brandon Martin/Rice University)

Two complementary ML algorithms ⎯ characteristic peak extraction and characteristic peak similarity ⎯ were used to parse relevant spectral traits in real-world soil samples and match them to compounds mapped out in the virtual library of spectra.

“We are using PAHs in soil to illustrate this very important new strategy,” Halas said. “There are tens of thousands of PAH-derived chemicals and this approach ⎯ calculating their spectra and using machine learning to connect the theoretically calculated spectra to those observed in a sample ⎯ allows us to identify chemicals that we may not, or do not, have any experimental data for.”

The method addresses a critical gap in environmental monitoring, opening the door to identifying a much broader range of hazardous compounds ⎯ including those that have changed over time. This is especially important given that soil is a dynamic environment where chemicals are subject to transformations that can render them harder to detect.

Thomas Senftle, Rice’s William Marsh Rice Trustee Associate Professor of Chemical and Biomolecular Engineering, compared the process to using facial recognition in order to find an individual in a crowd.

“You can imagine we have a picture of a person when they’re a teenager, but now they’re in their 30s,” Senftle said. “In my group what we do is, on the theory side, we can predict what the picture will look like.”

The researchers tested the method on soil from a restored watershed and natural area using both artificially contaminated samples and a control sample. Results showed the new approach reliably picked out even minute traces of PAHs using a simpler and faster process than conventional techniques.

“This method can identify lesser-known and largely unstudied PAH and PAC pollutant molecules,” said Oara Neumann, a Rice research scientist who is a co-author on the study.

Naomi Halas and Ankit Patel (Photos by Jeff Fitlow/Rice University)

In the future, the method could enable on-site field testing by integrating the ML algorithms and theoretical spectral library with portable Raman devices into a mobile system, making it easier for farmers, communities and environmental agencies to test soil for hazardous compounds without needing to send samples to specialized labs and wait days for results.

Ankit Patel, assistant professor of electrical and computer engineering at Rice and assistant professor of neuroscience at Baylor, is a corresponding author on the study alongside Halas.

Other Rice co-authors include computer science doctoral alum Yilong Ju; doctoral students Sarah Denison, Peixuan Jin and Andres Sanchez-Alvarado; Peter Nordlander, the Wiess Chair in Physics and Astronomy and professor of electrical and computer engineering and materials science and nanoengineering; and Pedro Alvarez, the George R. Brown Professor of Civil and Environmental Engineering.

The research was supported by the National Institutes of Health (P42ES027725-01), the Welch Foundation (C-1220, C-1222) and the Carl and Lillian Illig Fellowship (Smalley-Curl Institute, H20398-239440). The content herein is solely the responsibility of the authors and does not necessarily represent the official views of the funding organizations and institutions.




Rice University




            AIhub is supported by:


Related posts :



monthly digest

AIhub monthly digest: June 2025 – gearing up for RoboCup 2025, privacy-preserving models, and mitigating biases in LLMs

  26 Jun 2025
Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

RoboCupRescue: an interview with Adam Jacoff

  25 Jun 2025
Find out what's new in the RoboCupRescue League this year.

Making optimal decisions without having all the cards in hand

Read about research which won an outstanding paper award at AAAI 2025.

Exploring counterfactuals in continuous-action reinforcement learning

  20 Jun 2025
Shuyang Dong writes about her work that will be presented at IJCAI 2025.

What is vibe coding? A computer scientist explains what it means to have AI write computer code − and what risks that can entail

  19 Jun 2025
Until recently, most computer code was written, at least originally, by human beings. But with the advent of GenAI, that has begun to change.

Gearing up for RoboCupJunior: Interview with Ana Patrícia Magalhães

  18 Jun 2025
We hear from the organiser of RoboCupJunior 2025 and find out how the preparations are going for the event.

Interview with Mahammed Kamruzzaman: Understanding and mitigating biases in large language models

  17 Jun 2025
Find out how Mahammed is investigating multiple facets of biases in LLMs.

Google’s SynthID is the latest tool for catching AI-made content. What is AI ‘watermarking’ and does it work?

  16 Jun 2025
Last month, Google announced SynthID Detector, a new tool to detect AI-generated content.



 

AIhub is supported by:






©2025.05 - Association for the Understanding of Artificial Intelligence


 












©2025.05 - Association for the Understanding of Artificial Intelligence