ΑΙhub.org
 

Interview with Joseph Marvin Imperial: aligning generative AI with technical standards


by
02 April 2025



share this:


In this interview series, we’re meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In the latest interview, we hear from Joseph Marvin Imperial, who is focussed on aligning generative AI with technical standards for regulatory and operational compliance.

Tell us a bit about your PhD – where are you studying, and what is the topic of your research?

I’m doing research at the beautiful University of Bath, focusing on aligning generative AI (GenAI) with technical standards for regulatory and operational compliance. Standards are documents created by industry and/or academic experts that have been recognized to ensure the quality, accuracy, and interoperability of systems and processes (aka “the best way of doing things”). You’ll see standards in almost all sectors and domains, including the sciences, healthcare, education, finance, journalism, law, and engineering. Given the rapid implementation of GenAI-based systems across these domains, it is logical to explore research that will anchor these systems with the same rules and regulations that experts follow through standards. I believe this direction of research is extremely important in playing a key role in both the oversight and trustworthiness of more powerful GenAI-based systems in the future.

Could you give us an overview of the research you’ve carried out so far during your PhD?

So far, I have explored evaluating and benchmarking the current capabilities of GenAI, specifically LLMs, on how they can capture standard constraints through prompting-based approaches such as in-context learning (ICL). My motivation is that I wanted to measure first how good or bad LLMs are by following specifications from standards pasted on prompts which is also the way interdisciplinary practitioners such as teachers, engineers, and journalists use AI.

In my first paper published at GEM Workshop at EMNLP2023, in the context of education and language proficiency, I empirically showed that the way teachers use GenAI models such as ChatGPT do not actually maximize their instruction-following capabilities to generate content that aligns with standards in education such the Common European Framework of Reference for Languages (CEFR). In my second long paper published in the main conference of EMNLP 2024, I proposed a novel method called ‘Standardize’ that leverages knowledge artifacts such as specifications from standards, exemplar information, and linguistic flags to significantly improve how LLMs, including GPT-4, can produce standard-aligned content that teachers can use.

In another paper at EMNLP 2024, I also explored building a benchmark called SpeciaLex to evaluate how state-of-the-art LLMs such as Gemma, Llama, GPT-4o, Bloom, OLMO perform on checking, identification, rewriting, open-generation tasks based on proper usage of vocabularies from standards. For this paper, I explored vocabularies from education (using CEFR data) and from aviation and engineering domain by partnering with ASD Simple Technical English.

Is there an aspect of your research that has been particularly interesting?

Whenever I talk to non-AI colleagues working in standards- and regulation-driven domains such as healthcare, education, and finance, I always find it interesting and motivating to hear how they express needing technological innovations to ease their burden in following regulations and protocols. This includes standard compliance. Likewise, it’s interesting and challenging to know how much standard reference data (SRDs) across domains and sectors are inaccessible to researchers who want to automate or do research on compliance checking. I see this as an opportunity to build useful AI benchmarks, like what I started with a vocabulary-based one called SpeciaLex for the domains for education and engineering.

What are your plans for building on your research so far during the PhD – what aspects will you investigate next?

I’m trying to be more ambitious with my following projects for the remaining two years of my PhD. For the following months, I plan to explore reasoning traces (also known as chain-of-thought) of LLMs and analyze how they can arrive at certain conclusions by following conflicting and complementary rules in technical standards. This direction is particularly challenging as the reasoning models I would need to use, such as DeepSeek-R1, are extremely large and computationally expensive. Moreover, I also plan to build a larger standard-compliance benchmark that spans more than 10 domains to test how state-of-the-art LLMs can follow standards for tasks including compliance checking and compliance-aligned generation.

How was the AAAI Doctoral Consortium, and the AAAI conference experience in general?

I think it’s the best Doctoral Consortium out of all the Doctoral Consortiums! (Disclaimer: I’ve only attended AAAI). I love how the event was so well organized by the program chairs. My favorite part is the mentorship session where they paired us with senior researchers in AI and I was excited to have talked face-to-face and share my research with Dr Francesca Rossi from IBM and the current AAAI25 President. I’ve also met Dr Jindong Wang from William & Mary who also works with NLP and LLM research and went out to lunch with him along with other PhD student participants at a nearby Chinese restaurant. Overall, I highly recommend the consortium to any PhD student who wants to meet new people outside their circles.

Joseph presenting his poster at the AAAI 2025 Doctoral Consortium with Program Chair Dr Lijing Wang.

What advice would you give to someone considering doing a PhD in the field?

Pursuing a PhD in anything related to AI is ridiculously competitive these days. Your success can depend on several key factors: your mental drive, your interest in the research topic, your physical health, your financial standing, and the advisor you will be working with. I’ve talked to enough PhD students and have my own experience to say if you encounter problems with any of these variables, they can impact your success. However, don’t see this as something to scare you off. I think mental, physical, intellectual, and financial maturity will get you a long way in completing a PhD. Also, it wouldn’t hurt to do a PhD in a beautiful and scenic place in another country.

Could you tell us an interesting (non-AI related) fact about you?

I grew up in the Philippines and speak two other languages, Filipino and Bikol, besides English. When I moved to the UK for my PhD, I became a casual football and rugby fan and have toured three stadiums (Anfield, Old Trafford, and Stamford Bridge) during holidays. Whenever I’m available, I always try to travel around beautiful and scenic places in the UK. Nothing is more mentally rejuvenating than being on trains and looking out the window while traveling through the English and Welsh countryside.

About Joseph

Joseph Marvin Imperial is a PhD Candidate at the UKRI Centre for Doctoral Training in Accountable, Responsible and Transparent AI (ART-AI) and a researcher of the BathNLP group at the University of Bath, UK. His research focuses on the alignment and controllability of generative AI through expert-defined rules and guidelines known as standards for regulatory and operational compliance.

Outside of the university, Joseph maintains active collaborations with international partners, including serving as 1) a committee member of the British Standards Institution (BSI) and contributing toward the AI standardization landscape for the UK, 2) advisory board member of the SIGSEA and SEACrowd advancing Southeast Asian language research, and 3) academic member of MLCommons AI Risks and Responsibility Working Group for exploring frontiers in AI safety research.



tags: , , ,


Lucy Smith is Senior Managing Editor for AIhub.
Lucy Smith is Senior Managing Editor for AIhub.




            AIhub is supported by:


Related posts :



monthly digest

AIhub monthly digest: June 2025 – gearing up for RoboCup 2025, privacy-preserving models, and mitigating biases in LLMs

  26 Jun 2025
Welcome to our monthly digest, where you can catch up with AI research, events and news from the month past.

RoboCupRescue: an interview with Adam Jacoff

  25 Jun 2025
Find out what's new in the RoboCupRescue League this year.

Making optimal decisions without having all the cards in hand

Read about research which won an outstanding paper award at AAAI 2025.

Exploring counterfactuals in continuous-action reinforcement learning

  20 Jun 2025
Shuyang Dong writes about her work that will be presented at IJCAI 2025.

What is vibe coding? A computer scientist explains what it means to have AI write computer code − and what risks that can entail

  19 Jun 2025
Until recently, most computer code was written, at least originally, by human beings. But with the advent of GenAI, that has begun to change.

Gearing up for RoboCupJunior: Interview with Ana Patrícia Magalhães

  18 Jun 2025
We hear from the organiser of RoboCupJunior 2025 and find out how the preparations are going for the event.

Interview with Mahammed Kamruzzaman: Understanding and mitigating biases in large language models

  17 Jun 2025
Find out how Mahammed is investigating multiple facets of biases in LLMs.

Google’s SynthID is the latest tool for catching AI-made content. What is AI ‘watermarking’ and does it work?

  16 Jun 2025
Last month, Google announced SynthID Detector, a new tool to detect AI-generated content.



 

AIhub is supported by:






©2025.05 - Association for the Understanding of Artificial Intelligence


 












©2025.05 - Association for the Understanding of Artificial Intelligence