ΑΙhub.org
 

Interview with Joseph Marvin Imperial: aligning generative AI with technical standards


by
02 April 2025



share this:


In this interview series, we’re meeting some of the AAAI/SIGAI Doctoral Consortium participants to find out more about their research. The Doctoral Consortium provides an opportunity for a group of PhD students to discuss and explore their research interests and career objectives in an interdisciplinary workshop together with a panel of established researchers. In the latest interview, we hear from Joseph Marvin Imperial, who is focussed on aligning generative AI with technical standards for regulatory and operational compliance.

Tell us a bit about your PhD – where are you studying, and what is the topic of your research?

I’m doing research at the beautiful University of Bath, focusing on aligning generative AI (GenAI) with technical standards for regulatory and operational compliance. Standards are documents created by industry and/or academic experts that have been recognized to ensure the quality, accuracy, and interoperability of systems and processes (aka “the best way of doing things”). You’ll see standards in almost all sectors and domains, including the sciences, healthcare, education, finance, journalism, law, and engineering. Given the rapid implementation of GenAI-based systems across these domains, it is logical to explore research that will anchor these systems with the same rules and regulations that experts follow through standards. I believe this direction of research is extremely important in playing a key role in both the oversight and trustworthiness of more powerful GenAI-based systems in the future.

Could you give us an overview of the research you’ve carried out so far during your PhD?

So far, I have explored evaluating and benchmarking the current capabilities of GenAI, specifically LLMs, on how they can capture standard constraints through prompting-based approaches such as in-context learning (ICL). My motivation is that I wanted to measure first how good or bad LLMs are by following specifications from standards pasted on prompts which is also the way interdisciplinary practitioners such as teachers, engineers, and journalists use AI.

In my first paper published at GEM Workshop at EMNLP2023, in the context of education and language proficiency, I empirically showed that the way teachers use GenAI models such as ChatGPT do not actually maximize their instruction-following capabilities to generate content that aligns with standards in education such the Common European Framework of Reference for Languages (CEFR). In my second long paper published in the main conference of EMNLP 2024, I proposed a novel method called ‘Standardize’ that leverages knowledge artifacts such as specifications from standards, exemplar information, and linguistic flags to significantly improve how LLMs, including GPT-4, can produce standard-aligned content that teachers can use.

In another paper at EMNLP 2024, I also explored building a benchmark called SpeciaLex to evaluate how state-of-the-art LLMs such as Gemma, Llama, GPT-4o, Bloom, OLMO perform on checking, identification, rewriting, open-generation tasks based on proper usage of vocabularies from standards. For this paper, I explored vocabularies from education (using CEFR data) and from aviation and engineering domain by partnering with ASD Simple Technical English.

Is there an aspect of your research that has been particularly interesting?

Whenever I talk to non-AI colleagues working in standards- and regulation-driven domains such as healthcare, education, and finance, I always find it interesting and motivating to hear how they express needing technological innovations to ease their burden in following regulations and protocols. This includes standard compliance. Likewise, it’s interesting and challenging to know how much standard reference data (SRDs) across domains and sectors are inaccessible to researchers who want to automate or do research on compliance checking. I see this as an opportunity to build useful AI benchmarks, like what I started with a vocabulary-based one called SpeciaLex for the domains for education and engineering.

What are your plans for building on your research so far during the PhD – what aspects will you investigate next?

I’m trying to be more ambitious with my following projects for the remaining two years of my PhD. For the following months, I plan to explore reasoning traces (also known as chain-of-thought) of LLMs and analyze how they can arrive at certain conclusions by following conflicting and complementary rules in technical standards. This direction is particularly challenging as the reasoning models I would need to use, such as DeepSeek-R1, are extremely large and computationally expensive. Moreover, I also plan to build a larger standard-compliance benchmark that spans more than 10 domains to test how state-of-the-art LLMs can follow standards for tasks including compliance checking and compliance-aligned generation.

How was the AAAI Doctoral Consortium, and the AAAI conference experience in general?

I think it’s the best Doctoral Consortium out of all the Doctoral Consortiums! (Disclaimer: I’ve only attended AAAI). I love how the event was so well organized by the program chairs. My favorite part is the mentorship session where they paired us with senior researchers in AI and I was excited to have talked face-to-face and share my research with Dr Francesca Rossi from IBM and the current AAAI25 President. I’ve also met Dr Jindong Wang from William & Mary who also works with NLP and LLM research and went out to lunch with him along with other PhD student participants at a nearby Chinese restaurant. Overall, I highly recommend the consortium to any PhD student who wants to meet new people outside their circles.

Joseph presenting his poster at the AAAI 2025 Doctoral Consortium with Program Chair Dr Lijing Wang.

What advice would you give to someone considering doing a PhD in the field?

Pursuing a PhD in anything related to AI is ridiculously competitive these days. Your success can depend on several key factors: your mental drive, your interest in the research topic, your physical health, your financial standing, and the advisor you will be working with. I’ve talked to enough PhD students and have my own experience to say if you encounter problems with any of these variables, they can impact your success. However, don’t see this as something to scare you off. I think mental, physical, intellectual, and financial maturity will get you a long way in completing a PhD. Also, it wouldn’t hurt to do a PhD in a beautiful and scenic place in another country.

Could you tell us an interesting (non-AI related) fact about you?

I grew up in the Philippines and speak two other languages, Filipino and Bikol, besides English. When I moved to the UK for my PhD, I became a casual football and rugby fan and have toured three stadiums (Anfield, Old Trafford, and Stamford Bridge) during holidays. Whenever I’m available, I always try to travel around beautiful and scenic places in the UK. Nothing is more mentally rejuvenating than being on trains and looking out the window while traveling through the English and Welsh countryside.

About Joseph

Joseph Marvin Imperial is a PhD Candidate at the UKRI Centre for Doctoral Training in Accountable, Responsible and Transparent AI (ART-AI) and a researcher of the BathNLP group at the University of Bath, UK. His research focuses on the alignment and controllability of generative AI through expert-defined rules and guidelines known as standards for regulatory and operational compliance.

Outside of the university, Joseph maintains active collaborations with international partners, including serving as 1) a committee member of the British Standards Institution (BSI) and contributing toward the AI standardization landscape for the UK, 2) advisory board member of the SIGSEA and SEACrowd advancing Southeast Asian language research, and 3) academic member of MLCommons AI Risks and Responsibility Working Group for exploring frontiers in AI safety research.



tags: , , ,


Lucy Smith is Senior Managing Editor for AIhub.
Lucy Smith is Senior Managing Editor for AIhub.




            AIhub is supported by:


Related posts :



Interview with Eden Hartman: Investigating social choice problems

  24 Apr 2025
Find out more about research presented at AAAI 2025.

The Machine Ethics podcast: Co-design with Pinar Guvenc

This episode, Ben chats to Pinar Guvenc about co-design, whether AI ready for society and society is ready for AI, what design is, co-creation with AI as a stakeholder, bias in design, small language models, and more.

Why AI can’t take over creative writing

  22 Apr 2025
A large language model tries to generate what a random person who had produced the previous text would produce.

Interview with Amina Mević: Machine learning applied to semiconductor manufacturing

  17 Apr 2025
Find out how Amina is using machine learning to develop an explainable multi-output virtual metrology system.

Images of AI – between fiction and function

“The currently pervasive images of AI make us look somewhere, at the cost of somewhere else.”

Grace Wahba awarded the 2025 International Prize in Statistics

  16 Apr 2025
Her contributions laid the foundation for modern statistical techniques that power machine learning algorithms such as gradient boosting and neural networks.

Repurposing protein folding models for generation with latent diffusion

  14 Apr 2025
The awarding of the 2024 Nobel Prize to AlphaFold2 marks an important moment of recognition for the of AI role in biology. What comes next after protein folding?




AIhub is supported by:






©2024 - Association for the Understanding of Artificial Intelligence


 












©2021 - ROBOTS Association