How can robots acquire skills through interactions with the physical world? An interview with Jiaheng Hu

by Lucy Smith

12 February 2026

What is the topic of the research in your paper and why is it an interesting area for study?

This paper is about how robots (in particular, household robots like mobile manipulators) can autonomously acquire skills via interacting with the physical world (i.e. real-world reinforcement learning). Reinforcement learning (RL) is a general learning framework for learning from trial-and-error interaction with an environment, and has huge potential in allowing robots to learn tasks without humans hand-engineering the solution. RL for robotics is a very exciting field, as it can open possibilities for robots to self-improve in a scalable way, towards the creation of general-purpose household robots that can assist people in our everyday lives.

What were some of the issues with previous methods that your paper was trying to address?

Previously, most of the successful applications of RL to robotics were done by training entirely in simulation, then deploying the policy in the real-world directly (i.e. zero-shot sim2real). However, such a method has big limitations: on one hand, it is not very scalable, as you need to create task-specific, high-fidelity simulation environments that highly match the real-world environment that you want to deploy the robot in, and this can often take days or months for each and every task. On the other hand, some tasks are actually very hard to simulate, as they involve deformable objects and contact-rich interactions (for example, pouring water, folding clothes, wiping whiteboard). For these tasks, the simulation is often quite different from the real world. This is where real-world RL comes into play: if we can allow a robot to learn by directly interacting with the physical world, we don’t need a simulator anymore. However, while several attempts have been made towards realizing real-world RL, it is actually a very hard problem since: 1. Sample-inefficiency: RL requires a lot of samples (i.e. interaction with the environment) to learn good behavior, which is often impossible to collect in large quantities in the real-world. 2. Safety Issues: RL requires exploration, and random exploration in the real-world is often very very dangerous. The robot can break itself and will never be able to recover from that.

Could you tell us about the method (SLAC) that you’ve introduced?

So, creating high-fidelity simulations is very hard, and directly learning in the real-world is also really hard. What should we do? The key idea of SLAC is that we can use a low-fidelity simulation environment to assist subsequent real-world RL. Specifically, SLAC implements this idea in a two-step process: in the first step, SLAC learns a latent action space in simulation via unsupervised reinforcement learning. Unsupervised RL is a technique that allows the robot to explore a given environment and learn task-agnostic behaviors. In SLAC, we design a special unsupervised RL objective that encourages these behaviors to be safe and structured.

In the second step, we treat these learned behaviors as the new action space of the robot, where the robot does real-world RL for downstream tasks such as wiping whiteboards by making decisions in this new action space. Importantly, this method allow us to circumvent the two biggest problem of real-world RL: we don’t have to worry about safety issues since the new action space is pretrained to be always safe; and we can learn in a sample-efficient way because our new action space is trained to be very structured.

The robot carrying out the task of wiping a whiteboard.

How did you go about testing and evaluating your method, and what were some of the key results?

We test our methods on a real Tiago robot – a high degrees-of-freedom, bi-manual mobile manipulation, on a series of very challenging real-world tasks, including wiping a large whiteboard, cleaning a table, and sweeping trash into a bag. These tasks are challenging from three aspects: 1. They are visuo-motor tasks that require processing of high-dimensional image information. 2. They require the whole-body motion of the robot (i.e. controlling many degrees-of-freedom at the same time), and 3. They are contact-rich, which makes it hard to simulate accurately. On all of these tasks, our method allows us to learn high-performance policies (>80% success rate) within an hour of real-world interactions. By comparison, previous methods simply cannot solve the task, and often risk breaking the robot. So to summarize, previously it was simply not possible to solve these tasks via real-world RL, and our method has made it possible.

What are your plans for future work?

I think there is still a lot more to do at the intersection of RL and robotics. My eventual goal is to create truly self-improving robots that can learn entirely by themselves without any human involvement. More recently, I’ve been interested in how we can leverage foundation models such as vision-language models (VLMs) and vision-language-action models (VLAs) to further automate the self-improvement loop.

About Jiaheng

Jiaheng Hu is a 4th-year PhD student at UT-Austin, co-advised by Prof. Peter Stone and Prof. Roberto Martín-Martín. His research interest is in Robot Learning and Reinforcement Learning, with the long-term goal of developing self-improving robots that can learn and adapt autonomously in unstructured environments. Jiaheng’s work has been published at top-tier Robotics and ML venues, including CoRL, NeurIPS, RSS, and ICRA, and has earned multiple best paper nominations and awards. During his PhD, he interned at Google DeepMind and Ai2, and is a recipient of the Two Sigma PhD Fellowship.

Read the work in full

SLAC: Simulation-Pretrained Latent Action Space for Whole-Body Real-World RL, Jiaheng Hu, Peter Stone, Roberto Martín-Martín.

tags: RoboCup, RoboCup2026

Lucy Smith is Senior Managing Editor for AIhub.

AIhub is supported by:

How can robots acquire skills through interactions with the physical world? An interview with Jiaheng Hu

What is the topic of the research in your paper and why is it an interesting area for study?

What were some of the issues with previous methods that your paper was trying to address?

Could you tell us about the method (SLAC) that you’ve introduced?

How did you go about testing and evaluating your method, and what were some of the key results?

What are your plans for future work?

About Jiaheng

Read the work in full

Related posts :

From Visual Question Answering to multimodal learning: an interview with Aishwarya Agrawal

Governing the rise of interactive AI will require behavioral insights

AI is coming to Olympic judging: what makes it a game changer?

Sven Koenig wins the 2026 ACM/SIGAI Autonomous Agents Research Award

Congratulations to the #AAAI2026 award winners

Forthcoming machine learning and AI seminars: February 2026 edition

#AAAI2026 social media round up: part 2

Interview with Zijian Zhao: Labor management in transportation gig systems through reinforcement learning

↑