Prompt a Robot to Walk with Large Language Models

Paper Code

Large language models (LLMs) pre-trained on vast internet-scale data have showcased remarkable capabilities across diverse domains. Recently, there has been escalating interest in deploying LLMs for robotics, aiming to harness the power of foundation models in real-world settings. However, this approach faces significant challenges, particularly in grounding these models in the physical world and in generating dynamic robot motions. To address these issues, we introduce a novel paradigm in which we use few-shot prompts collected from the physical environment, enabling the LLM to autoregressively predict low-level control actions for robots without task-specific fine-tuning. We utilize LLMs as a controller, diverging from the conventional approach of employing them primarily as planners. Simulation experiments across various robots and environments validate that our method can effectively prompt a robot to walk. We thus illustrate how LLMs can function as low-level feedback controllers for dynamic motion control, even in high-dimensional robotic systems.

Overview

We first collect data from an existing controller to initialize the LLM policy. Then, we design a text prompt including a description prompt and an observation and action prompt. The LLM outputs normalized target joint positions that are then tracked by a PD controller. After each LLM inference loop, the prompt is updated with the historical observations and actions. In our experiment, the LLM is supposed to run at 10 Hz although the simulation has to be paused to wait for LLM inference, and the PD controller executes at 200 Hz.

Can we directly use a prompt to make LLM achieve low-level control?

Yes! Grounded in a physics-based simulator, LLMs output target joint positions to enable a robot to walk given a text prompt. The following video shows the A1 quadrupedal robot walking on a flat ground in MuJoCo simulator.

A1 quadrupedal robot walking on a flat ground

Due to the need to balance the token limit of the LLM and the size of P_Hist, we execute the policy at 10 Hz. However, this leads to a walking gait that becomes reasonably worse compared to many RL-based walking policies running at 50 Hz or even higher. All videos were produced through post-production rendering and are not in real-time.

Does the proposed approach generalize to different robots and environments?

To answer this question, we test our method on a variety of robots and environments. The following videos show the A1 quadrupedal robot walking on an uneven terrain in MuJoCo simulator and the ANYmal quadrupedal robot walking on a flat ground in Isaac Gym simulator.

ANYmal quadrupedal robot walking on a flat ground in Isaac Gym

A1 quadrupedal robot walking on an uneven terrain in MuJoCo

How should we design prompts for robot walking?

We design a text prompt that includes two parts: a description prompt and an observation and action prompt. In the description prompt, we have the following subparts: P_TD: task description, P_IO: meaning of input and output space, P_JO: joint order, P_CP: full control pipeline, P_AI: additional illustration. In the observation and action prompt, we have P_Hist: historical observations and actions. The LLM outputs normalized target joint positions.

The LLM policy can make a robot recover from terrain disturbance

The A1 robot is prompted to walk on uneven terrain in MuJoCo, where the LLM policy can make it recover from terrain disturbance.

The LLM policy can make the A1 robot recover from terrain disturbance

Want more details?

Please read our paper! If you have further questions, please feel free to contact Yen-Jen.

BibTeX

@article{wang2023prompt,
  title={Prompt a Robot to Walk with Large Language Models},
  author={Yen-Jen Wang and Bike Zhang and Jianyu Chen and Koushil Sreenath},
  journal={arXiv preprint arXiv:2309.09969},
  year={2023}
}

Acknowledgements

This work is supported in part by the InnoHK of the Government of the Hong Kong Special Administrative Region via the Hong Kong Centre for Logistics Robotics and in part by The AI Institute. The website template is from Learning Humanoid Locomotion with Transformers.