WarpDrive: Extremely Fast Reinforcement Learning on an NVIDIA GPU

blog.einstein.ai
7 min read
standard
tldr: WarpDrive is an open-source framework to do multi-agent RL end-to-end on a GPU. It achieves orders of magnitude faster multi-agent RL training with 2000 environments and 1000 agents in a simple Tag environment. WarpDrive provides lightweight tools and workflow objects to build your own fast RL workflows. Check out the code [https://github.com/salesforce/warp-drive], this blog, and the white paper [https://arxiv.org/abs/2108.13976] for more details! The name WarpDrive is inspired by the sc
tldr: WarpDrive is an open-source framework to do multi-agent RL end-to-end on a GPU. It achieves orders of magnitude faster multi-agent RL training with 2000 environments and 1000 agents in a simple Tag environment. WarpDrive provides lightweight tools and workflow objects to build your own fast RL workflows. Check out the code, this blog, and the white paper for more details!

The name WarpDrive is inspired by the science fiction concept of a fictional superluminal spacecraft propulsion system. Moreover, at the time of writing, a "warp" is a group of 32 threads that are executing at the same time in (certain) GPUs.

The Challenge of Multi-Agent RL

Multi-agent systems, particularly those with multiple interacting AI agents, are a frontier for AI research and applications. They are key to solving many engineering and scientific challenges in economics, self-driving cars, and robotics, and many other fields. Deep reinforcement learning (RL) is a powerful learning framework to train AI agents. Deep RL agents have mastered Starcraft [1], successfully trained robotic arms [2], and effectively recommended economic policies [3,4].

However, multi-agent deep RL (MADRL) experiments can take days or even weeks, especially when a large number of agents is trained. MADRL requires repeatedly running multi-agent simulations and training agent models. This takes a lot of time because MADRL implementations often combine CPU-based simulations with GPU deep learning model. For example, the Foundation economic simulation framework [5] follows this pattern.

This introduces many performance bottlenecks. For instance, CPUs do not parallelize computations well across agents and across environments, and data transfers between CPU and GPU are inefficient.

To accelerate MADRL research and engineering, we built WarpDrive, an open-source framework library for extremely fast MADRL. WarpDrive runs MADRL entirely on a GPU and so achieves orders of magnitude faster training. In the animation…
Stephan Zheng
Read full article