"Stabilizing Multi-Agent Deep Reinforcement Learning by Implicitly Esti" by Yue Jin, Shuangqing Wei et al.

Faculty Publications

Title

Stabilizing Multi-Agent Deep Reinforcement Learning by Implicitly Estimating Other Agents' Behaviors

Authors

Yue Jin, Tsinghua University
Shuangqing Wei, Louisiana State University
Jian Yuan, Tsinghua University
Xudong Zhang, Tsinghua University
Chao Wang, Tsinghua University

Document Type

Conference Proceeding

Publication Date

5-1-2020

Abstract

Deep reinforcement learning (DRL) is able to learn control policies for many complicated tasks, but it's power has not been unleashed to handle multi-agent circumstances. Independent learning, where each agent treats others as part of the environment and learns its own policy without considering others' policies is a simple way to apply DRL to multi-agent tasks. However, since agents' policies change as learning proceeds, from the perspective of each agent, the environment is non-stationary, which makes conventional DRL methods inefficient. To cope with this challenge, we propose a novel approach where each agent uses an implicit estimate of others' actions to guide its own policy learning. We demonstrate that given the implicit estimate of others' actions, each agent can learn its policy in a relatively stationary environment. Extensive experiments show that our method significantly alleviates the non-stationarity and outperforms the state-of-the-art in terms of both convergence speed and policy performance.

Publication Source (Journal or Book title)

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

First Page

3547

Last Page

3551

Recommended Citation

Jin, Y., Wei, S., Yuan, J., Zhang, X., & Wang, C. (2020). Stabilizing Multi-Agent Deep Reinforcement Learning by Implicitly Estimating Other Agents' Behaviors. ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings, 2020-May, 3547-3551. https://doi.org/10.1109/ICASSP40776.2020.9053534

This document is currently not available here.

COinS