Learning to Advise and Learning from Advice in Cooperative Multiagent Reinforcement Learning

Document Type

Conference Proceeding

Publication Date

1-1-2022

Abstract

We propose a novel policy-level generative adversarial learning framework to enhance cooperative multiagent reinforcement learning (MARL), which consists of a centralized advisor, MARL agents and discriminators. The advisor is realized through a dual graph convolutional network (DualGCN) to give advice to agents from a global perspective via fusing decision information, resolving spatial conflicts, and maintaining temporal continuity. Each discriminator trained can distinguish between the policies of the advisor and an agent. Leveraging the discriminator's judgment, each agent learns to match with the advised policy in addition to learning by its own exploration, which accelerates learning and enhances policy performance. Additionally, an advisor boosting method which incorporates the relevant suggestion made by the discriminators into the training of DualGCN is proposed to further help improve MARL agents. We validate our methods in cooperative navigation tasks. Results demonstrate that our method outperforms baseline methods in terms of both learning efficiency and policy efficacy.

Publication Source (Journal or Book title)

Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS

First Page

1645

Last Page

1647

This document is currently not available here.

Share

COinS