Self-Improving Generative Adversarial Reinforcement Learning

Yang Liu, Yifeng Zeng, Yingke Chen, Jing Tang, Yinghui Pan

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

8 Citations (Scopus)
108 Downloads (Pure)


The lack of data efficiency and stability is one of the main challenges in end-to-end model free reinforcement learning (RL) methods. Recent researches solve the problem resort to supervised learning methods by utilizing human expert demonstrations, e.g. imitation learning. In this paper we present a novel framework which builds a self-improving process upon a policy improvement operator, which is used as a black box such that it has multiple implementation options for various applications. An agent is trained to iteratively imitate behaviors that are generated by the operator. Hence the agent can learn by itself without domain knowledge from human. We employ generative adversarial networks (GAN) to implement the imitation module in the new framework. We evaluate the framework performance over multiple application domains and provide comparison results in support.
Original languageEnglish
Title of host publicationAAMAS 2019
Subtitle of host publicationProceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems
EditorsN. Agmon, M. E. Taylor, E. Elkind, M. Veloso
Place of PublicationRichland
PublisherInternational Foundation for Autonomous Agents and Multiagent Systems (IFAAMAS)
Number of pages9
ISBN (Print)9781450363099
Publication statusPublished - 8 May 2019
Externally publishedYes
EventThe 18th International Conference on Autonomous Agents and MultiAgent Systems -
Duration: 13 May 2019 → …

Publication series

NameACM International Conference on Autonomous Agents and Multiagent Systems. Proceedings
PublisherAssociation for Computing Machinery
ISSN (Print)1548-8403


ConferenceThe 18th International Conference on Autonomous Agents and MultiAgent Systems
Abbreviated titleAAMAS
Period13/05/19 → …


Dive into the research topics of 'Self-Improving Generative Adversarial Reinforcement Learning'. Together they form a unique fingerprint.

Cite this