TY - JOUR
T1 - Behavior Reasoning for Opponent Agents in Multi-Agent Learning Systems
AU - Hou, Yaqing
AU - Sun, Mingyang
AU - Zhu, Wenxuan
AU - Zeng, Yifeng
AU - Piao, Haiyin
AU - Chen, Xuefeng
AU - Zhang, Qiang
PY - 2022/10/1
Y1 - 2022/10/1
N2 - One important component of developing autonomous agents lies in the accurate prediction of their opponents’ behaviors when the agents interact with others in an uncertain environment. Most recent study focuses on first constructing predictive types (or models) of the opponents, considering their various properties of interest, and subsequently using these models to predict their behaviors accordingly. However, as the possible type space can be rather large, it is time-consuming, and sometimes even infeasible, to predict the actual behaviors of opponents with all candidate types. Thus, in this paper a tractable opponent behavior reasoning approach is proposed that facilitates ( a ) extraction of a small yet representative summary of all candidates using sub-modular-type maximization, and accordingly, ( b ) identification of the most appropriate type for real-time behavior prediction based on multi-armed bandits. In addition, we propose a knowledge-transfer scheme through demonstration learning to synchronize subject agents’ knowledge about their opponents’ behaviors. This further reduces the burden of reasoning with all models of their opponents from the perspective of individual subject agents. We integrate the new behavior prediction and reasoning method into a state-of-the-art evolutionary multi-agent framework, namely a memetic multi-agent system (MeMAS), and demonstrate empirical performance in two problem domains.
AB - One important component of developing autonomous agents lies in the accurate prediction of their opponents’ behaviors when the agents interact with others in an uncertain environment. Most recent study focuses on first constructing predictive types (or models) of the opponents, considering their various properties of interest, and subsequently using these models to predict their behaviors accordingly. However, as the possible type space can be rather large, it is time-consuming, and sometimes even infeasible, to predict the actual behaviors of opponents with all candidate types. Thus, in this paper a tractable opponent behavior reasoning approach is proposed that facilitates ( a ) extraction of a small yet representative summary of all candidates using sub-modular-type maximization, and accordingly, ( b ) identification of the most appropriate type for real-time behavior prediction based on multi-armed bandits. In addition, we propose a knowledge-transfer scheme through demonstration learning to synchronize subject agents’ knowledge about their opponents’ behaviors. This further reduces the burden of reasoning with all models of their opponents from the perspective of individual subject agents. We integrate the new behavior prediction and reasoning method into a state-of-the-art evolutionary multi-agent framework, namely a memetic multi-agent system (MeMAS), and demonstrate empirical performance in two problem domains.
KW - behavior prediction and reasoning
KW - memetic computing
KW - multi-agent systems
KW - Opponent modeling
UR - http://www.scopus.com/inward/record.url?scp=85124714421&partnerID=8YFLogxK
U2 - 10.1109/tetci.2022.3147011
DO - 10.1109/tetci.2022.3147011
M3 - Article
AN - SCOPUS:85124714421
SN - 2471-285X
VL - 6
SP - 1125
EP - 1136
JO - IEEE Transactions on Emerging Topics in Computational Intelligence
JF - IEEE Transactions on Emerging Topics in Computational Intelligence
IS - 5
ER -