TY - JOUR
T1 - Evolutionary Multiagent Transfer Learning With Model-Based Opponent Behavior Prediction
AU - Hou, Yaqing
AU - Ong, Yew-soon
AU - Tang, Jing
AU - Zeng, Yifeng
N1 - Funding information:
National Key Research and Development Program of China (2018YFC0910500)
National Natural Science Foundation of China (61906032)
Key Research and Development Program of Liaoning Province (2019JH2/10100030)
Liaoning United Foundation (U1908214)
PY - 2021/10
Y1 - 2021/10
N2 - This article embarks a study on multiagent transfer learning (TL) for addressing the specific challenges that arise in complex multiagent systems where agents have different or even competing objectives. Specifically, beyond the essential backbone of a state-of-the-art evolutionary TL framework (eTL), this article presents the novel TL framework with prediction (eTL-P) as an upgrade over existing eTL to endow agents with abilities to interact with their opponents effectively by building candidate models and accordingly predicting their behavioral strategies. To reduce the complexity of candidate models, eTL-P constructs a monotone submodular function, which facilitates to select Top-K models from all available candidate models based on their representativeness in terms of behavioral coverage as well as reward diversity. eTL-P also integrates social selection mechanisms for agents to identify their better-performing partners, thus improving their learning performance and reducing the complexity of behavior prediction by reusing useful knowledge with respect to their partners' mind universes. Experiments based on a partner-opponent minefield navigation task (PO-MNT) have shown that eTL-P exhibits the superiority in achieving higher learning capability and efficiency of multiple agents when compared to the state-of-the-art multiagent TL approaches.
AB - This article embarks a study on multiagent transfer learning (TL) for addressing the specific challenges that arise in complex multiagent systems where agents have different or even competing objectives. Specifically, beyond the essential backbone of a state-of-the-art evolutionary TL framework (eTL), this article presents the novel TL framework with prediction (eTL-P) as an upgrade over existing eTL to endow agents with abilities to interact with their opponents effectively by building candidate models and accordingly predicting their behavioral strategies. To reduce the complexity of candidate models, eTL-P constructs a monotone submodular function, which facilitates to select Top-K models from all available candidate models based on their representativeness in terms of behavioral coverage as well as reward diversity. eTL-P also integrates social selection mechanisms for agents to identify their better-performing partners, thus improving their learning performance and reducing the complexity of behavior prediction by reusing useful knowledge with respect to their partners' mind universes. Experiments based on a partner-opponent minefield navigation task (PO-MNT) have shown that eTL-P exhibits the superiority in achieving higher learning capability and efficiency of multiple agents when compared to the state-of-the-art multiagent TL approaches.
KW - Behavior prediction
KW - evolutionary transfer learning (eTL)
KW - monotone submodular model selection
KW - multiagent system (MAS)
U2 - 10.1109/TSMC.2019.2958846
DO - 10.1109/TSMC.2019.2958846
M3 - Article
SN - 1083-4427
SN - 2168-2216
SN - 2168-2232
VL - 51
SP - 5962
EP - 5976
JO - IEEE Transactions on Systems, Man and Cybernetics: Systems
JF - IEEE Transactions on Systems, Man and Cybernetics: Systems
IS - 10
ER -