This article embarks a study on multiagent transfer learning (TL) for addressing the specific challenges that arise in complex multiagent systems where agents have different or even competing objectives. Specifically, beyond the essential backbone of a state-of-the-art evolutionary TL framework (eTL), this article presents the novel TL framework with prediction (eTL-P) as an upgrade over existing eTL to endow agents with abilities to interact with their opponents effectively by building candidate models and accordingly predicting their behavioral strategies. To reduce the complexity of candidate models, eTL-P constructs a monotone submodular function, which facilitates to select Top-K models from all available candidate models based on their representativeness in terms of behavioral coverage as well as reward diversity. eTL-P also integrates social selection mechanisms for agents to identify their better-performing partners, thus improving their learning performance and reducing the complexity of behavior prediction by reusing useful knowledge with respect to their partners' mind universes. Experiments based on a partner-opponent minefield navigation task (PO-MNT) have shown that eTL-P exhibits the superiority in achieving higher learning capability and efficiency of multiple agents when compared to the state-of-the-art multiagent TL approaches.
|Number of pages||15|
|Journal||IEEE Transactions on Systems, Man and Cybernetics: Systems|
|Early online date||27 Dec 2019|
|Publication status||E-pub ahead of print - 27 Dec 2019|