One important component of developing autonomous agents lies in the accurate prediction of their opponents’ behaviors when the agents interact with others in an uncertain environment. Most recent study focuses on first constructing predictive types (or models) of the opponents, considering their various properties of interest, and subsequently using these models to predict their behaviors accordingly. However, as the possible type space can be rather large, it is time-consuming, and sometimes even infeasible, to predict the actual behaviors of opponents with all candidate types. Thus, in this paper a tractable opponent behavior reasoning approach is proposed that facilitates ( a ) extraction of a small yet representative summary of all candidates using sub-modular-type maximization, and accordingly, ( b ) identification of the most appropriate type for real-time behavior prediction based on multi-armed bandits. In addition, we propose a knowledge-transfer scheme through demonstration learning to synchronize subject agents’ knowledge about their opponents’ behaviors. This further reduces the burden of reasoning with all models of their opponents from the perspective of individual subject agents. We integrate the new behavior prediction and reasoning method into a state-of-the-art evolutionary multi-agent framework, namely a memetic multi-agent system (MeMAS), and demonstrate empirical performance in two problem domains.
|Number of pages||12|
|Journal||IEEE Transactions on Emerging Topics in Computational Intelligence|
|Publication status||Accepted/In press - 14 Jan 2022|