TY - JOUR
T1 - Tensor optimization with group lasso for multi-agent predictive state representation
AU - Ma, Biyang
AU - Tang, Jing
AU - Chen, Bilian
AU - Pan, Yinghui
AU - Zeng, Yifeng
N1 - Funding information: Dr. Bilian Chen and Dr. Yinghui Pan were supported in part by the National Natural Science Foundation of China (Grants No. 61772442, 61806089 and 61836005). Professor Yifeng Zeng and Dr. Biyang Ma thanks the support of the EPSRC New Investigator Award in 2019.
PY - 2021/6/7
Y1 - 2021/6/7
N2 - Predictive state representation (PSR) is a compact model of dynamic systems that represents state as a vector of predictions about future observable events. It is an alternative to a partially observable Markov decision process (POMDP) model in dealing with a sequential decision-making problem under uncertainty. Most of the existing PSR research focuses on the model learning in a single-agent setting. In this paper, we investigate a multi-agent PSR model upon available agents interaction data. It turns out to be rather difficult to learn a multi-agent PSR model especially with limited samples and increasing number of agents. We resort to a tensor technique to better represent dynamic system characteristics and address the challenging task of learning multi-agent PSR problems based on tensor optimization. We first focus on a two-agent scenario and use a third order tensor (system dynamics tensor) to capture the system interaction data. Then, the PSR model discovery can be formulated as a tensor optimization problem with group lasso, and an alternating direction method of multipliers is called for solving the embedded subproblems. Hence, the prediction parameters and state vectors can be directly learned from the optimization solutions, and the transition parameters can be derived via a linear regression. Subsequently, we generalize the tensor learning approach in a multi(N >2)-agent PSR model, and analyze the computational complexity of the learning algorithms. Experimental results show that the tensor optimization approaches have provided promising performances on learning a multi-agent PSR model over multiple problem domains.
AB - Predictive state representation (PSR) is a compact model of dynamic systems that represents state as a vector of predictions about future observable events. It is an alternative to a partially observable Markov decision process (POMDP) model in dealing with a sequential decision-making problem under uncertainty. Most of the existing PSR research focuses on the model learning in a single-agent setting. In this paper, we investigate a multi-agent PSR model upon available agents interaction data. It turns out to be rather difficult to learn a multi-agent PSR model especially with limited samples and increasing number of agents. We resort to a tensor technique to better represent dynamic system characteristics and address the challenging task of learning multi-agent PSR problems based on tensor optimization. We first focus on a two-agent scenario and use a third order tensor (system dynamics tensor) to capture the system interaction data. Then, the PSR model discovery can be formulated as a tensor optimization problem with group lasso, and an alternating direction method of multipliers is called for solving the embedded subproblems. Hence, the prediction parameters and state vectors can be directly learned from the optimization solutions, and the transition parameters can be derived via a linear regression. Subsequently, we generalize the tensor learning approach in a multi(N >2)-agent PSR model, and analyze the computational complexity of the learning algorithms. Experimental results show that the tensor optimization approaches have provided promising performances on learning a multi-agent PSR model over multiple problem domains.
KW - predictive state representations
KW - tensor optimization
KW - alternating direction method of multipliers
KW - group lasso
UR - http://www.scopus.com/inward/record.url?scp=85103131184&partnerID=8YFLogxK
U2 - 10.1016/j.knosys.2021.106893
DO - 10.1016/j.knosys.2021.106893
M3 - Article
SN - 0950-7051
VL - 221
JO - Knowledge-Based Systems
JF - Knowledge-Based Systems
M1 - 106893
ER -