TY - JOUR
T1 - Statistical Modeling of Spatially Stratified Heterogeneous Data
AU - Wang, Jinfeng
AU - Haining, Robert
AU - Zhang, Tonglin
AU - Xu, Chengdong
AU - Hu, Maogui
AU - Yin, Qian
AU - Li, Lianfa
AU - Zhou, Chenghu
AU - Li, Guangquan
AU - Chen, Hongyan
N1 - Funding information: This study was supported by the National Natural Science Foundation of China (No. 42071375, 41531179), Ministry of Science and Technology of China (2022YFC3600800; 2023YFF1305403), and the National Social Science Foundation of China (No. 21&ZD186).
PY - 2024/3/15
Y1 - 2024/3/15
N2 - Spatial statistics is an important methodology for geospatial data analysis. It has evolved to handle spatially autocorrelated data and spatially (locally) heterogeneous data, which aim to capture the first and second laws of geography, respectively. Examples of spatially stratified heterogeneity (SSH) include climatic zones and land-use types. Methods for such data are relatively underdeveloped compared to the first two properties. The presence of SSH is evidence that nature is lawful and structured rather than purely random. This induces another “layer” of causality underlying variations observed in geographical data. In this article, we go beyond traditional cluster-based approaches and propose a unified approach for SSH in which we provide an equation for SSH, display how SSH is a source of bias in spatial sampling and confounding in spatial modeling, detect nonlinear stochastic causality inherited in SSH distribution, quantify general interaction identified by overlaying two SSH distributions, perform spatial prediction based on SSH, develop a new measure for spatial goodness of fit, and enhance global modeling by integrating them with an SSH q statistic. The research advances statistical theory and methods for dealing with SSH data, thereby offering a new toolbox for spatial data analysis.
AB - Spatial statistics is an important methodology for geospatial data analysis. It has evolved to handle spatially autocorrelated data and spatially (locally) heterogeneous data, which aim to capture the first and second laws of geography, respectively. Examples of spatially stratified heterogeneity (SSH) include climatic zones and land-use types. Methods for such data are relatively underdeveloped compared to the first two properties. The presence of SSH is evidence that nature is lawful and structured rather than purely random. This induces another “layer” of causality underlying variations observed in geographical data. In this article, we go beyond traditional cluster-based approaches and propose a unified approach for SSH in which we provide an equation for SSH, display how SSH is a source of bias in spatial sampling and confounding in spatial modeling, detect nonlinear stochastic causality inherited in SSH distribution, quantify general interaction identified by overlaying two SSH distributions, perform spatial prediction based on SSH, develop a new measure for spatial goodness of fit, and enhance global modeling by integrating them with an SSH q statistic. The research advances statistical theory and methods for dealing with SSH data, thereby offering a new toolbox for spatial data analysis.
KW - confounding
KW - inference
KW - sample bias
KW - spatial causality
KW - spatially stratified heterogeneity
UR - http://www.scopus.com/inward/record.url?scp=85184403696&partnerID=8YFLogxK
U2 - 10.1080/24694452.2023.2289982
DO - 10.1080/24694452.2023.2289982
M3 - Article
SN - 2469-4452
VL - 114
SP - 499
EP - 519
JO - Annals of the American Association of Geographers
JF - Annals of the American Association of Geographers
IS - 3
ER -