Cross-validation (CV) is a widely-used method of predictive assessment based
on repeated model fits to different subsets of the available data. CV is
applicable in a wide range of statistical settings. Tuttavia, in cases where
data are not exchangeable, the design of CV schemes should account for
suspected correlation structures within the data. CV scheme designs include the
selection of left-out blocks and the choice of scoring function for evaluating
predictive performance.
This paper focuses on the impact of two scoring strategies for block-wise CV
applied to spatial models with Gaussian covariance structures. We investigate,
through several experiments, whether evaluating the predictive performance of
blocks of left-out observations jointly, rather than aggregating individual
(pointwise) predictions, improves model selection performance. Extending recent
findings for data with serial correlation (such as time-series data), our
experiments suggest that joint scoring reduces the variability of CV estimates,
leading to more reliable model selection, particularly when spatial dependence
is strong and model differences are subtle.
Questo articolo esplora i giri e le loro implicazioni.
Scarica PDF:
2504.15586v1