This package adds resampling methods for the {mlr3} package framework suited for spatial, temporal and spatiotemporal data. These methods can help to reduce the influence of autocorrelation on performance estimates when performing cross-validation. While this article gives a rather technical introduction to the package, a more applied approach can be found in the mlr3book section on “Spatiotemporal Analysis”.
After loading the package via library("mlr3spatiotempcv")
, the spatiotemporal resampling methods and example tasks provided by {mlr3spatiotempcv} are available to the user.
In mlr3, dictionaries are used for overview purposes of available methods. The following shows which dictionaries get appended with new entries.
Additional task types:
TaskClassifST
TaskRegrST
$task_types
mlr_reflections#> type package task learner prediction
#> 1: classif mlr3 TaskClassif LearnerClassif PredictionClassif
#> 2: classif mlr3spatiotempcv TaskClassifST LearnerClassif PredictionClassif
#> 3: regr mlr3 TaskRegr LearnerRegr PredictionRegr
#> 4: regr mlr3spatiotempcv TaskRegrST LearnerRegr PredictionRegr
#> measure
#> 1: MeasureClassif
#> 2: MeasureClassif
#> 3: MeasureRegr
#> 4: MeasureRegr
Additional column roles:
coordinates
$task_col_roles
mlr_reflections#> $regr
#> [1] "feature" "target" "name" "order" "stratum" "group" "weight"
#>
#> $classif
#> [1] "feature" "target" "name" "order" "stratum" "group" "weight"
#>
#> $classif_st
#> [1] "feature" "target" "name" "order" "stratum"
#> [6] "group" "weight" "coordinates"
#>
#> $regr_st
#> [1] "feature" "target" "name" "order" "stratum"
#> [6] "group" "weight" "coordinates"
Additional resampling methods:
spcv_block
spcv_buffer
spcv_coords
spcv_env
sptcv_cluto
sptcv_cstf
and their respective repeated versions.
as.data.table(mlr_resamplings)
#> key params iters
#> 1: bootstrap repeats,ratio 30
#> 2: custom 0
#> 3: cv folds 10
#> 4: holdout ratio 1
#> 5: insample 1
#> 6: loo NA
#> 7: repeated_cv repeats,folds 100
#> 8: repeated_spcv_block folds,repeats,rows,cols,range,selection 10
#> 9: repeated_spcv_coords folds,repeats 10
#> 10: repeated_spcv_env folds,repeats,features 10
#> 11: repeated_sptcv_cluto folds,repeats 10
#> 12: repeated_sptcv_cstf folds,repeats 10
#> 13: spcv_block folds,rows,cols,range,selection 10
#> 14: spcv_buffer theRange,spDataType,addBG 0
#> 15: spcv_coords folds 10
#> 16: spcv_env folds,features 10
#> 17: sptcv_cluto folds 10
#> 18: sptcv_cstf folds 10
#> 19: subsampling repeats,ratio 30
Additional example tasks:
tsk("ecuador")
(spatial, classif)tsk("cookfarm")
(spatiotemp, regr)The following table lists all methods implemented in {mlr3spatiotempcv}, their upstream R package and scientific references.
Literature | Package | Reference | mlr3 Sugar |
---|---|---|---|
Spatial Buffering | blockCV | Valavi et al. (2018) | rsmp("spcv_buffer") |
Spatial Blocking | blockCV | Valavi et al. (2018) | rsmp("spcv_block") |
Spatial CV | sperrorest | Brenning (2012) | rsmp("spcv_coords") |
Environmental Blocking | blockCV | Valavi et al. (2018) | rsmp("spcv_env") |
- | - | - | rsmp("sptcv_cluto") |
Leave-Location-and-Time-Out | CAST | Meyer et al. (2018) | rsmp("sptcv_cstf") |
Spatiotemporal Clustering | skmeans | Zhao and Karypis (2002) | rsmp("repeated_sptcv_cluto") |
Repeated Spatial Blocking | blockCV | Valavi et al. (2018) | rsmp("repeated_spcv_block") |
Repeated Spatial CV | sperrorest | Brenning (2012) | rsmp("repeated_spcv_coords") |
Repeated Env Blocking | blockCV | Valavi et al. (2018) | rsmp("repeated_spcv_env") |
- | - | - | rsmp("repeated_sptcv_cluto") |
Repeated Leave-Location-and-Time-Out | CAST | Meyer et al. (2018) | | rsmp("repeated_sptcv_cstf") |
Repeated Spatiotemporal Clustering | skmeans | Zhao and Karypis (2002) | rsmp("repeated_sptcv_cluto") |