Spatio-Temporal

Misc

  • Packages
    • {magclass} - Data Class and Tools for Handling Spatial-Temporal Data
      • Has datatable under the hood, so it should be pretty fast for larger data
      • Conversions, basic calculations and basic data manipulation
    • {mlr3spatiotempcv} - Extends the mlr3 machine learning framework with spatio-temporal resampling methods to account for the presence of spatiotemporal autocorrelation (STAC) in predictor variables
    • {spStack} - Bayesian inference for point-referenced spatial data by assimilating posterior inference over a collection of candidate models using stacking of predictive densities.
      • Currently, it supports point-referenced Gaussian, Poisson, binomial and binary outcomes.
      • Highly parallelizable and hence, much faster than traditional Markov chain Monte Carlo algorithms while delivering competitive predictive performance
      • Fits Bayesian spatially-temporally varying coefficients (STVC) generalized linear models without MCMC
    • {spTimer} (Vignette) - Spatio-Temporal Bayesian Modelling
      • Models: Bayesian Gaussian Process (GP) Models, Bayesian Auto-Regressive (AR) Models, and Bayesian Gaussian Predictive Processes (GPP) based AR Models for spatio-temporal big-n problems
      • Depends on {spacetime} and {sp}
    • {stars} - Reading, manipulating, writing and plotting spatiotemporal arrays (raster and vector data cubes) in ‘R’, using ‘GDAL’ bindings provided by ‘sf’, and ‘NetCDF’ bindings by ‘ncmeta’ and ‘RNetCDF’
      • It supercedes the {spacetime}, which extended the shared classes defined in {sp} for spatio-temporal data. {stars} uses PROJ and GDAL through {sf}.
  • Resources
  • Papers
  • Notes from
    • spacetime: Spatio-Temporal Data in R - It’s been superceded by {stars}, but there didn’t seem to be much about working with vector data in {stars} vignettes. So, it seemed like a better place to start for a beginner, and I think the concepts might transfer since the packages were created by the same people. Some packages still use sp and spacetime, so it could be useful in using with those packages

Grid Layouts

  • {spacetime} Classes
  • Full-Grids
    • Each coordinate (spatial feature vs. time) for each time point has a value (which can be NA) — i.e. a fixed set of spatial entities and a fixed set of time points.
    • Examples
      • Regular (e.g., hourly) measurements of air quality at a spatially irregular set of points (measurement stations)
      • Yearly disease counts for a set of administrative regions
      • A sequence of rainfall images (e.g., monthly sums), interpolated to a spatially regular grid. In this example, the spatial feature (images) are grids themselves.
  • Sparse Grids
    • Only coordinates (spatial feature vs. time) that have a value are included (no NA values)
    • Use Cases
      • Space-time lattices where there are many missing or trivial values
        • e.g. grids where points represent observed fires, and points where no fires are recorded are discarded.
      • Each spatial feature has a different set of time points
        • e.g. where spatial feaatures are regions and a point indicates when and where a crime takes place. Some regions may have frequent crimes while others hardly any.
      • When spatial features vary with time
        • Scenarios
          • Some locations only exist during certain periods.
          • Measurement stations move or appear/disappear.
          • Remote sensing scenes shift or have different extents.
          • Administrative boundaries change (e.g., counties/tracts splitting or merging).
        • Example: Suppose you have monthly satellite images of vegetation over a region for three months
          • January: the satellite covers tiles A, B, and C
          • February: cloud cover obscures tile B; only A and C are available
          • March: a different orbital path covers B and D (a new area)
        • Example: Crime locations over time for a city
          • In 2018, you have GPS points for all recorded crimes that year (randomly scattered).
          • In 2019, you have a completely different set of GPS points (different crimes, different places).
          • The number of points per year also varies.
  • Irregular Grids
    • Time and space points of measured values have no apparent organization: for each measured value the spatial feature and time point is stored, as in the long format.
    • No underlying lattice structure or indexes. Each observation is a standalone (space, time) pair. Repeated values are possible.
    • Essentially just a dataframe with a geometry column, datetime column, and value column.
    • Use Cases
      • Mobile sensors or moving animals: each record has its own location and timestamp — no fixed grid or station network.
      • Lightning strikes: purely random events in continuous space-time.
  • Trajectory Grids
    • Trajectories cover the case where sets of (irregular) space-time points form sequences, and depict a trajectory.
    • Their grouping may be simple (e.g., the trajectories of two persons on different days), nested (for several objects, a set of trajectories representing different trips) or more complex (e.g., with objects that split, merge, or disappear).
    • Examples
      • Human trajectories
      • Mobile sensor measurements (where the sequence is kept, e.g., to derive the speed and direction of the sensor)
      • Trajectories of tornados where the tornado extent of each time stamp can be reflected by a different polygon

Data Formats

  • Misc
    • {spacetime} supported Time Classes: Date, POSIXt, timeDate ({timeDate}), yearmon ({zoo}), and yearqtr ({zoo})
    • Data and Spatial Classes:
      • Points: Data having points support should use the SpatialPoints class for the spatial feature
      • Polygons: Values reflect aggregates (e.g., sums, or averages) over the polygon (SpatialPolygonsDataFrame, SpatialPolygons)
      • Grids: Values can be point data or aggregates over the cell.
    • stConstruct creates “an STFDF object if all space and time combinations occur only once, or else an object of class STIDF, which might be coerced into other representations.”
  • Long - The full spatio-temporal information (i.e. response value) is held in a single column, and location and time are also single columns.
    • Example: Private capital stock (?)

      #>     state year region     pcap     hwy   water    util       pc   gsp
      #> 1 ALABAMA 1970      6 15032.67 7325.80 1655.68 6051.20 35793.80 28418
      #> 2 ALABAMA 1971      6 15501.94 7525.94 1721.02 6254.98 37299.91 29375
      #> 3 ALABAMA 1972      6 15972.41 7765.42 1764.75 6442.23 38670.30 31303
      #> 4 ALABAMA 1973      6 16406.26 7907.66 1742.41 6756.19 40084.01 33430
      #> 5 ALABAMA 1974      6 16762.67 8025.52 1734.85 7002.29 42057.31 33749
      • Each row is a single time unit and space unit combination.
      • This is likely not the row order you want though. I think these spatio-temporal object creation functions want ordered by time, then by space. (See Example)
    • Example: Create full grid spacetime object from a long table (7.2 Panel Data in {spacetime} vignette)

      head(df_data); tail(df_data)
      #>          state year region      pcap      hwy    water     util        pc    gsp    emp unemp
      #> 1      ALABAMA 1970      6  15032.67  7325.80  1655.68  6051.20  35793.80  28418 1010.5   4.7
      #> 18     ARIZONA 1970      8  10148.42  4556.81  1627.87  3963.75  23585.99  19288  547.4   4.4
      #> 35    ARKANSAS 1970      7   7613.26  3647.73   644.99  3320.54  19749.63  15392  536.2   5.0
      #> 52  CALIFORNIA 1970      9 128545.36 42961.31 17837.26 67746.79 172791.92 263933 6946.2   7.2
      #> 69    COLORADO 1970      8  11402.52  4403.21  2165.03  4834.28  23709.75  25689  750.2   4.4
      #> 86 CONNECTICUT 1970      1  15865.66  7237.14  2208.10  6420.42  24082.38  38880 1197.5   5.6
      #>             state year region     pcap      hwy   water     util       pc   gsp    emp unemp
      #> 731       VERMONT 1986      1  2936.44  1830.16  335.51   770.78  6939.39  7585  234.4   4.7
      #> 748      VIRGINIA 1986      5 28000.68 14253.92 4786.93  8959.83 71355.78 88171 2557.7   5.0
      #> 765    WASHINGTON 1986      9 41136.36 11738.08 5042.96 24355.32 66033.81 67158 1769.9   8.2
      #> 782 WEST_VIRGINIA 1986      5 10984.38  7544.99  834.01  2605.38 35781.74 21705  597.5  12.0
      #> 799     WISCONSIN 1986      3 26400.60 10848.68 5292.62 10259.30 60241.65 70171 2023.9   7.0
      #> 816       WYOMING 1986      8  5700.41  3400.96  565.58  1733.88 27110.51 10870  196.3   9.0
      
      yrs <- 1970:1986
      vec_time <- 
        as.POSIXct(paste(yrs, "-01-01", sep=""), tz = "GMT")
      head(vec_time)
      #> [1] "1970-01-01 GMT" "1971-01-01 GMT" "1972-01-01 GMT" "1973-01-01 GMT" "1974-01-01 GMT" "1975-01-01 GMT"
      
      head(spatial_geom_ids)
      #> [1] "alabama"     "arizona"     "arkansas"    "california"  "colorado"    "connecticut"
      class(geom_states)
      #> [1] "SpatialPolygons"
      #> attr(,"package")
      #> [1] "sp"
      • The names of the objects above don’t the match the ones in the example, but I wanted names that were more informative about what types of objects were needed. The package documentation and vignette are spotty with their descriptions and explanations.
      • The row order of the spatial geeometry object should match the row order of the spatial feature column (in each time section) of the data dataframe.
      • The data_df only has the state name (all caps) and the year as space and time features.
      • spatial_geom_ids shows the order of the spatial geometry object (state polygons) which are state names in alphabetical order.
        • At least in this example, the geometry object (geom_states) didn’t store the actual state names. It just used a id index (e.g. ID1, ID2, etc.)
      st_data <- 
        STFDF(geom_states, 
              vec_time, 
              df_data)
      
      length(st_data)
      #> [1] 816
      nrow(df_data)
      #> [1] 816
      class(st_data)
      #> [1] "STFDF"
      #> attr(,"package")
      #> [1] "spacetime"
      
      df_st_data <- as.data.frame(st_data)
      head(df_st_data[1:6]); tail(df_st_data[1:6])
      #>           V1       V2 sp.ID       time    endTime timeIndex
      #> 1  -86.83042 32.80316   ID1 1970-01-01 1971-01-01         1
      #> 2 -111.66786 34.30060   ID2 1970-01-01 1971-01-01         1
      #> 3  -92.44013 34.90418   ID3 1970-01-01 1971-01-01         1
      #> 4 -119.60154 37.26901   ID4 1970-01-01 1971-01-01         1
      #> 5 -105.55251 38.99797   ID5 1970-01-01 1971-01-01         1
      #> 6  -72.72598 41.62566   ID6 1970-01-01 1971-01-01         1
      #>             V1       V2 sp.ID       time    endTime timeIndex
      #> 811  -72.66686 44.07759  ID44 1986-01-01 1987-01-01        17
      #> 812  -78.89655 37.51580  ID45 1986-01-01 1987-01-01        17
      #> 813 -120.39569 47.37073  ID46 1986-01-01 1987-01-01        17
      #> 814  -80.62365 38.64619  ID47 1986-01-01 1987-01-01        17
      #> 815  -90.01171 44.63285  ID48 1986-01-01 1987-01-01        17
      #> 816 -107.55736 43.00390  ID49 1986-01-01 1987-01-01        17
      • So combining of these objects into a spacetime object doesn’t seemed be based any names (e.g. a joining variable) of the elements of these separate objects.
      • Converting back to a dataframe adds 6 new columns to df_data:
        • V1, V2: Latitude, Longitude
        • sp.ID: IDs within the spatial geomtry object (geom_states)
        • time: Time object values
        • endTime: Used for intervals
        • timeIndex: An index of time “sections” in the data dataframe. (e.g. 1:17 for 17 unique year values)
  • Space-wide - Each space unit is a column
    • Example: Wind speeds

      #>   year month day   RPT   VAL   ROS   KIL   SHA  BIR   DUB   CLA   MUL
      #> 1   61     1   1 15.04 14.96 13.17  9.29 13.96 9.87 13.67 10.25 10.83
      #> 2   61     1   2 14.71 16.88 10.83  6.50 12.62 7.67 11.50 10.04  9.79
      #> 3   61     1   3 18.50 16.88 12.33 10.13 11.17 6.17 11.25  8.04  8.50
      #> 4   61     1   4 10.58  6.63 11.75  4.58  4.54 2.88  8.63  1.79  5.83
      #> 5   61     1   5 13.33 13.25 11.42  6.17 10.71 8.21 11.92  6.54 10.92
      #> 6   61     1   6 13.21  8.12  9.96  6.67  5.37 4.50 10.67  4.42  7.17
      • Each row is a unique time unit
    • Example: Create full grid spacetime object from a space-wide table (7.3 Interpolating Irish Wind in {spacetime} vignette)

      class(mat_wind_velos)
      #> [1] "matrix" "array" 
      dim(mat_wind_velos)
      #> [1] 6574   12
      mat_wind_velos[1:6, 1:6]
      #>             RPT        VAL         ROS         KIL         SHA        BIR
      #> [1,] 0.47349183  0.4660816  0.29476767 -0.12216857  0.37172544 -0.0549353
      #> [2,] 0.43828072  0.6342761  0.04762928 -0.48431251  0.23530275 -0.3264873
      #> [3,] 0.76797566  0.6297610  0.20133157 -0.03446898  0.07989186 -0.5358676
      #> [4,] 0.01118683 -0.4751407  0.13684623 -0.78709720 -0.79381717 -1.1049744
      #> [5,] 0.29298091  0.2851083  0.09805298 -0.54439303  0.02147079 -0.2707680
      #> [6,] 0.27715281 -0.2860777 -0.06624749 -0.47759668 -0.66795421 -0.8085874
      
      class(wind$time)
      #> [1] "POSIXct" "POSIXt" 
      wind$time[1:5]
      #> [1] "1961-01-01 12:00:00 GMT" "1961-01-02 12:00:00 GMT" "1961-01-03 12:00:00 GMT" "1961-01-04 12:00:00 GMT" "1961-01-05 12:00:00 GMT"
      
      class(geom_stations)
      #> [1] "SpatialPoints"
      #> attr(,"package")
      #> [1] "sp"
      • See Long >> Example >> Create full grid spacetime object for a more detailed breakdown of creating these objects
      • The order of station geometries in geom_stations should match the order of the columns in mat_wind_velos
        • In the vignette, the data used to create the geometry object is ordered according to the matrix using match. See R, Base R >> Functions >> match
      st_wind = 
        stConstruct(
          # space-wide matrix
          x = mat_wind_velos, 
          # index for spatial feature/spatial geometries
          space = list(values = 1:ncol(mat_wind_velos)),
          # datetime column from original data
          time = wind$time, 
          # spatial geometry
          SpatialObj = geom_stations, 
          interval = TRUE
        )
      class(st_wind)
      #> [1] "STFDF"
      #> attr(,"package")
      #> [1] "spacetime"
      
      df_st_wind <- as.data.frame(st_wind)
      dim(df_st_wind)
      #> [1] 78888     7
      head(df_st_wind); tail(df_st_wind)
      #>   coords.x1 coords.x2 sp.ID                time             endTime timeIndex     values
      #> 1  551716.7   5739060     1 1961-01-01 12:00:00 1961-01-02 12:00:00         1  0.4734918
      #> 2  414061.0   5754361     2 1961-01-01 12:00:00 1961-01-02 12:00:00         1  0.4660816
      #> 3  680286.0   5795743     3 1961-01-01 12:00:00 1961-01-02 12:00:00         1  0.2947677
      #> 4  617213.8   5836601     4 1961-01-01 12:00:00 1961-01-02 12:00:00         1 -0.1221686
      #> 5  505631.2   5838902     5 1961-01-01 12:00:00 1961-01-02 12:00:00         1  0.3717254
      #> 6  574794.2   5882123     6 1961-01-01 12:00:00 1961-01-02 12:00:00         1 -0.0549353
      #>       coords.x1 coords.x2 sp.ID                time             endTime timeIndex    values
      #> 78883  682680.1   5923999     7 1978-12-31 12:00:00 1979-01-01 12:00:00      6574 0.8600970
      #> 78884  501099.9   5951999     8 1978-12-31 12:00:00 1979-01-01 12:00:00      6574 0.1589576
      #> 78885  608253.8   5932843     9 1978-12-31 12:00:00 1979-01-01 12:00:00      6574 0.1536922
      #> 78886  615289.0   6005361    10 1978-12-31 12:00:00 1979-01-01 12:00:00      6574 0.1325158
      #> 78887  434818.6   6009945    11 1978-12-31 12:00:00 1979-01-01 12:00:00      6574 0.2058466
      #> 78888  605634.5   6136859    12 1978-12-31 12:00:00 1979-01-01 12:00:00      6574 1.0835638
      • space has a confusing description. Here it’s just a numerical index for the number of columns of mat_wind_velos or the geometries in geom_stations
      • interval = TRUE, since the values are daily mean wind speed (aggregation)
        • See Time and Movement section
      • df_st_wind is in long format
  • Time-wide - Each time unit is a column
    • Example: Counts of SIDS

      #>          NAME BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79
      #> 1        Ashe  1091     1      10  1364     0      19
      #> 2   Alleghany   487     0      10   542     3      12
      #> 3       Surry  3188     5     208  3616     6     260
      #> 4   Currituck   508     1     123   830     2     145
      #> 5 Northampton  1421     9    1066  1606     3    1197
      • SID74 contains to the infant death syndrome cases for each county at a particular time period (1974-1984). SID79 are SIDS deaths from 1979-84.
        • NWB is non-white births, BIR is births.
      • Each row is a spatial unit and unique
  • Time and Movement
    • s1 refers to the first feature/location, t1 to the first time instance or interval, thick lines indicate time intervals, arrows indicate movement. Filled circles denote start time, empty circles end times, intervals are right-closed
    • Spatio-temporal objects essentially needs specification of the spatial, the temporal, and the data values
    • Intervals - The spatial feature (e.g. location) or its associated data values does not change during this interval, but reflects the value or state during this interval (e.g. an average)
      • e.g. yearly mean temperature at a set of locations
    • Instants - Reflects moments of change (e.g., the start of the meteorological summer), or events with a zero or negligible duration (e.g., an earthquake, a lightning).
    • Movement - Reflects objects may change location during a time interval. For time instants. locations are at a particular moment, and movement occurs between registered (time, feature) pairs and must be continuous.
    • Trajectories - Where sets of (irregular) space-time points form sequences, and depict a trajectory.
      • Their grouping may be simple (e.g., the trajectories of two persons on different days), nested (for several objects, a set of trajectories representing different trips) or more complex (e.g., with objects that split, merge, or disappear).
      • Examples
        • Human trajectories
        • Mobile sensor measurements (where the sequence is kept, e.g., to derive the speed and direction of the sensor)
        • Trajectories of tornados where the tornado extent of each time stamp can be reflected by a different polygon