Spatial
Misc
- Also see
- Packages
- {spdep}
- Notes from
- Spatial Data Science, Ch.15
- Comparing Implementations of Global and Local Indicators of Spatial Association (Bivand, Wong 2018) (Also in R >> Documents >> Geospatial)
- The concern about conditional permutations in Local Moran’s I was later shown to be an formulaic mistake. It’s now been corrected in {spdep} (source)
- Entitation - The process of dividing continuous space into discrete units of observation (counties, census tracts, grid cells, etc.).
- Residual spatial autocorrelation has an effect of standard errors and on coefficient values.
Causes of Spurious Spatial Autocorrelation
- Omitted Variables
- A variable you haven’t included in your model has its own spatial pattern injects apparent spatial structure into your results.
- By including mulitple (relevant) predictors, testing the residuals is testing whether the unexplained variation is spatially clustered, not just whether the raw (response) variable has any spatial structure at all.
- The fix for true spatial autocorrelation — where nearby observations genuinely influence each other (diffusion, spillovers, contagion), is a spatial lag or spatial error model.
- Example: You’re mapping crime rates and notice strong spatial clustering, but you haven’t controlled for poverty, which is itself spatially clustered .
- The crime autocorrelation may largely disappear once poverty is accounted for.
- The spatial pattern was real, but its cause wasn’t proximity per se — it was a third variable you left out.
- Model Misspecification
- Modeling a nonlinear relationship as linear (or vice versa) can produce spurious spatial residuals.
- Entitation
- Choosing the wrong geographic units can manufacture or mask spatial autocorrelation
- If your units are too large, you average out variation within them and neighboring units end up looking artificially similar
- Certain causal effects may be present only at particular scales and missing this scale can lead to spurious autocorrelation
- See
- Regression, Spatial >> Econometrics >> Examples >> Example 1
- Geospatial, General >> Terms >> Copying Out
- Examples:
- Counties were drawn for governance, not for studying disease spread or economic activity
- Census tracts were drawn to have roughly equal populations, not to reflect neighborhood identity
- Grid cells impose a rigid geometry that real phenomena don’t respect
- Your units cut across a real spatial boundary (e.g., a river that divides two distinct communities), you mask genuine dissimilarity
Global Measures
Misc
- A single value of autocorrelation for the total study area. These measures consider the average autocorrelation level across all observations
- Can be biased by edge effects where important spatial process components fall outside the study area.
Join Count Test
- Categorical, discrete, or group-wise test of spatial autorcorrelation
- Answers the question: given the spatial arrangement, do same-type or cross-type neighbors occur more or less often than you’d expect by chance?
- The observed counts are of neighbors with the same factor levels, known as same-color joins.
- Process
- The observed same-color join-counts are tabulated with their expectations based on the counts of levels of the input factor
- Example: Few joins would be expected between Warsaw boroughs, because there are very few (18) of them.
- The variance calculation uses the underlying constants of the chosen listw object and the counts of levels of the input factor.
- The z-value is obtained by dividing the difference between the observed and expected join-counts by the square root of the variance.
- The observed same-color join-counts are tabulated with their expectations based on the counts of levels of the input factor
- sampling - The basis for the construction of the variance of the measure, where the default “nonfree” choice corresponds to analytical permutation
- Jtot is the count of all different-color joins. It tests whether cross-type joins are more or less than expected
- Interpretation
- Positive z-score → More joins and more clustering than expected (i.e. positive spatial autocorrelation)
- Negative z-score → Fewer joins and more dispersion than expected (i.e. negative spatial autocorrelation)
- Adjust p-values
- Multiple tests on the same dataset and neighbor graph, so a p-value adjustment should be done.
- Do same color (category) joins represent a different testing family than cross-color joins (therefore different numbers of comparisions?) or are these all together considered part of the same research question?
- If you have a lot of categories and the adjustment flips some interpretations when you assume these should all be treated as the same family, then this becomes an important question.
- Use the Holm (FWER) or Benjamini-Yekutieli (FDR) correction since there’s likely negative correlations among test statistics
- Structural Dependence: The join counts are computed from the same spatial weights matrix (i.e. same neighborhood graph) and the same set of polygons. Their z-values are correlated.
- Compositionally Constrained: If same type joins are high, then cross-joins are lower since there is fixed number of joins. This induces negative correlations.
- Cross-type pairs sharing a category could induce positive coorelations
- e.g. Urban:Rural and Urban/rural:Rural both involve Rural polygons. If Rural polygons are more clustered than expected, both join counts might track together — suggesting positive correlation between some cross-type pairs
- Multiple tests on the same dataset and neighbor graph, so a p-value adjustment should be done.
- Examples
Example: Polish City Types
words
library(dplyr); library(spdep) data(pol_pres15, package = "spDataLarge") pol_pres15 |> count(types) #> Simple feature collection with 4 features and 2 fields #> Geometry type: GEOMETRY #> Dimension: XY #> Bounding box: xmin: 171677.6 ymin: 133223.7 xmax: 861882.2 ymax: 774774 #> Projected CRS: ETRF2000-PL / CS92 #> types n geometry #> 1 Rural 1563 MULTIPOLYGON (((282864 3248... #> 2 Urban 303 MULTIPOLYGON (((315888.7 27... #> 3 Urban/rural 611 MULTIPOLYGON (((244371.6 33... #> 4 Warsaw Borough 18 POLYGON ((631202.9 480068, ...- Thoughts
nb_poly_pol <- pol_pres15 |> poly2nb(queen = TRUE) lw_b_poly_pol <- nb_poly_pol |> nb2listw(style = "B") # polygon centroids coords_pol <- pol_pres15 |> st_geometry() |> st_centroid(of_largest_polygon = TRUE) # 18300 determined the distance for a connected graph nb_d183_pol <- coords_pol |> dnearneigh(0, 18300) # Inverse distance weights may be constructed by taking the lengths of edges, changing units to avoid most weights being too large or small (here from metre to kilometre), taking the inverse # https://r-spatial.org/book/14-Areal.html#weights-specification wgts_idw_d183 <- nb_d183_pol|> nbdists(coords_pol) |> lapply(function(x) 1/(x/1000)) # in km lw_d183_idw_b_pol <- nb_d183_pol |> nb2listw(glist = wgts_idw_d183, style = "B")- Thoughts
joincount.multi(pol_pres15$types, listw = lw_b_poly_pol) |> as.data.frame() |> tibble::rownames_to_column("type") |> mutate( p_value = 2 * pnorm(abs(`z-value`), lower.tail = FALSE), p_value_adj = p.adjust(p_value, method = "BY") ) |> arrange(desc(p_value_adj)) |> ebtools::fmt_hl(c(p_value, p_value_adj), emphatic = TRUE) #> type Joincount Expected Variance z-value p_value p_value_adj #> 1 Urban:Urban 110 104.7185351 93.2993687 0.5467831 0.5845 1 #> 2 Warsaw Borough:Urban 9 12.4830042 11.7580036 -1.0157509 0.3097 1 #> 3 Warsaw Borough:Urban/rural 8 25.1719985 22.3538161 -3.6319930 0.0002812 0.001038 #> 4 Urban/rural:Rural 2359 2185.7685388 1267.1313345 4.8664913 1.14e-06 4.72e-06 #> 5 Warsaw Borough:Rural 12 64.3925265 46.4599085 -7.6865272 1.51e-14 7.17e-14 #> 6 Rural:Rural 3087 2793.9201781 1126.5342033 8.7320000 2.50e-18 1.39e-17 #> 7 Urban/rural:Urban/rural 656 426.5255306 331.7590322 12.5986206 2.15e-36 1.43e-35 #> 8 Urban/rural:Urban 171 423.7286419 352.1895385 -13.4668567 2.45e-41 2.04e-40 #> 9 Jtot 3227 3795.4855729 1496.3984180 -14.6958878 6.85e-49 7.58e-48 #> 10 Urban:Rural 668 1083.9408630 708.2086432 -15.6297121 4.57e-55 7.59e-54 #> 11 Warsaw Borough:Warsaw Borough 41 0.3501833 0.3474277 68.9646203 0 0 # substantial z-value changes using an inverse distance based listw object does, because closer centroids are up-weighted relatively strongly # Rural:Rural and Warsaw Borough:Rural go to insignificant w/adj # Jtot goes from highly signif to very not signif # Urban:Urban goes from very not signif to very signif # Warsaw Borough:Urban goes from not signif to very signif joincount.multi(pol_pres15$types, listw = lw_d183_idw_b_pol) |> as.data.frame() |> tibble::rownames_to_column("type") |> mutate( p_value = 2 * pnorm(abs(`z-value`), lower.tail = FALSE), p_value_adj = p.adjust(p_value, method = "BY") ) |> arrange(desc(p_value_adj)) |> ebtools::fmt_hl(c(p_value, p_value_adj), emphatic = TRUE) #> type Joincount Expected Variance z-value p_value p_value_adj #> 1 Warsaw Borough:Urban/rural 3.267639 3.25448276 0.551796957 0.01771153 0.9859 1 #> 2 Jtot 481.832119 490.71758713 41.570110157 -1.37812859 0.1682 0.5586 #> 3 Warsaw Borough:Rural 5.650239 8.32529715 1.725992663 -2.03616849 0.04173 0.154 #> 4 Rural:Rural 346.476174 361.22539320 49.314210235 -2.10030797 0.0357 0.1482 #> 5 Urban/rural:Urban/rural 46.497610 55.14540240 9.613381390 -2.78911983 0.005285 0.02508 #> 6 Urban/rural:Urban 36.498919 54.78379321 8.858649324 -6.14339187 8.08e-10 4.47e-09 #> 7 Urban/rural:Rural 225.173194 282.59758674 35.891709966 -9.58515926 9.23e-22 6.13e-21 #> 8 Urban:Urban 29.044510 13.53903891 2.228091144 10.38767831 2.82e-25 2.34e-24 #> 9 Urban:Rural 202.062080 140.14250210 23.644875577 12.73384239 3.83e-37 4.25e-36 #> 10 Warsaw Borough:Urban 9.180046 1.61392517 0.253919428 15.01499993 5.86e-51 9.73e-50 #> 11 Warsaw Borough:Warsaw Borough 16.822284 0.04527513 0.006608328 206.38053024 0 0- Thoughts
Moran’s I
Misc
- A measure of global spatial autocorrelation or overall clustering of the data
- If there is no global autocorrelation or no clustering, there can still be clusters at a local level (See Local Moran’s I)
- Assumes homegeneity (i.e. only one statistic is needed to summarize the whole study area)
- A measure of global spatial autocorrelation or overall clustering of the data
Equations
\[\begin{align} &I = \frac{N}{S_0} \frac{\sum_i \sum_j w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_i (x_i - \bar{x})^2} \\ & \text{where} \;\; S_0 = \sum w_{ij}\\ &\mathbb{E} (I) = - \frac{1}{n-1}\\ &\text{Var}(I) = \mathbb{E}(I^2) - [\mathbb{E}(I)]^2 \\ &Z(I) = \frac{I - \mathbb{E}(I)}{\sqrt{\text{Var}(I)}} \end{align}\]
- \(I\) is the observed statistic
- \(N\) is the number of spatial units (e.g. counties)
- \(w_{ij}\) is an element of the spatial weights matrix
- \(\mathbb{E}(I)\) is the Expected I value under the null hypothesis
- \(\text{Var}(I)\) is the theoretical variance of I under the null hypothesis
- \(Z(I)\) is the Standard Deviate
- Functions like a z-score in the NHST
- The components of the analytical variance (\(\text{Var}(I)\)) equation (under normalization and randomization) isn’t shown because it’s long. It’s computed from the weights and number of neighbors with observations.
Interpretation
- Range: \(w_{\text{min}}\frac{N}{W} \lt I \lt w_{\text{max}}\frac{N}{W}\)
- For a row normalized weight matrix, \(\frac{N}{W} = 1\) (Wiki)
- In {spdep}, this would be style = “W”
- I don’t get this. W = 1, but why would N also equal 1? Not sure if this right.
- For a row normalized weight matrix, \(\frac{N}{W} = 1\) (Wiki)
- Values significantly below the expected value are negatively correlated
- Values significantly above the exected value are positively correlated
- Range: \(w_{\text{min}}\frac{N}{W} \lt I \lt w_{\text{max}}\frac{N}{W}\)
moran.test- randomisation - boolean; TRUE (default) says calculate the variance of I under the assumption of randomization, if FALSE assume normality
- The only difference between tests under normality and randomization is that an extra term is added if the kurtosis of the variable of interest indicates a flatter or more peaked distribution, where the measure used is the classical measure of kurtosis
- drop.EI2 - Used to reproduce results where the final component of the variance term is omitted as found in some legacy software implementations (CrimeStat <= 4.02)
- Output
- Standared Deviate
- p.value
- Moran’s I statistic - The observed I
- Expectation - Expected I value under the null hypothesis
- Variance - The theoretical variance under the null hypothesis given the method.
- randomisation - boolean; TRUE (default) says calculate the variance of I under the assumption of randomization, if FALSE assume normality
Other Flavors
Moran Plot
- Finds observations that have strong influence on (global) Moran’s I
- Plots the values of the variable of interest against their spatially lagged values, typically using row-standardized weights (W) to make the axes more directly comparable
- Returns an influence measures object that’s used to label observations exerting more than proportional influence on the slope of the line which represents Global Moran’s I.
- Uses the spatial weights list object to get spatial lag values of the variable. The regresses the variable against the spatial lag variable (e.g.
lm(lags ~ x)). Thelmobject is then used to get the influential observations (viainfluence.measures). - Quadrants are formed by lines located at the means of the variable and lagged variable
Examples
Example 1:
# codeExample 2: Moran Plot
- plot: relatively fewer influential observations in the high-low (lagged vs orig) and low-high quadrants.
- map: suggests that some edge entities exert more than proportional influence (perhaps because of row standardization), as do entities in or near larger urban areas.
Geary’s C
\[\begin{align} &C = \frac{(N-1) \sum_i \sum_j w_{ij}(x_i-x_j)^2}{2S_0 \sum_i (x_i - \bar x)^2} \\ &\mathbb{E}(C) = 1 \\ &Z(C) = \frac{\mathbb{E}(C) - C}{\sqrt{\text{Var}(C)}} \end{align}\]
- A measure of global spatial autocorrelation or overall clustering of the data
- The analytical variance (\(\text{Var}(C)\)) equation and its components (under normalization and randomization) isn’t shown because it’s long. It’s computed from the weights and number of neighbors with observations.
- More sensitive to local spatial autocorrelation than Moran’s I so it can pick-up on spatial autocorrelation that Moran’s I might have missed.
- More computationally demanding than Moran’s I, especially when the spatial weights are dense
- \(N\) is the number of analysis units on the map
- \(w_{ij}\) is an element of the spatial weights matrix
- \(S_0\) is the sum of all \(w_{ij}\)
- \(Z(C)\) is the standard deviate
Local Measures
Misc
- Values of autocorrelation for each areal unit
- Local metrics can suffer from multiple testing issues when the number of group units is large.
- Local measures are likely to mislead users in the presence of global autocorrelation, and where the mean model is mis-specified in other ways.
LOSH.csandLOSH.mcperform Chi-square and Bootstrapping-based tests respectively for local spatial heteroscedasticity. Values of the measure greater than 1 indicate heightened local spatial heteroscedasticity (Paper)- The statistic may be used to identify boundaries of clusters and to describe the nature of heteroscedasticity within clusters, i.e. trends of homogeneity or heterogeneity in the neighborhood of a given observation
- i.e. If heteroskedacity (p-value < 0.05) is detected around an observation near a cluster/in a cluster \(\rightarrow\) the local neighborhood is heterogeneous \(\rightarrow\) that location is on the boundary of the cluster.
- Results demonstrate that thebootstrap method can provide a more accurate approximation than the chi-squaremethod
- The statistic may be used to identify boundaries of clusters and to describe the nature of heteroscedasticity within clusters, i.e. trends of homogeneity or heterogeneity in the neighborhood of a given observation
Local Moran’s I
\[\begin{align} &I_i = \frac{x_i - \bar x}{m_2} \sum_{j=1}^N w_{ij} (x_j - \bar x) \\ &\text{where} \;\; m_2 = \frac{\sum_{i=1}^N (x_i - \bar x)^2}{N} \end{align}\]
- Moran’s I is just the average of all \(I_i s\), \(I = \sum_{i=1}^N I_i /N\)
- Notes from The Importance of Null Hypotheses: Understanding Differences in Local Moran’s Ii under Heteroskedasticity (Sauer 2021)
- The false discovery rate (FDR) adjustment of p-values (e.g. BY) gives a better picture of interesting (used by Anselin instead of significant) clusters than no adjustment
- The type of null you choose will determine the type of spatial randomness you’ll potentially detect
- The Conditional Null asks: what if this site’s value were fixed and everything around it were reshuffled?
- Given that this location has this particular value, are its neighbors more similar to it than we’d expect by chance? e.g. a location has high degree of permanence like a natural feature (e.g. mountain) or a city center.
- Fits naturally in disease surveillance or crime analysis, where a particular neighborhood’s characteristics are somewhat fixed and you’re asking whether clustering around it is unusual
- Tends to find more significant clusters, which matters in public health contexts where missing a real cluster has serious consequences
- The Total Randomization Null asks: what if all values were randomly reshuffled across the map?
- Seems more in line with how point patterns are assessed. All locations are treated as equally to change coordinates.
- Also more likely to detect fewer clusters which may be useful if False Discovery is a primary concern.
- The Conditional Null asks: what if this site’s value were fixed and everything around it were reshuffled?
- \(Z(I_i)\) estimates are inflated in the presence of heteroskedasticity (see \(\text{LOSH} (H_i)\) in Misc) for either choice of null
- This is like getting inflated z-scores, so it’ll result in more false positives.
localmoranandlocalmoran_perm- conditional: TRUE (default) says the expectation and variance are calculated using the conditional randomization Null
- FALSE says expectation and variance are calculated using the total randomization null
- mlvar (default TRUE) and adjust.x (default FALSE) permit matching with other implementations
- conditional: TRUE (default) says the expectation and variance are calculated using the conditional randomization Null
Local Geary’s C
\[\begin{align} &C_i = \frac{1}{m_2} \sum_j w_{ij}(x_i - xj)^2\\ &\text{where} \;\; m_2 = \frac{\sum_i (x_i - \bar x)^2}{N-1} \end{align}\]
- Geary’s C is \(C=\sum_i C_i/2W\)