Geospatial
Misc
Symmetry
- Most spatial statistics assume reciprocal relationships and will symmetrize weights if they are asymmetric. (e.g. Moran’s I)
- Symmetrizing weights can indeed introduce bias if the true underlying spatial process is inherently asymmetric (i.e. influence flows in one direction).
- Examples:
- In watershed studies where upstream locations affect downstream locations but not vice versa
- In wind dispersal where particles move predominantly in one direction
- In economic studies where influence flows from larger to smaller markets but not equally in reverse
- Examples:
- Statistical inference often assumes undirected (symmetric) relationships
- Asymmetry creates a logical problem: how can A influence B but B not influence A in a spatial relationship?
spdep::is.symmetric.nb
can be used to check for symmetry in weight lists (e.g.spdep::graph2nb
)
Disjointedness
- These are groups of points that are connected to each other but completely separated from other groups
- Think of islands in an archipelago:
- Points within each island are connected
- But there are no connections between islands
- Each island is a “component” or disjoint subgraph
- Could indicate alternative clustering schemes in mixed effects models
- Issues
- Disjoint components can break an assumption of connectivity between observations (aka spatial continuity)
- Block diagonal spatial weight matrices can affect computation of certain models
- Might need to analyze each block separately
spdep::n.comp.nb
gives the number of disjoint connected subgraphs
Spatial Autocorrelation
Misc
- Local metrics can suffer from multiple testing issues when the number of group units is large
Moran’s I
\[ I = \frac{N}{W} \frac{\sum_i \sum_j w_{ij}(x_i - \bar{x})(x_j - \bar{x})}{\sum_i (x_i - \bar{x})^2} \]
- A measure of global spatial autocorrelation or overall clustering of the data
- If there is no global autocorrelation or no clustering, there can still be clusters at a local level (See Local Moran’s I)
- Assume homegeneity (i.e. only one statistic is needed to summarize the whole study area)
- \(N\) is the number of spatial units (e.g. counties)
- \(w_{ij}\) is an element of the spatial weights matrix
- \(W\) is the sum of all \(w_{ij}\)
- Values significantly below the expected value are negatively correlated
- Values significantly above the exected value are positively correlated
- Range: \(w_{\text{min}}\frac{N}{W} \lt I \lt w_{\text{max}}\frac{N}{W}\)
- For a row normalized weight matrix, \(\frac{N}{W} = 1\) (Wiki)
- In {spdep}, this would be style = “W”
- I don’t get this. W = 1, but why would N also equal 1? Not sure if this right.
- For a row normalized weight matrix, \(\frac{N}{W} = 1\) (Wiki)
- A measure of global spatial autocorrelation or overall clustering of the data
Local Moran’s I
\[ \begin{align} &I_i = \frac{x_i - \bar x}{m_2} \sum_{j=1}^N w_{ij} (x_j - \bar x) \\ &\text{where} \;\; m_2 = \frac{\sum_{i=1}^N (x_i - \bar x)^2}{N} \end{align} \]- Moran’s I is just the average of all \(I_i s\), \(I = \sum_{i=1}^N I_i /N\)
Geary’s C
\[ C = \frac{(N-1) \sum_i \sum_j w_{ij}(x_i-x_j)^2}{2W \sum_i (x_i - \bar x)^2} \]- A measure of global spatial autocorrelation or overall clustering of the data
- More sensitive to local spatial autocorrelation than Moran’s I so it can pick-up on spatial autocorrelation that Moran’s I might have missed.
- \(N\) is the number of analysis units on the map
- \(w_{ij}\) is an element of the spatial weights matrix
- \(W\) is the sum of all \(w_{ij}\)
- A measure of global spatial autocorrelation or overall clustering of the data
Local Geary’s C
\[ \begin{align} &C_i = \frac{1}{m_2} \sum_j w_{ij}(x_i - xj)^2\\ &\text{where} \;\; m_2 = \frac{\sum_i (x_i - \bar x)^2}{N-1} \end{align} \]- Geary’s C is \(C=\sum_i C_i/2W\)