Epidemiology

Misc

Disease Mapping

  • Goals
    • Provide accurate estimates of mortality/incidence risks or rates in space and time
    • Unveil underlying spatial and spatio-temporal patterns
    • Detect high-risk areas or hotspots
  • Risk estimation using metrics such as Standardized Mortality Ratio (SMR) when analyzing rare diseases or low-populated areas are highly variable over time, so it’s diffficult to spot patterns and form hypotheses
    • SMR = Observed number of cases / Expected number of cases
      • SMR > 1: risk is greater than the whole region under study
      • Guessing “Expected number of cases” is the average number of cases for the whole study region
  • Statistical models smooth risks by borrowing information from spatio-temporal neighbors
    • The smoothed gradient over the entire study region makes it easier to detect patterns and form hypotheses than highly variable, local area metric estimates (e.g. SMR in a low populated county)
  • Traditional Models
    • Types
      • Mixed Poisson with conditional autoregressive (CAR) priors for “space” and random walk priors for “time” that include space ⨯ time interactions (Knorr-Held, 2000, Bayesian modeling of inseperable space-time variation in disease risk)
      • Reduced rank multidimensional P-splines (Ugarte et al, 2017, One-dimensional, two-dimensional, and three dimensional B-splines to specify space-time interactions in Bayesian disease mapping)
    • Issues
      • Estimating the cov-var matrix becomes intractable with big data and many areas since the covariance must be estimated between each pair of areas
      • CAR models assume the same level of spatial dependence between all areas which isn’t likely.
  • {bigDM}
    • Scalable non-stationary Bayesian models for high-dim, count data
    • Dependencies
      • Uses {future} for distributed computing
      • Integrated, nested laplace approximation (INLA) method through {R-INLA}
    • K-order neighborhood model
      • Breaks up local spatial or spatio-temporal domains so that estimations can distributed and local area dependencies (neighborhoods) can be accounted for.
      • “Areas” are usually districts, counties, provinces, etc.
        • Package does provide a method to create a “random” area grid
          • Might be useful to compare a random grid model with the e.g. county model to see how much county boundaries influence the estimates
      • Each local area model includes k adjacent areas which creates a partition
        • The local area estimate is smoothed by taking information from the adjacent areas
        • Adjacent areas also have estimate posteriors computed
        • Each area will have multiple posterior estimates from local area models where the area is the local area or where it is the adjacent area
      • Merge or don’t merge estimate posteriors for each area
        • Merge: use weights proportional to the conditional predictive ordinates (CPO) ???
        • Don’t Merge: Use the posterior marginal risk estimates of an area corresponding to the original submodel.
          • i.e. use the posterior where the area is the “local area” in that local area model and not an adjacent area.
        • Primary functions
          • CAR.INLA() fits several spatial CAR models for high dim count data
          • STCAR.INLA() fits several spatio-temporal CAR models for high dim count data