Residual Covariance Structures

Misc

Notes from
- Mixed Models with R >> Common Extensions >> Residual Structures
Packages:
- {rstanarm}, {brms}, {mgcv}
- {graphpcor} - Models for Correlation Matrices Based on Graphs
  - I think this package can help divine the correlation structure you want to use for your model, but the documentation currently sucks.
- Also see
  - Mixed Effects, GLMM >> Misc >> Packages
    - {glmmTMB}
  - Mixed Effects, General >> Misc >> Packages
    - {mmrm}
The reason for why {lme4} doesn’t have methods for using different covariance structures is that lmer employs a method that won’t allow it incorporate other covariance structures. This method is also why it is faster and better performing than other packages.
- {nlme} isn’t typically supported by other complementary packages (e.g. broom, ggeffects, etc.).

Correlation Structures supported by {nlme}

`corAR1`	autoregressive process of order 1.
`corARMA`	autoregressive moving average process, with arbitrary orders for the autoregressive and moving average components.
`corCAR1`	continuous autoregressive process (AR(1) process for a continuous time covariate).
`corCompSymm`	compound symmetry structure corresponding to a constant correlation.
`corExp`	exponential spatial correlation.
`corGaus`	Gaussian spatial correlation.
`corLin`	linear spatial correlation.
`corRatio`	Rational quadratics spatial correlation.
`corSpher`	spherical spatial correlation.
`corSymm`	general correlation matrix, with no additional structure.

Matrix Structures

The overall variance-covariance matrix is made up of submatrices for each subject
- Example: Varying Intercepts Model
  - Shows 5 subjects — each with 6 observations ($6 \times 6$ matrices).
  - Observations from one person have no covariance with another person (light gray area)
  - Within the subject matrix, there are variances on the diagonal and covariances of observations on the off-diagonal
    - Cell values are from the GPA varying intecepts model (link).
    - The diagonal (orange) is the total variance (Between + Within-Subject).
    - The off-diagonal values is the Between-Subject intercept variance
  - The reason for using between-subject variances and the overall variance for the cell values comes from them being components of the ICC (I think). The ICC can be interpreted as the correlation of two observations from the same group (wiki).
Types
- Each matrix represents 1 subject (aka unit, group, cluster, etc.) and each row/column represents one observation. Every subject’s matrix is the same.
- Compound Symmetry
  \[ \Sigma = \begin{bmatrix} \sigma^2 & \rho & \rho \\ \rho & \sigma^2 & \rho \\ \rho & \rho & \sigma^2 \end{bmatrix} \]
  - Also referred to as an exchangeable correlation structure since the correlations between any two observations within the same cluster are assumed to be the same, regardless of the time points or the ordering of the observations.
  - Structure of the typical mixed effects model
  - Sphericity is a relaxed form of compound symmetry, where all the correlations have the same value, i.e. $\rho_1 = \rho_2 = \rho_3$, and all variances are equal.
    - A Mixed ANOVA assumption is sphericity
- Heterogeneous Variance
  \[ \Sigma = \begin{bmatrix} \sigma_1^2 & 0 & 0 \\ 0 & \sigma_2^2 & 0 \\ 0 & 0 & \sigma_3^2 \end{bmatrix} \]
  - Each measure for each subject has an estimated variance and those are constant for each subject.
  - My current understanding is that this is between-subject variance.
  - For example, $\sigma_1^2$ is the variance between each subject’s first observation.
- Unstructured (or Symmetric) Correlation
  \[ \Sigma = \begin{bmatrix} \sigma^2 & \rho_1 & \rho_2 \\ \rho_1 & \sigma^2 & \rho_3 \\ \rho_2 & \rho_3 & \sigma^2 \end{bmatrix} \]
  - Variances are the same, but the correlations between observations are different
- Autocorrelation
  \[ \Sigma = \begin{bmatrix} \sigma^2 & \rho & \rho^2 & \rho^3 \\ \rho & \sigma^2 & \rho & \rho^2 \\ \rho^2 & \rho & \sigma^2 & \rho \\ \rho^3 & \rho^2 & \rho & \sigma^2 \\ \end{bmatrix} \]
  - Autocorrelation is when there is a time relationship between observations. Variances are the same, but correlations are powers of a single correlation ($\rho$) value which is the lag 1 correlation (AR1).
  - This structure assumes observations closer in time would be more strongly correlated than those further apart, or that the variance changes over time

Examples

Heterogeneous Variance

Example 1: {glmmTMB}

Data

Code

dplyr::glimpse(gpa)
#> Rows: 1,200
#> Columns: 10
#> $ student  <fct> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, …
#> $ occas    <fct> year 1 semester 1, year 1 semester 2, year 2 semester 1, year 2 semester 2, year 3 semester 1, year 3 semester 2, yea…
#> $ gpa      <dbl> 2.3, 2.1, 3.0, 3.0, 3.0, 3.3, 2.2, 2.5, 2.6, 2.6, 3.0, 2.8, 2.4, 2.9, 3.0, 2.8, 3.3, 3.4, 2.5, 2.7, 2.4, 2.7, 2.9, 2.…
#> $ job      <fct> 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, 3 hours, 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, …
#> $ sex      <fct> female, female, female, female, female, female, male, male, male, male, male, male, female, female, female, female, f…
#> $ highgpa  <dbl> 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 3.8, 3.8, 3.8, 3.8, 3.8, 3.…
#> $ admitted <fct> yes, yes, yes, yes, yes, yes, no, no, no, no, no, no, yes, yes, yes, yes, yes, yes, no, no, no, no, no, no, yes, yes,…
#> $ year     <dbl> 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, …
#> $ semester <fct> 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, …
#> $ occasion <dbl> 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, …

Model

library(glmmTMB)

mod_gpa_vi_hetvar <- 
  glmmTMB(
    gpa ~ occasion 
          + (1 | student) 
          + diag(0 + occas | student), 
    data = gpa
  )

summary(mod_gpa_vi_hetvar)
#> Random effects:
#> 
#> Conditional model:
#>  Groups    Name                   Variance Std.Dev. Corr                     
#>  student   (Intercept)            0.093123 0.30516                           
#>  student.1 occasyear 1 semester 1 0.129833 0.36032                           
#>            occasyear 1 semester 2 0.086087 0.29341  0.00                     
#>            occasyear 2 semester 1 0.046240 0.21503  0.00 0.00                
#>            occasyear 2 semester 2 0.017615 0.13272  0.00 0.00 0.00           
#>            occasyear 3 semester 1 0.008709 0.09332  0.00 0.00 0.00 0.00      
#>            occasyear 3 semester 2 0.017730 0.13316  0.00 0.00 0.00 0.00 0.00 
#>  Residual                         0.008065 0.08980                           
#> Number of obs: 1200, groups:  student, 200
#> 
#> Dispersion estimate for gaussian family (sigma^2): 0.00806 
#> 
#> Conditional model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) 2.598788   0.026201   99.19   <2e-16 ***
#> occasion    0.106141   0.004034   26.31   <2e-16 ***

diag specifies the covariance structure
occasion is essentially the observation id variable for each student
- The numeric variable, occasion, is used as a fixed effect and the factor variable, occas, is used in covariance specification.
Each student has six observations, so there are six variances.
The variances decrease over time which would indicate that student GPAs become more similar over time. Potential reasons could be course difficulties, assessment methods, or other time-varying factors.
Dispersion estimate for gaussian family is just the Residual variance (see Random Effects section of the summary)
- For other distributions, it’s the dispersion (e.g. Poisson)

Extract heterogeneous variances

Code

VarCorr(mod_gpa_vi_hetvar)$cond$student.1

# or

mixedup::extract_het_var(mod_gpa_vi_hetvar, 
                         scale = 'var')

Autocorrelation

Example 1: {glmmTMB}

mod_gpa_vi_auto <- 
  glmmTMB(
    gpa ~ occasion 
          + (1 | student)
          + ar1(0 + occas | student),
    data = gpa
  )

summary(mod_gpa_vi_auto)
#> Random effects:
#> 
#> Conditional model:
#>  Groups    Name                   Variance  Std.Dev.  Corr      
#>  student   (Intercept)            3.384e-10 0.0000184           
#>  student.1 occasyear 1 semester 1 9.283e-02 0.3046760 0.84 (ar1)
#>  Residual                         2.791e-02 0.1670688           
#> Number of obs: 1200, groups:  student, 200
#> 
#> Dispersion estimate for gaussian family (sigma^2): 0.0279 
#> 
#> Conditional model:
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) 2.597577   0.023385  111.08   <2e-16 ***
#> occasion    0.106875   0.005493   19.46   <2e-16 ***

See Hetergeneous Example 1 for a description of the data
ar1 specifies the autocorrelation structure.
occasion is essentially the observation id variable for each student
- The numeric variable, occasion, is used as a fixed effect and the factor variable, occas, is used in covariance specification.
Extract ar1 correlation
Code
```
corr_obj <- VarCorr(mod_gpa_vi_auto)$cond$student.1
attr(corr_obj, "correlation")[1,2]
#> [1] 0.8441728

# or

mixedup::extract_cor_structure(mod_gpa_vi_auto, 
                               which_cor = "ar1")
#>   parameter student.1
#>   <chr>         <dbl>
#> 1 ar1           0.844
```
- Finding it and writing a general function for extracting it can be ridiculous. Suggest using the {mixedup} function
- Remember that in an autocorrelation structure, the correlation values are just powers of one $\rho$ value and that value will be at the [1, 2] and [2, 1] coordinates. (See Matrix Structures >> Autocorrelation)

Spatial Autocorrelation
- Example 1