Residual Covariance Structures

Misc

  • Notes from

  • Packages: {glmmTMB}, {rstanarm}, {brms}, {mgcv}

  • The reason for why {lme4} doesn’t have methods for using different covariance structures is that lmer employs a method that won’t allow it incorporate other covariance structures. This method is also why it is faster and better performing than other packages.

    • {nlme} isn’t typically supported by other complementary packages (e.g. broom, ggeffects, etc.).
  • Correlation Structures supported by {nlme}

    corAR1 autoregressive process of order 1.
    corARMA autoregressive moving average process, with arbitrary orders for the autoregressive and moving average components.
    corCAR1 continuous autoregressive process (AR(1) process for a continuous time covariate).
    corCompSymm compound symmetry structure corresponding to a constant correlation.
    corExp exponential spatial correlation.
    corGaus Gaussian spatial correlation.
    corLin linear spatial correlation.
    corRatio Rational quadratics spatial correlation.
    corSpher spherical spatial correlation.
    corSymm general correlation matrix, with no additional structure.

Matrix Structures

  • The overall variance-covariance matrix is made up of submatrices for each subject
    • Example: Varying Intercepts Model
      • Shows 5 subjects — each with 6 observations (\(6 \times 6\) matrices).
      • Observations from one person have no covariance with another person (light gray area)
      • Within the subject matrix, there are variances on the diagonal and covariances of observations on the off-diagonal
        • Cell values are from the GPA varying intecepts model (link).
        • The diagonal (orange) is the total variance (Between + Within-Subject).
        • The off-diagonal values is the Between-Subject intercept variance
      • The reason for using between-subject variances and the overall variance for the cell values comes from them being components of the ICC (I think). The ICC can be interpreted as the correlation of two observations from the same group (wiki).
  • Types
    • Each matrix represents 1 subject (aka unit, group, cluster, etc.) and each row/column represents one observation. Every subject’s matrix is the same.
    • Compound Symmetry
      \[ \Sigma = \begin{bmatrix} \sigma^2 & \rho & \rho \\ \rho & \sigma^2 & \rho \\ \rho & \rho & \sigma^2 \end{bmatrix} \]
      • Also referred to as an exchangeable correlation structure since the correlations between any two observations within the same cluster are assumed to be the same, regardless of the time points or the ordering of the observations.
      • Structure of the typical mixed effects model
      • Sphericity is a relaxed form of compound symmetry, where all the correlations have the same value, i.e. \(\rho_1 = \rho_2 = \rho_3\), and all variances are equal.
        • A Mixed ANOVA assumption is sphericity
    • Heterogeneous Variance
      \[ \Sigma = \begin{bmatrix} \sigma_1^2 & 0 & 0 \\ 0 & \sigma_2^2 & 0 \\ 0 & 0 & \sigma_3^2 \end{bmatrix} \]
      • Each measure for each subject has an estimated variance and those are constant for each subject.
      • My current understanding is that this is between-subject variance.
      • For example, \(\sigma_1^2\) is the variance between each subject’s first observation.
    • Unstructured (or Symmetric) Correlation
      \[ \Sigma = \begin{bmatrix} \sigma^2 & \rho_1 & \rho_2 \\ \rho_1 & \sigma^2 & \rho_3 \\ \rho_2 & \rho_3 & \sigma^2 \end{bmatrix} \]
      • Variances are the same, but the correlations between observations are different
    • Autocorrelation
      \[ \Sigma = \begin{bmatrix} \sigma^2 & \rho & \rho^2 & \rho^3 \\ \rho & \sigma^2 & \rho & \rho^2 \\ \rho^2 & \rho & \sigma^2 & \rho \\ \rho^3 & \rho^2 & \rho & \sigma^2 \\ \end{bmatrix} \]
      • Autocorrelation is when there is a time relationship between observations. Variances are the same, but correlations are powers of a single correlation (\(\rho\)) value which is the lag 1 correlation (AR1).
      • This structure assumes observations closer in time would be more strongly correlated than those further apart, or that the variance changes over time

Examples

  • Heterogeneous Variance
    • Example 1: {glmmTMB}

      • Data

        Code

        dplyr::glimpse(gpa)
        #> Rows: 1,200
        #> Columns: 10
        #> $ student  <fct> 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 5, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, 7, 7, 7, …
        #> $ occas    <fct> year 1 semester 1, year 1 semester 2, year 2 semester 1, year 2 semester 2, year 3 semester 1, year 3 semester 2, yea…
        #> $ gpa      <dbl> 2.3, 2.1, 3.0, 3.0, 3.0, 3.3, 2.2, 2.5, 2.6, 2.6, 3.0, 2.8, 2.4, 2.9, 3.0, 2.8, 3.3, 3.4, 2.5, 2.7, 2.4, 2.7, 2.9, 2.…
        #> $ job      <fct> 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, 3 hours, 2 hours, 2 hours, 2 hours, 2 hours, 2 hours, …
        #> $ sex      <fct> female, female, female, female, female, female, male, male, male, male, male, male, female, female, female, female, f…
        #> $ highgpa  <dbl> 2.8, 2.8, 2.8, 2.8, 2.8, 2.8, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 2.5, 3.8, 3.8, 3.8, 3.8, 3.8, 3.…
        #> $ admitted <fct> yes, yes, yes, yes, yes, yes, no, no, no, no, no, no, yes, yes, yes, yes, yes, yes, no, no, no, no, no, no, yes, yes,…
        #> $ year     <dbl> 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, 2, 3, 3, 1, 1, 2, …
        #> $ semester <fct> 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, 2, 1, …
        #> $ occasion <dbl> 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, 3, 4, 5, 0, 1, 2, …
      • Model

        library(glmmTMB)
        
        mod_gpa_vi_hetvar <- 
          glmmTMB(
            gpa ~ occasion 
                  + (1 | student) 
                  + diag(0 + occas | student), 
            data = gpa
          )
        
        summary(mod_gpa_vi_hetvar)
        #> Random effects:
        #> 
        #> Conditional model:
        #>  Groups    Name                   Variance Std.Dev. Corr                     
        #>  student   (Intercept)            0.093123 0.30516                           
        #>  student.1 occasyear 1 semester 1 0.129833 0.36032                           
        #>            occasyear 1 semester 2 0.086087 0.29341  0.00                     
        #>            occasyear 2 semester 1 0.046240 0.21503  0.00 0.00                
        #>            occasyear 2 semester 2 0.017615 0.13272  0.00 0.00 0.00           
        #>            occasyear 3 semester 1 0.008709 0.09332  0.00 0.00 0.00 0.00      
        #>            occasyear 3 semester 2 0.017730 0.13316  0.00 0.00 0.00 0.00 0.00 
        #>  Residual                         0.008065 0.08980                           
        #> Number of obs: 1200, groups:  student, 200
        #> 
        #> Dispersion estimate for gaussian family (sigma^2): 0.00806 
        #> 
        #> Conditional model:
        #>             Estimate Std. Error z value Pr(>|z|)    
        #> (Intercept) 2.598788   0.026201   99.19   <2e-16 ***
        #> occasion    0.106141   0.004034   26.31   <2e-16 ***
        • diag specifies the covariance structure

        • occasion is essentially the observation id variable for each student

          • The numeric variable, occasion, is used as a fixed effect and the factor variable, occas, is used in covariance specification.
        • Each student has six observations, so there are six variances.

        • The variances decrease over time which would indicate that student GPAs become more similar over time. Potential reasons could be course difficulties, assessment methods, or other time-varying factors.

        • Dispersion estimate for gaussian family is just the Residual variance (see Random Effects section of the summary)

          • For other distributions, it’s the dispersion (e.g. Poisson)
        • Extract heterogeneous variances

          Code

          VarCorr(mod_gpa_vi_hetvar)$cond$student.1
          
          # or
          
          mixedup::extract_het_var(mod_gpa_vi_hetvar, 
                                   scale = 'var')
  • Autocorrelation
    • Example 1: {glmmTMB}

      mod_gpa_vi_auto <- 
        glmmTMB(
          gpa ~ occasion 
                + (1 | student)
                + ar1(0 + occas | student),
          data = gpa
        )
      
      summary(mod_gpa_vi_auto)
      #> Random effects:
      #> 
      #> Conditional model:
      #>  Groups    Name                   Variance  Std.Dev.  Corr      
      #>  student   (Intercept)            3.384e-10 0.0000184           
      #>  student.1 occasyear 1 semester 1 9.283e-02 0.3046760 0.84 (ar1)
      #>  Residual                         2.791e-02 0.1670688           
      #> Number of obs: 1200, groups:  student, 200
      #> 
      #> Dispersion estimate for gaussian family (sigma^2): 0.0279 
      #> 
      #> Conditional model:
      #>             Estimate Std. Error z value Pr(>|z|)    
      #> (Intercept) 2.597577   0.023385  111.08   <2e-16 ***
      #> occasion    0.106875   0.005493   19.46   <2e-16 ***
      • See Hetergeneous Example 1 for a description of the data

      • ar1 specifies the autocorrelation structure.

      • occasion is essentially the observation id variable for each student

        • The numeric variable, occasion, is used as a fixed effect and the factor variable, occas, is used in covariance specification.
      • Extract ar1 correlation

        Code

        corr_obj <- VarCorr(mod_gpa_vi_auto)$cond$student.1
        attr(corr_obj, "correlation")[1,2]
        #> [1] 0.8441728
        
        # or
        
        mixedup::extract_cor_structure(mod_gpa_vi_auto, 
                                       which_cor = "ar1")
        #>   parameter student.1
        #>   <chr>         <dbl>
        #> 1 ar1           0.844
        • Finding it and writing a general function for extracting it can be ridiculous. Suggest using the {mixedup} function
        • Remember that in an autocorrelation structure, the correlation values are just powers of one \(\rho\) value and that value will be at the [1, 2] and [2, 1] coordinates. (See Matrix Structures >> Autocorrelation)
  • Spatial Autocorrelation
    • Example 1