Multivariate

Misc

  • Also see Mathematics, Statistics >> Multivariate
  • Packages
    • Compositional
      • {coda.pack} - Meta-Package for Compositional Data Analysis. Loads the following:
        • {coda.base} - A minimum set of functions to perform compositional data analysis using the log-ratio approach
        • {coda.plot} - Provides a collection of easy-to-use functions for creating visualizations of compositional data using ‘ggplot2’
      • {Compositional} - Compositional Data Analysis
        • Regression, classification, contour plots, hypothesis testing and fitting of distributions for compositional data are some of the functions included.
        • Functions for percentages (or proportions) are also included.
      • {CompositionalSR} - Spatial Regression Models with Compositional Data
      • {DirichletRF} - “Dirichlet Random Forest” Implementation of the Dirichlet Random Forest algorithm for compositional response data
      • {prefviz} - Ternary plots of both two and higher dimensions. These plots are made compatible with other interactivity R packages, allowing users to explore their ternary plot interactively.
      • {ZIDM} (Vignette, Paper) - Model for Multivariate Compositional Count Data
        • Uses a Bayesian Zero-Inflated Dirichlet-Multinomial Regression model
    • General
      • {BCD} - Implementation of bivariate binomial, geometric, and Poisson distributions based on conditional specifications.
      • {heplots} - Mostly for visualizing hypothesis tests in multivariate linear models (“MLM” = {MANOVA, multivariate multiple regression, MANCOVA, and repeated measures designs}).
        • Also provides other tools for analysis and graphical display of MLMs.
        • robmlm fits a multivariate linear model by robust regression using a simple M estimator that down-weights observations with large residuals
        • Robust estimation for multivariate linear models (MLMs) using iteratively reweighted least squares (IRLS)
      • {kernreg} - Fast implementation of Nadaraya-Watson kernel regression for either univariate or multivariate responses, with one or more bandwidths. K-fold cross-validation is also performed
      • {randomForestSRC} - Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)
        • New Mahalanobis splitting rule for correlated real-valued outcomes in multivariate regression settings
      • {savvyPR} - Implements the Savvy Parity Regression methodology for multivariate linear regression analysis
        • Solves an optimization problem that balances the contribution of each predictor variable to ensure estimation stability in the presence of multicollinearity
        • Supports two distinct parameterization methods, a Budget-based approach that allocates a fixed loss contribution to each predictor, and a Target-based approach (t-tuning) that utilizes a relative elasticity weight for the response variable.
      • {savvySh} - Implements a suite of shrinkage estimators for multivariate linear regression to improve estimation stability and predictive accuracy
        • Includes the Stein estimator, Diagonal Shrinkage, the general Shrinkage estimator (solving a Sylvester equation), and Slab Regression (Simple and Generalized)
  • Papers

Generalized Joint Regression Modelling (GJRM)

  • A flexible statistical framework that generalizes classical regression models to jointly model multiple responses (or multi-response), potentially of different types, while accounting for dependencies between them.
    • It is particularly useful when you have multiple outcomes (e.g., continuous, binary, count data) that may influence each other.
    • Handles nonlinear associations between the reponse variables
  • Packages
    • {GJRM} - Routines for fitting various joint (and univariate) regression models, with several types of covariate effects, in the presence of equations’ errors association, endogeneity, non-random sample selection or partial observability.
  • Papers
  • Comparison to a Gaussian Multivariate Regression Model
    • Allows for more flexible marginal distributions, not limited to normal distributions.
    • Dependence Structure
      • Multivariate regression models the correlation between responses using a multivariate normal distribution, which implies a linear association.
      • GJRM uses copulas to model the dependence structure, allowing for more complex, non-linear associations between responses.
    • Flexibility
      • In multivariate regression with splines, the same spline structure is typically applied across all response variables..
      • In GJRM, each marginal can have its own unique non-linear structure, potentially using different splines or smoothing approaches for each response variable.
    • GJRM allows different link functions for each marginal distribution, accommodating various types of responses and not just continuous responses
    • GJRM can handle mixed types of responses (e.g., a combination of continuous, binary, and count data) in a single model.
  • Steps
    • First stage: Models each marginal distribution separately, allowing for different distributions and link functions for each response.
    • Second stage: Combines these marginals using a copula function to create the joint distribution.