Skip to contents

Eric’s Data Science Toolbox

A personal R package of miscellaneous data science functions

Installation

Install from GitHub with:

# install.packages("remotes")
remotes::install_github("ercbk/ebtools")

What’s in it

Function Description Complementary Packages
EDA
skim_arrow Runs a {skimr::skim} type of analysis on an arrow object to get summary statistics with minimal memory usage arrow
Geospatial
add_spatial_lags Adds separate orders of spatial lags of a numeric variable to a dataframe spdep
est_buffernmax Estimates the bufferNmax hyperparameter for krigeST gstat
pooled_temporal_variogram Computes a pooled temporal variogram gstat
Processing
fmt_hl Smartly formats very small numbers (e.g. p values) with optional highlighting
to_js_array Creates js array column dataui
Statistics
cles Calculates the Common Language Effect Size (CLES)
get_boot_ci Simplifies computation of bootstrapped confidence intervals for a statistic by wrapping multiple functions and using sensible defaults.
Time Series
add_mase_scale_feat Adds a scale feature by using a factor derived from the MASE error function
add_tsfeatures Adds 20 time series features from the tsfeatures package
create_dtw_grids Creates a nested, parameter grid list. dtwclust
dtw
descale_by_mase Back-transforms a group time series that has been scaled by a factor derived from the MASE error function.
dtw_dist_gridsearch Performs a grid search using a list of parameter grids and a list of distance functions dtwclust
dtw
prewhitened_ccf Prewhitens time series, calculates cross-correlation coefficients, and returns statistically significant values.
scale_by_mase Scales a group time series by using a factor derived from the MASE error function.
test_fable_resids Checks residuals of multiple fable models for autocorrelation using the Ljung-Box test fable
test_lm_resids Checks residuals of multiple lm models for autocorrelation using Breusch-Godfrey and Durbin-Watson tests.