Privacy/Security
Misc
Packages
- {xxhashlite} - Very fast hash functions using xxHash
- {{metasyn}} - For generating synthetic tabular data with a focus on privacy
- {simPop} (Vignette) - Tools and methods to simulate populations for surveys based on auxiliary data. The tools include model-based methods, calibration and combinatorial optimization algorithms
- {sdcmicro} (CRAN)- For the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files.
- {diffpriv} - Implements the formal framework of differential privacy: differentially-private mechanisms can safely release to untrusted third parties: statistics computed, models fit, or arbitrary structures derived on privacy-sensitive data.
- {encryptr} - Encrypt and decrypt data frame or tibble columns using the strong RSA public/private keys
- {encryptedRmd} - Encrypt Html Reports Using ‘Libsodium’
- {randomForestSRC} - Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)
- Anonymous random forests for data privacy
- {deident} (JOSS) - A framework for the replicable removal of personally identifiable data (PID) in data sets.
- {rmonocypher} - Simple encryption of R objects using a strong modern technique.
- Also see Simulation, Data
- Tools
- rclone - A command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors’ web storage interfaces.
- Over 70 cloud storage products support rclone including S3 object stores, business & consumer file storage services, as well as standard transfer protocols.
- Able to add the encryption to files and the keys will be on your local device
- {staticrypt} - Password protect a static HTML page, decrypted in-browser in JS with no dependency. No server logic needed. (Useful for HTML reports)
hrbrmstr: “Please use long,”complex” *passphrases* for this tool/docs. These documents are susceptible to brute-force attacks, so you gotta make it hard for the attacker.
It uses solid encryption practices (AES-256 encryption; PBKDF2 for password hashing w/decent iterations); but — unlike public/private key-based message exchanges — once the password leaks, there’s no access revocation”
- VeraCrypt - Encrypt files before cloud upload for extra security
- rclone - A command-line program to manage files on cloud storage. It is a feature-rich alternative to cloud vendors’ web storage interfaces.
Tags
Tag sensitive information in dataframes
names(df) 1] "date" "first_name" "card_number" "payment" [# assign pii tags attr(df, "pii") <- c("name", "ccn", "transaction")
- Personally Identifiable Information (PII)
Tag dataframes with the names of regulations that are applicable
attr(df, "regs") <- c("CCPA", "GDPR", "GLBA")
- CCPA is the privacy regulation for California
- GDPR is the privacy regulation for the European Union
- GLBA is the financial regulation for the United States
- Needed because df has credit card and financial information
- Saving objects as .rds files preserves tags
Hashing
- {digest}
- Hash Function
- Apply Hash Function to PII Fields
- Hash Function