Privacy/Security
Misc
- Also see Simulation, Data
- Packages
- {xxhashlite} - Very fast hash functions using xxHash
- {{metasyn}} - For generating synthetic tabular data with a focus on privacy
- {simPop} (Vignette) - Tools and methods to simulate populations for surveys based on auxiliary data. The tools include model-based methods, calibration and combinatorial optimization algorithms
- {sdcmicro} (CRAN)- For the generation of anonymized (micro)data, i.e. for the creation of public- and scientific-use files.
- {diffpriv} - Implements the formal framework of differential privacy: differentially-private mechanisms can safely release to untrusted third parties: statistics computed, models fit, or arbitrary structures derived on privacy-sensitive data.
- {encryptr} - Encrypt and decrypt data frame or tibble columns using the strong RSA public/private keys
- {encryptedRmd} - Encrypt Html Reports Using ‘Libsodium’
- {randomForestSRC} - Fast Unified Random Forests for Survival, Regression, and Classification (RF-SRC)
- Anonymous random forests for data privacy
- Tools
- staticrypt - Password protect a static HTML page, decrypted in-browser in JS with no dependency. No server logic needed. (Useful for HTML reports)
hrbrmstr: “Please use long,”complex” *passphrases* for this tool/docs. These documents are susceptible to brute-force attacks, so you gotta make it hard for the attacker.
It uses solid encryption practices (AES-256 encryption; PBKDF2 for password hashing w/decent iterations); but — unlike public/private key-based message exchanges — once the password leaks, there’s no access revocation”
- VeraCrypt - Encrypt files before cloud upload for extra security
- staticrypt - Password protect a static HTML page, decrypted in-browser in JS with no dependency. No server logic needed. (Useful for HTML reports)
Tags
Tag sensitive information in dataframes
names(df) 1] "date" "first_name" "card_number" "payment" [# assign pii tags attr(df, "pii") <- c("name", "ccn", "transaction")
- Personally Identifiable Information (PII)
Tag dataframes with the names of regulations that are applicable
attr(df, "regs") <- c("CCPA", "GDPR", "GLBA")
- CCPA is the privacy regulation for California
- GDPR is the privacy regulation for the European Union
- GLBA is the financial regulation for the United States
- Needed because df has credit card and financial information
- Saving objects as .rds files preserves tags
Hashing
- {digest}
- Hash Function
- Apply Hash Function to PII Fields
- Hash Function