General
Misc
Installing from a git repo (From link)
- Make a fork of the repo and then clone it to your local machine.
- To update, after setting an upstream remote (
git remote add upstream git://github.com/benfulcher/hctsa.git
) you can usegit pull upstream main
. - To update the submodule in the repo,
git submodule update --init
Start R project and Git repo in whichever order (I think)
Create R project in RStudio
- Choose “New Directory” for all the templated projects (e.g. quarto book, shiny, etc.). None of the other choices have them.
- If you’ve already created a directory, it will NOT overwrite this directory or add to it. So you’ll either have alter the name of your old directory or choose a new name.
- Choose “New Directory” for all the templated projects (e.g. quarto book, shiny, etc.). None of the other choices have them.
Create repo on Github
- Add license and readme
Do work
Tools >> Version Control >> Project Set-up >> Version Control System >> Select Git
Open terminal and go to working directory of project
git checkout -B main git pull origin main --allow-unrelated-histories git add . git commit -m "initial commit" git push --set-upstream origin main
Turn off “LF will be replaced by CRLF the next time Git touches it”
- Message spams terminal when committing changes from a window machines. Has to do with line endings in windows vs unix.
- Turn off:
git config core.autocrlf true
- See SO post for more details
View HTML file in browser
- Syntax: “https://raw.githack.com/<acct name>/<repo name>/<branch name>/<directory name>/<file name>.html”
Download files from repositories
- https://raw.githubusercontent.com/user/repository/branch/filename
# Or evidently this way works too # adds ?raw=true to the end of the url <- url("https://github.com/notast/hierarchical-forecasting/blob/main/3feat_all.RData?raw=true") feat_all_url load(feat_all_url) close(feat_all_url)
Get filelist from repo and download to a directory
- ** Directory urls change as commits are made **
library(httr) # example: get url for the data dir of covidcast repo <- httr::GET("https://api.github.com/repos/ercbk/Indiana-COVIDcast-Dashboard/git/trees/master?recursive=1") %>% req ::content() httr# alphabetical order <- req$tree %>% trees map(., ~pluck(.x, 1)) %>% as.character() # returns 20 which is first instance, so 19 should the "data" folder detect_index(trees, ~str_detect(., "data/")) # url for data dir $tree[[19]]$url req # example # Get all the file paths from a repo <- GET("https://api.github.com/repos/etiennebacher/tidytuesday/git/trees/master?recursive=1") req # any request errors get printed stop_for_status(req) <- unlist(lapply(content(req)$tree, "[", "path"), use.names = F) file_paths # file_path wanted <- filter file path to file you want # gets the very last part of the path <- basename(file_path_wanted) file_wanted <- paste0("https://raw.githubusercontent.com/etiennebacher/tidytuesday/master/", file_wanted) origin <- "output-path-with-filename-ext" destination # if file doesn't already exist, download it from repo into destination if (!file.exists(destination)) { # if root dir doesn't exist create it if (!file.exists("_gallery/img")) { dir.create("_gallery/img") }download.file(origin, destination)
Config Options
Notes from: Popular git config options - More options listed that are not presented here.
Setting Options
- Add via CLI:
git config --global <name> <value>
- Example:
git config --global diff.algorithm histogram
- Example:
- Delete by going into
~/.gitconfig
and delete the parameter and value
- Add via CLI:
merge.conflictstyle diff3 - Provides extra information on merge conflicts
<<<<<<< HEAD def parse(input): return input.split("\n") ||||||| b9447fc def parse(input): return input.split("\n\n") ======= def parse(text): return text.split("\n\n") >>>>>>> somebranch
- Below <<<<<< HEAD: This is your local code that you’re trying to push
- Between |||||||| b9447fc and =======: This is the original version of the code
- Above <<<<<< somebranch: This is code from the branch that got merged before yours (I think)
- Therefore, the correct merge conflict resolution is
return text.split("\n")
, since that combines the changes from both sides.
merge.conflictstyle zdiff3 - A newer version of merge.conflictstyle diff3
A B C D E <<<<<<< ours F G ||||||| base # Add More Letters ======= X Y Z >>>>>>> theirs
- Above <<<<<< ours: This is the original code plus the code that belongs to the branch that got merged that is not in conflict with your code
- Below <<<<<< ours: This is the code that is in conflict with the branch (e.g. main) your merging into.
- Below |||||||| base: This is the code that has been removed from the original code for both mergers
- Above <<<<<< theirs: This is code for another branch that was merged before yours that is in conflict with your code.
push.default current - Says that when using
git push
to always push the local branch to a remote branch with the same name.- push.default simple is the default in Git. Means
git push
only works if your branch is already tracking a remote branch. - I guess it’s possible to push a local branch to a remote branch of a different name.
- push.default simple is the default in Git. Means
init.defaultBranch main - Create a main branch instead of a master branch when creating a new repo. I normally do this on Github.
commit.verbose true - This adds the whole commit diff in the text editor where you’re writing your commit message, to help you remember what you were doing.
rerere.enabled true - This enables rerere (”reuse recovered resolution”), which remembers how you resolved merge conflicts during a git rebase and automatically resolves conflicts for you when it can.
core.pager delta - The “pager” is what git uses to display the output of git diff, git log, git show, etc.
- Values:
- delta: A fancy diff viewing tool with syntax highlighting
- less -x5,9 - Sets tabstops, which I guess helps if you have a lot of files with tabs in them?
- less -F -X - Not sure about this one, -F seems to disable the pager if everything fits on one screen if but her git seems to do that already anyway
- cat - To disable paging altogether
- Delta also suggests that you set up interactive.diffFilter delta –color-only to syntax highlight code when you run
git add -p
.
- Values:
diff.algorithm histogram - Improves the Patience algorithm for presenting diffs. See link in article for more details.
Default (I think the default algorithm is Myers.)
-.header { +.footer { margin: 0; } -.footer { +.header { margin: 0; + color: green; }
- footer didn’t actually have margin: 0 and color: green in the original code like this diff makes it seem. In reality, the two rules have switched order with header gaining the additional property, color: green.
Histogram
-.header { - margin: 0; -} - .footer { margin: 0; } +.header { + margin: 0; + color: green; +}
- This shows header’s old rule without color: green at the top and being removed. footer is accurately depicted as unchanged. Then, it shows header with the addtional property, color: green, added below footer.
includeIf - Allows you to use different options depending which directory your project is in.
Example: Use this config file only if you’re in the “work” directory
[includeIf "gitdir:~/code/<work>/"] path = "~/code/<work>/.gitconfig"
- Good if, for example, you want to have a work email set for work repos and personal email for set for personal repos
insteadOf - Useful to correct little mistakes often you make
See article for other usecases
Example: If you accidently clone using http when you want to use SSH
[url "git@github.com:"] insteadOf = "https://github.com/"
- Now when you accidently clone a repo using the http address, it’ll change it to the ssh address in
.git/config
. Now you’ll be using ssh to push changes which is more secure.
- Now when you accidently clone a repo using the http address, it’ll change it to the ssh address in
Submodules
status.submoduleSummary true diff.submodule log submodule.recurse true
- See thread for details
- The top two “make
git status
andgit diff
display some more useful information on how things differ in submodules.” - The bottom one aids in the updating of submodules when switching branches
diff.colorMoved default - Uses different colours to highlight lines in diffs that have been “moved”
- diff.colorMovedWS allow-indentation-change - With diff.colorMoved set, also ignores indentation changes
gpg.format ssh - Allows you to sign commits with SSH keys
merge.tool meld (or nvim, or nvimdiff) - Enables use
git mergetool
to help resolve merge conflicts
Optimizations
- For large repos, simple actions, like running git status or adding new commits can take many seconds. Cloning repos can take many hours.
- Benefits
- It improves the overall performance of your development workflow, allowing you to work more efficiently. This is especially important when working with large organizations and open source projects, where multiple developers are constantly committing changes to the same repository. A faster repository means less time waiting for Git commands such as
git clone
orgit push
to finish. - It helps to optimize the storage space, as large files are replaced by pointers which take up less space. This can help avoid storage issues, especially when working with remote servers.
- It improves the overall performance of your development workflow, allowing you to work more efficiently. This is especially important when working with large organizations and open source projects, where multiple developers are constantly committing changes to the same repository. A faster repository means less time waiting for Git commands such as
- Misc
See How to Improve Performance in Git: The Complete Guide
- Explainer, config settings, advanced gc, checkout, and clone commands
- Use
.gitignore
- Generated files, like cache or build files
- They will be modified at each different generation — and there’s no need to keep track of those changes.
- Third-party libraries
- Instead, aim for a list of the required dependencies (and the correct version) so that everyone can download and install them whenever the repo is cloned.
- For example, with a
package.json
file for JavaScript projects you can (and should) exclude the/node_modules
folder. .DS_Store
files (which are automatically created by macOS) are another good candidate
- For example, with a
- Instead, aim for a list of the required dependencies (and the correct version) so that everyone can download and install them whenever the repo is cloned.
- Generated files, like cache or build files
- Git LFS
Designed specifically to handle large file versioning. LFS saves your local repositories from becoming unnecessarily big, preventing you from downloading unnessary data.
- Git LFS intercepts any large files and sends them to a separate server, leaving a smaller pointer file in the repository that links to the actual asset on the Git LFS server.
This is an extension to the standard Git feature set, so you will need to make sure that your code hosting provider supports it (all the popular ones do).
Also need to download and install the CLI extension on your machine before installing it in your repository.
Set-Up
$ git lfs install $ git lfs track "*.wav" $ git lfs track "images/*.psd" $ git lfs track "videos" $ git add .gitattributes
- Tells Git LFS which file extensions it should manage.
.gitattributes
notes the file names and patterns in this text file and, just like any other change, it should be staged and committed to the repository.- Can now add files and commit as normal
List all file extensions being tracked:
git lfs track
List all files being managed:
git lfs ls-files
- Don’t download the version history if you don’t need to
git clone –depth 1 gitj@github.com:name/repo.git
Troubleshooting
- Diverged Branches
- Keeps asking for username/password when pushing
- Solution: You (or if you used
usethis::use_github/git
) probably set-up a https connection when you need a ssh connection.- See https://docs.github.com/en/get-started/getting-started-with-git/managing-remote-repositories#changing-a-remote-repositorys-url to change from https to ssh.
- Solution: You (or if you used
- Undo a commit, but save changes made (e.g. you forgot to pull before you pushed)
- Steps
git log
- Shows commit history. Copy the hash for your last commitgit diff <last commit hash> > patch
- save the diff of the latest commit to a filegit reset --hard HEAD^
to revert to the previous commit- **After this, your changes will be lost locally **
git log
- confirm that you are now at the previous commitgit pull
- correct the mistake you made in first placepatch -p1 < patch
- apply the changes you originally madegit diff
- to confirm that the changes have been reapplied- Now, you do the regular commit, push routine
- Steps
- Undo uncommitted changes:
git stash
followed bygit stash drop
- “But only use if you commit often” - Guessing this is not good if your commit is somehow large and/or involves multiple files
- Search commits by string:
git log --grep <string>
- Pinpoint bugs in your commit history
Instead of sequentially searching each previous commit to look for the bad commit,
git bisect
helps you perform a bisect search for the commit which saves time.Scenario: A bug is introduced in a codebase, but it is not discovered until later. The feature used to work, but now, it does not. The feature was definitely known to work 3 weeks ago.
Manual Workflow
Make sure you’re in the current commit that’s bad and start
git bisect
- 1
- This labels the current commit as bad (i.e. bug is present)
- 2
- This lists every commit for last 3 weeks
- 3
- Switch to the commit that’s the version of the project that was 3 weeks ago when supposedly the feature was working. The first commit listed (i.e. top) will be the commit closest to 3 weeks ago — with older commits below it. You only need to use the first 6 or so digits of the commit hash.
Recompile code and test commit for bug
::load_all() devtools
load_all
will recompile your package using this current version’s code- After recompiling code, use your reproducible examplet to see if the bug is present in this version
- If the bug is stil present, then go to the next older commit and repeat process. Keep loading older commits until you find one that doesn’t have the bug.
- If ths is the case and assuming you don’t have to go back too much further to find a “good” commit, then you can stop here since you’ll have found the bad commit that introduced the bug.
- If you don’t find a good commit around this time period, then quit the current
git bisect
session usinggit bisect reset
and choose whichever commit you stop at as the new starting point for a newgit bisect
session and repeat this whole workflow.
Go to terminal and mark this commit as good
git bisect good
- Git will automatically switch you to commit that’s the midway point between the “start” commit and the commit you labeled as “good.”
- It tells you how many commits that are currently between you and the “start” commit which is the same amount as between this midway commit and the commit you labelled as “good.
- It also tells you how many more bisections (“steps”) you’ll have to go through to find the commit resposible for the bug.
Repeat Step 2 and test verstion for the bug. Then label commit as good or bad
git bisect bad
- Afterwards, git will automatically
checkout
to the commit that is either midway between this commit and “start” or the end commit based on whether you label this current commit as good or bad.
- Afterwards, git will automatically
Continue labelling commits until git’s message is “<some commit hash> is the first bad commit.”
- Git will also show you the commit message and a list of files that were changed.
Use
git show <commit hash>
to see the diff(Optional) Use
git bisect log > file-name
to save the session to a file.Use
git bisect reset
to exit and return you to where you were at the start of this workflow (HEAD)
Automatic Workflow
Write script that includes you reproducible exaample and have it return an error code of 0 if it does not contain the bug or return an non-zero error if it does contain the bug.
Example
::load_all() devtools if (nr != nrow(df)) { stop("error") }
load_all
will recompile your package using this current version’s code- Returns non-zero error code if condition is not triggered (i.e. False) and a 0 error code when the condition is triggered (i.e. True).
- Could also use
stopifnot
here.
(Optional) If you already know the commit hash of commit from 3 weeks ago and that does not have the bug, you can bypass the step 3.
# Make sure you're in the current commit that's got the bug git bisect start1 git bisect bad23348b0 git bisect good
- 1
- This labels the current commit as bad (i.e. bug is present)
- 2
- This labels the commit from 3 weeks ago that you know doesn’t have the bug
Do steps 1, 2, and 3 of the Manual Workflow
Run auto-bisect
git bisect run Rscript test.R
- test.R is the script from step 1 that determines whether the version (i.e. commit hash) of your code has the bug.
- This run through all the steps of the Manual Workflow and determine with the version of the code is “good” or “bad” by whether the script returns an error code of zero or non-zero.
Read final message to get the commit hash with bug in it.
- Message will be “<some commit hash> is the first bad commit.”
- Git will also show you the commit message and a list of files that were changed.
See steps 6, 7, and 8 of the Manual Workflow
Pulling
Save your changes, pull in an update, apply your changes
git stash git pull git stash pop
git stash pop
throws away the (topmost, by default) stash after applying it, whereasgit stash apply
leaves it in the stash list for possible later reuse (or you can thengit stash drop
it).
Regarding potential merge conflicts
- “For instance, say your stashed changes conflict with other changes that you’ve made since you first created the stash. Both pop and apply will helpfully trigger merge conflict resolution mode, allowing you to nicely resolve such conflicts… and neither will get rid of the stash, even though perhaps you’re expecting pop too. Since a lot of people expect stashes to just be a simple stack, this often leads to them popping the same stash accidentally later because they thought it was gone.”
Fetching just gets the info about the commits made to the remote repo
git fetch origin
Some technical discussion for always using
git pull --ff
- https://blog.sffc.xyz/post/185195398930/why-you-should-use-git-pull-ff-only-git-is-a
- https://megakemp.com/2019/03/20/the-case-for-pull-rebase/
- it’s still confusing but pull rebase sounds fine to me
- –global tag says do it for all my repos
- Not sure what the true and only are for
git pull --help
will open doc in browser
Pulling by rebase
Local: Using this method as default
git config pull.rebase true git pull
Remote
git pull --rebase
Pulling by fast-forward
Local: Using this method as default
git config --global pull.ff only git pull
Remote
git pull --ff
Branching
Misc
Operations
Create a branch (e.g. “testing”)
git branch testing
Work in a branch
git checkout testing
- The files in your working directory change to the version saved in that branch
- It adds, removes, and modifies files automatically to make sure your working copy is what the branch looked like on your last commit to it.
Create and work in a branch
# new way git switch -c testing or git checkout -b testing or git branch testing git checkout testing
- Creates the branch and switches you to working in that branch
- If you did a bunch of changes in a codebase, only to realize that you’re working on master,
switch
will bring those local changes with you to the new branch. So I guess they won’t affect master then.- Unless If you already committed to main, then those changes are both in your new branch and in main. So you would still have to clean up the main branch.
Moving between branches
From master to testing
git checkout testing
Local files are deleted and replaced with branch versions
- Alternative: worktree
Example
What happens when you move from branch A to branch B
BRANCH-A BRANCH-B alpha.txt alpha.txt bravo.txt charlie.txt charlie.txt delta.txt
bravo.txt
is deleted from your local disc anddelta.txt
is addedIf any changes to
alpha.txt
orcharlie.txt
have been made and no commit has been made, thecheckout
will be aborted- So either revert the changes or commit the changes
Untracked files or newly created files
- If you have branch A checked out and you create a new file called
echo.txt
, Git will not touch this file when you checkout branch B. This way, you can decide that you want to commitecho.txt
against branch B without having to go through the hassle of (1) move the file outside the repo, (2) checkout the correct branch, and (3) move the file back into the repo.
- If you have branch A checked out and you create a new file called
Deleting a branch
Local branch
git branch -d testing
Remote branch
git push <remoteName> --delete <branchName>
See existing branches
git branch
See what has been commited the remote repo branches
git fetch origin git branch -vv
origin is the name of the remote
Result
testing 7e424c3 [origin/testing: ahead 2, behind 1] change abc master 1ae2a45 [origin/master] Deploy index fix * issue f8674d9 [origin/issue: behind 1] should do it cart 5ea463a Try something new
- Format: branch, last commit sha-1, local branch status vs remote branch status, commit message
- The star indicates the HEAD pointer’s location (where you’re at, i.e.
checkout
) - testing branch
- ahead 2 means I committed twice to the local testing branch and this work has not been pushed to the remote testing branch repo yet.
- behind 1 means someone has pushed a commit to the remote testing branch repo and we haven’t merged this work to our local testing branch
Get the last 10 branches that you’ve committed to locally:
git branch --sort=-committerdate | head -n 10
Rename branch
# change locally git branch --move <bad-branch-name> <corrected-branch-name> # change remotely in repo git push --set-upstream origin <corrected-branch-name> # confirm change git branch --all
HEAD
- Determines to which branch new commits are added
- Example
- testing branch is created (not shown in above picture)
- HEAD points at master branch
- master branch and the new testing branch both point at commit, f30ab.
- f30ab commit points to previous commit 34ac2
- User executes
checkout
to testing branch (not shown in picture)- HEAD now points to testing branch
- User commits 87ab2 (shown in pic)
- 87ab2 is committed to the testing branch
- testing branch is now ahead of the master branch by 1 commit
- testing branch is created (not shown in above picture)
- Example
Merging
Notes
- NEVER merge your branch locally on your machine with the master branch, ALWAYS merge online via pull request
- Steps
- Push final changes and use of a pull request
- Switch to master branch locally and pull the merged changes
- Steps
- NEVER merge your branch locally on your machine with the master branch, ALWAYS merge online via pull request
Update branch with work that’s been done in master branch
After updating your local branch, push to remote repo (no commit necessary)
# while in branch git merge master
Fast-Forward
Example
Code (merging work in branch with the master branch for production)
# currently in test branch git checkout master git merge testing
Example
iss53 branch ahead of master by 2 commits (c3, c5) and behind 1 commit (c2)
Same code as Fast-Forward merge but git handles the merge a bit differently
git checkout master git merge iss53
-
- C6 (bottom) is called a “merge commit.” Its created by git and points to two commits instead of one.
- No need to
merge
with master (i.e. update local iss53 branch with C4 changes in master) before committing final changes- If there are changes in the same lines of code C4 and C5, then there will be a conflict (See below, Conflicts >> Example)
Conflicts
- Example
- Changed files in C4 (see above example) are in the same lines of the same files that you made changes to in C5
- Remember: you’re now in the master branch since you did checkout master as part of the merge code
- Steps
Check status to which files are causing the conflict (e.g. index.html)
git status Unmerged paths: (use "git add <file>..." to mark resolution) both modified: index.html
Lines in file are marked
# <<<<<<< HEAD:index.html # <div id="footer">contact : email.support@github.com</div> # ======= # <div id="footer"> # please contact us at support@github.com # </div> # >>>>>>> iss53:index.html
Above ======= is the master branch version of the code and below is the iss53 branch version
Make necessary changes and save the file
git add . or git add <resolved file>
- Tells git that conflict is resolved
Check status to confirm everything has been resolved
git status On branch master All conflicts fixed but you are still merging. (use "git commit" to conclude merge) Changes to be committed: modified: index.html
git commit
- No message required (there’s a default message) but you can add one if you want
- Changed files in C4 (see above example) are in the same lines of the same files that you made changes to in C5
Collaboration
- Add collaborators to your repository
- One person invites the others and provides them with read/write access (github docs)
- Steps
- Go to the settings for your repository
- Manage access >> “Invite a Collaborator”
- Search for each collaborator by full name, acct name, or email
- Click “Add <name> to <repo>”
- Each collaborator will need to accept the invitation
- Sent by email
- Steps