1 03-VersionControl


Previous: 01-02-LinuxBash.html

1.1 Introduction: three problems

1.1.1 1. How to keep past versions of your stuff?

For example, in a day’s work you may produce:

~> ls
draft.py
final.py
final_real.py
final_real_real.py
actually_done.py
actually_done_v1.py
actually_done_v2.py
actually_done_v2.1.py
actually_done_v2.1-2019-12-10.py
...

Ok, I guess I should use Git…

Sub-problem:
You had a working version of your code at the beginning of the day,
but at the end of the days work, it’s broken.

03-VersionControl/vc-terrible.jpg

1.1.2 2. How to collaborate by making copies of a document or code, and then re-integrate those changes.

For example:
How to write code between 1000’s of people while everyone wants to work at once.
How to re-write or draft a document (e.g., a constitutional amendment) at once with lots of people.

1.1.3 3. How to back up your code?

In case of fire:
git commit, git push, leave the building!

1.2 Version control, the git that keeps on giving

https://en.wikipedia.org/wiki/Version_control
Like a MS-Word document’s track changes, but better, with many more features, and for source code!
Comes in many flavors.

https://en.wikipedia.org/wiki/Distributed_version_control
A form of version control in which the complete code-base, including its full history, is mirrored on every developer’s computer.
This enables automatic management branching and merging, speeds up of most operations (except pushing and pulling), improves the ability to work offline, and does not rely on a single location for backups.
In general, distributed systems are more robust and favorable for end-users, when compared to centralized systems.

Version control is the Git…
Git is one byte short of a four-letter word.

https://en.wikipedia.org/wiki/Git
Is a distributed VCS written by the original author of the Linux kernel,
https://en.wikipedia.org/wiki/Linus_Torvalds
Torvalds sarcastically quipped about the name Git (which means unpleasant person in British English slang): “I’m an egotistical bastard, and I name all my projects after myself. First ‘Linux’, now ‘git’.” …
The man page describes Git as “the stupid content tracker”.
The read-me file of the source code elaborates further:
random three-letter combination that is pronounceable, and not actually used by any common UNIX command.
Git: stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
“global information tracker”
“goddamn idiotic truckload of…”

1.3 Git

https://git-scm.com/
https://en.wikipedia.org/wiki/Git
03-VersionControl/git.png
Git is software. It exists locally on your machine and other developer’s machines.
Github, Gitlab, and BitBucket are websites (servers), that interface with end-users’ git software.
They host their own versions of Git-compatible server software,
that hosts Git repositories and talks to local Git processes.
Quite ironically, unlike GitHub (now a Microsoft product),
Gitlab’s server-side software is actually:
https://en.wikipedia.org/wiki/Open_source
so anyone can host their own Gitlab website/server.
Gitlab itself also has a rich positive development community.
MST IT hosts two installations of Gitlab server-side software,
on two different servers residing on campus (cool!!):
https://git.mst.edu
(permanent code, like lab code or personal projects)
https://git-classes.mst.edu
(class code, which gets deleted ever now-and-then)

03-VersionControl/gitcomic.png
This is actually a good approach for now, if you break your “repo” but still have your code…
Later, you will want to learn branching and conflict handling better.

1.3.1 Demo 1

#!/bin/bash

# Make a repository.
# Show the gitlab view of it.

git clone
vim README.md
vim hello_world.py
# write, save, quit
git add .
git commit -m "my first repo!"
git push -u origin master

# show web interface

# edit a file locally
git push #?
git pull #?

# edit something in web, then, what happens?

git pull

1.3.2 Demo 2

Check out some real repositories:

https://github.com/explore
For example:
https://github.com/nasa
https://github.com/LLNL

https://gitlab.com/explore
For example:
https://gitlab.com/cryptsetup/cryptsetup
https://gitlab.com/inkscape/inkscape

1.3.3 Extra background

Reading about version control and Git. Read these roughly in order.

https://www.atlassian.com/git/tutorials/what-is-version-control
https://www.atlassian.com/git/tutorials/source-code-management
https://www.atlassian.com/git/tutorials/what-is-git

https://git-scm.com/
https://git-scm.com/videos
https://git-scm.com/docs/gittutorial
https://git-scm.com/book/en/v2
(read at least chapters 1 and 2)

https://docs.gitlab.com/ce/gitlab-basics/README.html
https://docs.gitlab.com/ce/gitlab-basics/start-using-git.html

https://marklodato.github.io/visual-git-guide/index-en.html
https://learnxinyminutes.com/docs/git/
http://think-like-a-git.net/
https://www.dangitgit.com/
https://learngitbranching.js.org/
http://git.rocks
tools-for-computer-scientists.pdf Appendix E, Chapter 1
03-VersionControl/03-version_control.pdf (my old slides)

Cheat sheets:
https://rogerdudler.github.io/git-guide/
https://github.com/hbons/git-cheat-sheet/raw/master/git-cheat-sheet.pdf
http://wall-skills.com/wp-content/uploads/2013/12/git-Cheat-Sheet_Wall-Skills1.pdf
https://rogerdudler.github.io/git-guide/files/git_cheat_sheet.pdf
https://www.atlassian.com/git/tutorials/atlassian-git-cheatsheet
https://about.gitlab.com/images/press/git-cheat-sheet.pdf

1.3.4 Tracking changes

1.3.4.1 Git version control?

Keeps track of changes to your code.
You don’t have to worry about accidentally losing or deleting code.
You can experiment with changes to your code, and then reset to a known good state.
Makes collaborating with others easier.

1.3.4.2 How does Git work?

Distributed
Everything is kept on your, and your collaborators’ local machines, not primarily or necessarily in the cloud.

Repository
A collection of code and history; a.k.a, “repo”.

Commit
A chunk of saved changes, like a snapshot in time, similar to a VM snapshot, but only for a particular folder (a git repo).

1.3.5 Distributed

Distributed version control
03-VersionControl/distributed.png

1.3.6 Snapshots

Snapshots (commits) include all files
03-VersionControl/snapshots.png

1.3.7 Storage landscape

Three places where edits exist
03-VersionControl/areas.png

1.4 Gitting Started…

Pre-use configuration: these are just for meta-data, not login, etc.
git config --global user.name "<YOUR NAME>"
git config --global user.email <EMAIL>
git config --global core.editor vim
or your choice of text editor

1.4.1 Basic local use:

git init Makes a new empty git repository out of your current working directory and its sub-directories.
git add <FILENAME> Adds FILENAME or changes to FILENAME to the next commit. Addable thing can be a wildcard, like . or *
git commit -m "some message" Takes a snapshot (commit) with any staged (added) changes.
Note: don’t skip the -m “message” or you may end up stuck in vim; if so, just hit ‘i’ type something, hit ‘esc’, then type ‘:wq!’

THESE SHOULD BE YOUR CONSTANT GO-TO:
git status Shows the status of the repository.
git diff Shows the diff of anything you have done from your last snapshot
git diff fileofinterest.py
git diff commithash
git log --all --graph Shows a nice history

1.5 commit

echo hey >>README.md
git add README.md
git commit -m "a message"
echo hey >>README.md
git commit -am "b message"
echo hey >>README.md
git commit -am "c message"
git log -p --all --graph

++++++++++++++++++++++++++++
Cahoot-02c.1

1.6 branch

May the forks be with you!

git branch new-branch
git checkout new-branch
git checkout -b new-branch
git log -p --all --graph
echo hey >>README.md
git commit -am "d message"
git log -p --all --graph
git checkout master
git log -p --all --graph
git checkout -b another
git log -p --all --graph
echo hey >>README.md
git commit -am "e message"
git log -p --all --graph

1.7 diff for branches

git diff branch1..branch2

1.8 merge

Incorporates changes from the named commits (since the time their histories diverged from the current branch) into the current branch.
git merge new-branch
git log -p --all --graph
git checkout master
git merge another
git log -p --all --graph

1.8.1 Merge conflicts (oh Fork! …)

CONFLICT (content): Merge conflict in the-file.txt
Automatic merge failed; fix conflicts and then commit the result.

In the-file.txt:

<<<<<<< HEAD 
The current branch's contents 
=======
Stuff from the branch you're merging 
>>>>>>> new-branch 

git add the-file.txt
git commit -m "message"

++++++++++++++++++++++++++++
Cahoot-02c.2

1.9 Exploration

1.9.1 Looking at stuff

git status shows summary data

git log Show a log of commits
--graph Neat ASCII graph
--all Shows all branches
-p Show what changed in each commit

git show firstfourofhashofcommit

git diff Show un-added, un-committed changes for all files
git diff firstfourofhashofcommit
git diff --cached shows diff with added but not committed changes
git diff branch1..branch2

1.10 Git happens

Now, how to clean up a mess?

1.10.1 Revert single file in latest commit

git checkout file.py

1.10.2 reverting changes

git revert help

1.10.2.1 Undoing stuff since a commit

To delete all local changes in the branch that have not been added to the staging area, and leave un-staged files/folders, type:
git checkout .

To undo the most recently added, but not committed, changes to files/folders:
git reset .

1.11 Remote repositories

1.11.1 Working with remotes

git clone <REPO_URL>
Makes a copy of a repository.

git push
git push
Pushes changes from your current branch to the remote branch it tracks.
(You may need to run git config --global push.default simple.)

For example, to push your local commits to the master branch of the origin remote:
git push origin master

git pull
git pull
Pulls changes from the remote branch and merges them into your current branch.

git remote -v
To view your remote repositories.

git remote add <REMOTE_NAME> <REPO_URL>
Adds a remote to an existing repository.

For projects you work on:
A ‘git pull’ a day, keeps the conflicts away.

If there may be remote changes,
then commit before pull!

git commit -am "always commit before pull
git pull

++++++++++++++++++++++++++++
Cahoot-02c.3

1.12 Working with others

1.12.1 Collaboration

You and your co-workers are working on a project simultaneously.
You clone the company’s repository:
git clone https://git.company.com/project.git
git checkout -b dougs-branch
to create your own development branch.
Modify files, git add <FILENAME> to stage them, and
git commit when they are in a working state.
Ready to merge with mainline?
git checkout master and
git merge dougs-branch
Your work is now merged with your local master branch (but not on the company’s repo).
Question: which branch is HEAD now pointing to?
Meanwhile, your co-workers might have made changes!
First, git pull to fetch and merge their changes.
Rectify merge conflicts (if any),
test the code, then
git add <FILENAME> to stage, and
then git commit when in a working state.
Only after pulling and merging the most recent changes should you
git push
Your work is merged with that of your co-workers, and now resides on the company repo.
Take a break.

1.12.2 Blaming your collaborators

When you need a scapegoat for that critical mistake in your code-base…
git blame help

1.12.3 Commit early, commit often!

A tip for version control, not for relationships…

++++++++++++++++++++++++++++
Cahoot-02c.4

1.13 Final Git Tips

Unlike GCC/G++, Git actually gives good error messages!
If something went wrong, it often tells you exactly what to do.
Actually read Git’s error messages!!!!
Make your commit messages descriptive.
Only git commit when the code works.
Don’t add generated files (like a.out) to your repo.
You can ignore certain files by putting their names in a .gitignore file in your repository.
When collaborating, work on separate branches and merge as you go along.
git help COMMAND will show you documentation.
git COMMAND --help will usually too.
man git COMMAND often does too.

1.14 Time to Git-er-done: Continuous testing and integration

https://en.wikipedia.org/wiki/Continuous_testing
https://en.wikipedia.org/wiki/Continuous_integration
https://en.wikipedia.org/wiki/Deployment_environment

Continuous testing was originally proposed as a way of reducing waiting time for feedback to developers,
by introducing development environment-triggered tests as well as more traditional developer/tester-triggered tests.
Continuous testing is the process of executing automated tests as part of the software delivery pipeline,
to obtain immediate feedback on the business risks associated with a software release candidate.
For Continuous testing, the scope of testing extends from validating bottom-up requirements or user stories,
to assessing the system requirements associated with overarching business goals.

03-VersionControl/continuousintegrationcycle.png
03-VersionControl/continuous-integration.png

Your unit tests are built into the git CI framework!
Check out how we do it:
https://about.gitlab.com/ci-cd/
https://docs.gitlab.com/ee/ci/
https://about.gitlab.com/product/continuous-integration/

Remember, in learning to code, and trying new projects:

Fork it until you make it!