GIT DIY Exercise for Beginners

DEVESH CHANCHLANI3/24/2017

GIT DIY Exercise for Beginners

The exercise is for folks who are beginning to learn git, and may be even transitioning from a traditional version control tool like svn or cvs. Before proceeding with this exercise, I would suggest a quick reading of this article on Getting Started - About Version Control.


Pre-requisites :

1. Install git on your machine - https://git-scm.com/book/en/v2/Getting-Started-Installing-Git


2. Create an account on GitHub, if do not have any.


3. Create a new repository on GitHub. [Quick link: https://help.github.com/articles/create-a-repo/ ]


4. Open Terminal or Command-prompt, switch to relevant folder of choice and checkout the repository locally, as follows:

  git clone <path to your repository on github>


You can either use “HTTPS” or “SSH” protocol to work with the GitHub repository.

An HTTPS path would look something like – https://github.com/deveshchanchlani/train4git.git,

while an SSH path would look something like – git@github.com:deveshchanchlani/train4git.git


The above command will create a new folder by the name of your repository, and place the repository contents in it.


5. Now, Initialize your repository. In my case it would be:

    
  # 'train4git' is name of my new repository.
  cd train4git    

  # create a file to begin with. 
  touch README.md 

  # add the new file to be recognized by git.
  git add README.md 

  # provide a comment for your first commit.
  git commit -m "first commit"

  # push the new change-commit to remote GitHub repository. 
  git push -u origin master 



6. Finally to get running with the exercise, run the following commands and in the same order:

# create a local repository branch called 'canary'
  git branch canary

  # create a local repository branch called 'dev'
  git branch dev

  # push the local 'dev' branch to remote GitHub repo
  git push -u origin dev

  # push the local 'canary' branch to remote GitHub repo
  git push -u origin canary 
	


Now, browse to your repository page on GitHub. You would see a drop-down labeled "Branch: master". Clicking on this drop-down, you would notice 3 branches - dev, canary and master. "master" is the default branch. "dev" and "canary" branches are the ones we just created above, for our exercise.

Let's start rolling !


Knowing remote path of the repo
  git remote -v


This would show an output similar to, as shown below:

[If you have checked out using SSH]

  origin  git@github.com:deveshchanchlani/train4git.git (fetch)
  origin  git@github.com:deveshchanchlani/train4git.git (push)


[If you have checked out using HTTPS]

  origin  https://github.com/deveshchanchlani/train4git.git (fetch)
  origin  https://github.com/deveshchanchlani/train4git.git (push)


You see the URL there twice because Git allows you to have different push and fetch URLs for each remote in case you want to use different protocols or addresses for reads and writes.


Tracking branches

  $ git branch # To lists all the "local" branches
    canary
    dev
  * master


Notice, the current working branch is prefixed by * sign.

$ git branch -r # To lists all the "remote" branches
  origin/canary
  origin/dev
  origin/master


You can see 3 remote branches - master, dev and canary. The branches dev and canary were created by us in the "pre-requisites" section.

$ git branch -a # To lists "all" the branches
    canary
    dev
  * master
    remotes/origin/canary
    remotes/origin/dev
    remotes/origin/master


The remote branches begin with "remotes" keyword. The branches are listed in alphabetical order, with local branches preceding the remote branches. This shows us that there are 3 local branches, and corresponding 3 remote branches.

Tracking changes and commits

If you want to ignore certain type of files from being tracked by git (like *.log files), you make their entries in a file named .gitignore. This file is usually placed inside the root directory of the repo. However, it can also be placed inside any directory in the repo.

Let us add the below contents to our .gitignore file, do notice the comment lines start with the letter '#'.

 # ignore .jar files
  *.jar

  # but do track xyz.jar, even though you're ignoring   
  # .jar files above
  !xyz.jar

  # only ignore the TODO file in the current directory, 
  # not subdir /TODO
  /TODO

  # ignore the build/ directory
  build/

  # ignore doc/*.txt files, but not doc/server/*.txt files
  doc/*.txt

  # ignore all .pdf files in the doc/ directory
  doc/**/*.pdf


You can find more information on .gitignore at https://git-scm.com/docs/gitignore

Few .gitignore templates can be found at https://github.com/github/gitignore

Now if you fire -

  git status


It would show you the file modified, under "Untracked files". This means the new file .gitignore is untracked and will not be considered for committing. So, you would be required to start tracking the file, to commit it. This would be done as:

  git add .gitignore


Now, if 'git status' command is fired again, it can be seen that the file .gitignore has been labeled as new file:, and is now being tracked by git.


Commit and pushing change-sets
  git commit -a -m "adding .gitignore"


The above command would commit the changes and create a change-set, locally, that is the change-set is local to your machine. The change-set can be observed on typing

  git log


If you go to GitHub, and check the commit history of master branch, you would not find this change-set. To move this change-set to remote (so that even your team can see it), you would now be required to push it to remote

  git push origin master:master


The syntax for push command is:

git push |remote| |local-branch-name|:|remote-branch-to-push-into|

We will talk more about “git push” going ahead.


Deleting a file and checking in

Before that, let us quickly add few files and check it in, as follows:

  touch tempFile.txt file1.txt
  git add tempFile.txt file1.txt
  git commit -a -m "adding tempFile.txt and file1.txt"
  git push origin master:master


Assuming now you no longer need tempFile.txt, let us now begin to delete it from the git repo, as follows:

  rm tempFile.txt


Now, if you do 'git status' you would see "deleted: tempFile.txt" under "Changes not staged for commit:" heading. This means that these changes are under “unstaged” bucket and will not be considered for committing purpose. So, you are now required to bring it under “staged” bucket, before committing the changes.

  git rm tempFile.txt


followed by 'git status' you would see 'deleted: tempFile.txt' under 'Changes to be committed:' heading. Now, just commit and push this change, as below

  git commit -m "deleting tempFile.txt"
  git push origin master:master


Another way of making these changes would have been:

rm tempFile.txt
  git commit -a -m "deleting tempFile.txt"
  git push origin master:master


In the above code-snippet the 'git rm' is missing. All this command was supposed to do was to bring the changes from "unstaged" bucket to the "staged" bucket. Another way of doing this is by making use of the -a option of 'git commit' command. This option makes sure that all the modified and deleted files are staged automatically. However, it does not work for new files.

Playing with branches

Switching between branches

Doing a git branch will show that the current branch is master. Now fire

  git checkout canary


This will switch the current branch from master to canary. It can be confirmed by firing 'git branch' again. It will show canary being the current branch.

Next, on firing a 'git log', only single change-set will be seen with the title “first commit”. All other change-sets will not be shown since they were on the master branch ! (Remember the 'git commit' syntax above.)

So, the commit change-sets also get switched when the current branch is switched from one to another. This is an obvious behavior.

But, what happens to the "staged” and “unstaged” buckets? Answer: They remain intact !!

Try this out, keeping in mind the current branch is “canary”:

touch unstagedFile.txt
  git checkout master
  git status


notice the unstagedFile.txt is still shown in the status of master branch, though it was created under canary branch. It will continue to show across branches until it is committed to a certain branch.

For now, let us delete this file and move ahead, fire - 'rm unstagedFile.txt'


Creating new local branch

 # switch to 'canary' branch
 git checkout canary 

 # create new branch 'newFeatureCanary' identical 
 # to current branch 'canary'
 git branch newFeatureCanary

 #switch to the new local branch 
 git checkout newFeatureCanary


This would first switch to canary and then create a new local branch called newFeatureCanary, which has identical change-sets as canary. Now, on firing 'git branch', newFeatureCanary can be found listed, but the current branch would still be canary. So, we would be required to switch to the newly created local branch.

Now, any commit made here will be exclusive to newFeatureCanary branch, that is will not be seen in any other branch, including canary.

There's a shortcut as below, to create a local branch and check it out instantly. Lets try it out on master.

#switch to master branch
 git checkout master

 #create a new branch from master and check it out as well
 git checkout -b newFeatureMaster

Another very important thing to note here is, that the branches newFeatureCanary or newFeatureMaster are local to your machine, and will not be visible to any one else. These are different from remote branches. You can confirm this by checking the repository-page on GitHub, and observe the number of branches present.


Deleting a local branch
#make sure the branch-to-delete 'newFeatureMaster' is not
 #the current branch, switch to someother branch
 git checkout master
  
 #delete the 'local' branch
 git branch -d newFeatureMaster


Listing commits

 #Getting change-sets unique to canary branch
  #and not in dev branch
  git log dev..canary

  #Getting change-sets unique to dev branch
  #and not in canary branch
  git log canary..dev

  #Getting unique change-sets in current branch wrt master
  git log master..

  #Getting unique change-sets in master branch wrt current branch
  git log ..master

  #Getting change-set log of any branch
  git log dev

  #Knowing incoming change-sets from 'canary remote'
  #to local current branch
  git fetch && git log ..remotes/origin/canary

  #Knowing outgoing change-sets to 'remote master'
  #from local current branch
  git fetch && git log remotes/origin/master..


We will discuss “git fetch” in detail a bit later.

Updating from remote

There are 2 ways in which you could update from remote repo - "pull" and "fetch+merge".

Approach A. FETCH + MERGE

In GIT, there are 3 types of branches for each line of development:

  1. Remote branch
  2. Local branch
  3. Remote tracking branch

Let us consider canary as the line of work (or branch), and have a look at the below illustration of change-sets and where each branch points.


The branch remote/canary points to the current state of the remote repository. This branch is present in the remote and will reflect change-sets that others too would have pushed.

The remote-tracking-canary branch points to the change-set, which was at the tip when you last took an update from the remote. Hence, currently your local repo does not have any knowledge of change-sets 'D' and 'E'.

The branch local/canary points to the current tip of the local canary branch, and the change-sets 'F' and 'G' are your local change-sets.

So, git branch lists all your local branches and remote tracking branches.

Now, if you fire -

  git fetch


it would align the remote-tracking-canary and remote/canary pointers, thus bringing in the snapshots of change-sets 'D' and 'E' into the local repository copy. This means that the information regarding 'D' and 'E' is now there locally (in your local .git folder), but still it will not be reflected in the local branches. Now your local repo state would be something like below:



Your local branch canary has no knowledge of the change-sets 'D' and 'E', and it is in continuation from change-set 'C'. Now your local branch canary can be fast-forwarded to the current state of the remote repo. And, now if you do git checkout canary, this is what you would get to see:

 $ git checkout canary
 Switched to branch canary
 Your branch is behind 'origin/canary' by 2 commits, 
 and can be fast-forwarded


Now to align your local branch canary with the remote canary, you would be required to do:

  #merge from remote-tracker-branch
  git merge remotes/origin/canary



This would bring your repo in the following state:


To summarize: You can do a git fetch at any time to update your local copy of a remote branch. This will update all the remote-tracking branches. This operation never changes any of your own local branches, and is safe to perform without changing your local working copy.


Now to try Approach A, we may first create a change-set on canary branch through GitHub. (https://help.github.com/articles/creating-new-files/, ignore the last step to "create a pull request") Now, we need to update our local copy with the remote changes.

# switch to canary
 git checkout canary

 # check change-sets unique to remote-tracker-canary
 # wrt canary branch
 git log ..remotes/origin/canary
 # should not show any change-sets

 # fetch remote changes into local repo copy
 git fetch

 # check again, change-sets unique to remote-tracker-canary
 # wrt canary
 git log ..remotes/origin/canary
 # will show the remotely added change-set
 # through GitHub in the log

 # merge changes from remote-tracker-canary into  current
 # i.e. canary branch
 git merge remotes/origin/canary


Approach B. PULL

Pull is just doing fetch and merge in one command. In the simplest terms, git pull does a git fetch followed by a git merge.

If you are on branch canary, and do git pull, it would fetch the remote-repo, update the local-repo pointers, but merge only the current branch and not others.

Let us consider the below illustration, as current repo state:



Now, if you were on branch canary, and do

  git pull


the new repo-state would be as below:


So, git pull actually aligned all the remote-tracking branches with their corresponding remote branches. Since, the current local branch was canary, even the local-canary pointer was updated but not local-dev pointer.

To try it out, first create a change-set on dev and canary branches on GitHub. Next fire -

# check change-sets unique to remote-tracker-canary wrt canary 
 git log canary..remotes/origin/canary
 # should not show any change-sets

 # check change-sets unique to remote-tracker-dev wrt dev branch
 git log dev..remotes/origin/dev
 # should not show any change-sets

 # pull remote changes into canary
 git checkout canary
 git pull

 # notice the new change-set present in canary
 git log canary

 # check again, change-sets unique to remote-tracker-canary
 # wrt canary
 git log canary..remotes/origin/canary
 # will not show any change-sets since canary
 # branch also got updated

 # check again, change-sets unique to remote-tracker-dev wrt dev
 git log dev..remotes/origin/dev
 # will show remotely added change-set to remote-dev
 # through GitHub, since local-dev branch does not get updated


Updating a local branch from desired remote branch

Let us create a new local branch called newCanaryFeature from canary branch, as follows:

git checkout canary
  git checkout -b newCanaryFeature


Now in this case, the branch newCanaryFeature will not have a corresponding remote-tracker branch. Therefore, if you now try doing git pull, you would get an error information claiming …

There is no tracking information for the current branch.

Please specify which branch you want to merge with.

There are 3 ways of updating the branch newCanaryFeature -

Approach A. (Pull + Merge)

 # 1. Update canary to latest
  git checkout canary
  git pull

  # 2. checkout new branch
  git checkout newCanaryFeature

  # 3. take merge from canary
  git merge canary



Approach B. (Fetch + Merge)

# 1. Update local copy of repo, which means all remote-trackers
  git fetch

  # 2. checkout new branch
  git checkout newCanaryFeature

  # 3. take merge from canary remote-tracker
  git merge remotes/origin/canary



I always prefer second approach to first approach. The reason being it makes me stick to common mechanisms for updating any local branch, irrespective of the branch having an associated remote-tracker. Hence, for the same reasons, I prefer fetch + merge approach to the pull approach.

Approach C. (Assign Remote-tracker)

  # 1. Assign remote-tracker to the local branch
  git branch --set-upstream-to=origin/canary newCanaryFeature

  # 2. pull recent changes from remote
  git pull



This approach is simply to add remote-tracker to newCanaryFeature, and then doing a git pull on it. I never prefer this approach. I always like only one local branch corresponding to a remote-tracker.

Understanding "merges" between branches

Suppose you have two branches, "stable" and "new-idea", whose tips are at revisions E and F, respectively:


So the commits A, C and E are on "stable" and A, B, D and F are on "new-idea". If you then merge "new-idea" onto "stable" with the following commands:

  git checkout stable
  git merge new-idea


Then you have the following:



Observe, commit G being the merge-commit on branch "stable". If you carry on committing on "new-idea" and on "stable", you get:



So now A, B, C, D, E, F, G and H are on "stable", while A, B, D, F and I are on "new-idea".


Creating Diff and Applying Patches

# Getting diff of change-sets unique to canary wrt dev branch
  git diff dev..canary > diff1.patch

  # Getting diff of change-sets unique to dev wrt canary branch
  git diff canary..dev > diff2.patch

  # Getting diff of change-sets unique to current branch wrt master
  git diff master.. > diff3.patch
  # OR, simply,
  git diff master > diff4.patch

  # Getting diff of change-sets unique to master wrt current branch
  git diff ..master > diff5.patch

  # Knowing incoming diff in canary after 'git fetch'
  git fetch && git diff ..remotes/origin/canary > diff6.patch

  # Knowing outgoing diff from current to remote-master branch
  git fetch && git diff remotes/origin/master.. > diff7.patch


The above commands would guide the output of respective diff commands into file *.patch files. To apply a patch -

  # Syntax to apply a patch => patch -p1 < {/path/to/patch/file}
  # To apply a patch on canary branch
  git checkout canary
  patch -p1 < diff.patch  

  # To reverse an applied Patch
  patch -R < /path/to/file


The patching example above makes use of patch command. If not already installed, you may need to install it.

The URL to install it on Windows - http://gnuwin32.sourceforge.net/packages/patch.htm

To install it on Ubuntu, fire - sudo apt-get install patch


GIT Repository Browser
  gitk


The above command brings up a graphical interface for browsing the git repository. More details on this can be found at - https://git-scm.com/docs/gitk

TechPracticesGit