Git Magic - From Chaos to Ordered Collaboration

Git Magic - From Chaos to Ordered Collaboration

Why Git?

Once the why is clear, the how is easy

This is easy. We want to be able to take snapshots of our code at critical junctures of the development cycle (read as undoing the mess we'll eventually create later on). We'd also like collaborating on large scale projects without the hassle of sending each other files. This is where git saves us! Git is:

  • Free
  • Open Source
  • Super fast
  • Scalable
  • Allows cheap branching/merging

Using Git

We can use git via

  • Command line
  • Code editors and IDEs
  • GUI Clients

We tend to eventually gravitate to the command line though since we'll have a much greater degree of flexibility and we might be working in environments where we won't have access to GUI tools. So why not get comfortable with it from the get go?

Configuring Git

The first time we have git on our systems, we probably want to customize our git environment to work with our existing tools.

There are 3 levels of configurations under our control

  1. System level: This applies to all users of the current system
  2. Global level: This applies to all repositories of the current user
  3. Local level: This applies to the current repository only

The most basic of these is setting our:

  • Name: git config --<level> user.name "<USER NAME>"
  • Email: git config --<level> user.email "<USER EMAIL>"
  • Default editor: git config --<level> core.editor "<PATH TO EDITOR>"
  • Line endings*: git config --<level> core.autocrlf true

More details can be found here.
We can view all our settings and where they are coming from using:
git config --list --show-origin
*this is important since its a point of difference between Windows and Linux based machines!

Initializing Git Repository

To start tracking our code changes, we need to "initialize" a git repository which basically creates a directory called .git which stores all metadata about the code repository. We do this via
git init

We should be able to see the .git directory in the current directory on running ls -a.

Git Workflow

Now that we've initialized the git repository, when we make any changes to the current directory, git will compare against the previous "state" and tell us the files that have changed since. To see this we can run:
git status

Adding Files

To add changed files, we can manually add them or use certain wildcard characters.
eg. git add <file1> <file2> *.txt
This puts them on the staging area. To finalize these changes into a snapshot we need to "commit" them. This can be done by
git commit -m "<COMMIT MESSAGE>" or just git commit

Now some rubrics when committing.

  • Size matters in a commit. We don't want a commit to have too many changes. Neither should be too small to not merit its own commit.
  • Commit should represent one change only. We shouldn't cram multiple changes onto the same commit
  • The commit messages should be meaningful and clearly elucidate the changes made

The entire sequence above can be achieved with the following line (just be careful that it will basically put all files into the snapshot!):
git commit -am "<COMMIT MESSAGE>"

Removing Files

We can track removal using the following sequence:

  1. rm <file>
  2. git add <file>
  3. git commit "<COMMIT MESSAGE>"

OR

  1. git rm <file>
  2. git commit "<COMMIT MESSAGE>"

Note: Renaming can be achieved by git mv <filename old> <filename new> followed by commit

Ignoring certain files/directories in commit

We may have some intermediate files that we may not want in our repository. For this, git provides us a special file called .gitignore. The files/directories listed in this file will not be tracked or put on the commit. We can append files to .gitignore using commands like echo or nano. Or better yet, simply open this file our favorite IDE and make the changes. Double check the files in the staging area using git ls-files.

But what if the files are already in the staging area? Then even if we make changes to the .gitignore file, the changes on those files will continue being tracked. To remove these files from the staging area, we'll have to painstakingly work through the following:
git rm --cached -r <directory> <file>
Files that have been staged can be checked using git diff --staged

History

One of the most powerful features of git is the ability to see changes as they were added to the repository. We can examine the various commits made using the git log command. Further, we can view the exact changes made using git show HEAD~<n> where "n" is the number of commit prior to the current head.
To simply list files/directories that changed we use git ls-tree HEAD~<n>.

Unstaging files

Sometimes, you may be too enthusiastic and just add files that don't have to be committed in the current commit. In this case, you can unstage the file from the staging area using:
git restore --staged <file>
While git restore tends to put file contents back to the last commit with --staged, we can safely unstage it while leaving its contents unmodified ie the contents of the file itself don't change, just the file is pulled back from the staging area.

Remove Untracked Files

We might want to remove any untracked files and directories from our working directory. For this, we use:
git clean -fd
This removes both untracked files(-f) and untracked directories(-d). It is fundamental to understand that using git clean in this manner will permanently delete untracked files and the operation cannot be undone.

Revert to last commit

Okay. We made a mistake in the current commit. To restore our code to what it used to be, we use git restore --source=HEAD~1 <file(s)>

Branching

A crucial feature of git is the ability to branch off from a common code base and make changes on that branch. This allows multiple people to work outwards from a stable code version. To create a new branch, we can use:
git checkout -b <BRANCH NAME>
If a branch already exists, we can simply use git checkout <BRANCH NAME> to switch to the existing branch.

Merging branches

Now we've added the changes to two branches say branch1 and branch2 and currently we're on branch2. How do we bring changes made on branch1 to branch2? This is extremely simple with git (with a few caveats of course!). We can use:
git merge branch1
The branch1 will be unaffected here.
Note: Merging can lead to merge conflicts (two branches have different changes on the same code). We'll have to manually fix those affected files.

Adding a Remote Repository

Since git is decentralised (the files live on your PC!), we need a way for multiple collaborators to work together. This is where the role of remote comes in. We can move code to a remote repository from where multiple folks can get started working on their changes. Before we move our code to the remote repository, we need to add it to our repository.
git remote add origin <LINK TO REMOTE REPOSITORY>
Now we can go ahead and push our changes onto the remote repository:
git push -u origin <BRANCH NAME>

Conclusion

Now this is not the end. It is not even the beginning of the end. But it is perhaps the end of the beginning.

Winston Churchill

Thanks for sticking around to the end! There a lot more to git. We've only scratched the surface and seen the power it lends to developers. To learn more, use this guide.