Sir Windows Churchill once said:
To each programmer there comes in their lifetime a special moment when they are figuratively tapped on the shoulder and offered the chance to improve their workflow and save them from future pains in the ass. What a tragedy if that moment finds them unprepared or unqualified for that which could have been their finest project.
That tap is usually the first time one hears of this mystical thing called version control.
This post is an introduction to Git –arguably the best and most popular version control system around– meant for anyone with zero or almost nil experience on the subject. Also, it isn’t supposed to be a detailed tutorial as there are plenty of those on the internet; this is a reference guide to (hopefully) help you understand the basic principles and begin a more thorough research on your own.
What is version control?
Programmers and artists alike have suffered when creating backups of their projects.
What if there’d be a magical way to keep versions of the different states a project undergoes, without having a mess of a directory?
Version control, defined as «the management of changes to documents, computer programs, large web sites, and other collections of information,» is the key to achieve this.
What is Git?
Plenty of options exist, either free or proprietary, that offer version control in one way or another; Mercurial, SVK, Subversion and Perforce are some of them. There’s one that has grown in popularity since a while ago, overshadowing the rest: Git.
Created by Linus Torvalds in 2005, Git is nowadays one of the most popular version control systems around, designed to be robust, fast, efficient and distributed.
|Repository||“On-disk data structure which stores metadata for a set of files and/or directory structure”; at least that’s what Wikipedia has to say about that. For our purposes, a repository is a gittified (?) folder.|
|Commit||A set of changes applied to the previous state of the repository. Each commit is identified by a unique string: its SHA-1 hash value.|
|Branch||Projects tend to follow a non-linear development; like a tree, branches may spring with their own course. In other words, a branch is a deviation from a previous set of commits.|
|Merge||Although development may be non-linear, the desired state of the project is usually associated to a particular branch. Merging is the action of taking two branches and applying the commits from one into the other. It can be either the most beautiful thing you’ll see or hell itself.|
|HEAD||Pointer to the latest commit in a branch. When it points to any other commit, it’s called a detached HEAD.|
|Tag||A user-defined name for specific commit of interest. It’s usually used to denote the version of the software at a specific time.|
Sadly, I haven’t used any other version control system other than Git. In a way, that was great as it meant that I wasn’t biased by or accustomed to other philosophies; the toll, however, is that I have no background experiences that let me compare if Git is actually better or not. But, from what I’ve researched and learned from friends, Git and Mercurial are the most popular choices.
One of the things that stands Git apart from its competition is its design. Most version control system keep track of the difference or delta of the different states of an object; Git, on the other hand, saves each state as a snapshot. It does that intelligently, though; objects that haven’t changed from one snapshot to the next, are just stored as a pointer to the previous version. This approach makes it very easy to change between states of a project very fast and easily.
Furthermore, Git is intelligent enough to figure out how to merge (i.e. join) different states for the same object, given that a good workflow has been used. The process should be transparent to the user; sometimes it isn’t, but more on that will be mentioned later.
There’s a constant debate on the internet between Git and Mercurial, so it wouldn’t be harmful to do some research on that, even if you decide to stick with Git. As a fun fact, Facebook moved from Git to Mercurial in 2014 because its code base became too large and they found the latter was better at scaling than the former.
How to Git?
Git is an open-source project and it can be used freely just by installing it in a computer; however, projects usually require two things: backups and teamwork. There’s the need to have the project available to all developers of a project and it means either a private server or an online service.
Even though Git works through the command line, some client apps provide a user-friendly GUI. Atlassian, owners of BitBucket, developed SourceTree, for example. Some purists may call them aberrations but I consider them a great stepping stone for anyone beginning to work with Git (or any other version control). The initial configuration of git (the tool) is tedious; by installing GitHub for Windows, for example, that initial chore is not necessary as the app does it for you. For anyone interested on doing it the ideal way, check Set up Multiple SSH Identities on Git by Jorge Palacios.
One of the first roadblocks when learning how to git is its steep curve of difficulty. Part of that difficulty comes from the amount of commands available, which might seem overwhelming; some commands even are multipurpose, making them an extremely powerful tool that can be confusing for newcomers. This key characteristics is one of the many reasons some developers prefer Mercurial, as it’s more consistent with its commands.
I’ll be working in a more detailed insight into the basic and more advance commands in a later post, but let’s check some of the most common ones and see what they do:
|init||Sets up the current folder for git|
|clone||Like init, but using a remote repository as the source|
|status||Reports any changes on the current branch pending to be committed|
|log||Review the history of commits|
|checkout||A multi-purpose tool, but it’s generally used to changes the HEAD to the specified ID|
|add||Marks the selected pending changes as ready to be committed|
|commit||Defines a new state for the repository. A commit message is not required but is recommended|
|merge||Applies the commit in the selected branch into the current branch|
|push||Shares the latest state of the current branch with the remote repository|
|fetch||Retrieves the latest state of the current branch from the remote repository|
|pull||Shorthand for first executing git fetch and then git merge|
Are version control systems perfect? Well, that depends on your definition of the word; they can be magical, but they sure aren’t a silver bullet. As mentioned earlier, merging can be a hell if the workflow isn’t good. Their Achilles heel is the human factor; if two different soon-to-be-merged commits change an object at the same spot, the system can’t figure out which change should be considered as the correct one and human input will be expected to unravel such mess.
Additionally, these systems work best on objects that can be easily merged (text files, for example). How can the machine coherently merge binary files like images or audio? Even for a human, having the right tools, the process wouldn’t be straightforward.
Is Git perfect? Not by far, but it’s one of the best solutions around. It is a great tool that once mastered will become a great ally for any developer. And while at first it may seem scary and uncanny, it only needs some practice and the will to learn it.
I’m still deciding if the next post on this topic will be about the basic commands available. Plenty of material and tutorials exist on the internet, so I don’t want to reinvent the wheel; maybe I’ll be focusing on how to achieve daily day tasks and deal with (and avoid) basic roadblocks.
If you have any stories on how you learned to Git, feel free to share them in the comments.