First date with git-clone

Unraveling the cloning basics

Deepak Tunuguntla
4 min readApr 25, 2021

Working with a version control system such as Git is unequivocally beneficial but can also be mind-boggling in your early days as a Git user. Eventually, many of us get the hang of it after several practice sessions but another way to get a good grip on Git is by starting to closely examine some of the core operations such as the git-clone. To begin with, let us

Create a remote repository
It is assumed that a reader has a GitHub account. If not it is recommended to get one given its perks. Once logged in, create a repository called my-repo with the settings of your choice. As a result, my-repo is a remote storage provided by GitHub for storing, tracking and sharing your project with the below repository structure

Figure 1: Schematised repository structure of my-repo, located at GitHub. This image is made by the author using diagrams.

Although one would expect your repository’s default version main to be referred to as root, by definition, every version of your repository in Git is a branch. Hence, leading to the above repository structure and, more importantly, calling your my-repo’s default version as main branch. With my-repo at hand, the next logical step is to

Clone the remote repository
There are three URL protocols Git supports. These are ssh, http[s] or it’s own Git protocol. For no special reason, we choose the https option and clone the remote my-repo via the command-line as below

$ git clone https://github.com/<git_username>/my-repo.git

The above command will prompt you to enter your username and then a password. To be honest, this is a pretty straightforward self-speaking operation where all it did is create a local copy of my-repo located at GitHub into a folder on your local machine, which is also called my-repo. So, if your remote repository did have a LICENSE and a README file, your local directory my-repo will also contain a copy of the same LICENSE and a README file in your local copy. And, in case a .gitignore file was included while creating the remote my-repo, it will not be visible in your local copy as it is classed as a hidden file. But, more importantly,

What does git-clone do under the hood?

When cloning a remote repository onto your local machine, git-clone performs a list of underlying operations, out of which we will focus on two.

Firstly, git-clone does not just make a local copy of the files that your remote repository contains but also creates a brand new local repository with the same repository structure as that of your remote repository, see below

Figure 2: Illustrates the git-clone operation that results in a new repository located on your machine. This image is made by the author using diagrams.

Simply put, the clone operation results in a new main branch called the local main branch that contains a copy of all the files present in your remote repository. Both, the remote and local repositories are Git repositories on their own. Implying that you could use the remote and the local repository as two independent repositories. In fact, on our local machine we do use the local main branch for local file edits and then send (git-push) these local changes to the remote main branch. But, when performing the git-push operation, how does Git know that these local changes should be sent to the remote main branch? This is where git-clone does a tad more than just creating a local repository, see the below illustration

Figure 3: Illustrates origin — a default remote connection created when a clone operation is performed. This image is made by the author using diagrams.

Besides creating a local repository, the second key aspect is that git-clone also automatically creates an alias name for the remote repository’s URL called origin. In Git terminology, origin here represents a remote connection, which according to Atlassian’s Git tutorials is defined as below

remote connections are more like bookmarks rather than direct links into other repositories. Instead of providing real-time access to another repository, they serve as convenient names that can be used to reference a not-so-convenient URL.

Hence, origin here is a convenient default reference to our remote my-repo’s long URL, https://github.com/<git_username>/my-repo.git. And, it is this under-the-hood reference or remote connection that enables a git-pull or git-push operation to keep, both, the local and remote repositories in sync. You can confirm this using the below git-remote command where you can see the not-so-convenient URLs, which the origin refers to when a fetch or push operation is to be performed.

$ git remote -vorigin https://github.com/<git_username>/my_repo.git (fetch)
origin https://github.com/<git_username>/my_repo.git (push)

Relevant Git jargon
Now that we are familiar with the term origin, some terms a Git user often comes across are

  • main —default branch of any repository, for example, see Figure 2.
  • origin main — the main branch of the repository, which the remote connection origin points to. For example, in Figure 3, we could refer to the remote main branch as origin main.
  • origin/main — just a pointer to origin main, which is commonly used in several Git operations.

Hopefully, the above insights were helpful and worth a read. In the follow up article, we will shed some light on the other key aspects of git-clone. Till then, Enjoy Gitting! 🎊 🎉

--

--