Introduction – The Rise Of Git (And Github)
Git had overtaken Subversion in popularity around 2010. It has a slew of features that makes it a better choice for open source, community-backed software projects. GitHub grew in popularity because it offered an easy-to-use interface for experienced Git users who wanted convenient remote source-code storage. Besides that, it enticed novice programmers to get acquainted with Git. It was a positive feedback loop. Eventually, GitHub became the Instagram for developers and Git became a relatively low-skill prerequisite.
Note that it’s definitely possible to be skilled in using Git. Here, I’m calling it low-skill because developers simply need to follow these steps:
- Initialize a repository before or after writing code with
git init. Optionally, the branch may be renamed to
- Stage changes with
git add <path>
- Commit the changes with
git commit -m "<commit message>"
- Set up a remote with
git remote add <remote name> <remote SSH/HTTPS path>(once per remote)
- Push the changes to the remote repository with
git push -u <remote name> <master or main>
Git is obviously a thousandfold more nuanced than that. These, however, are the steps that every developer must take when they want to publish their code on Github. In my humble opinion, it’s not that complicated. It practically becomes muscle memory after pushing a few dozen commits.
Why self-host a git server?
If Github is so popular, why am I looking for an alternative? There are a few reasons for that. I’d like to discuss at length.
Monopoly isn’t good
Before Github, there were several project-hosting sites (many of them are still functional) like Google Code Project Hosting (2006-2016), SourceForge (active as of 2020). With Github, people flocked to it for its updated UI and pleasant UX. And there were no good competitors for Github for a long while until Bitbucket and GitLab arrived.
On GitLab and Bitbucket
Both of them are good options and they share a similar feature-set. Sometimes, they have the extra feature that the competitors do not have. But gradually, these features become the industry standard.
An example of this is the fact that GitHub did not allow free private repositories until after GitLab had already supported this feature for a long time. It is therefore imperative to have competitors in the market. Otherwise, there is no incentive for the companies to actually try to make better products.
A recent development is the release of GitHub Copilot. It is a collaboration between GitHub and OpenAI, who used millions of lines of publicly available source code to train their specialised generative model. Unfortunately, it seems the training data did not consider the licenses under which the original source code repositories were released. Not all publicly available code is open source. And not all open source software can be replicated without citations or references. This has the potential to erode the power of FOSS licenses and it reinforces the dynamic that small, individual developers are powerless to protect their IP from the greed of the massive corporations. Yikes!
What to do now?
We are spoiled by having access to remote locations for our source code backups. Our dependence on CI runners is also increasing. This is not a bad thing, but these services are ultimately under the direct control of GitHub and its parent company Microsoft. There is a silver lining to this rather difficult situation, though.
It is now easier than ever to learn about various technologies from online resources. These resources can be free or paid, but they are readily available to everyone, regardless of their background or geographical location. The outcome of this is the plethora of FOSS, self-hosted alternatives available for mainstream services.
The World of Self-Hosting
There are quite a few options to choose from. I tried self-hosting GitLab Community edition, but it was just too much of a resource hog. Gitea is a lightweight alternative. It is written in GoLang and practically sips resources. In order to make up for the missing CI service, I self-hosted a Drone CI server.
You should try other options if you have the time. But if you’re in a hurry, I recommend a combination of these two services and it will serve you well if you have set up everything correctly.
I recommend getting free credits from a VPS provider and performing a few experiments to try out the various options. For example, this (referral) link will give you $100 to use if you sign up for a Vultr account. There are definitely other options available like Linode, Hostinger, Digital Ocean, etc. Choose any of them.
I will create a series of posts outlining the steps that I had taken to install these services and the various difficulties that I faced. One thing that I had to do for both VPS’es was set them up and harden them properly. This article provides an excellent tutorial on how to do that. After that, I essentially followed the documentation and deviated slightly on a few instances.
It took me some time to understand all the underlying technologies and the relevant configurations, but I pulled through and installed Gitea and Drone CI properly. Now I can confidently push my sensitive code to a private repository that can be tested against a CI. As a bonus, Microsoft and other tech giants cannot (easily) get their hands on my IP, unless I am fine with it.
Self-hosting is a bit taxing but this is the next best thing to a course on system administration; you get plenty of hands-on experience and a working service in the end. I highly recommend this moderately difficult exercise!