I’ve been using subversion for many years, and it has always served my needs well.
When I decided to switch to GitHub for hosting my ‘DarkGlass’ project, I didn’t foresee the difficulties that I would have with Git, and I could not have predicted that I would be drawn into another tool war! Ready yourself for a short story…
Why GitHub in the first place?
I first selected GitHub for hosting the DarkGlass project because the project is Open Source, intended to ultimately become a community driven effort, and is even being written in public on a youtube live-stream. GitHub has a number of great community features, from message forums and issue tracking, to demonstrating that you like a project by ‘starring’. This all makes GitHub a great place to collaborate online, to share ideas, to demonstrate support for a project and so on.
Git vs Subversion
Git is a version control system which was written to accommodate the development of the Linux kernel, and was intended to address what the authors saw as failings in older version control systems such as subversion. The main difference between Git and Subversion being that Git is distributed, while Subversion is centralized. What does that mean exactly?
When using subversion, you have a server which hosts the project that you’re working on. The general daily procedure for updating a project involves a ‘check-out’ operation, which copies the latest version of the code from the server to your local machine, making updates to the code, and then a ‘commit’ of your updates back to the server.
Git has no such server. The idea behind Git is that if you and another developer are working together, you each have a copy of the source code, you make changes to that copy, and then you can connect to the other developers copy and merge your code with theirs. This may seem like an odd way of doing things, and that’s because it is. It may make sense in certain scenarios, but the truth is that this is such a strange way to manage version control that the majority of developers using Git simply set up a server to host a central copy of the source code – ultimately defeating the purpose of it being distributed in the first place!
Now, I know there will be people out there for whom this mechanism makes sense. Different people work in different ways and have different processes. I’m not here to judge which is right or wrong, I’m simply trying to describe the differences between Git and Subversion as succinctly as I know how.
There are many other differences between the two systems, but what I’ve discovered is, there’s something of a war going on between their users. War is actually quite a dramatic term to use here, but we all know what I mean right? Early I.E vs Firefox anyone? There are lots of people online giving their heated opinions of both their preferred version control system, and their opinions of the other. I guess this post is my own addition to the pile-on.
For some minor technical reasons, which I’ll discuss in a moment, my preference is for Subversion. Before I get to the actual technical reasons however, I’d like to give my analysis on why I think Git has become so popular, and it has nothing at all to do with it being ‘distributed’.
Why is Git so popular?
As I see it, Git has a number of advantages in the popularity race, none of which are really to do with the technology, but more to do with it’s users confidence and trends.
- It’s New!
A piece of software being newer, has never really seemed to me like a sensible reason for it’s adoption. I prefer to think of older software as being tried and tested, or matured. None the less, new technology always gets a buzz around it, and Git certainly takes advantage of that buzz. For example, an excerpt from the first page of the Git website reads “It outclasses SCM tools like Subversion, CVS, Perforce, and ClearCase” This is a dubious claim. It seems that features such as local branching (*ahem* simply copying a directory!) bolsters the claim that Git performs ‘faster’. The truth behind this claim is that Git is no faster, it simply offsets the moment at which you’ll take a performance hit. For example, with a centralized source control system, if you want a branch of the source code you ask the server to create one, and then you download it. This ensures that the server knows what version you’re dealing with, and that you have all the latest changes before you begin modifying your local copy. With Git, you can simply make a branch of your local copy on your local machine and begin modifying that. Of course it’s faster to do a local copy than to download a branch from a server, but ultimately in either case you’re going to have to merge your code with the work of others. At the moment the merge happens, it’ll be both faster and easier to do with Subversion. Why? Well consider the bidirectional interaction required between two machines which each have their own repositories and branches, attempting to merge those branches. This complex interaction is both computationally expensive, and consumes just as much bandwidth as a branch download from a central server.
- It’s better!
Another advantage to Git’s popularity is the number of claims about how it is in some way better than older systems. On one part of the git website I found them referring to subversion as ‘ancient’, just one example of how Git is promoted by knocking the competition. On deeper investigation however, I found that this goes very deep into the Git community. There are claims about subversion circulating that are flat out untrue, or may have once had a grain of truth to them which have been exaggerated but no longer stand up. Here’s a great read regarding some of the myths and facts of Git vs Subversion : https://svnvsgit.com/
- Linus Torvalds
Git has the advantage of having been created by Linus Torvalds. If you don’t know who Linus is (have you been away some place?), here’s a wiki for you https://en.wikipedia.org/wiki/Linus_Torvalds Linus achievements speak for themselves and his name carries a lot of weight in the open source community. This does give Git a big advantage among developers that respect and trust him for his work.
If you were to ask me what the biggest advantage is for Git, it’s GitHub. There are online hosting solutions for other version control systems, and they even have collaboration tools, and generally support Git too now, but the simple fact is GitHub has done an amazing job of providing a developer community. They’ve marketed themselves well, they’ve priced themselves well, and their platform is both feature rich and well built. Hands down, this is a great thing for Git.
In my opinion, all of these advantages (except for GitHub perhaps) are relatively trivial. When you cut through the hype of git being new, or in some way superior, you find that it’s just another tool with it’s own strengths and weaknesses. I still prefer subversion, by why?
Why do I prefer Subversion?
Just as with any comparison of developer tools, each has it’s strengths and weaknesses. I’m not going to go down the ‘git hype’ path of telling you how subversion is better, smarter, faster. There are many great things about subversion which make it stand out for me, but there is one in particular which git just doesn’t seem to get right. Dependency Management.
Git’s dependency management fails to meet my needs for three quite distinct reasons, each of which on it’s own might not be the breaking point for me, but when combined they force me to reject git for my project. I’m going to look at each of these failings and explain why subversion works better for me in each case.
Reason 1: Git sub modules are locked to a revision.
Suppose you have two projects, A and B. (I’m going to continue referring to them as A and B here)
A depends on B, so you tell git to include B as a submodule within A. All appears well and good, when you clone project A you’ll get a copy of B also. The problem is that if there have been any changes to project B, you won’t get those. Now, this is considered a feature of git, and with good reason. By locking the submodule to a particular revision you can be sure that A will continue to build even if B has breaking changes because you won’t get those changes when you clone A.
This all seems to make perfect sense, so long as project B is a third party project. You don’t want some third party messing up your build with a breaking change. But what if you are the developer or organization authoring both projects A and B. You make a bug fix in project B and you want project A to take advantage of that bug fix. Well of course, you can alter project A to take advantage of the new revision of B, but what if B is a dependency of multiple other projects? Now you have the administrative task of going through all of your projects, determining which of them requires project B as a dependency, and updating them all to to use the latest revision of B.
Subversion offers a solution for both conditions. In subversion project B is added to project A as an ‘external’. That means that B does not become a part of A, but rather, it is externally referenced using a url. When you check-out a copy of A you’ll get a copy of B. Any time you update A you’ll get the latest revision of B also. So what happens if B is a third party library, and you want to avoid the breaking change issue? Well, you can simply add a revision number to the url for B. For example svn://projectB@100 would lock the external to revision 100 of project B. On the other hand, if B is internal to your organization and you want to incorporate bug fixes from B, add B as an external without a revision number and you’re all set, you’ll get all the latest updates.
(As a side note for the Git fans now yelling at me about subtrees, I considered those too and they have similar issues.)
Reason 2: Git URLS are not nearly as flexible as subversion URLS.
As I described above, in subversion you can append a revision number to a dependency url, but subversion url’s have another option which is very useful, and utterly absent in git. It’s possible to append a sub-directory path to a subversion url. For an example we’ll again consider projects A and B, where A depends on B. Imagine your Project B directory looks a little something like this…
The above screen shot is the directory structure of one of one of my own library projects. I have directories for dependencies to the project, as well as the project documentation, sample code, unit test projects etc. Project A depends on only one sub-directory of this project, the ‘src’ directory containing the source code. When I check out Project A, it has it’s own documentation, samples and unit tests, it does not need to check out those directories of project B. Well, again, we can simply append a path to our subversion url when including Project B as an external to Project A, like this svn://projectB/src What if I actually do want the dependencies from this project also, but not the other directories? Well then I can simply add the dependencies directory as another external, and so on.
Now this may seem like quite a trivial feature, but consider this scenario…
Lets say your application runs on Windows and has a weft of unit tests applied which are all passing nicely. Being the diligent developer that you are, and knowing that someday a customer is going to ask your for an OSX port of your application, you wrote all of your classes as implementations of platform agnostic interfaces. Great! Now in order to port your application to OSX you merely need to implement those same interfaces for the new target. You look at the work involved and decide to out-source this particular part of the development to a third party. Don’t worry, your interfaces will enforce a contract upon that third party, and your unit tests will ensure their code does what it should…. but… well third parties have a little less respect for your code than you do and so when they can’t make a particular feature work, rather than consult you, they either alter an interface or they “Kobayashi Maru” and change the conditions of a unit test.
You of course chastise this developer with repairing what they did, on threat of non-payment, but what if you could have prevented this from happening in the first place?
With the subversion externals mechanism you can put your interfaces and unit tests into an entirely different repository, one which your third party developer has read access to, but not write access. This means that they get essentially a read-only copy of the interfaces and tests, while still being able to submit work on the implementation. The temptation for them to behave like a Kirk are mitigated from the outset.
Reason 3: Ease of use.
Although there are UI’s for using Git, not one of them has impressed me yet. They are typically quite limited in features, or platform specific, or have some other failing. The most comprehensive interface for Git is the command line. Looking at people posting questions about how to achieve something in git online, the typical response from those that have learned every command and command-line switch is, “you really should take the time to learn how to use the git command-line.” This is often served with a condescending side-serving of “What have you got against learning, too lazy?” It’s all too familiar, it reminds me of my early years with Linux, before the community support for Linux really stepped up and created easily consumable distributions and strong well-written documentation.
Let me answer the question for you now, while also venting a little of my early frustrations with Linux as well as Git… I am a software developer both by trade and hobby, I could give you a long list of highly technical disciplines I have either mastered or taken part in, but I’ll abbreviate them to everything from writing an operating system kernel in assembler (albeit a small and stupid one), to writing neural network simulations. I’m not trying to boast here, I’m expressing that I’m not in any way afraid to learn, and in no way frightened of technical detail, however, I shouldn’t have to!
The highest possible praise I could give to any product, be it a physical product or a piece of software is, it does what it’s supposed to without being complicated to use. I especially don’t need or want additional technical difficulties arising from my choice of development tools. In a corporate setting, why should my version control system require weeks of training when there are options that require only a couple of hours of training?
If you wish to use subversion from the command line, you can do so. It has quite a limited number of instructions and switches to learn, and therefore, is actually really quite usable in command line form. However, it has some excellent UI’s available. On windows, I like to use TortoiseSVN which provides additions to the context menu in the windows file explorer. This way, it feels almost like a feature of the operating system and is very natural to use. For other platforms, such as Linux there are alternatives, including rabbit-cvs which operates in a similar way to Tortoise, and RapidSVN which provides a more “dedicated application” feel. Now it’s true that RapidSVN can’t be used on Windows, and TortoiseSVN can’t be used on Linux, however, my criticism of the lack of cross-platform options for Git comes down to the fact that when you do find a good client, it isn’t cross platform. With subversion, I don’t really care that the UI clients aren’t cross platform because there are good feature rich / feature complete options on both platforms, and for apple targets also.
(Yes Git fans, there is also TortoiseGIT but it’s almost as confusing as the command line!)
Yes, this is a problem. While there are options for subversion, none of them is quite so well fitting for my project as GitHub. I don’t wish to remove the DarkGlass project from GitHub, and yet, I can’t serve the project well with Git. So what do I do?
Well, I recently spent a lot of time rebuilding my private server infrastructure to host projects on subversion. My plan is simple, I’m moving all of my code over to subversion, but then, I’ll do a regular export and merge that with GitHub. This way, I can keep my project hosted on the Hub, but return to the comfortable world of subversion for my daily activities. In fact, by adding the subversion hidden directory to the Git ignores list, there’s not really any reason to keep them in separate directories. I can commit the sources from a single directory to both subversion and git directly.
I don’t recommend this as a common practice. There are some complications to consider. For example, suppose I use the subversion interface to delete a file, git will not know that the file is deleted and continue to persist it at the GitHub repository. I’ll therefore have to first use a git client to tell git that the file should be deleted (while retaining the local copy), and then use the subversion client to delete the file from the subversion repository while deleting the local copy, only then can I commit to both repositories without orphaning files. None the less, this solution does solve the predicament that I find myself in, wanting to keep an existing project on GitHub, while using subversion as my version control system.
If Git suites your way of working more closely than subversion, great, use git. I didn’t write this page to tell anyone that subversion is better, or that git is bad. I simply wanted to share my frustrations with git, and my opinions of it, because it has been a very challenging tool to work with. If you could take all of the features of GitHub to provide an SvnHub, I would switch to it in a heart-beat, but of course, in true ‘git’ style the domain name ‘svnhub.com’ is taken and contains a page disparaging the use of subversion (it’s almost like the git community is afraid of svn or something). For me however, the answer is clear and simple. The advantages of distributed source control are not what they’re hyped up to be, the negative messages around subversion are either out of date or flat out false, and the flexibility of subversion url’s and externals is superior to git. I’ll be sticking with Subversion.
(For anyone interested in access to my open source code via subversion, I’m still completing the migration, keep an eye on my blog for an update)