Twice now, I have tried to host moderately sized public projects on GitHub, and twice I’ve decided to move those projects away. I understand that this will come as a frustration to the few that follow my projects and that really appreciate GitHub, however, my decision to move my project away from GitHub is as much for YOUR benefit as it is for mine.
I’ve blogged about this before, so I’ll try to avoid launching into a rant, but frankly the dependency management of Git is non-existent. I know, I know, you Git fans are going to say “SubTrees” or “SubModules” or start talking about external dependency management software, or complain about the lack of dependency management in the available pascal IDE’s…. Please, I’ve heard it all and in my opinion, all of these are merely excuses for a failing in Git. So let me tell you what I want, and why Git is not right.
- When I share a project with you, I’d like for you to be able to take a single action to get all of the code required to build the project. i.e. ‘Clone / Checkout’ and have all of the dependencies pulled out of their respective repositories automatically.
- I’d like to share dependencies between different repositories. As an example, I use the cwRuntime library as a dependency for just about all projects, shared or otherwise.
- When you obtain a copy of one of my projects, it should simply compile (with appropriate compiler/IDE) out of the box, using relative paths to ensure that you don’t have to go setting up search paths to make things work. This means being able to include a sub-directory of another repository at a given location in the parent repository, to keep those relative paths functioning correctly.
- I’d like to be able to decided for myself if a dependency should be locked to a particular revision, or should update to the latest revision when the parent repository is updated. Generally, I’m using my own repositories as dependencies, and therefore want the dependencies to take advantage of bug fixes, but I’d still like to lock to a particular dependency revision for releases.
- I don’t want to have to perform additional steps to update all of the ‘parent’ repositories that depend on another when that dependency gets an update, and if I make changes to the dependency while working on the parent, I’d like it to be a simple process to distribute that change to all other repositories that depend on the same.
Why Git Fails.
Essentially there are two ways to include a dependency in Git. You can chose to use a SubModule or a SubTree.
Martin Owens blog post goes into more detail https://martowen.com/2016/05/01/git-submodules-vs-git-subtrees/ but essentially the problem is that you either include a copy of the dependency, or a reference to it. If you include a copy, then it becomes challenging to propagate changes to that dependency back to the original repository, if you include a reference then someone else (YOU) cloning the repository, would have to take extra steps to get the dependencies.
In both cases of SubTrees and SubModules, the dependency is locked to a particular revision, and there is no option on that. So if you (or in this case I) make a fix to a dependency, that fix has to be manually propagated to every repo that depends on it.
In both cases of SubTrees and SubModules, as with all git repositories, the repository URL is to the root of the repository, period. There is no way to include a sub-directory. This makes relative directory management close to impossible, or at the least very disorganized.
I have seen companies at which I’ve been employed, deal with these deficiencies in Git in one of two ways. Either they keep their entire company wide code-base in a single (often huge) repository, or they include a script in every repository to go pull the dependencies, and possibly adjust their paths. In the first case, with a single repository, not only is it time consuming to work with repositories of this size, but I’ve also seen third party integrations failing because the repositories were too large for them to handle. In the later case, managing dependency scripts becomes just another annoyance to developers and additional overhead to keep them all correct.
Public GitHub projects often have instructions included with them to install some particular third-party dependency management software. This may seem reasonable, but as a consumer of those repositories, I now need to concern myself with installing and learning to use which-ever dependency management software a given project owner has chosen to use. Should I have to understand things such as npn, homebew, dep, Pnpm, Cocoapods, Gvm, Bundler, Patch Package, Athens, GreenKeeper, Mint, BowerPhp, Cget, Poet …. you get my point, there are hundreds of these managers all trying to resolve a problem that could so easily be solved by simply using a better SCM.
It really is a shame that Git fails so horribly for dependency management, because GitHub is an amazing thing. A place to host source code, with community features for commenting and liking, issue tracking features, features for forking, revenue driving features for open source, and many more. GitHub really is a great tool, and switching from it means finding alternatives to at least some of those features.
Like a trusted old hammer, subversion comes to the rescue!
There are many people that for some reason, dislike subversion. Having listened to many of the arguments against, I really don’t understand why, but I am convinced that subversion is misunderstood. If you’ve read my earlier blog post comparing Git to Subversion, you’ll understand what I mean. Suffice to say, Subversion solves all of the ‘wants’ that I listed above where Git fails.
(* Yes, I am aware that GitHub offers a subversion interface to git repositories, however, that interface lacks the same features that git does because it’s backed by git. It’s essentially a git-broken subversion :-/ *)
- Subversion ‘externals’ are the best of both worlds vs Git SubTrees & SubModules. When you check out (clone) a subversion repository, you’ll get a copy of all linked external repositories at the same time. Great when sharing code with others!
Not only that, but if you make a change to a dependency you can commit that change back to the dependency repository with a single commit instruction. The next time you update / check out another project that depends on the same dependency, it will be updated (assuming you didn’t opt to lock the revision).
I have a single repository at the root level and use externals to reference in all of my repositories, a single update at the root level, will update all repositories and their dependencies at the same time! Perfect.
- Subversion externals are simply referenced subversion urls, which are flexible to dig into a project. For example, if I would like to include the main source only from the cwRuntime library, without samples and test projects, I can simply append the appropriate path. https://svn.chapmanworld.com/cwRuntime/trunk/src/main <- example. This enables me to keep all of the repositories organized in such a way that relative paths work out of the box. Great.
- Subversion urls can also be locked to a revision number by appending @<revision-no>. It’s as simple as that.
Subversion development typically takes place in a single ‘master’ branch called ‘trunk’ – the trunk is unstable, in-development code, but generally, depending on the development team practices, should compile. While it is possible to branch in subversion there are technical differences from branches as you may be familiar with from Git, frankly, they’re far simpler. Branches are simply directories in the repository which begin at a particular revision. There’s also a commonly used concept of ‘tags’ – Typically under this schema in subversion, branches are used for feature development, while tags are used to ‘tag’ release revisions, though they are essentially the same thing.
The great part of all of this is, it’s possible to leave the trunk branch at the latest code revision, while locking tags and their dependencies to specific revision numbers. So in-development code always gets dependency bug fixes, while releases are protected from breaking changes.
Leaving aside the technical arguments of ‘distributed’ vs ‘central’ code management, which is a pointless debate when most ‘distributed’ systems are used with a central repository *sigh*. The true strength of git is GitHub and it’s features as I’ve mentioned above.
When making the move to subversion for my public projects, as a permanent solution, I wanted to provide at least three things.
- I would like to give you a front page, similar to the ‘readme.md’ rendering that you’d get on GitHub or some other SCM hosting site.
- I don’t want to force you to install anything in order to get my code. While I would encourage you to use a subversion client, it should not be a requirement. So you must be able to download ‘zip’ packages of the source.
- There must be some system for issue tracking.
With this in mind, I created some custom code within my blog site to provide these features.
If you take a look at the front page of one of my project pages, you’ll find a tabbed navigation bar has been added…
The first page is an about document containing the typical readme content that you’d find in an SCM site.
The next tab is where you can get the source:
This is a functional source code browser, with a drop-down at the top enabling you to select between the latest ‘trunk’ code, or a release tag. Beneath the drop-down are a ‘zip’ icon for downloading the source code in zip format, and a subversion link to paste into your subversion client. The browser will also allow you to read source files, however, there are one or two css glitches still to be worked out 🙂
I had intended to build my own issue tracking system for the issues tab, and I may still do so, however for the time being I’ve included a plugin to my blog to provide this feature. The issues tab simply links to it.
With that, the GitHub features that I feel most important for presenting my projects are now available at my blog, and I’m able to migrate projects from GitHub. I hope that this will not disappoint you too much, but I will not be going back to GitHub. If you have suggestions for ways that I can improve integration with my blog site I’d be happy to hear those.
Lets get coding.