Moving Projects from Java.net to GitHub

Posted in Java, Software Development by Dan on May 17th, 2010

How to move your project from Subversion on Java.net to Git on GitHub without losing the change history.

Cloning the Subversion Repository

The normal way to duplicate a Subversion repository with full history is to use the svnadmin dump and load commands. Unfortunately most SVN hosting services, including Java.net, do not provide access to svnadmin commands or direct access to the file system.

Fortunately there is another way to clone a repository, complete with all its history, that requires only read access to the repository: svnsync.

The first step is to create a local SVN repository into which you will mirror the remote Java.net repository.

svnadmin create myproject

Before cloning the contents it is important that you add a pre-revprop-change hook to your new local repository. This is a script that must complete successfully (exit code 0) before any changes to revision properties are accepted. The easiest way to do this to add an empty script and make it executable.

echo '#!/bin/sh' > myproject/hooks/pre-revprop-change
chmod +x myproject/hooks/pre-revprop/change

Bearing in mind that we want to preserve tags and branches too, not just the trunk, we can then mirror the top-level of the remote repository.

svnsync init file:///pathto/myproject https://myproject.dev.java.net/svn/myproject
svnsync sync file:///pathto/myproject

This may take a while if the repository is large and/or your connection is slow.

Stripping Java.net Web Content from the Repository

Java.net uses the project SVN repository to manage the project website, with the files stored under trunk/www. When migrating from Java.net you probably don’t want to continue with this approach. If that’s the case then you’ll want to remove all traces of the www directory from the repository. The usual way of deleting a file – removing it from the working copy and then committing – will not purge its history so instead we use the svndumpfilter command.

First dump the mirrored repository:

svnadmin dump myproject > dump

We can then remove all traces of the www directory. Any commits that only touched files under www are now empty and are dropped completely. All remaining revisions are renumbered to avoid gaps in the sequence.

svndumpfilter --drop-empty-revs --renumber-revs exclude trunk/www < dump > filtered

The final step is to restore the filtered dump in place of our local repository.

rm -rf myproject
svnadmin create myproject
svnadmin load myproject < filtered

Migrating the Repository from SVN to Git

Now that the local SVN repository contains only what we wish to keep, we are ready to migrate it to Git. To achieve this I follow Paul Dowman’s instructions.

The first step is to import the SVN repository into a new Git repository.

git svn clone file:///pathto/myproject --no-metadata -A authors.txt -t tags -b branches -T trunk myproject-git

The authors.txt file maps SVN users to Git users. Your Java.net repository may have commits attributed to the users root and httpd. You should probably just map these to your own user name. There should be one entry for each committer:

root = Your Name <you@example.com>
httpd = Your Name <you@example.com>
you = Your Name <you@example.com>
other = Someone Else <other@example.com>

Refer to Paul’s full instructions if you have branches and tags to maintain.

Pushing to GitHub

Create a new repository on GitHub and then add this as a remote for your local Git repository.

git remote add origin git@github.com:username/myproject.git

And then push:

git push origin master --tags

Job done.