StiltSoft: Using Submodules in Git

February 20, 2014
#How To
6 min

In this post we are going to tell you how we solved two problems of storing source codes for one of our projects. Initially, we had a repository with code in Subversion. We were thinking about switching to Git and splitting it into two separate repositories in order to close public access to one of them. We will show you how to migrate from Subversion to Git and splitting a Git repo into two without making a mess of the file structure for a user who has access to both parts.

Step 1: Getting data from Subversion

First, we need a file with the commiters data. It looks like this:

mkuzmich = Maxim Kuzmich <mkuzmich@stiltsoft.com>
alexkuznetsov = Alexander Kuznetsov <alexkuznetsov@stiltsoft.com>
rkirilenko = Roman Kirilenko <rkirilenko@stiltsoft.com>

Let’s save it as users.txt

Now, we check out the repository to a temporary folder (in case something goes wrong we’ll have a copy of our repository):

git svn clone --stdlayout --no-metadata -A users.txt --username svn_username https://some_svn_url dest_dir-tmp

Updating data is a good idea, since someone could commit to a repository while we were checking it out:

cd dest_dir-tmp
git svn fetch

Step 2: Migrating data to Git and splitting the repository

First of all, we need a clean clone of the data we got from Subversion:

git clone dest_dir-tmp dest_dir

Then we unlink our repository from Subversion:

cd ./dest_dir
cp -Rf .git/refs/remotes/* .git/refs/heads/
rm -Rf .git/refs/remotes

Putting the private part to a repository

To do that, we move the private data to another folder and make a repository of this folder:

git init
git add .
git commit -m "initial commit"

Then we copy the private data to the server:

git remote add stash http://192.168.0.234:7990/scm/proj/secure.git
git push --all stash

And now it’s time to set access rights for the private part (we used Stash for that):

Setting rights in Stash

Removing info about the private part

To prevent private data from getting to the sever that contains the public part, we need to remove Git control over them. Then we can copy the repository to the server:

git rm secure -r
git commit -m "removed secure part"
git remote add stash http://192.168.0.234:7990/scm/proj/repo.git
git remote rm origin
git push --all stash

Integrating the private part into the public part as a submodule

Now we add the private part as a submodule:

git submodule add http://192.168.0.234:7990/scm/proj/secure.git
git commit -m "added secure part as submodule"
git push stash

Step 3: Getting the whole repository (with the private part).

We cloned the repository:

git clone http://192.168.0.234:7990/scm/proj/repo.git

As a result we got an empty folder named secure, there’re no data in it and rights are not checked when accessing the submodule’s content. Therefore everyone can work with the repository. To get access to the private part, we need to initialize the submodule and get the content of the public repository:

git submodule init
git submodule update

At this stage, user rights will be checked and if everything’s ok, the user will get the private part from the public repository.

When working with submodules, please note that a submodule stores a link to a specific commit of the public repository. In case the data in the privaterepository have changed and the changes should be merged with the main repository, you’ll need to update the submodule data and commit them to the main repository.

git submodule foreach git pull origin master
git add .
git commit -m "updated submodule"
git push origin

Summary

However it may look like a difficult task, migrating a small repository from Subversion to Git only takes around 15 minutes. You will need another 15 minutes for splitting the repository and setting access rights for the private part. You may want to use Atlassian Stash for making your work with Git more intuitive and user-friendly.