Monolithic Repository and CI/CD

After working with dozen of independent projects and dealt with the difficulties of sharing components, fighting the technical debt, duplicating parameters from the projects to the infrastructure code or simply helping people to join our teams/projects, we've decided to move towards a monolithic repository... And by this mean, overcoming some of the challenges we had faced, as a start up, to develop and grow a micro-service architecture.

As you can guess, storing different services in a single repository does not come without any challenges either. You want to enforce good practices. You need to optimize your CI/CD pipeline. You need to accelerate and add new people; new teams.

A monolithic repository comes with its own set of questions:
How do you organize it at the highest level? How do you keep your services independent when commits, tags, branches, pull/merge requests and issues are consolidated? How do you actually differentiate a "micro-service architecture" stored in a monolithic repository from a monolithic project? How do you manage versions, labels, interfaces and application artifacts? How do you mix services based on technologies as diverse as Java/Maven, Go, Javascript/React, Elm, Docker/Kunernetes? How do you secure environments and separate responsibilities? We've come a long way and it all had started with a single step.

What files have changed?

The way we first considered it, a monolitic repository is just a collection of smaller repositories. So once you've put everything together, the first question, simple question, that comes to the mind when merging a change is "What files have changed?". Once you can tell what exact part of the monolithic repository has been changed by the commit history that has lead to a branch or tag, you can:
  • ensure your teams are keeping the micro-service approach and are modifying only one service at a time, i.e. services are self-contended
  • trigger the right service build, test and deployment
  • monitor this service during the upgrade and create independent "rollback" if some concerns appear in production
Obviously our context differs from many others. Our development process relies on the Github flow. Our reference branch is master. We always start from it and we tend to merge back to it as soon as possible to keep changes small and have our production in sync with our code or, at least, most of it. We encourage people to rebase their changes rather than merge master back to their working branch. The level of understanding of git is probably above average. People are responsible for their branches as well as the fact they push to production; this includes fixing conflicts if needed; this includes having a reliable set of tests. We have only one consolidated production system even if, thank to blue/green upgrades, its state might be transient. So the way we develop might differ a lot from yours.

As you can see from git merge-base --help, git provides a simple way to figure out the base commit that has led to a branch. So assuming your HEAD is on the branch, you can easily figure out the starting commit of it by executing:
git merge-base origin/master HEAD

Consequence from the previous command, the command below list the files that have been changed since a branch was first created:
git diff --name-only "$(git merge-base origin/master HEAD)" HEAD

Our development process also requires peer reviews. People are responsible from the production release. The test from the command above is just a tool, not a sole control.

Cloning with Travis-CI

This method assumes the commit history as well as your branch and the master branch to be part of the repository that you are working on. Unfortunately when a Travis-CI job is kicked off due to a commit, the clone command looks like below:
git clone --branch [branch] --single-branch [yourproject]

This method only copies the current branch. It also prevents the git fetch/pull from working as with a regular repository. If you want more details about this, refer to the --single-branch section of git clone --help.

As a result, the git merge-base origin/master HEAD simply does not work because you cannot access the reference to the master commits. To overcome this limit, you will have to change the repository configuration so that it can fetch all the branches again. That is the remote.origin.fetch configuration property that must look like the one below for now:
git config remote.origin.fetch

Change it back into a standard configuration that allows to fetch any branches, run the command below:
git config remote.origin.fetch "+refs/heads/*:refs/remotes/origin/*"

Once done, you should be able to fetch the master branch again and as a result, figure-out the files changed:
git fetch origin master
git diff --name-only "$(git merge-base origin/master HEAD)" HEAD

Here you are, you can easily write a script that works with Travis-CI to detect what files have been impacted by a branch.

The journey continues...

The biggest challenges we've experienced so far with moving to a monolithic repository are the changes to the CI/CD. It took us some time to fix everything and this article is only about the first step of it: detecting what files have been impacted by a change. I will explore other issues/solutions later. So far, the impact on people is very positive or it might be that I have been suffering so much with a large set of distributed repositories that my reality is totally biased. Anyway, I'm quite optimistic the benefits will overcome the difficulties soon. Hopefully, this article will trigger some questions. If that is the case do not hesitate to comment it.


Popular posts from this blog

Installing Oracle Database 12.1 in Command Line and "Silent Mode"

Querying an Oracle VM Server Xenstore from a Linux Guest

Introduction to Oracle Linux 7 Network