CI/CD Task Force Report

Introduction

Back in April the CI/CD Task Force formed to look into the current state of continuous integration (CI) in all of the Hyperledger projects. The stated goal was to analyse the existing solutions to come up with a set of requirements for a unified CI system and then to make recommendations to the TSC on ways to improve on what we have now. This report is the culmination of several months of meetings and research. It outlines the current challenges we face and then provides several short-term and long-term solutions for meeting Hyperledger project requirements for CI.

Problems

Currently only 5 out of the 14 Hyperledger projects have a CI system in place with a few others they may have some CI builds being made. Due to many different challenges only one of the teams, Hyperledger Fabric, is currently running CI on Hyperledger provided infrastructure using Jenkins and Gerrit. The rest are running CI on corporate owned and run infrastructure. There are two major problems with Hyperledger projects running on corporate CI infrastructure. The first has to do with security, visibility and analytics. The corporate run CI platforms are not open to the Hyperledger staff for integration with our metrics gathering and security auditing tools. With Hyperledger Fabric and Sawtooth the CI pipeline is open to the rest of the community for viewing. The second issue is one of resources and money. Hyperledger has a budget for CI infrastructure and the member corporations shouldn't have to pay for CI infrastructure to run builds of the stock open source version of a project.

Solutions

The task force spent considerable amount of time looking at potential CI platforms trying to figure out what is the best way to get all of the Hyperledger projects onto shared CI infrastructure. In the end there is no one solution that fits all projects and the solutions further break down into short-term and long-term thinking.

Short Term

The short term solutions are things we could implement immediately to alleviate any acute CI issues. The Hyperledger Fabric team is concerned with the speed of the CI cycle on Hyperledger infrastructure. It takes way longer than it should and one of the potential short term solutions would be to move the Jenkins minions off to AWS servers . This would decrease the time per CI pass and give greater control over the CI pipeline to the Hyperledger Fabric developers.

Another short term solution would be to dedicate some portion of the Hyperledger budget for the use of paying for CI resources. Either this is reimbursing companies for their corporate run CI infrastructure or paying for some infrastructure that each team can use however they like. The goal is to defray the costs of doing CI and encourage more teams to set it up.

The last short term solution is to work with each team to adapt the existing CI systems to work with out analytics and security sytems so that they are much less opaque and we can do a better job vetting releases.

Long Term

In the long term we were able to look more at ideal cases rather that sticking with only the practical solutions that are outlined above. Eventually Hyperledger will need to develop a global budget for CI resources that all teams will be able to use for either Hyperledger run infrastructure or corporate run infrastructure. Management of this budget will require some set of policies and rules designed to fairly divide up the resources between the teams. One thing we do know is that the total budget for CI infrastructure is likely to grow significantly from where it is now and would present a large budget problem that needs to be solved.

The task force went back and forth on a lot of the potential CI systems out there, hoping that one would meet all of the team requirements and all teams would unite onto a single platform. In the end, no one CI system was an easy target for all of the teams to move towards. This leaves us with the recommendation of picking one CI system and "blessing" it as the one that all of the Hyperledger projects will use. Currently Circle CI appears to be one of the best candidates and the Fabric team has been experimenting with it over the last couple of weeks. Another strong contender is Gitlab CI. The most exciting feature of Gitlab CI is the ease by which community sourced computer hardware can be added to the network of buildiers and contribute to making builds.

At the end of the day though, blessing a single CI system will be painful for all of the projects to switch to the new system. However, the increased transparency, security, and metrics gathering is what matters the most.