Continuous Integration Part 1: The Fundamentals

Deepak KaranthThu, 07/06/2017 - 10:14

Introduction

Continuous integration, referred to in short as CI, is a technique developed by Grady Booch in which developers are encouraged to continually merge their code into the main source code repository.

Each of these ‘merges’ or ‘commits’ into the repository is usually followed by a series of automated tasks: compilation of code, execution of unit tests and integration tests, static code quality analysis to determine whether the code quality has degraded etc.

These automated tasks help to verify the sanity of the new code and identify any breakages caused by introduced by the new code i.e. whether any integration issues have occurred? Most importantly, the developers receive this feedback Quickly and Early.

Having CI infrastructure and processes are seen as a fundamental requirement in most modern software companies - whether the organization is adopting Agile methodology, DevOps or both.

CI has become a mandatory aspect in any software development process by virtue of providing measurable benefits to both developers and customers. These measurable variables include software quality, faster time-to-market, quicker feedback cycles on defects, lower costs of development and decreased occurrences of integration issues throughout the whole software development process.

CI In Agile

Agile - the most commonly followed Software Development methodology since the last decade is based on twelve principles; three of these twelve principles emphasize fast development cycle and quick feedback loops:

  1. Customer satisfaction by early and continuous delivery of valuable software
  2. Working software is delivered frequently (weeks rather than months)
  3. Working software is the principal measure of progress

CI processes and tools are utilized to achieve part of these objectives.

In an environment that encourages early feedback loops and continuous delivery of working software, CI enables developers to frequently commit changes with confidence by automating the quality assurance and build processes. Each check-in to the repository is verified for integration issues and acceptable level of quality by the CI infrastructure.

CI in DevOps

The term 'DevOps' is used to describe a set of practices and methodologies that bring together 'Development' and 'Operations' in order to increase the overall efficiency of delivering software projects.

In recent years, DevOps has garnered a lot of attention in many organizations, especially ones dealing with building large and complex software.

DevOps is a culture shift that encourages communication and collaboration among different departments within the organization - Development, QA and IT Operations.

There is no single tool to achieve the aims of DevOps. There are a set of phases called “DevOps Toolchains” that have unique objectives. Various tools can be made us of to achieve those objectives depending on the organization.

Typically, the DevOps toolchain consists of the following:

Devopstoolchain

Image credit: https://commons.wikimedia.org/wiki/File:Devops-toolchain.svg

The CI process is involved in two of the phases in the toolchain, namely Create and Verify. Create phase includes Code development, checking in code to the version control system etc. Verify phase includes automatically building the product and testing it.

It is based on these two fundamental aspects that the rest of DevOps culture is built on. Without CI, it is difficult to see the benefits of Agile and almost impossible to create a DevOps culture.

Development workflow with CI Process

With the introduction and a basic understanding of how CI fits into Agile and DevOps methodologies, we will now go through a typical CI Process used in introducing a new feature to the main code base.

  1. Checkout code

The developer begins by checking out or cloning the latest code from the source control system onto the local development machine. The developer then proceeds to create a new feature branch based on the main branch. This branch will be used specifically to introduce changes only for the scope of the new feature.

  1. Develop the new feature

This phase is where the code needed for new functionality is written.

Remember that the code changes not only include the functional code changes, but also unit tests and integration tests for the new code.

  1. Integrate early and often

When working in a team, it is possible that many developers are working on various features and committing changes into the main branch. Hence, the new feature branch can get outdated the longer it is kept out of sync with the main branch.

Hence, it is recommended that developers keep syncing their code often with the main code branch content. This step will bring in the changes from the main branch into the feature branch, so the developers are always working with the latest copy of the main branch code in addition to their own feature code.

The developers have to execute the automated tests on their feature branch. If these tests or the compilation of the code itself fails, the developer has to fix the integration issues.

If the sync is withheld until the implementation is fully complete, there could be issues integrating the new code back to the main code base.

Typically, each developer is recommended to sync the code from the main branch onto their feature branches at least once every day.

  1. Automated build and Verification

When each change is committed to the main branch, no matter how small those changes are; an automated build process compiles the code and executes the unit and integration tests.

This step verifies that the new code has not broken any of the previously working code.

Of Course, it is assumed that the tests that run in this phase are provided by the developers themselves as part of the previous steps.

Hence, although obvious, it needs to be highlighted that developers need to provide enough tests to verify the code. In the absence of adequate tests, the verification cannot be carried out to the necessary level. The amount of code that is covered by these tests is known as Code Coverage - a higher coverage indicating better verification.

  1. Commit changes into the main branch

When the feature is deemed to be complete to the satisfaction of the developer and stakeholders, the code in the feature branch needs to be merged into the main branch.

Before doing so, the feature branch needs to be synced with the main branch one more time and the developer needs to verify that the build and verification steps are passing with no errors.

This step is easier and stress free if the developers have been integrating often, else it generally turns into a nightmare of merge conflicts i.e. when developers working on the same code repository have conflicting and overriding changes for the same files.

Observe how closely the above steps resemble and assist the aims of Agile and DevOps mentioned previously!

Advantages

The advantages of introducing CI are many, mainly to do with quick feedback (either positive or negative) on code changes and Code Quality.

  1. Automated steps to verify code changes. Usually these steps are also manually executable using only one command in the developer’s local machine!
  2. Early detection of defects
  3. Early detection of integration issues with other code and components
  4. Ability to test code automatically
  5. Happy customer because of improved code quality
  6. Provision to develop a piece of software in small bits, verifying that each bit is fully functional and devoid of errors
  7. Ability to introduce code quality measurements and code quality gates on each commit and also on the overall code
  8. Reduce time to market for developing complex software
  9. Reduce code merge conflicts and unexpected behavior caused by big-bang approach to software development i.e. where everything is merged into the mainline code in large chunks, usually at the end of a project life cycle

Difficulties

The word “Disadvantages” would be the wrong choice because CI can only benefit everyone (not just developers, but also customers), so I prefer to use the word “Difficulties”.

  1. The Architecture of the legacy system may not be able to support CI without significant rework
  2. It is difficult to introduce CI in legacy code. Legacy code generally lacks automated build systems. Builds are usually carried out manually or in a semi-automatic way. To overcome this issue generally needs an introduction of a modern build system that requires a heavy investment of time and money. Any legacy system looking forward to move to CI should first look to introduce a new automated build system as the starting point
  3. Legacy code generally lack automated tests. This would make it difficult or impossible to check the sanity of the new code that has been committed into the main branch. The main purpose of CI (that is to catch problems early) is not possible without tests
  4. Culturally, the organization might not have employed Agile principles or DevOps way of working. Continuous commits into the repository might not be happening during the development cycle, thereby making CI pointless. Apart from that, there is also a political reasoning in every organization that would be deterrent in implementing CI

Summary

I have introduced you to the fundamentals of CI, why is it needed, what are the typical processes that are employed, benefits and drawback etc.

In general, organizations make use of CI not only for compiling and executing tests on code, but also to determine other measurable aspects such as code quality, code coverage, performance measurements using automated tests etc.

Some organizations also extend the use of CI to not just verify the code, but also to package and install the builds on a test or a production environment. This is called Continuous Delivery or Continuous Deployment, respectively.

Continuous Integration should be the goal of most companies making complex software. Once the basic setup of CI is in place, there is no particular overhead to the development process. If anything, most teams will find that introducing CI process and tools will lead to significantly reduced integration issues and allows the team to develop software more rapidly and with confidence.

Moving forward - CI Server and Tooling

I have refrained from mentioning tools that aid in CI because it is a practice that requires no specific tooling apart from automated builds. It is beneficial to have a Source control repository and a Continuous Integration server. The CI server and other tooling will be discussed in the next post in this series on Continuous Integration.