Continuous Delivery in Big Data

07.09.2018

The daily life of a developer is filled with monotonous and repetitive tasks. Fortunately, we live in a pre-artificial intelligence age, which means computers are great at handling boring chores and they hardly ever complain about it!

Testing and deployment are two integral elements of development. With some automation mixed in, they become solutions commonly called Continuous Integration, Continuous Delivery and Continuous Deployment (otherwise known as CI/CD). The "continuous" aspect of these solutions means that your projects will be automatically tested and deployed, allowing you to focus more on writing code and less on herding it onto servers.

Continuous Integration

Continuous Integration (CI) is the process of automatically building and testing your software on a regular basis. How regularly this occurs, varies. In the early days of agile, this meant daily builds, but with the rise of tools like Jenkins and Bamboo, this can be as often as every commit. Builds run a full suite of unit and integration tests against every commit (and because most commits happen on dev branches, we take advantage of Jenkins's or Bamboo’s branch builds feature).

Continuous Delivery

Continuous Delivery (CD) is the logical next step from continuous integration. Continuous Delivery can be thought of as an extension to Continuous Integration which makes us catch defects earlier. If your tests are run constantly, and you trust your tests to provide a good level of quality, then it becomes possible to release your software at any point in time. Note that continuous delivery does not always entail actually delivering, as your customers may not need or want constant updates. Instead, it represents a philosophy and a commitment to ensuring that your code is always in a release-ready state.

Continuous Deployment

Contrary to Continuous Delivery where code is always in a deployable state, so you can deploy it easily whenever you want, Continuous Deployment (CD) requires every change to be deployed automatically, without human intervention. The ultimate culmination of this process is the actual delivery of features and fixes to the customer as soon as the updates are ready.

In practice, there is a continuous spectrum of options between these techniques, ranging from just running tests regularly, to a completely automated deployment pipeline from commit to customer. The constant theme through all of them, however, is a commitment to constant QA and testing, and a level of test coverage that imbues confidence in the readiness of your software for delivery.

Continuous Integration - Continuous Deployment-1

So, as you can see in the diagram once Continuous Integration stages are completed, the newly built application is automatically deployed to production and then it becomes Continuous Deployment. On the other hand, if we manage to automate everything, but decide to include a step for human approval in order to proceed with the deployment of the new version, then we are considering Continuous Delivery. Well, the difference might seem very subtle, but it has enormous implications, making each technique appropriate for different situations.

What are the Big Data DevOps use cases

  • Platform Build and Code Release

Jira Big Data use cases with CI-CD-1

The above pipeline implements:

  1. Continuous Delivery (for development releases like snapshots, bugfixes etc...)
  2. Continuous Deployment (for customer releases like nightly and stable builds)


1.   Continuous Delivery

  • Data Engineer implements service requests for bugs and new features
  • Code commits in SCM trigger build on Jenkins/Bamboo which will use Maven/Ant, Ansible, Nexus and other tools build, test and mark a new snapshot release
  • First Maven, Ant or other build tools will build and perform unit tests on the code in Jenkins
  • Then Ansible will ensure the DEV/UAT (Hadoop Stack) is ready for Functional and Load tests and perform the tests
  • On successful completion of build and all tests, Jenkins will mark the build passed and release the artefact into Nexus (Snapshot)

 This completes the Continuous Delivery phase where we ensure our code is ready for release at any point of time. If the delivery fails at any step, users are notified to attend and fix the issues with build or tests.

2.    Continuous Deployment

  • Release Manager pull commits (ensure only the commits that passed the builds earlier) into the release branch
  • Code commits in SCM trigger build on Jenkins/Bamboo which will use Maven/Ant, Ansible, Nexus and other tools build, test and mark a new release to the consumers
  • First Maven, Ant or other build tools will build and perform unit tests on the code in Jenkins
  • On successful completion of build and all tests, Jenkins will mark the build passed and release the artefact into Nexus (Release)
  • Ansible Tower will then deploy the latest released product to the production and pre-production stack.

 Continuous Deployment ensures that working product is delivered automatically to the End-User.

 

If you would like to find out more about how DevOps could help you fast-track your Big Data projects while enabling you to open your digital horizons, do give us a call at +44 (0)203 475 7980 or email us at marketing@whishworks.com

Other useful links:

Big Data Consulting Services

Big Data Round-up August 2018

Big Data Centre of Excellence

Topics

Big Data DevOps

Recent Posts