CD: OOM framework for continuous E2E deploy validation of tagged commit/merge trigger docker snapshots

Description

 Issue: we currently build docker images daily - not by gerrit merge per developer change - so we can run CD on each commit

TODO: look at out of the box systems like gitlab, gocd, bamboo...

POC running now that can consume a tagged docker manifest for a commit under test
http://jenkins.onap.info/job/oom-cd/
against http://beijing.onap.info:8880/env/1a7/kubernetes/dashboard
tracked on http://kibana.onap.info:5601/app/kibana#/dashboards?_g=()

Manifests

https://onap.readthedocs.io/en/latest/release/release-manifest.html 

https://gerrit.onap.org/r/gitweb?p=integration.git;a=blob;f=version-manifest/src/main/resources/docker-manifest.csv;h=35e992adc0678ccd98328d33c9a5ffc88dbb4dfa;hb=refs/heads/master 

 

 existing flow (actual)

  • gerrit review commits - partial-CI runs - JobBuilder marks -1 (compile failure, sonar failure)

  • gerrit review commits - partial-CI runs - JobBuilder marks +1, committer +2 merges to master, (later daily docker merge build tagged docker blindly)

  • no way to know whether that commit degraded ONAP

 proposed flow (expected) - two phase JobBuilder +1 process

  • gerrit review commits - partial-CI runs - JobBuilder marks -1 (compile failure, sonar failure)

  • gerrit review commits - partial-CI runs - JobBuilder marks +1, (what we add below)
    kick in docker merge immediately, we tag the docker image, we adjust the manifest for this review
    run extra CDBuilder that runs CD deploy of OOM using above tagset manifest, healthcheck, vFW
    report failure - marks gerrit review as -1 - no tag set published for this failed commit
    or report pass - marks gerrit review as +1
    jenkins now retags "latest" to the the docker built above for the review (do not blindly tag "latest" to the last docker build whether it passes/fails CD)
    Tagset for that build becomes the latest stable master build of ONAP (as in an OOM deploy will run not from master tip but from a tagged set that omits the last breaking change)
    committer +2 merges to master, (docker merge and tagging should not be required as long as master has not moved during the 1 hour CD build)

(todo: developer can destabilize this if the submit order is wrong - ie: dcae will add an appc rest api, then checkin dcae first and appc second) 

Note: Align with upcoming config retrofit

Issues: (workarounds for both)

  • out of order commits on cross-project commits - 

  • repo drift during the CD build

We also need a way to do a full deployment of OOM containing the change in a tagged docker image (from the CI build) - OOM will need either a script to retrofit all the values.yaml docker version tags or a way to pull in an automated manifest that has the entire ONAP tag set.

The CD build with heatlhcheck and minimal (vFirewall run - as much as is automated) - is then run on this single commit trigger or set of triggers for the past hour - later the JobBuilder marks the gerrit reviews as passed/failed based on the CD run

A CD poc running on AWS EC2 is currently up that does part of this (jenkins, elk stack , cd server to run OOM) - but it needs to be finished and migrated into the LF infrastructure
--------OOM-393--------, OOM-540

 Proposal

1 - developer gerrit merge occurs
2 - CI system saves git hash -
3 - tag git based on when build starts (not when it finishes) -
4 - build docker image per merge - not daily
5 - insert artifact into docker image
6 - tag docker image with TBD timestamp standard tag
https://nexus3.onap.org/#browse/browse/components:docker.snapshot

currently we have non-standard tags

aai 

version

1.2.0-SNAPSHOT-STAGING-20180121T124253

ccsdk 0.2.0-SNAPSHOT-STAGING-180117-133202

0.1.1-STAGING-180121-124346

appc

Version

1.3.0-STAGING-20180105T120937

7 - run CD system - after 1 hour collect results on running vFW for example

8 - tag gerrit review -1 or  +1 via JobBuilder for CD results - 
9 - If success CD: mark tag set including new blind tag from earlier gerrit merge per component

9b - tag latest only if succesfull

10 - publish tag set in dynamic manifest
https://gerrit.onap.org/r/gitweb?p=integration.git;a=blob;f=version-manifest/src/main/resources/docker-manifest.csv;h=35e992adc0678ccd98328d33c9a5ffc88dbb4dfa;hb=refs/heads/master 
11 - retrofit manifest into all values.yaml files in OOM (replace v1.1.0 below - per component)
https://git.onap.org/oom/tree/kubernetes/aai/values.yaml?h=amsterdam 

Acceptance Critieria

  • jobbuilder can mark a gerrit review CD pass/fail - just like we do for compile CI jobs

  • kibana history is available for CD jobs

  • run on any branch or tag

  • publicly visible server results

  • mitigation procedure on not using unvalidated tags until they pass

  • mitigation procedure on high volume commit times - lack of resources (group commits - has its own issues)

 Issues to fix

1 - tag timestamps are non-standard

https://nexus3.onap.org/#browse/browse/components:docker.snapshot

v2/onap/aai-resources/manifests/1.2.0-SNAPSHOT-STAGING-20171205T173445

v2/onap/aai/esr-gui/manifests/1.0.1-SNAPSHOT-STAGING-171205-122425

We can pull all the highest version tags in a query (after we fix above) and generate the tag set this way

 

2- need to update the manifest and use the plugin

https://wiki.onap.org/display/DW/ONAP+Version+Manifest+Maven+Plugin 

 
Notes
http://blog.christianposta.com/deploy/blue-green-deployments-a-b-testing-and-canary-releases/
https://aws.amazon.com/about-aws/whats-new/2016/08/netflix-oss-spinnaker-on-the-aws-cloud-quick-start-reference-deployment/
https://medium.com/@gajus/the-missing-ci-cd-kubernetes-component-helm-package-manager-1fe002aac680

Attachments

1
100% Done
Loading...

relates to

Activity

Show:

Former user May 19, 2018 at 5:48 PM
Edited

Former user May 3, 2018 at 6:10 PM

20180502: clustered AWS now - I added EFS wrapping NFS - will double the cluster size this weekend
https://wiki.onap.org/display/DW/Cloud+Native+Deployment#CloudNativeDeployment-AWSEFSshareforsharedNFS

Also thanks Ran for
https://github.com/kubernetes-helm/chartmuseum

Former user April 25, 2018 at 4:11 AM

Former user March 21, 2018 at 2:08 AM

Done

Details

Assignee

Reporter

Fix versions

Priority

Epic Name

Created December 8, 2017 at 4:08 AM
Updated August 12, 2023 at 5:56 AM
Resolved May 20, 2019 at 1:47 PM