Github Actions Investigation [WIP]

As per JIRA: POLICY-4980: Add job for running automated s3p tests weeklyClosed

Introduction

As part of our work to run our S3P tests on a weekly basis in the Policy Framework we decided to investigate the possibly of using Github Actions for this task.

Running our S3P tests weekly would provide regular feedback about the stability and performance of our Policy Framework components. As of now, our S3P tests are ran fully once per release. For this proposed approach, we would run shortened versions of our S3Ps on a weekly basis. This would contribute to our dynamic tests that we run as part of the Policy Framework.

In this document I will outline the findings from this investigation.

What is Github actions?

Github actions is a CI/CD platform for automating build or test pipelines. Further information on Github Actions can be found here: https://docs.github.com/en/actions/about-github-actions

Some key terms:

A workflow is a configurable automated process that will run one or more jobs. Workflows are defined by a YAML file checked in to your repository and will run when triggered by an event in your repository, or they can be triggered manually, or at a defined schedule.

An event is a specific activity in a repository that triggers a workflow run.

A job is a set of steps in a workflow that is executed on the same runner. Each step is either a shell script that will be executed, or an action that will be run.

An action is a custom application for the GitHub Actions platform that performs a complex but frequently repeated task. Use an action to help reduce the amount of repetitive code that you write in your workflow files.

A runner is a server that runs your workflows when they're triggered. Each runner can run a single job at a time. GitHub provides Ubuntu Linux, Microsoft Windows, and macOS runners to run your workflows. Each workflow run executes in a fresh, newly-provisioned virtual machine.

Our Github Actions Workflow File Explained

gerrit-s3p-performance.yaml

name: policy-api-performance-test (1)

on:
  workflow_dispatch: (2)
  # For Branch-Protection check. Only the default branch is supported. See
  # https://github.com/ossf/scorecard/blob/main/docs/checks.md#branch-protection
    inputs:
      GERRIT_BRANCH:
        description: 'Branch that change is against'
        required: true
        type: string
      GERRIT_CHANGE_ID:
        description: 'The ID for the change'
        required: true
        type: string
      GERRIT_CHANGE_NUMBER:
        description: 'The Gerrit number'
        required: true
        type: string
      GERRIT_CHANGE_URL:
        description: 'URL to the change'
        required: true
        type: string
      GERRIT_EVENT_TYPE:
        description: 'Gerrit event type'
        required: true
        type: string
      GERRIT_PATCHSET_NUMBER:
        description: 'The patch number for the change'
        required: true
        type: string
      GERRIT_PATCHSET_REVISION:
        description: 'The revision sha'
        required: true
        type: string
      GERRIT_PROJECT:
        description: 'Project in Gerrit'
        required: true
        type: string
      GERRIT_REFSPEC:
        description: 'Gerrit refspec of change'
        required: true
        type: string
  branch_protection_rule:
  # To guarantee Maintained check is occasionally updated. See
  # https://github.com/ossf/scorecard/blob/main/docs/checks.md#maintained

  # Run every Monday at 16:30 UTC
  schedule: (3)
    - cron: '30 16 * * 1'

jobs: (4)
  run-s3p-tests: (5)
    runs-on: ubuntu-22.04 (6)

    steps:
      - uses: actions/checkout@v4 (6)

      - name: Run S3P script (7)
        working-directory: ${{ github.workspace }}/testsuites
        run: sudo bash ./run-s3p-test.sh run performance

      - name: Archive result jtl (8)
        uses: actions/upload-artifact@v4
        with:
          name: policy-api-s3p-results
          path: ${{ github.workspace }}/testsuites/automate-performance/s3pTestResults.jtl

(1) - Workflow name

(2) - Bunch of boilerplate code required from LFN

(3) - Schedule event to run our jobs once weekly on Mondays at 16:30pm UTC

(4) - Where we define our jobs

(5) - Our only job defined in this workflow called run-s3p-tests

(6) - Spins up a test VM which runs Ubuntu

(7) - Runs a command to run our s3p script which triggers our performance tests

(8) - Archives the result of the Jmeter performance test

Triggering Github Actions

This page outlines the necessary boilerplate workflow code needed to trigger Github Actions workflows from Gerrit events such as a merge or push. https://docs.releng.linuxfoundation.org/projects/gerrit-to-platform/en/stable/index.html

In our case, we are not planning to trigger jobs from Gerrit changes specifically. The Github Actions workflow we have written will run once weekly. (I am not sure if the boilerplate code is really necessary in this case, but I have been advised to include it regardless.)

We will store and manage our workflow files in the .github/workflows directory in each components repo. This differs to the way we store our Jenkins jjb files. Currently, we store our Jenkins jjb files in the ci-management repo under the policy directory.

Testing

There is two approaches I took when testing out this workflow.

1) Using "act"

2) Creating a fork of the github repo and manually triggering the workflow on the github environment

Using "act"

All documentation on act can be viewed here: https://github.com/nektos/act

When you run act it reads in your GitHub Actions from .github/workflows/ and determines the set of actions that need to be run. It uses the Docker API to either pull or build the necessary images, as defined in your workflow files and finally determines the execution path based on the dependencies that were defined. Once it has the execution path, it then uses the Docker API to run containers for each action based on the images prepared earlier. The environment variables and filesystem are all configured to match what GitHub provides.

There was a few drawbacks with act that I found while testing. The main one was that systemd is required to install snap packages. As we know, systemd cannot be ran inside of a Docker container.

From a docker forum I found when researching the issue seen when attempting to run our run-s3p script: "I wouldn’t even try it. Snap requires systemd and not just requires it, systemd has to run as the first process. Even if you solve that (which is challenging itself), snap requires capabilities to manage linux kernel namespaces. It would be like Docker in Docker which exists, but in that case the Docker daemon is the first process and it doesn't require systemd. Maybe you could run snapd as the first process as you would do with the Docker daemon, but you can’t use the snap command until the daemon is running and you can’t start snapd during the build, only maybe in same very nasty way in a single layer (single RUN instruction). We say it very often, because it is true and important to know. A container is NOT a virtual machine and it shouldn’t be used like that. The other thing that I say sometimes is that if you want to use a container similarly to a virtual machine, use LXD and run a full operating system with Systemd in it in an LXC container. Docker was not designed for that. It is for running a single process or a couple of processes with a process manager as the first process."

Using the fork method

ONAP Wiki