OOM Meeting Notes - 2018-12-5

Agenda

  • OOM Branches now open

    • "casablanca" for Casablanca Maintenance - Casablanca-3.0.1Sprint

    • "master" for Dublin release - Dublin backlog exists and continues to evolve


  • Results from Casablanca Post mortem with Integration Team
    1. Improvements
            a. Much cleaner than Beijing
            b. New Deploy function that separated out the charts helped tremendously
            c. Control Plane separation greatly stabilized helm/k8/rancher
            d. Pulling patches before merge for Integration helped keep us rolling while waiting.
    2. Issues
            a. Still delays since projects would update base code and then have to wait for OOM chart merges
            b. Projects arent putting enough in helm charts - passwords for instance should all be in helm charts
            c. Still have rancher/helm/k8 issues under high load
            d. Multi-site and clustering
            e. Intervals are too short for full deployment
            f. If rancher node goes down need to do complete re-install
            g. Need somehting better than dockerdata-nfs
                    i. filled up dockerdata-nfs multiple times
            h. Log and Pomba are resource problems - stability improves if we turn these off
                    i. dont know if its network bandwidth or disk i/o
            i. vCPE (SDNC) needs to be on assigned hosts so we can setup static routes
                    i. implement host label so nodeSelector can be used ?
                    ii. More general solution for apps that write to VNF's with source IP ACL's/NAT ?
            j. Slower clouds (everyone but windriver) used the public-cloud settings with relaxed timeouts
            k. Portal DB still not coming up correctly all the time outside of windriver
    3. Suggestions
            a. Use HA rancher deployment
            b. Alternative to dockerdata-nfs turned on by default
            c. Ability to override all periodic interval settings ?
            d. Move project helm charts to their repo in Dublin
            e. Portal can be setup on another domain not just portal.api.simpledemo.onap.org
                    i. Should we setup a public external DNS that can be used to drive  testcase
                    ii. portal.api.sb04.windriver.test.onap.org ? with net10 and 192 addresses as applicable ?
            f. AAI team wanted to get notified of AAI Casandra issues automatically
                    i. Can we setup a Nagios or equivalent to monitor both rancher/k8 and the applications for rancher/k8 issues ?

  • OOM offline Installer overview

<