OOM Guilin release plan
TL;DR;
If you don't retrieve your certificates automatically, this is the first thing you must do. See https://github.com/onap/oom/tree/master/kubernetes/nbi as an example.
certInit
template (a.k.aaafInit
) is the strongly preffered way to do that.No more than 1 main process per container. If you have more, it'll be a blocker for updates
All logs to STDOUT
No direct commit to Frankfurt but master then cherry-pick
All upstream components should use an upstream (dockerhub, googlehub) version
Common chart version bump
Mariadb common chart will be upgraded to 10.4.12
PostgreSQL common chart will be upgraded to 12.2
Cassandra common chart will be upgraded to 3.11.6
MongoDB common chart will be upgraded to 4.2.2
ElasticSearch common chart may be upgraded, waiting for SECCOM proposed version
AAF is an optional requirement, meaning your component must work without AAF (certificates and RBAC), even on degraded mode
MSB is an optional requirement, meaning your component must work without MSB
Your component can use http as server and client
Password removal will continue on common charts (postgreSQL at least) and start on your component, be prepared to receive so call for help
Commit messages must be meaningful and follow the format shown below.
Proper crash (if your component fails, it must exit with code > 0, and not wait or exit with code 0)
Ingress will be the default deployment option (via Nginx Ingress). No more access via NodePort per default
New code will be submitted only if pods + healthchecks + basic tests are OK
No root access to any Database from application container
No configuration generation using sed in the application container
Certificates
Certificates in Docker are not allowed in Francfurt release per SECCOM recommendation. They won't be allowed either in helm chart starting branching of Frankfurt. This means you must move to automatic retrieval at boot. You'll either have to:
use
certInit
template (formerly known asaafInit
template) (see NBI as an example)do it by your own + explain why you can't use
certInit
template (will be subject to acceptance / not acceptance from both SECCOM and OOM team)
No More than 1 main process per container
Several containers uses several process (main component + database(s)) in the same docker.
Each container should have only one concern. Decoupling applications into multiple containers makes it easier to scale horizontally and reuse containers. For instance, a web application stack might consist of three separate containers, each with its own unique image, to manage the web application, database, and an in-memory cache in a decoupled manner.
Limiting each container to one process is a good rule of thumb, but it is not a hard and fast rule. For example, not only can containers be spawned with an init process, some programs might spawn additional processes of their own accord. For instance, Celery can spawn multiple worker processes, and Apache can create one process per request.
This will not be accetped anymore as it's a very bad pattern (https://docs.docker.com/develop/develop-images/dockerfile_best-practices/).
All logs to STDOUT
As per SECCOM (and testers ;) ) requests, all the logs must go to STDOUT. It'll easier troubleshooting and future centralized log work.
Smart Healthchecks
If your healtchecks just verify that /status
is answering 200, there are not useful as readiness/liveness probes can do it. So create meaningful healtchecks or remove them if they can be done via readiness/liveness probes.
Relevant commit messages:
the commit message (on OOM) must follow this form:
[NAME_OF_COMPONENT|DOC|COMMON|GENERIC] Meaningful title (from OOM side) at least one sentence explaining the change done in this patch, cause and consequences and possibly more of course Issue-ID: AS_WE_ARE_FORCED_BUT_MEANINGLESS Change-ID: xxx Sign-off: xxx
Commit message will be the last stuff that will stay with our code so it must clearly explain the changes, the "why" and the consequences. If it change OOM behavior in any way, documentation must be also updated.
Starting Frankfurt branching, merge requests which are not following this pattern will not be merged.
Please read the following pages and follow the guidelines for writing commit message contained therein.
http://bit.ly/goodcommitmessages
http://who-t.blogspot.com/2009/12/on-commit-messages.html
http://dep.debian.net/deps/dep3/
No root access to any Database from application container
If you need to create users, tables etc do it from init container. You can either:
use common.mariadb-init template for MariaDB (will be extended for PostgreSQL)
use your init container + explain why you can't use the common chart
No configuration generation using sed in the application container
It can be delivered as a config map or if it has to be somehow processed this should happen in the init container.
Kubernetes upgrade to 1.18
In order to move to Kuebernetes 1.18, significant changes will be done on helm charts, using templates we've made. It shouldn't be harmful on your side.
Moving to 1.16+ will allow use to use startupProbe, if you've have slow starting components, please tell us!
Databases upgrades
Databases will move to follow SECCOM proposition, please advise if your component is not compatible with the recommended version.
If you're using a specific (out of common) version of a component listed in the page, you'll either have to:
move to the common chart (preferred)
deal with the upgrade + explain why you can't use the common chart
All upstream components should use an upstream (dockerhub, googlehub) version
All upstream components directly used (databases, kafka, zookeeper, nginx, haproxy, ...) should use directly the upstream version and not embed it into its own docker
AAF component is optional
AAF is used for certificates and RBAC. It's a non mandatory requirement, meaning your component must be able to not use it for both features (but this is enabled per default). The mode can be obviously degraded (no HTTPs, no rbac but basicAuth) when AAF is disabled.
Please bear in mind that we're working on removing needs for AAF for certificates AND need for AAF for RBAC so this will be tested. (see AAF Removal
MSB component is optional
MSB is interesting when used on non Kubernetes deployment. But the features proposed are an overlap of basic Kubernetes features. Furthermore, it prevents to correctly traces dependencies between ONAP components.
As such, your component must be able to not use MSB
Password removal
Full status: https://wiki.lfnetworking.org/display/LN/2020+April+Technical+Event+Schedule?preview=/34605773/34606179/password_removal_update.pptx
Mariadb password removal is done. The work will continue on postgresql and other common charts. Work will be started on component (Policy is underway), be prepared to receive so call for help.
Pods requests/limits
Every container (including init containers) should have requests/limits with "good" values (not too high, not too low). A check will be done and some values will be changed.
A "simple" rule for setting the limits requests:
request should be the mean use of the container
limits you should be somewhat higher to expected MAX use.
For memory, if you can set MAX values in the underlying program (like in Java), set the max to 1.2 times this value. If not, set 1.5 times of max ever used.
For CPU, set 2 or 3 times of max ever used as some CPU may be significantly slower.
Ingress use as default deployment (instead of node ports)
Instead of using NodePorts, we want to push the use by default of Ingress. Significative changes will be made when this is ready (especially removing "NodePort" as default service type from most of ONAP services).
If your component is using "hardcoded" access to other components via NodePort (like Portal), please advise as it won't be possible anymore (but access will still be possible via ClusterIP for internal, Ingress for external).
Using node ports will still possible, by configuring the different services.
Dynamic PVC as default deployment (instead of /dockerdata-nfs)
Instead of pushing everything to /dockerdata-nfs, we'll default using storage class(es). It will still be possible to use /dockerdata-nfs but won't be the default way.
Service Mesh PoC
Wiki page: https://wiki.onap.org/display/DW/ONAP+Security+Model
A PoC using service mesh (Istio), Ingress (component tbd) and RBAC management (keycloak) will be started. For that, we must be able for your components to:
use HTTP (and not HTTPs) as Istio handles the TLS part for both server and client part
disable RBAC / basicAuth as keycloak will do the RBAC for your component (meaning if a request is coming on your component, it has already been granted by keycloak) for both server and client part
Some call for help may be launched if we don't understand your component behavior. This work will be done starting by "Core" component (DMAAP, AAI, SDC, SO, SDNC) and be extended.