Overview: ONAP Next Generation Security & Logging Architecture, Design and Roadmap

Provided Rationale for Service-Mesh Security: uniform, open-source- and standard-based security at the platform-level
Istio was chosen for ONAP Service-Mesh Security to fulfill ONAP security requirements
Defined ONAP Security Architecture, leveraging Istio, Keycloak, Cert-Mgr, Ingress, Egress
Multi-Tenancy architecture and plan were addressed
OOM team defined the service-mesh implementation priority for Istanbul; service-to-service mTLS first and then, AAA (AuthN, AuthZ, Audit)
Currently, collecting feedback from NSA
Implementation Plan for Istanbul:
- Phase 1: OOM will contribute this mTLS with Service-Mesh (contribution target: SDC, DCAE, SO, AAI, DMaaP, SDNC)
  - make selected components Service-Mesh compatible
  - Configure service-to-service mTLS communication
  - Configure peer authentication policies
  - Configure authorization policy for coarse-grained authorization
- Phase 2: OOM plans to contribute on AAA, Multi-Tenancy and logging (stretch goals)
  - Deploy/Configure Keycloak as the reference IdAM
  - Configure Istiod for public key and JWT handling
  - Configure request authentication policies
  - Configure additional authorization policies

Migration from AAF to Service-Mesh Security

Why not ONAP AAF (Application AuthZ Framework)

It is a solution for ONAP, which is maintained by ONAP only
AAF CADI plugin supports Java only, where some ONAP components are written in other languages (Python, etc.)
Clients in various languages could have language specific restrictions
AAF needs to keep up with the latest security technology evolution, but no active support in ONAP
AAF is not widely used by ONAP projects due to:
Integration/enforcement of AuthN and AuthZ are done by each project
Mutual-TLS enablement & Secrets/certificate management are done by each project
For encryption, each project needs to understand AAF certificate management mechanism
Compatibility with variety of external authentication systems is a challenge
Third-party microservices requires modification to work with AAF
SSO and Multi-factor authentication support is questionable

We want to achieve simpler, uniform, open source- and standard-based Security (Service Mesh-pattern Security):

Eliminate or reduce microservice infrastructure level security tasks in services
Security is handled at the platform level
Enabling uniform security across applications
Minimize code changes and increase stability
Platform-agnostic abstraction reduces risks on integration, extensibility and customization
Use of Sidecar, so security layer is transparent for the application
External Interfaces secured at Ingress Gateway
Internal Interfaces secured at Service Mesh Sidecar
User Management and RBAC is managed by open-source technologies

Benefits to ONAP, O-RAN and External Components

The uniform, open-source-based and standard-based security would be a foundation for secure integration between ONAP / O-RAN and external components
Product impacts for security would be minimum
Ensure ONAP services have only the business logic of that service
Third-party microservices can participate in ONAP security by leveraging the platform-level security (sidecar)
Vendors can take advantage of open-source-based security, instead of having and maintaining their own security mechanism: cost effective

Service Mesh Security with Istio

The following diagram depicts the Istio mesh architecture overview:

Why Istio?

Istio was chosen because it fulfills the current ONAP security requirements/needs with the following capabilities and yet provides future-proof mechanism.

However, please note that a new platform could be considered in the future as needed.

Istio authentication and authorization:

Provides each service with a strong identity representing its role to enable interoperability across clusters and clouds.
Secures service-to-service communication with authentication and authorization.
Provides a key management system to automate key and certificate generation, distribution, and rotation.

Microservices have security needs:

To defend against man-in-the-middle attacks, they need traffic encryption.
To provide flexible service access control, they need mTLS (mutual TLS) and fine-grained access policies.
To determine who did what at what time, they need auditing tools.

With Control and Data planes, mTLS, Ingress, Proxy and Egress, Istio provides Security features for :

Strong identity
Powerful policy
Transparent TLS encryption
Authentication, Authorization and Audit (AAA) tools to protect your services and data

Goals of Istio-based security are:

Security by default: no changes needed to application code and infrastructure
Defense in depth: integrate with existing security systems to provide multiple layers of defense
Zero-trust network: build security solutions on distrusted networks

ONAP Security Architecture - Leveraging Istio, Keycloak, Cert-Mgr, Ingress, Egress

The following diagram depicts ONAP Security Architecture.

Security Functional Blocks

Ingress Controller (realized by istio-ingress gateway)
AuthN/AuthZ Gateway (sidecar) – future consideration
ONAP IdAM (realized by Keycloak)
User Administration
Tenant Administration
Proxy (sidecar) – for data plane
oauth2-proxy
External IdP (vendor-provided component)
External IdAM
Egress
Istiod – control plane (pilot, citadel, galley, cert-mgr)
Certificate Authority (CA)

Kubernetes Ingress vs. Istio Ingress Gateway

Kubernetes Ingress

One NodePort per service
It is a reverse proxy and Nginx is recommended
Different ports for different applications
It has limited traffic control/routing rules and capabilities

Istio Ingress Gateway

It is a POD with Envoy that does the routing
It is configured by Gateway and Virtual Service (which configures routing and enables intelligent routing)
It can be configured for routing rules, traffic rate limiting, policy-based checking, metrics collections

Security Architecture Highlights

Use of control plane and pluggable sidecar (platform container) proxies to protect application service
Intercept traffic and check AuthN/AuthZ with IdAM and IdP
Support SSO for Human user access request
Compatibility with variety of external authN/authZ systems (e.g., external IdP, IdAM)
Allowing external IdP(s) and IdAM(s)
Enabling a foundation for authN federation
RBAC support, based on user role, group & permission
Multi-Tenancy support

ONAP Component Security Requirements

The following is ONAP component security requirements.

#	Requirement
1	No implicit dependencies between common components and ONAP
3	istio-ingress is used as ingress controller
4	Separate Northbound (for user) and Southbound (xNF) interfaces
5	Every ingress gateway terminates the TLS and re-encrypts the payload before sending to the destination component using mTLS
6	ISTIO network policy must be configured for only authorized services communications
7	AuthN between services is done using certs via mTLS
8	OpenID Connect is used to authenticate user
9	ONAP IdAM is realized by Keycloak, but it can be replaced with an external IdAM that is compatible with OIDC; authentication federation is out-of-scope from Istanbul
10	Cert-manager and Citadel are used to manage (CRUDQ) certificates
11	Kubernetes is configured to use encryption at rest
12	ISTIO automated sidecar injection is configured in underlying Kubernetes
13	No root pods are used
14	All DB is considered external
16	Support possible options to integrate with LDAP, Kerberos, AAF as IdP

ONAP OOM Plan - Istanbul Priorities

The following diagram depicts ONAP OOM plan and Istanbul priorities:

note:

Istio 1.10 is used
for Istanbul, the ONAP default IdAM (realized by Keycloak) will be used as the reference implementation/configuration.
Keycloak internal IdP will be used
Multi-tenancy support of external IdAM and IdP will not configured

ONAP Component mTLS (A): make core ONAP deployable on Service Mesh (mTLS first without AAA): basic_onboarding, basic_vm automated integration tests

Service Mesh compatibility:
- DMaaP Compatibility on service mesh
- Cassandra (and Istio 1.10?)
- AAI is ongoing
  - issue on jobs; trying here to "quitquitquit" solution (cleanly exit the server)
  - issue on sparky be: there's an aaf leftover part
  - cannot test part needing cassandra for now
- SDC is ongoing
- SO is planned
- DCAE is ongoing
- SDNC is planned
Better use of Service Mesh within components:
- SO: patch ongoing: https://gerrit.onap.org/r/c/so/+/122263
Plan to check if we can run without authentication on these components; also check if we enlarge # of components

Communication Protocol between Service and Proxy and between Proxies

The communication protocol between Service and Proxy is "HTTP".
- Service-Mesh Security architecture supports security at the platform level
- Application does not handle security directly
- For that, Service and the corresponding Proxy will sit on the same POD to make their communication internal
The communication protocol between Proxies is mTLS via "HTTPs".

ONAP Component AAA (B) – stretch goal for Istanbul

onboard roles and realm on Keycloak for tests / reference implementation (use of OIDC / JWT)
add oauth2-proxy in the solution to redirect unauthenticated traffic to SSO Portal (keycloak as example)
add some rules to enforce (AuthorizationPolicy)
add some service accounts (work ongoing)

OAuth2-based Access Token AuthN/AuthZ

The following diagram depicts OAuth2-based Access Token AuthN/AuthZ.

If the access token is used (unless self-contained access token), the Resource Server needs to ask the Authorization Server for the access token validation.
This can cause chatty communication between the Resource Server and the Authorization Server.
By leveraging the JWT (JSON Web Token), the above chatty communication can be reduced.
In ONAP, the JWT is used for the browser client and application client request authN and authZ - see the following

OAuth2-based JWT AuthN/AuthZ

The following diagram depicts OAuth2-based JWT AuthN/AuthZ.

By leveraging the JWT (JSON Web Token), the chatty communication between the ResourceServer and the Authorization Server can be removed.
In ONAP, the JWT is used for the browser client and application client request authN and authZ - see the following

ONAP Browser Client Request Authentication and Authorization Use Case

The following diagram depicts the browser client request authentication and authorization process.

#	Sequence
0	Configuration steps: Create private key and public key in Keycloak Configure login page at Keycloak. Note: login page is configurable and centralized, not part of applications. It is a must for SSO. Configure Envoy Proxy in istiod JWT-auth filter where to retrieve Keycloak's public key (JWKSUri) or embed public in JWKS format Issuer: who has generated the JWK token (in this case, Keycloak) Istiod distributes the configuration to all the Envoy proxy (e.g., public key, certificate, etc.)
1	Client sends a request without token
2	Ingress intercepts and forwards the request to Envoy proxy
3	If the request without token, oauth2-proxy redirects the traffic to Keycloak (reference configuration; could configure to use external IdAM) for login process If JWT is given in the request header, oauth2-proxy asks Envoy proxy for JWT token validation (for the step 10) If authentication cookie is given, oauth2-proxy adds JWT authentication header to the matching cookie, and it asks Envoy proxy for JWT token validation (for the stop 10)
4	Keycloak (with oauth2-proxy help) redirects the request to the login page. Note: this login page is configured at KeyCloak, and is provided to the user. This redirection is important because uses are completely isolated from applications and application never see a user's credentials. Applications instead are given an identity token or assertion that is cryptographically signed.
5	Client logs in with user credentials
6	If user credentials is valid, Keycloak generates JWT token with private/public key pair and send the generated token to the client
7	Client stores JWT token in local storage
8	Client sends a request with JWT token
9	Ingress intercepts and forwards the request to Envoy proxy
10	Envoy proxy validates JWT token with a public key
11	If JWT is valid, it enforces the security policy (e.g., authN, authZ). Once all is passed, it sends the request to service
12	if JWT is not valid, go to the step3 for the re-login process. If the max try is exceeded, error handling will be processed (details are TBD)

Machine Client Request Authentication and Authorization

The following diagram depicts the machine client request authentication and authorization process, by leveraging IdAM and Istio which support JWT.

#	Sequence
0	Configuration steps: Create private key and public key in Keycloak Configure Envoy Proxy in istiod JWT-auth filter where to retrieve Keycloak's public key (JWKSUri) or embed public in JWKS format Issuer: who has generated the JWK token (in this case, Keycloak) Istiod distributes the configuration to all the Envoy proxy (e.g., public key, certificate, etc.)
1	Client logs in with client credentials towards Keycloak
2	If client credentials is valid, Keycloak generates JWT token with private/public key pair and send the generated token to the client
3	Client stores JWT token in local storage
4	Client sends a request with JWT token
5	Ingress intercepts and forwards the request to Envoy proxy
6	Envoy proxy validates JWT token with a public key
7	If JWT is valid, it enforces the security policy (e.g., authN, authZ). Once all is passed, it sends the request to service
8	If JWT is not valid, error handling will be processed (details are TBD)

Request Authentication Configuration

Request authentication policies specify the values needed to validate a JWT. Istio checks the presented token.

If presented requests are against the rules in the request authentication policy, it rejects requests with invalid tokens
When requests carry no token, they are accepted, but authorization policies can validate requests
If more than one policy matches a workload, Istio combines all rules

See the authentication policy section for details.

Service-2-Service (Peer) mTLS Communication With Istio

The Service-to-Service authN is supported by leveraging mTLS certificates, instead of username/password
Also, authorization will be supported for mTLS by Istio authorization configuration
Two services will communicate using a direct connection (e.g., via REST APIs) without passing thru Ingress Controller or MSB

mTLS Certification Management

The following diagram depicts the Istio Certificate management and interaction with the Proxy (istio-agent and envoy proxy work together with istiod to automate key and certificate roation):

#
Sequence (mTLS Certification Management)
1
When started, Istio agent creates private key & CSR (certificate signing request)
2
Istio agent sends CSR with its credentials to istiod for signing
3
CA in istioid validates credentials in the CSR, signs CSR to generate certificate upon successful validation
4
CA in istioid sends the signed certificate to istio-agent
5
When a workload is started, Envoy proxy requests certificate and key from istio-agent via SDS (secret discovery service) API
6
Istio-agent sends certificates and private key to Envoy Proxy
7
Istio-agent monitors the expiration of the workload certificate. The above process repeats periodically for certificate and key rotation
mTLS Service-2-Service Secure communication with Authentication and Authorization
The following diagram depicts the Service-to-Service mTLS secure communication with Authentication and Authorization:

#
Sequence (mTLS Authentication Flow) - TLSv1_2
1
Client application in a container sends a plain-text HTTP request towards the server
2
Client proxy container intercepts the outbound request
3
Client proxy performs a TLS handshake with the server-side proxy
A. This handshake exchanges client and server certificates
B. Istio preloaded certificates the proxy containers (see diagram A)
4
Client proxy checks for ‘secure naming’ on the server’s certificate, and verify the server identify (service account presented in the server certificate) is authorized to run the target service
5
Client and server establish a mTLS connection
6 Istio forwards the traffic from the client side proxy to the server side proxy
7
The server side proxy authorizes the request

A. if authorized, Istio forwards the traffic to the server application service through local TCP

B. if not authorized, an error condition arises

Authentication Mode
Peer authentication policies specify the mTLS mode that Istio enforces on target workloads:
PERMISSIVE: workloads accept both mTLS and plain text traffic
STRICT: workloads accepts only mTLS traffic
DISABLE: mTLS is disabled
A permissive mode allows a service to accept both plaintext traffic and mutual TLS traffic at the same time. During the transition period, the permissive mode would be allowed for non-Istio clients communicate with non-Istio servers. Once ONAP components are Service-Mesh enabled, the permissive mode could be disabled.
Authentication Policy Configuration
Note: during Istanbul, peer authentication will be configured first (high priority). Then, OOM will visit the request authentication.
Authentication Policies are configured in .yaml files, and are saved in the Istio configuration storage once deployed. The Istio controller watches the configuration storage. Upon any policy changes, Istio sends configurations to the targeted endpoints (e.g., Envoy Proxy). Once the proxy receives the configuration, the new authentication requirement takes effect immediately on that pod.
for request authentication, the client application is responsible for acquiring and attaching the JWT credential to the request.
for peer authentication, Istio automatically upgrades all traffic between two PEPs to mTLS.
If authentication policies disable mTLS mode, Istio continues to use plain text between PEPs. In ONAP Istanbul, mTLS mode will be enabled. Otherwise, the disabled use cases will be reviewed as an exception request.
Example,
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: "onap-peer-policy"
namespace: "onap_x"
spec:
selector:
matchLabels:
app: onap_A
mtls:
mode: STRICT
Authorization Policy Configuration
Note: during Istanbul, service-to-service (workload-to-workload) authorization will be configured first (high priority). Then, OOM will visit end-user-to-service (workload) authorization.
The authorization policy enforces access control to the inbound traffic in the server side Envoy proxy. Each Envoy proxy runs an authorization engine that authorizes requests at runtime.
When a request comes to the proxy, the authorization engine evaluates the request context against the current authorization policies, and returns the authorization result, either ALLOW or DENY.
Istio authorization policies are configured using .yaml files.
<source: https://istio.io/latest/docs/concepts/security/#authentication-policies>
Authorization policies support ALLOW, DENY and CUSTOM actions. The following digram depicts the policy precedence.
CUSTOM → DENY → ALLOW
in ONAP Istanbul, DENY and ALLOW will be configured first, as coarse-grained authorization. Then, CUSTOM action would be considered for fine-grained authorization in the future (as time allows).

<source: https://istio.io/latest/docs/concepts/security/#authentication-policies>
Example,
<source: https://istio.io/latest/docs/concepts/security/#authentication-policies>
Role-Based Access Control
ONAP Istanbul release supports Role-Based Access Control (RBAC). For that, roles (e.g., Admin, user, manager, etc.) will be defined, and mapping between users/groups and roles (m:n) will be set by leveraging Keycloak and Istiod.
ONAP IdAM and External IdAM Interaction
The following diagram depicts the ONAP IdAM and External IdAM Interaction.

#
Choice
1
If the vendor chooses their own IdAM (external IdAM), the AuthN/AuthZ GW should be able to use the external IdAM, instead of ONAP IdAM – full integration choice
A.The external IdAM will handle AuthN/AuthZ for ONAP incoming traffic; session cookie handling, session handling
B.The external IdAM will typically use their own IdP (external IdP) to verify their own user identities
2
If ONAP IdAM is used, based on business use cases ,the ONAP IdAM could interface with the external IdAM, based on IdAM-to-IdAM interface configuration
A.SSO
B.Authentication federation (SAML) – not part of Istanbul
ONAP Security Components, IdAM External IdP(s), and Multi-Tenancy Support
Each service provider will likely have existing centralized IdP that ONAP should integrate with. This architecture supports that integration.
ONAP IdAM proxies via plugin to a designated external IdP server(s)
ONAP IdAM must support fallback mode, where it can act as IdP for cases where external IdP is unavailable
Multi-Tenancy allows the same Username inside different tenants; There is no SSO between different tenants
Realms will be defined to support multi-tenancy, by leveraging Keycloak.
Realms are isolated from one another and can only manage and authenticate the users that they control.
Single External IdP Use Case
The following diagram depicts the Multi-Tenancy Support with single External IdP.

#
Sequence (Single External IdP) - A
1
If ONAP IdAM is connected to a single external IdP, the IdP contains the credentials of all the users from all the tenants
A.AuthN/AuthZ requests including username, password and tenant will be forwarded from ONAP IdAM to the external IdP
B.The external IdP performs the authN/authZ for the user in the tenant realm
C.The external IdP returns with AuthN/AuthZ response
Multiple IdPs Use Case
The following diagram depicts the Multi-Tenancy Support with multiple External IdPs. Each tenant has its own external IdP:

#
Sequence (Multiple IdPs (each tenant has its own external IdP)) - B
2
If each tenant has its own external IdP, the IdP contains the credentials for its tenant users
A.Based on the tenant info in the request, ONAP IdAM selects the correct external IdP
B.ONAP IdAM forwards the request to the selected external IdP
C.The selected external IdP performs AuthN/AuthZ
D.The selected external IdP returns with AuthN/AuthZ response

ONAP Multi-Tenancy Support with Istio
Objective: ONAP components implement the right access controls for different end users/tenants to:
Build& distribute the SDC artifacts
Build & distribute CDS blueprints
Build & distribute custom operational SO workflows
Build & distribute Policies
Build & distribute Control loops
Access / manage the AAI inventory data for objects they own
Publish & consume messages on DMaaP / Kafka topics they have access to
Build, deploy & manage DCAE collectors & DCAE analytics applications
Build, deploy & manage CDS executors
Approach & State:
Single logical instance, with Multi-Tenancy built-in:
SDC: partial implementation through user roles; RBAC for viewing, authoring, updating and distributing
A&AI: is not extensive enough to allow any fine-grained control; set permission per owing group / user, not only per role
SO: different groups of people are allowed to orchestrate/view/change different instances
DMaaP: separate message bus topics based on consumers/producers
Several logic instances, where deployment/configuration for those instances can be handled independently:
Controllers (SDNC):
Collectors (DCAE): it can be deployed and controlled by the tenants, based on the roles
Policy executors (Policy): centralize PAP, Policy API; distribute PDPs by namespace
Istio Tenancy Model
Namespace tenancy: a cluster can be shared across multiple teams, each using a different namespace
Cluster tenancy: using clusters as a unit of tenancy
Mesh tenancy: each mesh can be used as the unit of isolation
<source: https://istio.io/latest/docs/ops/deployment/deployment-models/>
An ONAP option: Istio supports soft multi-tenancy with multiple Istio control planes
One control plane and one mesh per tenant
The cluster admin gets control and visibility across all the Istio control planes
The tenant admin gets control of a specific Istio instance
Separation between the tenants is provided by Kubernetes namespaces and RBAC (define Role and RoleBinding)
ONAP Logging
Logging Propositions
Treat logs as event streams
An App should not concern itself with routing or storage of its output steam
Each running process writes its event stream, unbuffered, to stdout or stderr
Each process stream is:
captured by execution environment,
collated together with other streams from the app
routed to one or more final destinations for viewing and long-term archival
Archival destinations should not be visible to or configurable by the app (separation of concerns, security reasons)
Container logging is different
When pods are evicted, crashed, deleted or scheduled on a different node, the logs from the containers are gone.
Transferring transient local log data in the containers to the centralized (or even distributed) long-term log storage is a must.
Logging Types in Kubernetes
Kubernetes Container Logging: container logs are generated by containerized applications. The typical way to capture container logs is to use STDOUT and STDERR.
Kubernetes Node Logging: container logs are streamed by the container engine to a logging driver which can be realized by an open-source tool such as fluentbit, fluentd. Node-level logs could be kernel logs or systemd logs
Kubernetes Cluster Logging: it refers to Kubernetes itself and all of its system component logs. Kubernetes does not provide a native solution for the cluster-level logging, but they can use various node-level logging agent, sidecar container as defined the following ONAP logging functional architecture.
Kubernetes Events Logging: events hold information about resources state changes or errors related to pod eviction or decisions were made by the scheduler.
Kubernetes Audit Logging: detailed descriptions of each call made to the kube-apiserver. It shows chronological activity event sequence.
Logging Scope
The ONAP application logging will be supported. Other logging scopes are under discussion.
ONAP Application Logging
Security Component Logging (e.g., Keycloak, IdP, external IdAM, external IdP)
Infrastructure Logging (e.g., Kuberenets logging)
xNF Logging(e.g., vFW, VNF, CNF)
Collating across logging scopes is under discussion, as an e2e tracing, but it is out of scope from Istanbul. It is not sure if we can achieve this in the near future release, either.
ONAP Logging Functional Architecture
The following diagram depicts ONAP logging functional architecture:

ONAP supports open-source- and standard-based Logging architecture.
Once all ONAP components push their logs into STDOUT/STDERR, any standard log pipe can work.
Allowing the logging component stack is realized by choices of vendors
ONAP provides a reference implementation/choice
Note: e.g., for apps uses log4j/slf4j properties file, the stdout and stderr redirection can be configured
log4j.appender.stdout = org.apache.log4j.ConsoleAppender log4j.appender.stdout.Threshold = TRACE log4j.appender.stdout.Target = System.out
log4j.appender.stderr = org.apache.log4j.ConsoleAppender log4j.appender.stderr.Threshold = WARN log4j.appender.stderr.Target = System.err
<log4j:configuration> <appender name="stderr" class="org.apache.log4j.ConsoleAppender"> <param name="threshold" value="warn" /> <param name="target" value="System.err"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%-5p %d [%t][%F:%L] : %m%n" /> </layout> </appender> <appender name="stdout" class="org.apache.log4j.ConsoleAppender"> <param name="threshold" value="debug" /> <param name="target" value="System.out"/> <layout class="org.apache.log4j.PatternLayout"> <param name="ConversionPattern" value="%-5p %d [%t][%F:%L] : %m%n" /> </layout> <filter class="org.apache.log4j.varia.LevelRangeFilter"> <param name="LevelMin" value="debug" /> <param name="LevelMax" value="info" /> </filter> </appender> <root> <priority value="debug"></priority> <appender-ref ref="stderr" /> <appender-ref ref="stdout" /> </root> </log4j:configuration>
The stdout or stderr streams will be picked up by the kubelet service running on that node and end up in the PV (PVC).
ONAP logs will be exported to a different and centralized location for security, persistent and aggregation reasons
Log collector sends logs to the aggregator in a different container
Aggregator sends logs to the centralized database in a different container
Logging Functional Blocks:
Collector (one per K8S node)
Aggregator (few per K8S cluster)
Database (one per K8S cluster)
Visualization (one per K8S cluster)
ONAP reference implementation choice:
EFK: Elastic Search, Fluentd, Fluentbit, Kibana
LFG: Loki, Grafana, Fluentd / Flentbit
ONAP logging conforms to SECCOM Container Logging requirements
Event Types
Log Data
Log Management
ONAP Reference Implementation
The above diagram also indicates reference realization, such as:
As a log driver, Fluent Bit is deployed per node as a collector and forwarder of log data to the Fluentd instance
Instead of doing heavy log transformations, just forward logs securely to the Flentd Aggregator cluster
FluentBit (node-level logging agent) needs to be run on every node to collect logs from every POD, so FluentBit is deployed as a DaemonSet (a POD that runs on every node of the cluster)
When FluentBit runs, it will read, parse and filter the logs of every POD and could enrich each entry with the following information (metadata):
POD Name
POD ID
Container Name
Container ID
Labels
Annotations
To obtain this information, a built-in filter plugin called kubernetes talks to the Kubernetes API server to retrieve relevant information such as the pod_id, labels and annotations, other fields such as pod_name, container_id and container_name are retrieved locally from the log file names. All of this is handled automatically, no intervention is required from a configuration aspect.
A Fluentd instance (or a few Fluentd instances) is deployed per cluster as an aggregator - processing the data and routing it to a different destination (e.g., Elastic Search)
It can handle heavy throughput - aggregating from multiple inputs, processing data and routing to different outputs. Also, it can handle a much larger amount of input and output sources
centralized way to transform data; normalize log data and collect additional info from containers
e.g., filter_record_transformer built-in filter plugin

<filter foo.bar>
@type record_transformer
remove_keys hostname,$.kubernetes.pod_id
</filter>
<record>
container ${id = tag.split('.')[5]; JSON.parse(IO.read("/var/lib/docker/containers/#{id}/config.v2.json"))["Name"][1..-1]} hostname "#{Socket.gethostname}"
</record>
or other custom plugins as needed
routing logs to ElasticSearch or equivalent component(s)
Note: for IoT (edge host/equipment), FluentBit could be installed per device, sending data to a Fluentd instance
Elastic Search is a centralized log data indexing and storage
Kibana / Grafana could be used for log data visualization.

FluentBit Internal Pipeline

Logging Generation and Collection Architecture
The following diagram depicts a various ways of log collection.
By leveraging various collection agents and configurations, various-type log data will be collected and aggregated to be stored in the centralized persistent storage. However, it is not clear how we collate heterogeneous log data. In many case, ONAP cannot control external log data contents / formats to extract hints/attributes for collation.

xNF logs will be collected via the VNF collection agent (collector), and will be sent to the Aggregator.
Application (ONAP component) logs will be collected via the application collection agent (collector), and will be sent to the Aggregator.
Infrastructure logs will be collected via syslog collection agent (collector), and will be sent to the Aggregator.
The aggregator can be processed/normalized log data to the centralized database. Note that the database could be one or multiple based on log data types. It would be system provider's deployment choices.
Visualization can be centralized or distributed based on log data types.
System Component Logs
Similar to the container logs, system component logs in the /var/log directory should be rotated once the size exceeds the maximum.
Cluster-Level Logging Requirements and Options (preferred option)
ONAP needs to support Cluster-Level Logging:
Allow users to access application logs event even if a container crashes, a pod get evicted or a node dies
Require a separate backend to store, analyze and query logs
Write to standard output and standard error streams
E.g., A container engine handles and redirects any output generated to a containerized application’s STDOUT and STDERR streams (e.g., /dev/stdout, /dev/stderr)
Support multi-tenancy logging
Cluster-level Logging Architecture (A) - ONAP Reference implementation uses this option
Use a node-level logging agent (e.g., DaemonSet) that runs on every node
the logging agent has access to a directory with log files from all of the application containers on that node
one agent per node is used
~~Include a dedicated sidecar container for logging in an application pod~~
A container engine handles and redirects any output generated to a containerized application's stdout and stderr.
Push logs directly to a backend from within an application
Note:
node-level logging creates only one agent per node and does not require any changes to the applications running on the node
Prerequisite: Application containers write STDOUT and STDERR
A node-level agent collects these logs and forwards them for aggregation
Streaming Sidecar Container with the Logging Agent Architecture (B)
Allows users to separate log streams (sidecar) from applications, which can lack support for writing to STDOUT or STDERR
The sidecar container runs a logging agent, which is configured to pick up logs from an application container
It allows to separate several log streams from different parts of applications
Pod-Level Logging Agent Architecture (C)
If the node-level logging agent is not flexible, write an application-specific sidecar

POD Sidecar Architecture
Why Sidecar Logging, instead of K8S Cluster Level Logging
Not all the container in K8S is writing logs to stdout, though it is recommended. Placing a log sidecar next to the container application would facilitate the uniform logging and distribution.
Note: In ONAP, the POD sidecar is not used because each ONAP application generates app logs to STDOUT or STDERR; i.e., no need to use the POD sidecar.

For the Log sidecar, define three major tags:
Source - define the file details to monitor/lookup and to set format to look out for
Filter - customize the event collected overwriting fields or adding fields
write custom scripts to transform log formats for uniform log contents or additional fields from containers
Match - define what to do with the matching data/log events and where to move

Logging Type Support
Application logging
Platform-level logging, including Kubernetes
Need to define traceability support scope (e.g., service, environment)
Can ONAP (or wants to) trace external components and infrastructure logs?
Security component logging, such as IdAM, IdP
Transaction logging for tracing for specified collation id/request id (e.g., MDC-based logging)
Analysis of ONAP Application Logging Specification
Existing ONAP Application Logging Specification
Wiki page: ONAP Application Logging Specification v1.3 (Frankfurt)
LFN DDF June 2021 Presentation
The ONAP Next Generation Security and Logging Architecture, design and Roadmap was presented at the LFN DDF June 2021 Presentation.
Please visit the https://wiki.lfnetworking.org/display/LN/2021-06-08+-+ONAP%3A+Next+Generation+Security+and+Logging+Architecture%2C+Design+and+Roadmap for presentation slide deck and recordings.
Note: there are several updates in this wiki page on top of the presented materials. So, this wiki page is up-to-date.
References
ONAP Next Generation Security Architecture, https://lf-onap.atlassian.net/wiki/pages/viewpage.action?pageId=16472429
Istio Architecture, https://istio.io/latest/docs/ops/deployment/architecture/
Istio Security, https://istio.io/latest/docs/concepts/security/#authentication-policies
Service Mesh Risk Analysis, https://lf-onap.atlassian.net/wiki/display/DW/Service+Mesh+Risk+Analysis
Service Mesh Impact on Project, https://lf-onap.atlassian.net/wiki/display/DW/Service+Mesh+Impact+on+Projects
ONAP Security Model, https://lf-onap.atlassian.net/wiki/display/DW/ONAP+Security+Model
Logging Architecture Proposal, https://lf-onap.atlassian.net/wiki/display/DW/Logging+Architecture+Proposal
Logging Enhancements Project proposal, https://lf-onap.atlassian.net/wiki/display/DW/Logging+Enhancements+Project+Proposal
SECCOM Container Logging Requirements, https://lf-onap.atlassian.net/wiki/display/DW/CNF+2021+Meeting+Minutes?focusedTaskId=239&preview=/93004634/100896899/2021-02-22_LoggingRequirementEvents_v8.pptx

Overview: ONAP Next Generation Security & Logging Architecture, Design and Roadmap
Provided Rationale for Service-Mesh Security: uniform, open-source- and standard-based security at the platform-level
Istio was chosen for ONAP Service-Mesh Security to fulfill ONAP security requirements
Defined ONAP Security Architecture, leveraging Istio, Keycloak, Cert-Mgr, Ingress, Egress
Multi-Tenancy architecture and plan were addressed
OOM team defined the service-mesh implementation priority for Istanbul; service-to-service mTLS first and then, AAA (AuthN, AuthZ, Audit)
Currently, collecting feedback from NSA
Implementation Plan for Istanbul:
Phase 1: OOM will contribute this mTLS with Service-Mesh (contribution target: SDC, DCAE, SO, AAI, DMaaP, SDNC)
make selected components Service-Mesh compatible
Configure service-to-service mTLS communication
Configure peer authentication policies
Configure authorization policy for coarse-grained authorization
Phase 2: OOM plans to contribute on AAA, Multi-Tenancy and logging (stretch goals)
Deploy/Configure Keycloak as the reference IdAM
Configure Istiod for public key and JWT handling
Configure request authentication policies
Configure additional authorization policies
Migration from AAF to Service-Mesh Security
Why not ONAP AAF (Application AuthZ Framework)
It is a solution for ONAP, which is maintained by ONAP only
AAF CADI plugin supports Java only, where some ONAP components are written in other languages (Python, etc.)
Clients in various languages could have language specific restrictions
AAF needs to keep up with the latest security technology evolution, but no active support in ONAP
AAF is not widely used by ONAP projects due to:
Integration/enforcement of AuthN and AuthZ are done by each project
Mutual-TLS enablement & Secrets/certificate management are done by each project
For encryption, each project needs to understand AAF certificate management mechanism
Compatibility with variety of external authentication systems is a challenge
Third-party microservices requires modification to work with AAF
SSO and Multi-factor authentication support is questionable
We want to achieve simpler, uniform, open source- and standard-based Security (Service Mesh-pattern Security):
Eliminate or reduce microservice infrastructure level security tasks in services
Security is handled at the platform level
Enabling uniform security across applications
Minimize code changes and increase stability
Platform-agnostic abstraction reduces risks on integration, extensibility and customization
Use of Sidecar, so security layer is transparent for the application
External Interfaces secured at Ingress Gateway
Internal Interfaces secured at Service Mesh Sidecar
User Management and RBAC is managed by open-source technologies
Benefits to ONAP, O-RAN and External Components
The uniform, open-source-based and standard-based security would be a foundation for secure integration between ONAP / O-RAN and external components
Product impacts for security would be minimum
Ensure ONAP services have only the business logic of that service
Third-party microservices can participate in ONAP security by leveraging the platform-level security (sidecar)
Vendors can take advantage of open-source-based security, instead of having and maintaining their own security mechanism: cost effective
Service Mesh Security with Istio
The following diagram depicts the Istio mesh architecture overview:
Why Istio?
Istio was chosen because it fulfills the current ONAP security requirements/needs with the following capabilities and yet provides future-proof mechanism.
However, please note that a new platform could be considered in the future as needed.
Istio authentication and authorization:
Provides each service with a strong identity representing its role to enable interoperability across clusters and clouds.
Secures service-to-service communication with authentication and authorization.
Provides a key management system to automate key and certificate generation, distribution, and rotation.
Microservices have security needs:
To defend against man-in-the-middle attacks, they need traffic encryption.
To provide flexible service access control, they need mTLS (mutual TLS) and fine-grained access policies.
To determine who did what at what time, they need auditing tools.
With Control and Data planes, mTLS, Ingress, Proxy and Egress, Istio provides Security features for :
Strong identity
Powerful policy
Transparent TLS encryption
Authentication, Authorization and Audit (AAA) tools to protect your services and data
Goals of Istio-based security are:
Security by default: no changes needed to application code and infrastructure
Defense in depth: integrate with existing security systems to provide multiple layers of defense
Zero-trust network: build security solutions on distrusted networks
ONAP Security Architecture - Leveraging Istio, Keycloak, Cert-Mgr, Ingress, Egress
The following diagram depicts ONAP Security Architecture.

Security Functional Blocks
Ingress Controller (realized by istio-ingress)
AuthN/AuthZ Gateway (sidecar) – future consideration
ONAP IdAM (realized by Keycloak)
User Administration
Tenant Administration
Proxy (sidecar) – for data plane
oauth2-proxy
External IdP (vendor-provided component)
External IdAM
Egress
Istiod – control plane (pilot, citadel, galley, cert-mgr)
Certificate Authority (CA)
Security Architecture Highlights
Use of control plane and pluggable sidecar (platform container) proxies to protect application service
Intercept traffic and check AuthN/AuthZ with IdAM and IdP
Support SSO for Human user access request
Compatibility with variety of external authN/authZ systems (e.g., external IdP, IdAM)
Allowing external IdP(s) and IdAM(s)
Enabling a foundation for authN federation
RBAC support, based on user role, group & permission
Multi-Tenancy support
ONAP Component Security Requirements
The following is ONAP component security requirements.
#
Requirement
1
No implicit dependencies between common components and ONAP
3
istio-ingress is used as ingress controller
4
Separate Northbound (for user) and Southbound (xNF) interfaces
5
Every ingress gateway terminates the TLS and re-encrypts the payload before sending to the destination component using mTLS
6
ISTIO network policy must be configured for only authorized services communications
7
AuthN between services is done using certs via mTLS
8
OpenID Connect is used to authenticate user
9
ONAP IdAM is realized by Keycloak, but it can be replaced with an external IdAM that is compatible with OIDC; authentication federation is out-of-scope from Istanbul
10
Cert-manager and Citadel are used to manage (CRUDQ) certificates
11
Kubernetes is configured to use encryption at rest
12
ISTIO automated sidecar injection is configured in underlying Kubernetes
13
No root pods are used
14
All DB is considered external
16
Support possible options to integrate with LDAP, Kerberos, AAF as IdP

ONAP OOM Plan - Istanbul Priorities
The following diagram depicts ONAP OOM plan and Istanbul priorities:
note:
Istio 1.10 is used
for Istanbul, the ONAP default IdAM (realized by Keycloak) will be used as the reference implementation/configuration.
Keycloak internal IdP will be used
Multi-tenancy support of external IdAM and IdP will not configured

ONAP Component mTLS (A): make core ONAP deployable on Service Mesh (mTLS first without AAA): basic_onboarding, basic_vm automated integration tests
Service Mesh compatibility:
DMaaP Compatibility on service mesh
Cassandra (and Istio 1.10?)
AAI is ongoing
issue on jobs; trying here to "quitquitquit" solution (cleanly exit the server)
issue on sparky be: there's an aaf leftover part
cannot test part needing cassandra for now
SDC is ongoing
SO is planned
DCAE is ongoing
SDNC is planned
Better use of Service Mesh within components:
SO: patch ongoing: https://gerrit.onap.org/r/c/so/+/122263
Plan to check if we can run without authentication on these components; also check if we enlarge # of components
Communication Protocol between Service and Proxy and between Proxies
The communication protocol between Service and Proxy is "HTTP".
Service-Mesh Security architecture supports security at the platform level
Application does not handle security directly
For that, Service and the corresponding Proxy will sit on the same POD to make their communication internal
The communication protocol between Proxies is mTLS via "HTTPs".

ONAP Component AAA (B) – stretch goal for Istanbul
onboard roles and realm on Keycloak for tests / reference implementation (use of OIDC / JWT)
add oauth2-proxy in the solution to redirect unauthenticated traffic to SSO Portal (keycloak as example)
add some rules to enforce (AuthorizationPolicy)
add some service accounts (work ongoing)
OAuth2-based Access Token AuthN/AuthZ
The following diagram depicts OAuth2-based Access Token AuthN/AuthZ.
If the access token is used (unless self-contained access token), the Resource Server needs to ask the Authorization Server for the access token validation.
This can cause chatty communication between the Resource Server and the Authorization Server.
By leveraging the JWT (JSON Web Token), the above chatty communication can be reduced.
In ONAP, the JWT is used for the browser client and application client request authN and authZ - see the following

OAuth2-based JWT AuthN/AuthZ
The following diagram depicts OAuth2-based JWT AuthN/AuthZ.
By leveraging the JWT (JSON Web Token), the chatty communication between the ResourceServer and the Authorization Server can be removed.
In ONAP, the JWT is used for the browser client and application client request authN and authZ - see the following
ONAP Browser Client Request Authentication and Authorization Use Case
The following diagram depicts the browser client request authentication and authorization process.

# Sequence
0
Configuration steps:
Create private key and public key in Keycloak
Configure login page at Keycloak. Note: login page is configurable and centralized, not part of applications. It is a must for SSO.
Configure Envoy Proxy in istiod
JWT-auth filter
where to retrieve Keycloak's public key (JWKSUri) or embed public in JWKS format
Issuer: who has generated the JWK token (in this case, Keycloak)
Istiod distributes the configuration to all the Envoy proxy (e.g., public key, certificate, etc.)
1 Client sends a request without token
2 Ingress intercepts and forwards the request to Envoy proxy
3
If the request without token, oauth2-proxy redirects the traffic to Keycloak (reference configuration; could configure to use external IdAM) for login process
If JWT is given in the request header, oauth2-proxy asks Envoy proxy for JWT token validation (for the step 10)
If authentication cookie is given, oauth2-proxy adds JWT authentication header to the matching cookie, and it asks Envoy proxy for JWT token validation (for the stop 10)
4
Keycloak (with oauth2-proxy help) redirects the request to the login page.
Note: this login page is configured at KeyCloak, and is provided to the user. This redirection is important because uses are completely isolated from applications and application never see a user's credentials. Applications instead are given an identity token or assertion that is cryptographically signed.
5 Client logs in with user credentials
6 If user credentials is valid, Keycloak generates JWT token with private/public key pair and send the generated token to the client
7 Client stores JWT token in local storage
8 Client sends a request with JWT token
9 Ingress intercepts and forwards the request to Envoy proxy
10 Envoy proxy validates JWT token with a public key
11 If JWT is valid, it enforces the security policy (e.g., authN, authZ). Once all is passed, it sends the request to service
12 if JWT is not valid, go to the step3 for the re-login process. If the max try is exceeded, error handling will be processed (details are TBD)
Machine Client Request Authentication and Authorization
The following diagram depicts the machine client request authentication and authorization process, by leveraging IdAM and Istio which support JWT.

# Sequence
0
Configuration steps:
Create private key and public key in Keycloak
Configure Envoy Proxy in istiod
JWT-auth filter
where to retrieve Keycloak's public key (JWKSUri) or embed public in JWKS format
Issuer: who has generated the JWK token (in this case, Keycloak)
Istiod distributes the configuration to all the Envoy proxy (e.g., public key, certificate, etc.)
1 Client logs in with client credentials towards Keycloak
2 If client credentials is valid, Keycloak generates JWT token with private/public key pair and send the generated token to the client
3 Client stores JWT token in local storage
4 Client sends a request with JWT token
5 Ingress intercepts and forwards the request to Envoy proxy
6 Envoy proxy validates JWT token with a public key
7 If JWT is valid, it enforces the security policy (e.g., authN, authZ). Once all is passed, it sends the request to service
8 If JWT is not valid, error handling will be processed (details are TBD)
Request Authentication Configuration
Request authentication policies specify the values needed to validate a JWT. Istio checks the presented token.
If presented requests are against the rules in the request authentication policy, it rejects requests with invalid tokens
When requests carry no token, they are accepted, but authorization policies can validate requests
If more than one policy matches a workload, Istio combines all rules
See the authentication policy section for details.
Service-2-Service (Peer) mTLS Communication With Istio
The Service-to-Service authN is supported by leveraging mTLS certificates, instead of username/password
Also, authorization will be supported for mTLS by Istio authorization configuration
Two services will communicate using a direct connection (e.g., via REST APIs) without passing thru Ingress Controller or MSB
mTLS Certification Management
The following diagram depicts the Istio Certificate management and interaction with the Proxy (istio-agent and envoy proxy work together with istiod to automate key and certificate roation):

#
Sequence (mTLS Certification Management)
1
When started, Istio agent creates private key & CSR (certificate signing request)
2
Istio agent sends CSR with its credentials to istiod for signing
3
CA in istioid validates credentials in the CSR, signs CSR to generate certificate upon successful validation
4
CA in istioid sends the signed certificate to istio-agent
5
When a workload is started, Envoy proxy requests certificate and key from istio-agent via SDS (secret discovery service) API
6
Istio-agent sends certificates and private key to Envoy Proxy
7
Istio-agent monitors the expiration of the workload certificate. The above process repeats periodically for certificate and key rotation
mTLS Service-2-Service Secure communication with Authentication and Authorization
The following diagram depicts the Service-to-Service mTLS secure communication with Authentication and Authorization:

#
Sequence (mTLS Authentication Flow) - TLSv1_2
1
Client application in a container sends a plain-text HTTP request towards the server
2
Client proxy container intercepts the outbound request
3
Client proxy performs a TLS handshake with the server-side proxy
A. This handshake exchanges client and server certificates
B. Istio preloaded certificates the proxy containers (see diagram A)
4
Client proxy checks for ‘secure naming’ on the server’s certificate, and verify the server identify (service account presented in the server certificate) is authorized to run the target service
5
Client and server establish a mTLS connection
6 Istio forwards the traffic from the client side proxy to the server side proxy
7
The server side proxy authorizes the request

A. if authorized, Istio forwards the traffic to the server application service through local TCP

B. if not authorized, an error condition arises

Authentication Mode
Peer authentication policies specify the mTLS mode that Istio enforces on target workloads:
PERMISSIVE: workloads accept both mTLS and plain text traffic
STRICT: workloads accepts only mTLS traffic
DISABLE: mTLS is disabled
A permissive mode allows a service to accept both plaintext traffic and mutual TLS traffic at the same time. During the transition period, the permissive mode would be allowed for non-Istio clients communicate with non-Istio servers. Once ONAP components are Service-Mesh enabled, the permissive mode could be disabled.
Authentication Policy Configuration
Note: during Istanbul, peer authentication will be configured first (high priority). Then, OOM will visit the request authentication.
Authentication Policies are configured in .yaml files, and are saved in the Istio configuration storage once deployed. The Istio controller watches the configuration storage. Upon any policy changes, Istio sends configurations to the targeted endpoints (e.g., Envoy Proxy). Once the proxy receives the configuration, the new authentication requirement takes effect immediately on that pod.
for request authentication, the client application is responsible for acquiring and attaching the JWT credential to the request.
for peer authentication, Istio automatically upgrades all traffic between two PEPs to mTLS.
If authentication policies disable mTLS mode, Istio continues to use plain text between PEPs. In ONAP Istanbul, mTLS mode will be enabled. Otherwise, the disabled use cases will be reviewed as an exception request.
Example,
apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
name: "onap-peer-policy"
namespace: "onap_x"
spec:
selector:
matchLabels:
app: onap_A
mtls:
mode: STRICT
Authorization Policy Configuration
Note: during Istanbul, service-to-service (workload-to-workload) authorization will be configured first (high priority). Then, OOM will visit end-user-to-service (workload) authorization.
The authorization policy enforces access control to the inbound traffic in the server side Envoy proxy. Each Envoy proxy runs an authorization engine that authorizes requests at runtime.
When a request comes to the proxy, the authorization engine evaluates the request context against the current authorization policies, and returns the authorization result, either ALLOW or DENY.
Istio authorization policies are configured using .yaml files.
<source: https://istio.io/latest/docs/concepts/security/#authentication-policies>
Authorization policies support ALLOW, DENY and CUSTOM actions. The following digram depicts the policy precedence.
CUSTOM → DENY → ALLOW
in ONAP Istanbul, DENY and ALLOW will be configured first, as coarse-grained authorization. Then, CUSTOM action would be considered for fine-grained authorization in the future (as time allows).

<source: https://istio.io/latest/docs/concepts/security/#authentication-policies>
Example,
<source: https://istio.io/latest/docs/concepts/security/#authentication-policies>
Role-Based Access Control
ONAP Istanbul release supports Role-Based Access Control (RBAC). For that, roles (e.g., Admin, user, manager, etc.) will be defined, and mapping between users/groups and roles (m:n) will be set by leveraging Keycloak and Istiod.
ONAP IdAM and External IdAM Interaction
The following diagram depicts the ONAP IdAM and External IdAM Interaction.

#
Choice
1
If the vendor chooses their own IdAM (external IdAM), the AuthN/AuthZ GW should be able to use the external IdAM, instead of ONAP IdAM – full integration choice
A.The external IdAM will handle AuthN/AuthZ for ONAP incoming traffic; session cookie handling, session handling
B.The external IdAM will typically use their own IdP (external IdP) to verify their own user identities
2
If ONAP IdAM is used, based on business use cases ,the ONAP IdAM could interface with the external IdAM, based on IdAM-to-IdAM interface configuration
A.SSO
B.Authentication federation (SAML) – not part of Istanbul
ONAP Security Components, IdAM External IdP(s), and Multi-Tenancy Support
Each service provider will likely have existing centralized IdP that ONAP should integrate with. This architecture supports that integration.
ONAP IdAM proxies via plugin to a designated external IdP server(s)
ONAP IdAM must support fallback mode, where it can act as IdP for cases where external IdP is unavailable
Multi-Tenancy allows the same Username inside different tenants; There is no SSO between different tenants
Realms will be defined to support multi-tenancy, by leveraging Keycloak.
Realms are isolated from one another and can only manage and authenticate the users that they control.
Single External IdP Use Case
The following diagram depicts the Multi-Tenancy Support with single External IdP.

#
Sequence (Single External IdP) - A
1
If ONAP IdAM is connected to a single external IdP, the IdP contains the credentials of all the users from all the tenants
A.AuthN/AuthZ requests including username, password and tenant will be forwarded from ONAP IdAM to the external IdP
B.The external IdP performs the authN/authZ for the user in the tenant realm
C.The external IdP returns with AuthN/AuthZ response
Multiple IdPs Use Case
The following diagram depicts the Multi-Tenancy Support with multiple External IdPs. Each tenant has its own external IdP:

#
Sequence (Multiple IdPs (each tenant has its own external IdP)) - B
2
If each tenant has its own external IdP, the IdP contains the credentials for its tenant users
A.Based on the tenant info in the request, ONAP IdAM selects the correct external IdP
B.ONAP IdAM forwards the request to the selected external IdP
C.The selected external IdP performs AuthN/AuthZ
D.The selected external IdP returns with AuthN/AuthZ response

ONAP Multi-Tenancy Support with Istio
Objective: ONAP components implement the right access controls for different end users/tenants to:
Build& distribute the SDC artifacts
Build & distribute CDS blueprints
Build & distribute custom operational SO workflows
Build & distribute Policies
Build & distribute Control loops
Access / manage the AAI inventory data for objects they own
Publish & consume messages on DMaaP / Kafka topics they have access to
Build, deploy & manage DCAE collectors & DCAE analytics applications
Build, deploy & manage CDS executors
Approach & State:
Single logical instance, with Multi-Tenancy built-in:
SDC: partial implementation through user roles; RBAC for viewing, authoring, updating and distributing
A&AI: is not extensive enough to allow any fine-grained control; set permission per owing group / user, not only per role
SO: different groups of people are allowed to orchestrate/view/change different instances
DMaaP: separate message bus topics based on consumers/producers
Several logic instances, where deployment/configuration for those instances can be handled independently:
Controllers (SDNC):
Collectors (DCAE): it can be deployed and controlled by the tenants, based on the roles
Policy executors (Policy): centralize PAP, Policy API; distribute PDPs by namespace
Istio Tenancy Model
Namespace tenancy: a cluster can be shared across multiple teams, each using a different namespace
Cluster tenancy: using clusters as a unit of tenancy
Mesh tenancy: each mesh can be used as the unit of isolation
<source: https://istio.io/latest/docs/ops/deployment/deployment-models/>
An ONAP option: Istio supports soft multi-tenancy with multiple Istio control planes
One control plane and one mesh per tenant
The cluster admin gets control and visibility across all the Istio control planes
The tenant admin gets control of a specific Istio instance
Separation between the tenants is provided by Kubernetes namespaces and RBAC (define Role and RoleBinding)
ONAP Logging
Logging Propositions
Treat logs as event streams
An App should not concern itself with routing or storage of its output steam
Each running process writes its event stream, unbuffered, to stdout or stderr
Each process stream is:
captured by execution environment,
collated together with other streams from the app
routed to one or more final destinations for viewing and long-term archival
Archival destinations should not be visible to or configurable by the app (separation of concerns, security reasons)
Container logging is different
When pods are evicted, crashed, deleted or scheduled on a different node, the logs from the containers are gone.
Transferring transient local log data in the containers to the centralized (or even distributed) long-term log storage is a must.
Logging Types in Kubernetes
Kubernetes Container Logging: container logs are generated by containerized applications. The typical way to capture container logs is to use STDOUT and STDERR.
Kubernetes Node Logging: container logs are streamed by the container engine to a logging driver which can be realized by an open-source tool such as fluentbit, fluentd. Node-level logs could be kernel logs or systemd logs
Kubernetes Cluster Logging: it refers to Kubernetes itself and all of its system component logs. Kubernetes does not provide a native solution for the cluster-level logging, but they can use various node-level logging agent, sidecar container as defined the following ONAP logging functional architecture.
Kubernetes Events Logging: events hold information about resources state changes or errors related to pod eviction or decisions were made by the scheduler.
Kubernetes Audit Logging: detailed descriptions of each call made to the kube-apiserver. It shows chronological activity event sequence.
Logging Scope
The ONAP application logging will be supported. Other logging scopes are under discussion.
ONAP Application Logging
Security Component Logging (e.g., Keycloak, IdP, external IdAM, external IdP)
Infrastructure Logging (e.g., Kuberenets logging)
xNF Logging(e.g., vFW, VNF, CNF)
Collating across logging scopes is under discussion, as an e2e tracing, but it is out of scope from Istanbul. It is not sure if we can achieve this in the near future release, either.
ONAP Logging Functional Architecture
The following diagram depicts ONAP logging functional architecture:

ONAP supports open-source- and standard-based Logging architecture.
Once all ONAP components push their logs into STDOUT/STDERR, any standard log pipe can work.
Allowing the logging component stack is realized by choices of vendors
ONAP provides a reference implementation/choice
ONAP logs will be exported to a different and centralized location for security, persistent and aggregation reasons
Log collector sends logs to the aggregator in a different container
Aggregator sends logs to the centralized database in a different container
Logging Functional Blocks:
Collector (one per K8S node)
Aggregator (few per K8S cluster)
Database (one per K8S cluster)
Visualization (one per K8S cluster)
ONAP reference implementation choice:
EFK: Elastic Search, Fluentd, Fluentbit, Kibana
LFG: Loki, Grafana, Fluentd / Flentbit
ONAP logging conforms to SECCOM Container Logging requirements
Event Types
Log Data
Log Management
ONAP Reference Implementation
The above diagram also indicates reference realization, such as:
As a log driver, Fluent Bit is deployed per node as a collector and forwarder of log data to the Fluentd instance
Instead of doing heavy log transformations, just forward logs securely to the Flentd Aggregator cluster
FluentBit (node-level logging agent) needs to be run on every node to collect logs from every POD, so FluentBit is deployed as a DaemonSet (a POD that runs on every node of the cluster)
When FluentBit runs, it will read, parse and filter the logs of every POD and could enrich each entry with the following information (metadata):
POD Name
POD ID
Container Name
Container ID
Labels
Annotations
To obtain this information, a built-in filter plugin called kubernetes talks to the Kubernetes API server to retrieve relevant information such as the pod_id, labels and annotations, other fields such as pod_name, container_id and container_name are retrieved locally from the log file names. All of this is handled automatically, no intervention is required from a configuration aspect.
A Fluentd instance (or a few Fluentd instances) is deployed per cluster as an aggregator - processing the data and routing it to a different destination (e.g., Elastic Search)
It can handle heavy throughput - aggregating from multiple inputs, processing data and routing to different outputs. Also, it can handle a much larger amount of input and output sources
centralized way to transform data; normalize log data and collect additional info from containers
e.g., filter_record_transformer built-in filter plugin

<filter foo.bar>
@type record_transformer
remove_keys hostname,$.kubernetes.pod_id
</filter>
<record>
container ${id = tag.split('.')[5]; JSON.parse(IO.read("/var/lib/docker/containers/#{id}/config.v2.json"))["Name"][1..-1]} hostname "#{Socket.gethostname}"
</record>
or other custom plugins as needed
routing logs to ElasticSearch or equivalent component(s)
Note: for IoT (edge host/equipment), FluentBit could be installed per device, sending data to a Fluentd instance
Elastic Search is a centralized log data indexing and storage
Kibana / Grafana could be used for log data visualization.

FluentBit Internal Pipeline

Logging Generation and Collection Architecture
The following diagram depicts a various ways of log collection.
By leveraging various collection agents and configurations, various-type log data will be collected and aggregated to be stored in the centralized persistent storage. However, it is not clear how we collate heterogeneous log data. In many case, ONAP cannot control external log data contents / formats to extract hints/attributes for collation.

xNF logs will be collected via the VNF collection agent (collector), and will be sent to the Aggregator.
Application (ONAP component) logs will be collected via the application collection agent (collector), and will be sent to the Aggregator.
Infrastructure logs will be collected via syslog collection agent (collector), and will be sent to the Aggregator.
The aggregator can be processed/normalized log data to the centralized database. Note that the database could be one or multiple based on log data types. It would be system provider's deployment choices.
Visualization can be centralized or distributed based on log data types.
System Component Logs
Similar to the container logs, system component logs in the /var/log directory should be rotated once the size exceeds the maximum.
Cluster-Level Logging Requirements and Options (preferred option)
ONAP needs to support Cluster-Level Logging:
Allow users to access application logs event even if a container crashes, a pod get evicted or a node dies
Require a separate backend to store, analyze and query logs
Write to standard output and standard error streams
E.g., A container engine handles and redirects any output generated to a containerized application’s STDOUT and STDERR streams (e.g., /dev/stdout, /dev/stderr)
Support multi-tenancy logging
Cluster-level Logging Architecture (A)
Use a node-level logging agent (e.g., DaemonSet) that runs on every node
the logging agent has access to a directory with log files from all of the application containers on that node
one agent per node is used
Include a dedicated sidecar container for logging in an application pod
Push logs directly to a backend from within an application
Note:
node-level logging creates only one agent per node and does not require any changes to the applications running on the node
Prerequisite: Application containers write STDOUT and STDERR
A node-level agent collects these logs and forwards them for aggregation
Streaming Sidecar Container with the Logging Agent Architecture (B)
Allows users to separate log streams (sidecar) from applications, which can lack support for writing to STDOUT or STDERR
The sidecar container runs a logging agent, which is configured to pick up logs from an application container
It allows to separate several log streams from different parts of applications
Pod-Level Logging Agent Architecture (C)
If the node-level logging agent is not flexible, write an application-specific sidecar

POD Sidecar Architecture
Why Sidecar Logging, instead of K8S Cluster Level Logging
Not all the container in K8S is writing logs to stdout, though it is recommended. Placing a log sidecar next to the container application would facilitate the uniform logging and distribution.

For the Log sidecar, define three major tags:
Source - define the file details to monitor/lookup and to set format to look out for
Filter - customize the event collected overwriting fields or adding fields
write custom scripts to transform log formats for uniform log contents or additional fields from containers
Match - define what to do with the matching data/log events and where to move

Logging Type Support
Application logging
Platform-level logging, including Kubernetes
Need to define traceability support scope (e.g., service, environment)
Can ONAP (or wants to) trace external components and infrastructure logs?
Security component logging, such as IdAM, IdP
Transaction logging for tracing for specified collation id/request id (e.g., MDC-based logging)
Analysis of ONAP Application Logging Specification
Existing ONAP Application Logging Specification
Wiki page: ONAP Application Logging Specification v1.3 (Frankfurt)
LFN DDF June 2021 Presentation
The ONAP Next Generation Security and Logging Architecture, design and Roadmap was presented at the LFN DDF June 2021 Presentation.
Please visit the https://wiki.lfnetworking.org/display/LN/2021-06-08+-+ONAP%3A+Next+Generation+Security+and+Logging+Architecture%2C+Design+and+Roadmap for presentation slide deck and recordings.
Note: there are several updates in this wiki page on top of the presented materials. So, this wiki page is up-to-date.
References
ONAP Next Generation Security Architecture, https://lf-onap.atlassian.net/wiki/pages/viewpage.action?pageId=16472429
Istio Architecture, https://istio.io/latest/docs/ops/deployment/architecture/
Istio Security, https://istio.io/latest/docs/concepts/security/#authentication-policies
Service Mesh Risk Analysis, https://lf-onap.atlassian.net/wiki/display/DW/Service+Mesh+Risk+Analysis
Service Mesh Impact on Project, https://lf-onap.atlassian.net/wiki/display/DW/Service+Mesh+Impact+on+Projects
ONAP Security Model, https://lf-onap.atlassian.net/wiki/display/DW/ONAP+Security+Model
Logging Architecture Proposal, https://lf-onap.atlassian.net/wiki/display/DW/Logging+Architecture+Proposal
Logging Enhancements Project proposal, https://lf-onap.atlassian.net/wiki/display/DW/Logging+Enhancements+Project+Proposal
SECCOM Container Logging Requirements, https://lf-onap.atlassian.net/wiki/display/DW/CNF+2021+Meeting+Minutes?focusedTaskId=239&preview=/93004634/100896899/2021-02-22_LoggingRequirementEvents_v8.pptx

#	Sequence (mTLS Certification Management)
1	When started, Istio agent creates private key & CSR (certificate signing request)
2	Istio agent sends CSR with its credentials to istiod for signing
3	CA in istioid validates credentials in the CSR, signs CSR to generate certificate upon successful validation
4	CA in istioid sends the signed certificate to istio-agent
5	When a workload is started, Envoy proxy requests certificate and key from istio-agent via SDS (secret discovery service) API
6	Istio-agent sends certificates and private key to Envoy Proxy
7	Istio-agent monitors the expiration of the workload certificate. The above process repeats periodically for certificate and key rotation

#	Sequence (mTLS Authentication Flow) - TLSv1_2
1	Client application in a container sends a plain-text HTTP request towards the server
2	Client proxy container intercepts the outbound request
3	Client proxy performs a TLS handshake with the server-side proxy
	A. This handshake exchanges client and server certificates
	B. Istio preloaded certificates the proxy containers (see diagram A)
4	Client proxy checks for ‘secure naming’ on the server’s certificate, and verify the server identify (service account presented in the server certificate) is authorized to run the target service
5	Client and server establish a mTLS connection
6	Istio forwards the traffic from the client side proxy to the server side proxy
7	The server side proxy authorizes the request
	A. if authorized, Istio forwards the traffic to the server application service through local TCP
	B. if not authorized, an error condition arises

#	Choice
1	If the vendor chooses their own IdAM (external IdAM), the AuthN/AuthZ GW should be able to use the external IdAM, instead of ONAP IdAM – full integration choice
	A.The external IdAM will handle AuthN/AuthZ for ONAP incoming traffic; session cookie handling, session handling
	B.The external IdAM will typically use their own IdP (external IdP) to verify their own user identities
2	If ONAP IdAM is used, based on business use cases ,the ONAP IdAM could interface with the external IdAM, based on IdAM-to-IdAM interface configuration
	A.SSO
	B.Authentication federation (SAML) – not part of Istanbul

#	Sequence (Single External IdP) - A
1	If ONAP IdAM is connected to a single external IdP, the IdP contains the credentials of all the users from all the tenants
	A.AuthN/AuthZ requests including username, password and tenant will be forwarded from ONAP IdAM to the external IdP
	B.The external IdP performs the authN/authZ for the user in the tenant realm
	C.The external IdP returns with AuthN/AuthZ response

#	Sequence (Multiple IdPs (each tenant has its own external IdP)) - B
2	If each tenant has its own external IdP, the IdP contains the credentials for its tenant users
	A.Based on the tenant info in the request, ONAP IdAM selects the correct external IdP
	B.ONAP IdAM forwards the request to the selected external IdP
	C.The selected external IdP performs AuthN/AuthZ
	D.The selected external IdP returns with AuthN/AuthZ response

Browser not supported