/
Common information model, Data lake and Access control

Common information model, Data lake and Access control

A real concern is the potential that the CPS will make managing coupling between ONAP components difficult.

Decision

We will start with Architectural Approach A in the PoC with the aim of fully supporting Architectural Approach C.

Details

Micro-service architecture strives to control coupling. The deployment of micro-services enforces good behavior, only published interfaces are available at run-time. This is not new. Encapsulation and data hiding have been around as long as software. What is new is the enforcement at run-time. Note also that mechanisms been put in place to soften this encapsulation: the C++ friend declaration; Java package protected. There are reasons for both approaches and as always there is a balance. The two approaches are not mutually exclusive.

By making data available to all, this control is potentially eroded. Some interesting reading on the subject:

There are several aspects to this item. For the purposes of this conversation, a couple of very limited definitions:

  • A Common Information Model is the ability to understand data. It is available to all.

  • A Data Lake is the ability to access data. It is available to all.

  • Access Control is a mechanism to grant or deny access to models or data

Related to this is how data is stored in the DBMS. Using SQL,

  1. is the data (related to a all models and instances) stored in a single database or

  2. is the data (related to a specific model and instance) stored in separate databases

    1. Secondary question for 2: within a shared or separate DBMS

Architectural approach A

All data access id via interfaces provided by the owning service.

Classic µService architecture. (One extreme)

  • No common information model

  • No data lake.

  • No need for access control.

Access to data is always via the µService API. This can cause

  • Latency in data access (multiple serializations)

  • Data transformation (multiple representations)

  • Data duplication (multiple representation 'cached' in multiple data stores)

  • Data deployment purely a deployment concern.

Architectural approach B

Data access to any data is allowed regardless of ownership.

Common information model and data lake. (Another extreme)

  • High risk of uncontrolled coupling between components

  • Difficult life-cycle management, in particular upgrade

  • Risk that complex inter-related objects become inconsistent thus breaking the owning µService

  • Implies that a single database is shared – data deployment involves understanding use cases

Access to data is either via a service or directly to the data store

  • Some use cases can have significant performance improvement

  • Integration options increase

Access control may be used to mitigate the negatives of this approach. However this can be problematic in the selection of the DB technology, and the load on the DBMS

Architectural approach C

Data access is permitted only when there is an explicit agreement between components.

Hybrid approach.

  • A single component is designated owner of the model and data related to it. This is not a common information model.

  • Knowledge of the model is always via an interface provided by the µService that owns it.

  • Access to data is not possible without knowledge of the model. This is not a data lake.

  • Data deployment involves understanding use cases

µServices given access to the model may then query the data store directly, bypassing the owning µService.

This approach can have additional benefit in that it may help with data access race conditions.

I.e. it becomes a decision on the part of the owner to 'relax' encapsulation. The coupling is made visible.

Access to data is either via a service or directly to the data store (when provided with the model)

  • Some use cases can have significant performance improvement

  • Integration options increase

Access control may be used to enhance the 'security-through-obscurity', without impacting complexity, DB technology selection or data access performance.

This approach is backwardly compatible with Architectural Approach A.

Discussion