CCSDK-2756 DM: [Spike] Propose C&PS Data Model

1        Flows

This chapter outlines the basic flows from an application perspective to be used as a base for POC. The following diagrams outlines the data and relations between.

An anchor point does not need to have a model attached until it receives data. The relation with A&AI (ANIA in diagram) is only logical – the anchor point name shall be supplied by the application creating it.

1.1       Scaffolding

1.1.1       Applications and dataspace(s)

Before any other operation a system or human user will have to define an application that owns one or more . Dataspaces are unique within the system. They are assumed to be well known by the application.

1.1.2       Model store

A system or human user should be able to store and retrieve one or more module sets. A module set is a combination of YAM identified by module name and revision modules of a specific version.

A module set name and version shall be unique within the system. Duplicates shall be rejected.

As a user of CPS I want to be able to:

  1. Merge and store a set of YAM into a model set.

  2. List the models in the store

  3. Upgrade a model set

1.1.3       Model instance loading and validation

A system or human user should be able to load from a file and validate it against a model set.

As a user of CPS I want to be able to:

  • (PoC) Store and validate a set of modules (a model file)

  • (Full project) Load a model instance and validate it given a model set reference.(using a separate SPI)

1.2   Anchor persistence

1.2.1       Anchor persistence/retrieval

A system or human user should be able to create an anchor . This serves the purpose of providing a name (or ID) for the model data and a root for the structure (together with the odd case of floating leaves at the top of a module).  

As a user of CPS I want to be able to: 

  • Create an anchor given a name and a dataspace 

  • Retrieve an anchor and the associated attributes given a name and a dataspace

  • List the anchors in the system given a dataspace

  • Delete an anchors given a name and a dataspace 

An anchor point name shall be unique within a dataspace. Duplicates shall be rejected. If a dataspace is not defined in the system, it shall be created upon anchor point creation. 

1.2.2       Anchor models set association

A system or human user should be able to associate an anchor point to a model set. 

As a user of CPS I want to be able to: 

  • Associate an anchor point to a module set given aanchor point and a module set reference. (upon creation)

1.3       Fragment (Node) Persistence

1.3.1       Fragment (Node) persistence

A system or human user should be able to persist and retrieve a model data fragment ie. Node under a parent fragment. A fragment is considered an anchor point, a container or a list element, the xpath identifying the fragment together with the leaf and leaf-list attributes.
As a user of CPS I want to be able to:

  • Persist a fragment given an

    • Anchor

  • Retrieve a fragment given a

    • Anchor

    • Xpath expression (limited to path and key equality, e.g. /A/B/C will grab a container while /A/B/C/ L[@key1='a key'][@key2='another key'] will grab a keyed list element)

1.4       Queries

1.4.1       Model instance fragment queries (parent)

A system or human user should be able to retrieve the parent of a fragment. 
As a user of CPS I want to be able to:

  • Retrieve the parent of a fragment given an

    • an xpath (this a a regular get , not really a query)

1.4.2       Model instance fragment queries (anchor point)

A system or human user should be able to retrieve the anchor point of a model instance fragment.

As a user of CPS I want to be able to:

  • Retrieve the Anchor of a Node given an

    • A Node

1.4.3       Containment Query

A system or human user should be able to fetch the entire model data under an anchor point as a list of fragments 

As a user of CPS I want to be able to: 

  • Retrieve all the fragments under an anchor point given a

    • anchor point (with level restriction!)

1.4.4       Type Query

A system or human user should be able to extract all the fragments that match a dataspace and a schema node identifier.   

As a user of CPS I want to be able to: 

  • Retrieve all the relevant fragments (nodes) given an

    • Schema node identifier 

2        Database backend

2.1 A Few considerations

  • If we can ignore mutability of key attributes the schema largely holds, an XPATH can be used as unique key for each data element (fragment) 

  • We should stick to an XPATH with equality operators on key attributes only (e.g. no access by list index).

  • 255 characters may not be enough for an XPATH.

  • Conditions on multiple key attributes in the XPATH will have to be ordered to work as a single string.

  • 'MO type' will require another definition since it is path dependent in Yang. Schema node identifier is as close as it gets.

  • Given the work done in Neo4J to flatten attributes the use of a JSON format (possibly binary) in another DB technology makes sense – it will be easier to store and retrieve (no need to flatten the whole structure).

2.1.1       Granularity

How do we break up the model instances into records in the DB? Leaf and list-leaf do not pose a challenge, they can stay with whatever parent object they are coming from and be stored together in a JSON column in the database. Lists and containers need special care.

2.1.1.1       Lists/Container 

We can treat each list or container elements as a database record. We can think of the path of that list/container in the model (schema node identifier) as the “type”. Asking for a certain type should retrieve all the data for that list/container element – that is not child elements.

The position in the JSON from where an element hangs is identified by the xpath of the child container so there should be no need to include special IDs in the JSON.

Given that having a record without a parent makes no sense (we cannot build an xpath) each record should have a parent, with the exception of the root which has a special interface for creation (in the current SPI createRootManagedObject(String namespace, String type, String version, String name,String bucketName, DecoratedManagedObject parent,Map<String, Object> attributeMap). The root object can be treated as a list that has only one element and an xpath of e.g.

/root/model[@mecontext=’myEnodeBInTheMiddleOfNowhere’]

There is no need to include the field mecontext in the child record. The database has no notion of lists/containers and no idea of the model either. This is a logical separation that happens between the CPS and the DB SPI so as long as we have the right xpath when we create the child, we are ok.

2.1.1.2       Lists

List elements with keys can store the keys in the xpath and be uniquely identified (under the assumption the keys do not change). List elements without keys con only be identified by number. Because of inserting records in the middle of list it’s not desirable to have a list number in an xpath since it may impact all the following .

2.1.2       Relations

Relation can have a type (for queries), they are directional and because of the way the DB schema is designed (both to and from columns indexed) they can be navigated in both directions. Relations can be parent-child, . The main difference is that a parent-child relation is implied in the xpath and if we are storing the full xpath anyway for containment queries we could argue why storing parent-child relations separately at all. Potential reasons:

  • Let’s say we want all the first level children nodes. We can get there by grabbing everything from the DB that starts with the xpath of the parent and figure out what is next, but that’s rather ugly. This becomes easy if we store the xpath as an array or if we store a parent reference in the children.

  • Same story if we want to get the parent

For non-parent-child relations - the relation would have to be associated to some leaf in the container structure. We can store the relative xpath of the source and the destination node in the relation table, so we know where to attach them. The absolute xpath can be rebuilt from the id of the source (or destination) node. Not having them in the json will make very easy to delete them since we only have to delete a record from the relation table.

2.2       Generic schema (current proposal)

  • An anchor is a fragment, as such it can have properties

  • The first element of the xpath array is the name of the anchor. All the children append elements to that.

  • All the fragments have a reference to the anchor point.

  • The full xpath is stored as text

  • The parent ID column in the fragment table is there for transactional integrity. It references the id column to prevent creating orphan records under concurrency. It also serves lookup performance for parent/children.

  • A schema node identifier can be inferred from the xpath (in fact, for containers it is the xpath bar the first element). There is no need to link it to a model set.

  • The link between anchor points and module sets is purely for storing the association. The DB SPI has no notion of the model semantics.

The proposed DB for this schema is Postgres (as it uses some Postgres specific datatypes)

Note 1. The module_set table above will not be used like this in the PoC instead we will start with a ‘modules’ table which will contain the complete source for each module. Including colums for namespace and revision.

Note 2. The latest DB Schema (as implemented) is documented on CPS Internal Relation DB Schema

2.2.1       Schema Code

v 0.1 Initial Setup SQL
CREATE TABLE RELATION_TYPE ( RELATION_TYPE TEXT NOT NULL, ID SERIAL PRIMARY KEY ); CREATE TABLE DATASPACE ( ID SERIAL PRIMARY KEY, NAME TEXT NOT NULL, CONSTRAINT "UQ_NAME" UNIQUE (NAME) ); CREATE TABLE SCHEMA_NODE ( SCHEMA_NODE_IDENTIFIER TEXT NOT NULL, ID SERIAL PRIMARY KEY ); CREATE TABLE MODULE_SET ( MODULE_SET_REFERENCE TEXT NOT NULL, ID SERIAL PRIMARY KEY ); CREATE TABLE FRAGMENT ( ID BIGSERIAL PRIMARY KEY, XPATH TEXT NOT NULL, DATASPACE_ID INTEGER NOT NULL REFERENCES DATASPACE(ID), ATTRIBUTES JSONB, ANCHOR_ID BIGINT REFERENCES FRAGMENT(ID), PARENT_ID BIGINT REFERENCES FRAGMENT(ID), MODULE_SET_ID INTEGER REFERENCES MODULE_SET(ID), SCHEMA_NODE_ID INTEGER REFERENCES SCHEMA_NODE(ID) ); CREATE TABLE RELATION ( FROM_FRAGMENT_ID BIGINT NOT NULL REFERENCES FRAGMENT(ID), TO_FRAGMENT_ID BIGINT NOT NULL REFERENCES FRAGMENT(ID), RELATION_TYPE_ID INTEGER NOT NULL REFERENCES RELATION_TYPE(ID), FROM_REL_XPATH TEXT NOT NULL, TO_REL_XPATH TEXT NOT NULL, CONSTRAINT RELATION_PKEY PRIMARY KEY (TO_FRAGMENT_ID, FROM_FRAGMENT_ID, RELATION_TYPE_ID) ); CREATE INDEX "FKI_FRAGMENT_DATASPACE_ID_FK" ON FRAGMENT USING BTREE(DATASPACE_ID) ; CREATE INDEX "FKI_FRAGMENT_MODULE_SET_ID_FK" ON FRAGMENT USING BTREE(MODULE_SET_ID) ; CREATE INDEX "FKI_FRAGMENT_PARENT_ID_FK" ON FRAGMENT USING BTREE(PARENT_ID) ; CREATE INDEX "FKI_FRAGMENT_ANCHOR_ID_FK" ON FRAGMENT USING BTREE(ANCHOR_ID) ; CREATE INDEX "PERF_SCHEMA_NODE_SCHEMA_NODE_ID" ON SCHEMA_NODE USING BTREE(SCHEMA_NODE_IDENTIFIER) ; CREATE INDEX "FKI_SCHEMA_NODE_ID_TO_ID" ON FRAGMENT USING BTREE(SCHEMA_NODE_ID) ; CREATE INDEX "FKI_RELATION_TYPE_ID_FK" ON RELATION USING BTREE(RELATION_TYPE_ID); CREATE INDEX "FKI_RELATIONS_FROM_ID_FK" ON RELATION USING BTREE(FROM_FRAGMENT_ID); CREATE INDEX "FKI_RELATIONS_TO_ID_FK" ON RELATION USING BTREE(TO_FRAGMENT_ID); CREATE INDEX "PERF_MODULE_SET_MODULE_SET_REFERENCE" ON MODULE_SET USING BTREE(MODULE_SET_REFERENCE) ; CREATE UNIQUE INDEX "UQ_FRAGMENT_XPATH"ON FRAGMENT USING btree(xpath COLLATE pg_catalog."default" text_pattern_ops, dataspace_id);