CPS-1733: Upgrade YANG Schema-Set for CM Handle Using a Module Set Tag

CPS-1733: Upgrade YANG Schema-Set for CM Handle Using a Module Set Tag

https://lf-onap.atlassian.net/browse/CPS-1733

https://lf-onap.atlassian.net/browse/CPS-1734

Study is required to clearly define the scope and impacts of updating YANG model schema sets for cm-handles. 

Requirements

Functional

Interface

Requirement

Additional Information

Sign Off

Interface

Requirement

Additional Information

Sign Off

1

1

CPS-NCMP-I-01

Upgrade module set for CM-Handle(s) using a moduleSetTag

Module Set Tag is owned (defined) by DMI Plugin
Known module set tags should prevent 'trips' to DMI for Module information
Unused data be cleaned
All data stores are accepted

Jul 25, 2023 

2

2

CPS-NCMP-I-01

Initial Inventory  shall support (Optional) moduleSetTag too

Exiting performance shall not degrade.
Known module set tags should prevent 'trips' to DMI for Module information

Jul 25, 2023 

3

3

CPS-NCMP-I-01

Update module set for CM-Handle(s) using a Blank (not defined)  moduleSetTag

use 'old' algorithm from initial inventory

Jul 25, 2023 

4

4

CPS-E-05

Read  moduleSetTag for given CM Handle(id)

Probably no code changes required (just a model change)

Jul 25, 2023 

5

5

CPS-E-05

Query Cm Handle(s) using CPS Path with moduleSetTag

Probably no code changes required (just a model change)

Jul 25, 2023 

6

6

CPS-E-05.e
(e for events)

A new notification informing the client the old and new value of moduleSetTag 

Use same topic as CM Handle LCM Events (and future trust-level change notifications)

Jul 25, 2023 

Error Handling

#

Error Scenario

Expected behavior

Sign-off

#

Error Scenario

Expected behavior

Sign-off

1

Missing CM Handles (in list, some are OK)

Similar to 'Initial Inventory' i.e response should include list of 'failed' cm handles



2

Upgrade request for 'cached' data (cache enabled)

Refuse request; not supported



Characteristics



#

Parameter

Expectation

Notes

1

Request Frequency
Upgrade to NEW moduleSetTags

3 upgrades  / second → 180 upgrades / minute

The assumption is there are 100 modules in each moduleSet and a moduleSet is available from a dmi plugin within 0.5 seconds
Also assume there is a 90% overlap in modules across all CM Handles.
Across 20,000 Nodes there are 10 different ModuleSetTags

2

Request Frequency
Upgrade to already existing moduleSetTags

6 upgrades  / second → 360 upgrades / minute



3

Test Environment



  1. CPS and NCMP

requests:
    cpu: 2000m
    memory: 2Gi
limits:
    memory: 3Gi
    cpu: 3000m

2. Postgres

requests:
    cpu: 4000m
    memory: 1Gi
 limits:
    memory: 3Gi
    cpu: 6000m





4

Concurrent request

12 clients requests toward 1 NCMP simultaneously



5

Number of CM Handles in one request





Performance is posted here :  Performance test for updating YANG schema set API

Out-of-scope

  • Upgrade of models for cached data. "ncmp-datastore:operational" is out-of-scope.  

 Note: Only pass-through i.e. non-cached data upgrade is in scope; "ncmp-datastore:passthrough-running" and "ncmp-datastore:passthrough-operational" is in scope.

Assumptions

Assumption

Notes

Assumption

Notes

Issues & Decisions

Issue

Notes 

Decision

Issue

Notes 

Decision

1

Type of Interface REST or Kafka

REST

  1. easier/cheaper

  2. possibly (complete) re-use of (inventory) existing interface

  3. well documented by OpenAPI

  4. support test stubs (contract testing)

Kafka

  1. more complicated/costly

  2. plethora of topics and messages, not well documented (no standard)

  3. More robust (request persisted until acknowledged)

meeting Jul 19, 2023 Agreed to use REST interface.  

2

moduleSetTag based on Hash no (yet) implemented!

This study seems to assume the module set tag is already implemented but it isn't!



[kieran mccarthy]:  the request it to introduce the module set tag/identifier of some kind for this new upgrade usecase.  Not assuming it is there already.  Will be costlier but is important for performance to avoid pulling models if they are already known to NCMP.

3

Expected Responses

when and what content?

  1. just acknowledge upon receipt

  2. response on completion

Should follow error handling for Inventory

Jul 25, 2023 @kieran mccarthy and team.
(Error) Handling should be similar to initial inventory ie. be able to report partial success.
Initial request gets acknowledged further processing is done in an asynchronous fashion.
Any incorrect cm handle Ids will be immediately return using a 'DmiPluginRegistrationErrorResponse' (existing class)
CPS will use Hazelcast to manage (and persist!) the requested upgrades

4

Should CM-Handle state change (e.g. to 'locked')  'during' upgrade?

Yes but is important to not be locked for long which makes it important to use the moduleSetTag

@kieran mccarthy  Jul 20, 2023  Should be set to LOCKED until the new moduleSet is associated with the cmhandle.  I think there is a LOCKED_UPGRADING if I remember right.  Lock reason should mention "upgrade" and have the usual timestamps. A separate notifications will be send with details of the old and new values for moduleSetTag see decision #11
See CPS-799 Spike: Define states and state handling for CM handle

5

moduleSetTag is Optional (owned and defined by DMI Plugin)

Support for upgrade without continues using delete/add cm handle approach.
If the update includes an moduleSetTag it would be considered an upgrade

meeting Jul 19, 2023 Agreed to use REST interface with "upgradedCmHandles". 

6

moduleSetTag should be able to be used during initial inventory too!

Initial Inventory should be sped up too (capability requirements impacts ?!)
Note.  Current inventory 'createdCmHandles' only supports a list of cm handle Ids can this be in backward compatible way be modified to optionally include a moduleSetTag

Jul 25, 2023 @kieran mccarthy and team. Yes, if need backward incompatible change can be handle as a new version of the interface

7

Exact name moduleSetTag



meeting Jul 19, 2023 agreed on moduleSetTag

8

How to store: hardcoded (postgress schema), inventory yang model or as additional property (private or public)?

 update Inventory Yang Model so it can be queried (without code changes!) like other aspect such as 'state'


meeting Jul 19, 2023 agreed to update Yang Model  

9

additional operation for inventory Interface: 'upgradedCmHandles'

The interface currently supports

  1. createdCmHandles

  2. updatedCmHandles

  3. removedCmHandles

it possibly could be done as part of 'updatedCmHandles' and look for /recognize the moduleSetTag update but this would be messy and confusing, also then theoretically properties could be updated as the same time as the module set..



meeting Jul 19, 2023 agreed 



10

Clarify capabilities

  • Expected Response/process times

  • 'batch' size

  • concurrent request combined with request frequency i.e 12 * 25 request per second?!

Part of requirement listed above, to be finalized in a  meeting after the holidays on Aug 22, 2023 

11

Separate Notification on change of moduleSetTag

  • There will already LCM notification, maybe that is enough?

Jul 25, 2023 @kieran mccarthy separate notification see requirement #6

12

Reuse schema set name or create a new one?

each cm handle now has a (unique) schema set name which is the same as the cm handle (id). 
we could simply change the module references but it might be more correct to create a NEW schema set and delete the old one. The name name could be a concatenation of the cm handle (id) and the module set tag

  • Pros of reusing the schema set name and DB Id :

    • less code change.

    • less DB calls. (Performance)

    • we can handle blank module set tag.

  • Cons  of reusing the schema set name :

    • No easy visibility of schema set name. ((upgraded or not)

    • Clear / upgrade the chache as well. 

    • It seems incorrect

Aug 31, 2023 Team agreed to re-use existing name



13

Conflicting Error code : Legacy codes for registration v. event status code (dataOperation and Subscrption)

See

  • org.onap.cps.ncmp.api.models.CmHandleRegistrationResponse.RegistrationError

    • 2 digits

    • 00 unknown error

    • 01-03 error scenarios

    • No success code

  • org.onap.cps.ncmp.api.NcmpEventResponseCode

    • variabel # of digits

    • 0 success

    • 1-99 reserved other success

    • 100-999 errors

ModuleSetTag error scenarios have overlapping codes for 'cm handle not found/not found'



Some of these are already in use!!! Can we fix now or live with inconsistencies forever?!



NEW

  1. are the error code being used ? - They (Most) are being used (Csaba). 

  2. Fixed number of digit - Can we have the same length of error codes 3 digit (Check other dependencies) - Agreed to leave as is 

  3. http errors are returned - error 200



Create a new Jira and agree priority with E// - Prioritise it's a blocker 

Sep 28, 2023 



Solution Proposals 

Update Inventory REST Interface

Making the module upgrade request either over REST or Kafka Event with a supplied 'moduleSetTag' property will indicate to NCMP that a cmhandle has a new moduleSet (node has been upgraded).  The moduleSetTag is a unique identifier for the set of modules associated with a cmhandle. This event or rest request will trigger ncmp to either retrieve the new module set for this cmhandle from the dmi plugin or if NCMP is already aware of moduleSetTag as part of a previous retrieval for a set of modules for a cmhandle that supplied this moduleSetTag then it should not re-retrieve the modules from the dmi plugin as it already has them.  NCMP should use its stored moduleSetTag to get the set of modules.  It can then do a delta with the existing cmhandle moduleSet and update the moduleSet for the cmhandle.

The above proposal would require model updates to NCMP to store the association between a moduleSetTag supplied by a dmi-plugin (during and updateCmHandles request) and the modules.  Initially moduleSetTag is not known to NCMP.  However, after retrieval of the modules for the first time for the cmhandle, NCMP can store the association.  Any subsequent cmhandle that references the same moduleSetTag will not require NCMP to go back to the dmi plugin for the moduleSet.

Reuse the NCMP Inventory API (CPS-NCMP-I-01). URI : POST /v1/ch

URI: POST /v1/ch requestbody
{     upgradedCmHandles : [             {         "cmhandle" : "<cmhandle-id-1>",         "moduleSetTag" : "ffsdfg55342"   #  new moduleSetTag ffsdfg55342 for the cmhandle        },        {         "cmhandle" : "<cmhandle-id-2>",        "moduleSetTag" : "ffsdfg55342"   #  new moduleSetTag ffsdfg55342 for the cmhandle        },      {         "cmhandle" : "<cmhandle-id-3>",             "moduleSetTag" : "ddger34324"    #  new moduleSetTag ddger34324 for the cmhandle         }   ] }

Proposed Inventory Model rev. 2023-08-23

Lines added: 11-14, 83-85

2023-08-23
module dmi-registry { yang-version 1.1; namespace "org:onap:cps:ncmp"; prefix dmi-reg; contact "toine.siebelink@est.tech"; revision "2023-08-23" { description "Added ModuleSetTag"; } revision "2022-05-10" { description "Added DataSyncEnabled, SyncState with State, LastSyncTime, DataStoreSyncState with Operational and Running syncstate"; } revision "2022-02-10" { description "Added State, LockReason, LockReasonDetails to aid with cmHandle sync and timestamp to aid with retry/timeout scenarios"; } revision "2021-12-13" { description "Added new list of public additional properties for a Cm-Handle which are exposed to clients of the NCMP interface"; } revision "2021-10-20" { description "Added dmi-data-service-name & dmi-model-service-name to allow separate DMI instances for each responsibility"; } revision "2021-05-20" { description "Initial Version"; } grouping LockReason { leaf reason { type string; } leaf details { type string; } } grouping SyncState { leaf sync-state { type string; } leaf last-sync-time { type string; } } grouping Datastores { container operational { uses SyncState; } container running { uses SyncState; } } container dmi-registry { list cm-handles { key "id"; leaf id { type string; } leaf dmi-service-name { type string; } leaf dmi-data-service-name { type string; } leaf dmi-model-service-name { type string; } leaf module-set-tag { type string; }     list additional-properties { key "name"; leaf name { type string; } leaf value { type string; } } list public-properties { key "name"; leaf name { type string; } leaf value { type string; } } container state { leaf cm-handle-state { type string; } container lock-reason { uses LockReason; } leaf last-update-time { type string; } leaf data-sync-enabled { type boolean; default "false"; } container datastores { uses Datastores; } } } } }



Upgrade With Module Set Tag

  1. Use new 'upgradedCmHandles' operation to upgrade CH-1

  2. Find a cmHandle with given 'moduleSetTag' 
    (if not found use algorithm defined in next section)

  3. Get all module references for cm handle (schemaset) with same tag (CH-2) 

  4. Update module references for the anchor/schemaset CH-1
    (See open issue # 12:  should we create a NEW schema set and delete the old one ?)

  5. Update inventory for CH-1 with given module set tag

Support for Upgrade Without Module Set Tag

Note. Same algorithm should be used when Module Set is not set (leaf not present), blank or no other cmHandle with same tag can be found

If the moduleSetTag json property is set to "" (empty string) or then it should also indicate that the moduleSet for a cmHandle has been updated but there is no associated moduleSetTag available for that cmHandle.  This approach will always result is a full request to the dmi plugin for the module set for the cmHandle.

Basically the same steps as during initial inventory can be followed. Except the creation of the cmHandle (anchor) as that already exist.
Refer to org.onap.cps.ncmp.api.inventory.sync.ModuleSyncTasks#performModuleSync for the relevant code


  1. get all module references from the (upgraded) cm handle (via DMI)

  2. find out which module references are NEW to CPS-Core

  3. get the yang resources for the new modules from the (upgraded) cm handle (via DMI)

  4. create/save a new (TBC, issue #12) schema set using the existing module reference and new yang resources

  5. (new) update the schema set for the upgraded cm handle

  6. (new) set the moduleSetTag in the ncmp inventory for the upgraded cm handle (mabey this needs to be done at the start?)

  7. (new) delete the old schema set of the upgraded cm handle (depend on decision re issue #12)

Use ModuelSyncWatchDog (parallel processing)

Initial inventory is driven by (batches of) Cm Handle state changes on the main thread and then the org.onap.cps.ncmp.api.inventory.sync.ModuleSyncWatchdog processes those!

Examining this in detail lead to some additional design questions

  1. should upgrade be dealt with in parallel too

  2. should upgrade be done in batches (of 100) too?

  3. need to use a (new?) state for UPGRADE that the watchdog can see an process

  4. re-use existing methods and make creating aan anchor optional or create new upgrade specific methods and reuse/slightly duplicate initial inventory methods? Methods like

    1. org.onap.cps.ncmp.api.inventory.sync.ModuleSyncService#syncAndCreateSchemaSetAndAnchor

    2. org.onap.cps.ncmp.api.inventory.sync.ModuleSyncService#createSchemaSetAndAnchor

Aug 31, 2023 team agreed to re-use ModuleSyncWatchDog  and batching with following considerations

  1. Need to use a new (shared) Hazelcast map with ModuleSetTags as key (value list of module refs) that have been processed (but not saved yet) to be used both inside a batch and different instance to prevent unnecessary trips to DMI/Node

  2. Watchdog needs to use Lock state AND lock reason to determine what node need to be upgraded

  3. Initial inventory and upgrade is not likely to happen at the same time but watchdog can handle both, of course performance would be affected if that does occur

  4. CACHE(s) needs to be cleared or updates as algorithm wil re-use existing schema set name

  5. Create new upgrade specific methods and reuse/slightly duplicate initial inventory methods

  6. Legacy (and new) checks for lock need to check the lock-reason too now! To differentiate between failed initial inventory and upgrade

  7. Probably need more specific (new) failure reasons to differentiate between initial inventory and upgrade failures

  8. as usual: small commits, early reviews to introduce all this functionality are advised, posisbel steps

    1. set lock state and reason upon request

    2. watch dog just list to be upgrade node (and does not mix them up with failed initial inventory)

    3. perform first upgrade for a new ModuleSetTag

    4. introduce and use new Hazelcast map

    5. perform upgrade of an node with a modueleSetTag that is already in cache

    6. perform upgrade of an node with a modueleSetTag that is already in DB (ie introduce DB query)

    7. handle failure of upgrade (re-use same retry mechanism as initial inventory but with different lock reason!)

    8. etc.

Use-Case Overview (Sync in watchDog)



Operation

Tag
Provided

Tag
Cached

Tag
In DB
(other cm handle)

Steps

Operation

Tag
Provided

Tag
Cached

Tag
In DB
(other cm handle)

Steps

1

Create

No

N/A

N/A

  1. get modules (delta) from Node (DMI)

  2. create schema set 

  3. create anchor

2

Create

Yes

No

No

  1. get modules (delta) from Node (DMI)

  2. create schema set

  3. create anchor

3

Create

Yes

No

Yes

  1. get modules from DB (other cm handle) No delta

  2. create schema set (blank map for new resources)

  3. create anchor

4

4

Create

Yes