Table of Contents |
---|
...
Assumption | Notes | |
---|---|---|
1 |
Issues and Decisions
# | Issue | Notes | Decision |
---|---|---|---|
1 | How fast should CPS (and DB) be able to process max heart beat failures? | is 60K really realistic if ENM goes down we should get a notification for each node do we ?! | PoC has shown 60 seconds is reasonable |
2 | Restart of NCMP | Should/Can this be handled? | As of now, there is no such case is being considered. |
3 | Does DMI Plugin provide NCMP with a health check URL during registration? Either, just rely on the default one provided with Spring boot actuator? | Document the contract. Its just the interface that matters and not the implementation. | Spring boot actuator interface |
4 | Error during cmHandle registration | If an error occurs during registration what trustlevel should the cmHandle be set to? IN eth following scenarios
| Agreed to Leave as is, if notification for a node already registered, we can process the other notification separately Team Notes: [Team] When state was provided to 'COMPLETE' or 'NONE' and the registration fails , state if trust level is still set to the provided state regardless of the current state of the cm handle (deleted/deleting, advise, ready, locked) |
5 | Module sync watchdog issues/error scenarios | If cmHandle is set to none/incomplete module sync will automatically retry (Is this acceptable?) If the module sync fails we will still send a Complete message (Is this acceptable?) Registering all cmHandles could take up to 20 mins, what should happen if the last sync fails as the notification would have been sent 20 mins ago? | When CMLevel is in: DELETING/DELETED - No Truslevel notification update ADVISE - No trustLevel notification update READY - Truslevel notification update LOCKED -Truslevel notification update
Team Notes: 12/10/2023 [Team] Notification SHALL only be sent when the Cm handle is set to Ready and locked regardless of the report from DMI Do we still update the cache? Yes. |
6 | When cm handle trustLevel state stays the same | Do we include that cm handle ID or not for notifications? | No you don't if no changes if it stays the the same
Team Notes: 12/10/2023 Scenario: DMI plugin up/down the previous state of the cm handle (trsutLevel) should be considered for notifications |
Description
- Define scenarios which cause a CM Handle to go stale.
- Implement changes to support tracking of CM Handle Freshness/Staleness.
...
Drawio | ||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
Interface | Name | Trigger | Description | Type | Endpoint or Topic | Schema |
---|---|---|---|---|---|---|
1 | HealthCheck | 30 second interval (configurable) | NCMP is to perform a health check against each of the DMI Plugins | REST | http://<dmiPluginServiceName>/manage/health This endpoint will be the standard heath check endpoint provided by spring boot actuator. We don't store it anywhere. We just document it for now. | |
2 | CMHandle trust level change | A CMHandle managed by DMI Plugin's trust level has changed | data contains {trustLevel: ENUM} event id is cmhandle id in kafka header | Kafka | kafka topic: dmi-device-heartbeat | <cloudEvents-header> id : <cmhandleId> type : org.onap.cm.events.trustlevel-notification data : { |
3 | CMHandle Query API with trustLevel Query Condition | Client Request | CmHandle is to be returned based on the values in above CMHandle Trust Map | REST |
| { |
4 | Notification on Trust Level Change | NCMP | NCMP sends notification upon trust level changes | Kafka | kafka-topic: cm-events | <cloudEvents-headers> "data": { |
Managing TrustLevel
DMI Plugins
- NCMP is checking every DMI Plugin for health at interface 1 every 30 seconds using the Trust Level DMI Trust Map
- IF a DMI Plugin goes down, that DMI Plugin's trust level health status is updated to NONE in the Trust Level DMI Trust Map
- The CM handles corresponding to DMI should be set to NONE.
- IF a DMI Plugin comes back up, Trust level Heath status is set back to COMPLETE for that DMI plugin only.
More details of health check URL can be accessed via:
CPS-1857 Document watchdog job impl. with health check URL
...
- It is the responsibility of the DMI Plugins to update NCMP about the heartbeat of CMHandle.
- Through interface 2, DMI Plugins will provide a Kafka event on the changing of trustworthiness state of a CMHandle.
- NCMP receives this event and updates the CM Handle Trust Map accordingly
- Needs to be able to handle a throughput of 60,000 State changes per minute for 2 instances
...
Query CM Handle with Trust Level
Body of request will be in the format as below:
Code Block language text title Search Trust Level Request Body { "cmHandleQueryParameters": [ { "conditionName": "cmHandleWithTrustLevel", "conditionParameters": [ {"trustLevel": "COMPLETE"} ] } ] }
There are two end points will be subject to query:
http://<host>:<port>/ncmp/v1/ch/id-searches
http://<host>:<port>/v1/ch/searches- Interface 3
- NCMP will first check trust level query parameters to determine which trust level (NONE, COMPLETE) is being searched.
- if Then, the target trust level is NONE
- The cm handles stored in CM Handle Trust Map having NONE will be returned.
must be selected. - If the target trust level is COMPLETEThe cm handles stored in CM Handle Trust Map having COMPLETE will be returned(comes from the request) is equal to effective trust level (obtained in step a.),
then cm handle should be included in the response.
- if Then, the target trust level is NONE
Notifications on Trust Level Changes
...