Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId4733707d-2057-3a0f-ae5e-4fd8aff50176
keyCPS-1415


Issues and Decisions

#IssueNotes Decision
1how fast should CPS (and DB) be able to process max heart beat failuresis 60K really realistic if ENM goes down we should get a notification for each node do we ?!PoC has shown 60 seconds is reasonable
2restart of NCMPshould/can this be handled
3


Description

  1. Define scenarios which cause a CM Handle to go stale
  2. Implement changes to support tracking of CM Handle Freshness/Staleness

...

  1. dmi plugin identifies that the device is no longer contactable
  2. dmi plugin identifies that an underlying device manager managing the device (node) is out of sync with the device itself. 

Requirements

Functional

#InterfaceRequirementAdditional InformationSign-off
1CPS-NCMP-
I
E-
01
05The 'trustlevel' can
be queried (
is visible) on the
methos
methods as currently the 'cm handle state'can be new or existing (preferred) endpoint 
2CPS-NCMP-
I-01
E-05CM Handles can be queried (filter condition) on  'trustlevel'  using a new 'trustLevel' condition (cannot use cpsPath condition)
3CPS-NCMP-E-05Once a CM Handle is registered
(TBD which state exactly?)
the trust -level for that CM Handle should
be 
be reported to be 'COMPLETE'
3


4CPS-NCMP-
I
E-
01
05Once DMI (plugin) is detected to be down the trust-level for all affected CM Handles should be reported to be 'NONE'It might not need to be persisted....
4

5CPS-NCMP-I-01

REST or ASYNC TBD
.eDMI plugin can report the current trustlevel of a single
(or collection?) of
cm handle
(
id
)s
ie. the DMI can tell NCMP the trustlevel is 'NONE' when a  node heartbeat failure is detected and 'COMPLETE' once it is restored
5Notification on trustlevel changes ?!

Error Handling

#Error ScenarioExpected behavior
1NCMP restart Options:
  • Trustlevels should as they were before the restart? (might depend on how much time has elapsed)
  • (preferred) (all instances)

    To be discussed, not suer if it can/should be handled

    Trustlevels should be 'NONE' and need to be restored using an audit-request (not in scope)

    2

    Characteristics

    #ParameterExpectationNotesSign-off
    1dmi-down detection speed
    60
    30 seconds
    (TBD)


    2maximum number of cm-handles down report by DMI in one request and/or per minute30,000 / minute 
    This looks like an 'ENM down' not sure if that should be handled this way
    a peak can be processed within 60 seconds
    3processing of all trustlevel time for DMI-Down and/or peak load by DMI 1 second


    Out-of-Scope

    1. This epic will only introduce trustlevels NONE and COMPLETE. PARTIAL and POOR may be added later as below.
    2. Re-registration ie. resolve resolving trutslevel degradation is not in scope of this epic
    3. NCMP wil not send notification on trustlevel changes for external consumers

    High Level Interactions

    Drawio
    bordertrue
    diagramNameStaleness Freshness Overview
    simpleViewerfalse
    width
    linksauto
    tbstyletop
    lboxtrue
    diagramWidth940
    revision3

    ...