Fault Management

Scope

This page lists improvements in related to Fault Management.

The term "fault management (FM)" refers to the "FCAPS" functions. Different SDO use different terms for the same function. At least on the presentation layer a common abstraction is needed for easy and efficient operation. Preferred also in the API to presentation layers such abstraction would be useful, However, modification of existing APIs should be handled carefully. 



Proposed changes on the presentation layer for Fault Management:

In order to clearly differentiate between faults (as event or notification) and alarms (as persistent indication of a fault) the following labels should be changed:

  • "Current Problem List" → "Active Alarms"

  • "Alarm Notifications" → "Faults"

  • "Alarm Log" → "Fault Log" (an alternative label could be "Fault History" - to be decided)





Cleared, NonAlarmed, Normal, ..

In case the fault reporting entity indicates that a fault does not need to be indicated further, an event is sent with a certain Severity or with a flag 'is-cleared' - On presentation layer the term "cleared" should be used.



Alarm History

A new tab should list an alarm history. The alarm history basically corelates the notification when a fault was raised with the notification when an alarm was cleared. 

Each list item show the duration an alarm exists - for currently active alarms no entry exists in the Alarm History because the alarm is not history yet.

The Alarm History can be interpreted as a different representation of the Fault log for all cleared alarms be merging the adjacent fault notifications. 

Columns

  • Raised: The timestamp when the fault notification for a specific alarm instance changed its status from "is-cleared = true" to "is-cleared = false".

  • Cleared: The timestamp when the related alarm was cleared. 

  • Duration: The difference between "cleared" and "raised" in ISO8601 format: P[nn]Y[nn]M[nn]DT[nn]H[nn]M[nn]S where [nn] should use leading zeros for sorting and filtering purpose.

  • Node Name: same as in Fault Log

  • Object Id: same as in Fault Log

  • Alarm Type: same as in Fault Log 

  • Severity: The severity as send in when the alarm was raised (must not be "cleared")



Notification message flow