Overview
Missing Heartbeat MS is one among the higher-level DCAE services responsible for tracking VES heartbeat events generated by the VNF's posted into DMAAP by VESCollector. Each VNF when configured to be monitored will be checked periodically by service if Heartbeat event was received. If the Heartbeat events from specific VNF is missed 'x' times (configured through policy), then Heartbeat Service will generate a control-loop event output. Once the VNF starts sending Heartbeat events, the service will automatically clear the original Onset created.
Following diagram explains the event flow involved.
Blueprint/image
Βlueprint (deployment artifact) : https://git.onap.org/dcaegen2/services/heartbeat/tree/dpo/k8s-heartbeat.yaml
Input file (deployment input) : https://git.onap.org/dcaegen2/services/heartbeat/tree/dpo/k8s-heartbeat.yaml
Docker image : nexus3.onap.org:10001/onap/org.onap.dcaegen2.services.heartbeat:2.1.0
Deployment Prerequisite/dependencies
Heartbeat service has following dependencies
Dmaap : Usedfor sub/pub for VES heartbeat events; this should be deployed using OOM helm chart
ConfigBindingService : Used for fetching the latest configuration from Consul; this should be deployed using DCAE OOM charts.
Postgres : Used for Event data store, leader management (scale), heartbeat computation. Blueprint includes postgres plugin and postgres node, which will be used for setting up the PG pod.
Deployment Steps
Deployment of Heartbeat Service can be done using Dashboard UI or CloudifyUI or via CLI. Below steps are based on CLI.
- Transfer blueprint component file in DCAE bootstrap POD under /blueprints directory
- Transfer blueprint component inputs file in DCAE bootstrap POD under / directory
- Log-in to the DCAE bootstrap POD's main container
Validate blueprint
Validate Blueprintcfy blueprints validate /blueprints/k8s-heartbeat.yaml
Verify Plugin versions in target Cloudify instance match to blueprint imports
Verify Plugin versioncfy plugins list
If the version of plugin used are different, update the blueprint import to match.
Deploy Service
Upload and deploy blueprintcfy install -b heartbeat -d heartbeat -i /k8s-heartbeat-inputs.yaml /blueprints/k8s-heartbeat.yaml
To un-deploy
Uninstall running component and delete deployment
Uninstall componentcfy uninstall heartbeat
Delete blueprint
Delete blueprintcfy blueprints delete heartbeat
Initial Validation
After deployment, verify if Heartbeat POD and PG pod are running correctly
root@k8s-rancher:~# kubectl get pods -n onap | egrep "heartbeat|postgres" dep-dcae-heartbeat-service-5ff7558fd-4nt6z 2/2 Running 2 4h dep-hbpostgres-write-854dc899b6-nb22c 1/1 Running 0 4h
And then check the logs to see if it can connect to DMaaP, polling for events.
kubectl logs -f -n onap dep-dcae-heartbeat-service-5ff7558fd-4nt6z dcae-heartbeat-service . . 2019-05-02 03:10:15,697 | urllib3.connectionpool | connectionpool | _make_request | 396 | DEBUG | http://message-router.onap.svc.cluster.local:3904 "GET /events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 HTTP/1.1" 200 2 2019-05-02 03:10:15,697 | __main__ | htbtworker | process_msg | 93 | INFO | ('HBT:', '[]') 2019-05-02 03:10:18,952 | __main__ | misshtbtd | main | 364 | INFO | ('MSHBT: hb_common values ', 7, 'RUNNING', 'dcae-heartbeat-service-', 1556766594) 2019-05-02 03:10:18,952 | __main__ | misshtbtd | main | 368 | INFO | ('MSHBD:pid,srcName,state,time,ctime,timeDiff is', 7, 'dcae-heartbeat-service-', 'RUNNING', 1556766594, 1556766619, 25) 2019-05-02 03:10:18,953 | __main__ | misshtbtd | main | 384 | INFO | ('MSHBD:config status is', 'RUNNING') 2019-05-02 03:10:18,962 | __main__ | misshtbtd | create_update_hb_common | 143 | INFO | MSHBT:Updated hb_common DB with new values
Functional tests
Following default configuration is loaded into Heartbeat (set in blueprint configuration)
{ "vnfs": [{ "eventName": "Heartbeat_vDNS", "heartbeatcountmissed": 3, "heartbeatinterval": 60, "closedLoopControlName": "ControlLoopEvent1", "policyVersion": "1.0.0.5", "policyName": "vFireWall", "policyScope": "resource=sampleResource,type=sampletype,CLName=sampleCLName", "target_type": "VNF", "target": "genVnfName", "version": "1.0" }, { "eventName": "Heartbeat_vFW", "heartbeatcountmissed": 3, "heartbeatinterval": 60, "closedLoopControlName": "ControlLoopEvent1", "policyVersion": "1.0.0.5", "policyName": "vFireWall", "policyScope": "resource=sampleResource,type=sampletype,CLName=sampleCLName", "target_type": "VNF", "target": "genVnfName", "version": "1.0" }, { "eventName": "Heartbeat_xx", "heartbeatcountmissed": 3, "heartbeatinterval": 60, "closedLoopControlName": "ControlLoopEvent1", "policyVersion": "1.0.0.5", "policyName": "vFireWall", "policyScope": "resource=sampleResource,type=sampletype,CLName=sampleCLName", "target_type": "VNF", "target": "genVnfName", "version": "1.0" }] }
To simulate the event flow and trigger missing heartbeat event, we can simulate a VES event into Heartbeat subscription topic (using curl).
Generate Heartbeat CL Onset
Send a triggering event to DMaaP topic unauthenticated.SEC_HEARTBEAT_OUTPUT
Before sending, validate following
- Correct DMaaP address
- "eventName" field in VES input matches eventName configuration set in Heartbeat Service
- "lastEpochMicrosec" in reflecting current time stamp approximately
curl -X POST http://10.12.5.8:30227/events/unauthenticated.SEC_HEARTBEAT_OUTPUT -H 'Content-Type: application/json' -d '{ "event": { "commonEventHeader": { "startEpochMicrosec": 1556753402000, "sourceId": "79e90d76-513a-4f79-886d-470a0037c5cf", "eventId": "Heartbeat_vDNS_10.0.0.1", "nfcNamingCode": "DNS", "reportingEntityId": "79e90d76-513a-4f79-886d-470a0037c5cf", "eventType": "applicationVnf", "priority": "Normal", "version": 3, "reportingEntityName": "testcmd001", "sequence": 36312, "domain": "heartbeat", "lastEpochMicrosec": 1556753402000, "eventName": "Heartbeat_vDNS", "sourceName": "testnode001", "nfNamingCode": "MDNS" } } }'
2019-05-02 03:35:08,638 | __main__ | htbtworker | process_msg | 78 | INFO | HBT:Getting :http://message-router.onap.svc.cluster.local:3904/events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 2019-05-02 03:35:08,649 | urllib3.connectionpool | connectionpool | _new_conn | 208 | DEBUG | Starting new HTTP connection (1): message-router.onap.svc.cluster.local 2019-05-02 03:35:18,986 | urllib3.connectionpool | connectionpool | _make_request | 396 | DEBUG | http://message-router.onap.svc.cluster.local:3904 "GET /events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 HTTP/1.1" 200 541 2019-05-02 03:35:18,987 | __main__ | htbtworker | process_msg | 93 | INFO | ('HBT:', '["{\\"event\\":{\\"commonEventHeader\\":{\\"startEpochMicrosec\\":1556753402000,\\"sourceId\\":\\"79e90d76-513a-4f79-886d-470a0037c5cf\\",\\"eventId\\":\\"Heartbeat_vDNS_10.0.0.1\\",\\"nfcNamingCode\\":\\"DNS\\",\\"reportingEntityId\\":\\"79e90d76-513a-4f79-886d-470a0037c5cf\\",\\"eventType\\":\\"applicationVnf\\",\\"priority\\":\\"Normal\\",\\"version\\":3,\\"reportingEntityName\\":\\"testcmd001\\",\\"sequence\\":36312,\\"domain\\":\\"heartbeat\\",\\"lastEpochMicrosec\\":1556753402000,\\"eventName\\":\\"Heartbeat_vDNS\\",\\"sourceName\\":\\"testnode001\\",\\"nfNamingCode\\":\\"MDNS\\"}}}"]') 2019-05-02 03:35:18,987 | __main__ | htbtworker | process_msg | 128 | INFO | ('HBT:Newly received HB event values ::', 'Heartbeat_vDNS', 1556753402000, 'testnode001') 2019-05-02 03:35:18,999 | __main__ | htbtworker | process_msg | 135 | INFO | HBT:vnf_table_2 is already there 2019-05-02 03:35:18,999 | __main__ | htbtworker | process_msg | 139 | INFO | ('HBT:', "Select source_name_count from vnf_table_1 where event_name='Heartbeat_vDNS'") 2019-05-02 03:35:19,002 | __main__ | htbtworker | process_msg | 156 | INFO | ('HBT:event name, source_name & source_name_count are', 'Heartbeat_vDNS', 'testnode001', 2) 2019-05-02 03:35:19,002 | __main__ | htbtworker | process_msg | 160 | INFO | ('HBT:eppc query is', "Select source_name from vnf_table_2 where event_name= 'Heartbeat_vDNS' and source_name_key=1") 2019-05-02 03:35:19,005 | __main__ | htbtworker | process_msg | 160 | INFO | ('HBT:eppc query is', "Select source_name from vnf_table_2 where event_name= 'Heartbeat_vDNS' and source_name_key=2") 2019-05-02 03:35:19,006 | __main__ | htbtworker | process_msg | 176 | INFO | ('HBT: The source_name_key and source_name_count are ', 1, 2) 2019-05-02 03:35:19,006 | __main__ | htbtworker | process_msg | 180 | INFO | ('HBT: Insert entry in table_2 : ', [('testcmd001',)]) 2019-05-02 03:35:24,045 | __main__ | misshtbtd | main | 364 | INFO | ('MSHBT: hb_common values ', 7, 'RUNNING', 'dcae-heartbeat-service-', 1556768099) 2019-05-02 03:35:24,045 | __main__ | misshtbtd | main | 368 | INFO | ('MSHBD:pid,srcName,state,time,ctime,timeDiff is', 7, 'dcae-heartbeat-service-', 'RUNNING', 1556768099, 1556768124, 25) 2019-05-02 03:35:24,045 | __main__ | misshtbtd | main | 384 | INFO | ('MSHBD:config status is', 'RUNNING') 2019-05-02 03:35:24,059 | __main__ | misshtbtd | create_update_hb_common | 143 | INFO | MSHBT:Updated hb_common DB with new values 2019-05-02 03:35:24,411 | __main__ | db_monitoring | db_monitoring | 158 | INFO | DBM: Active DB Monitoring Instance 2019-05-02 03:35:24,421 | __main__ | db_monitoring | sendControlLoopEvent | 39 | INFO | ('DBM:Time to raise Control Loop Event for Control loop typ /target type - ', 'ONSET', 'VNF') 2019-05-02 03:35:24,421 | __main__ | db_monitoring | sendControlLoopEvent | 41 | INFO | DBM:Heartbeat not received, raising alarm event 2019-05-02 03:35:24,421 | __main__ | db_monitoring | sendControlLoopEvent | 117 | INFO | ('DBM: CL Json object is', '{"closedLoopEventClient": "DCAE_Heartbeat_MS", "policyVersion": "1.0.0.5", "policyName": "vFireWall", "policyScope": "resource=sampleResource,type=sampletype,CLName=sampleCLName", "target_type": "VNF", "AAI": {"generic-vnf.vnf-name": "testnode001"}, "closedLoopAlarmStart": 1556768124419, "closedLoopEventStatus": "ONSET", "closedLoopControlName": "ControlLoopEvent1", "version": "1.0", "target": "genVnfName", "requestID": "8c1b8bd8-06f7-493f-8ed7-daaa4cc481bc", "from": "DCAE"}') 2019-05-02 03:35:24,421 | __main__ | db_monitoring | sendControlLoopEvent | 121 | INFO | ('DBM:', 'http://message-router.onap.svc.cluster.local:3904/events/unauthenticated.DCAE_CL_OUTPUT') 2019-05-02 03:35:24,433 | urllib3.connectionpool | connectionpool | _new_conn | 208 | DEBUG | Starting new HTTP connection (1): message-router.onap.svc.cluster.local 2019-05-02 03:35:24,539 | urllib3.connectionpool | connectionpool | _make_request | 396 | DEBUG | http://message-router.onap.svc.cluster.local:3904 "POST /events/unauthenticated.DCAE_CL_OUTPUT HTTP/1.1" 200 41 2019-05-02 03:35:24,540 | __main__ | db_monitoring | sendControlLoopEvent | 126 | INFO | ('DBM:', 200, 'OK')
We can check that a new DCAE_CL_OUTPUT event has been published (make sure you target the correct DMaaP IP address)
curl http://10.12.5.8:30227/events/unauthenticated.DCAE_CL_OUTPUT/vv/1
["{\"closedLoopEventClient\": \"DCAE_Heartbeat_MS\", \"policyVersion\": \"1.0.0.5\", \"policyName\": \"vFireWall\", \"policyScope\": \"resource=sampleResource,type=sampletype,CLName=sampleCLName\", \"target_type\": \"VNF\", \"AAI\": {\"generic-vnf.vnf-name\": \"testnode001\"}, \"closedLoopAlarmStart\": 1556768124419, \"closedLoopEventStatus\": \"ONSET\", \"closedLoopControlName\": \"ControlLoopEvent1\", \"version\": \"1.0\", \"target\": \"genVnfName\", \"requestID\": \"8c1b8bd8-06f7-493f-8ed7-daaa4cc481bc\", \"from\": \"DCAE\"}"]
Generate Heartbeat CL Abatement Event
Send a triggering event to DMaaP topic unauthenticated.SEC_HEARTBEAT_OUTPUT
Before sending, validate following
- Correct DMaaP address
- "eventName" field in VES input matches eventName configuration set in Heartbeat Service
- "lastEpochMicrosec" in reflecting current time stamp approximately
- Ensure the event is on the same "sourceName" as original event sent
curl -X POST http://10.12.5.8:30227/events/unauthenticated.SEC_HEARTBEAT_OUTPUT -H 'Content-Type: application/json' -d '{ "event": { "commonEventHeader": { "startEpochMicrosec": 1556768319000, "sourceId": "79e90d76-513a-4f79-886d-470a0037c5cf", "eventId": "Heartbeat_vDNS_10.0.0.1", "nfcNamingCode": "DNS", "reportingEntityId": "79e90d76-513a-4f79-886d-470a0037c5cf", "eventType": "applicationVnf", "priority": "Normal", "version": 3, "reportingEntityName": "testcmd001", "sequence": 36312, "domain": "heartbeat", "lastEpochMicrosec": 1556768319000, "eventName": "Heartbeat_vDNS", "sourceName": "testnode001", "nfNamingCode": "MDNS" } } }'
2019-05-02 03:39:28,249 | __main__ | htbtworker | process_msg | 78 | INFO | HBT:Getting :http://message-router.onap.svc.cluster.local:3904/events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 2019-05-02 03:39:28,257 | urllib3.connectionpool | connectionpool | _new_conn | 208 | DEBUG | Starting new HTTP connection (1): message-router.onap.svc.cluster.local 2019-05-02 03:39:28,929 | urllib3.connectionpool | connectionpool | _make_request | 396 | DEBUG | http://message-router.onap.svc.cluster.local:3904 "GET /events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 HTTP/1.1" 200 541 2019-05-02 03:39:28,930 | __main__ | htbtworker | process_msg | 93 | INFO | ('HBT:', '["{\\"event\\":{\\"commonEventHeader\\":{\\"startEpochMicrosec\\":1556768319000,\\"sourceId\\":\\"79e90d76-513a-4f79-886d-470a0037c5cf\\",\\"eventId\\":\\"Heartbeat_vDNS_10.0.0.1\\",\\"nfcNamingCode\\":\\"DNS\\",\\"reportingEntityId\\":\\"79e90d76-513a-4f79-886d-470a0037c5cf\\",\\"eventType\\":\\"applicationVnf\\",\\"priority\\":\\"Normal\\",\\"version\\":3,\\"reportingEntityName\\":\\"testcmd001\\",\\"sequence\\":36312,\\"domain\\":\\"heartbeat\\",\\"lastEpochMicrosec\\":1556768319000,\\"eventName\\":\\"Heartbeat_vDNS\\",\\"sourceName\\":\\"testnode001\\",\\"nfNamingCode\\":\\"MDNS\\"}}}"]') 2019-05-02 03:39:28,930 | __main__ | htbtworker | process_msg | 128 | INFO | ('HBT:Newly received HB event values ::', 'Heartbeat_vDNS', 1556768319000, 'testnode001') 2019-05-02 03:39:28,936 | __main__ | htbtworker | process_msg | 135 | INFO | HBT:vnf_table_2 is already there 2019-05-02 03:39:28,937 | __main__ | htbtworker | process_msg | 139 | INFO | ('HBT:', "Select source_name_count from vnf_table_1 where event_name='Heartbeat_vDNS'") 2019-05-02 03:39:28,940 | __main__ | htbtworker | process_msg | 156 | INFO | ('HBT:event name, source_name & source_name_count are', 'Heartbeat_vDNS', 'testnode001', 3) 2019-05-02 03:39:28,940 | __main__ | htbtworker | process_msg | 160 | INFO | ('HBT:eppc query is', "Select source_name from vnf_table_2 where event_name= 'Heartbeat_vDNS' and source_name_key=1") 2019-05-02 03:39:28,944 | __main__ | htbtworker | process_msg | 160 | INFO | ('HBT:eppc query is', "Select source_name from vnf_table_2 where event_name= 'Heartbeat_vDNS' and source_name_key=2") 2019-05-02 03:39:28,945 | __main__ | htbtworker | process_msg | 160 | INFO | ('HBT:eppc query is', "Select source_name from vnf_table_2 where event_name= 'Heartbeat_vDNS' and source_name_key=3") 2019-05-02 03:39:28,946 | __main__ | htbtworker | process_msg | 168 | INFO | ('HBT: Update vnf_table_2 : ', 2, [('testnode001',)]) 2019-05-02 03:39:28,946 | __main__ | htbtworker | process_msg | 176 | INFO | ('HBT: The source_name_key and source_name_count are ', 3, 3) 2019-05-02 03:39:34,820 | __main__ | misshtbtd | main | 364 | INFO | ('MSHBT: hb_common values ', 7, 'RUNNING', 'dcae-heartbeat-service-', 1556768350) 2019-05-02 03:39:34,820 | __main__ | misshtbtd | main | 368 | INFO | ('MSHBD:pid,srcName,state,time,ctime,timeDiff is', 7, 'dcae-heartbeat-service-', 'RUNNING', 1556768350, 1556768375, 25) 2019-05-02 03:39:34,820 | __main__ | misshtbtd | main | 384 | INFO | ('MSHBD:config status is', 'RUNNING') 2019-05-02 03:39:34,841 | __main__ | misshtbtd | create_update_hb_common | 143 | INFO | MSHBT:Updated hb_common DB with new values 2019-05-02 03:39:46,086 | __main__ | db_monitoring | db_monitoring | 158 | INFO | DBM: Active DB Monitoring Instance 2019-05-02 03:39:46,208 | __main__ | db_monitoring | sendControlLoopEvent | 39 | INFO | ('DBM:Time to raise Control Loop Event for Control loop typ /target type - ', 'ABATED', 'VNF') 2019-05-02 03:39:46,209 | __main__ | db_monitoring | sendControlLoopEvent | 77 | INFO | DBM:Heartbeat received, clearing alarm event 2019-05-02 03:39:46,209 | __main__ | db_monitoring | sendControlLoopEvent | 117 | INFO | ('DBM: CL Json object is', '{"closedLoopEventClient": "DCAE_Heartbeat_MS", "policyVersion": "1.0.0.5", "policyName": "vFireWall", "policyScope": "resource=sampleResource,type=sampletype,CLName=sampleCLName", "target_type": "VNF", "AAI": {"generic-vnf.vnf-name": "testnode001"}, "closedLoopAlarmStart": 1556768386209, "closedLoopEventStatus": "ABATED", "closedLoopControlName": "ControlLoopEvent1", "version": "1.0", "target": "genVnfName", "requestID": "8c1b8bd8-06f7-493f-8ed7-daaa4cc481bc", "from": "DCAE"}') 2019-05-02 03:39:46,209 | __main__ | db_monitoring | sendControlLoopEvent | 121 | INFO | ('DBM:', 'http://message-router.onap.svc.cluster.local:3904/events/unauthenticated.DCAE_CL_OUTPUT') 2019-05-02 03:39:46,217 | urllib3.connectionpool | connectionpool | _new_conn | 208 | DEBUG | Starting new HTTP connection (1): message-router.onap.svc.cluster.local 2019-05-02 03:39:46,260 | urllib3.connectionpool | connectionpool | _make_request | 396 | DEBUG | http://message-router.onap.svc.cluster.local:3904 "POST /events/unauthenticated.DCAE_CL_OUTPUT HTTP/1.1" 200 41 2019-05-02 03:39:46,262 | __main__ | db_monitoring | sendControlLoopEvent | 126 | INFO | ('DBM:', 200, 'OK') 2019-05-02 03:39:46,263 | __main__ | db_monitoring | sendControlLoopEvent | 129 | INFO | ('DBM:Status code for sending the control loop event is', 200) 2019-05-02 03:39:49,086 | __main__ | htbtworker | process_msg | 72 | INFO | ('\n\nHBT:eventnameList values ', ['Heartbeat_vDNS', 'Heartbeat_vFW', 'Heartbeat_xx']) 2019-05-02 03:39:49,086 | __main__ | htbtworker | process_msg | 78 | INFO | HBT:Getting :http://message-router.onap.svc.cluster.local:3904/events/unauthenticated.SEC_HEARTBEAT_OUTPUT/hbgrpID/1?timeout=15000 2019-05-02 03:39:49,108 | urllib3.connectionpool | connectionpool | _new_conn | 208 | DEBUG | Starting new HTTP connection (1): message-router.onap.svc.cluster.local
We can check that a new DCAE_CL_OUTPUT event has been published (make sure you target the correct DMaaP IP address)
curl http://10.12.5.8:30227/events/unauthenticated.DCAE_CL_OUTPUT/vv/1
["{\"closedLoopEventClient\": \"DCAE_Heartbeat_MS\", \"policyVersion\": \"1.0.0.5\", \"policyName\": \"vFireWall\", \"policyScope\": \"resource=sampleResource,type=sampletype,CLName=sampleCLName\", \"target_type\": \"VNF\", \"AAI\": {\"generic-vnf.vnf-name\": \"testnode001\"}, \"closedLoopAlarmStart\": 1556768386209, \"closedLoopEventStatus\": \"ABATED\", \"closedLoopControlName\": \"ControlLoopEvent1\", \"version\": \"1.0\", \"target\": \"genVnfName\", \"requestID\": \"8c1b8bd8-06f7-493f-8ed7-daaa4cc481bc\", \"from\": \"DCAE\"}"]
Dynamic Configuration Update
As the Heartbeat service periodically polls Consul KV using configbindingService api's - the run time configuration of Heartbeat service can be updated dynamically without having to redeploy/restart the service. The updates to configuration can be triggered either from Policy (or CLAMP) or made directly in Consul.
Locate the servicename by executing into Heartbeat Service pod and getting env HOSTNAME value
root@k8s-rancher:~# kubectl exec -it -n onap dep-s78f36f2daf0843518f2e25184769eb8b-dcae-heartbeat-servithzx2 /bin/bash Defaulting container name to s78f36f2daf0843518f2e25184769eb8b-dcae-heartbeat-service. Use 'kubectl describe pod/dep-s78f36f2daf0843518f2e25184769eb8b-dcae-heartbeat-servithzx2 -n onap' to see all of the containers in this pod. misshtbt@s78f36f2daf0843518f2e25184769eb8b-dcae-heartbeat-service:~/bin$ env | grep HOSTNAME HOSTNAME=s78f36f2daf0843518f2e25184769eb8b-dcae-heartbeat-service
Change the configuration for Service in KV-store through UI
http://<k8snodeip>:30270/ui/#/dc1/kv/