...
Metric | Metric available? | Exposed via Prometheus endpoint? | Comment |
---|---|---|---|
Availability of policy-api service | Yes | NoYes | Exposed by policy-api healthcheck and policy-pap consolidated healthcheck. |
Latency | NoYesNo | Yes | To be implemented for all CRUD endpoints exposed by policy-api. Sample s3p numbers for policy-api stress tests. |
Successful API request counter | NoYesNo | Yes | Prometheus query for Number of successful API calls per minute |
Failed API request counter | NoYesNo | Yes | Prometheus query for Number of API calls with non 20* family of status codes per minute |
...
Metric | Metric available? | Exposed via Prometheus endpoint? | Comment | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Availability of the policy-pap service | Yes | No | policy-pap healthcheck API | Status of PDPs as registered with policy-pap | Yes | No | policy-pap | consolidated healthcheck API | ||||
Successful API request counter | NoYesNo | Yes | To be implemented for all the endpoints exposed by policy-pap. Sample s3p numbers for policy-pap stress tests. | |||||||||
Failed API request counter | NoYesNo | Yes | To be implemented for all the endpoints exposed by policy-pap. Number of API calls with non 200 family of status codes per minute | |||||||||
Latency | NoYesNo | Yes | To be implemented for all the endpoints exposed by policy-pap. | |||||||||
Policy deployment statistics policyDeployFailureCount | Yes | NoYes | Sample:
|
...
Metric | Metric available? | Exposed via Prometheus endpoint? | Comment |
---|---|---|---|
Availability of the policy-distribution service | Yes | NoYes | Exposed by policy-distribution healthcheck and consolidated policy-pap healthcheck |
Successful API request counter | NoYesNo | Yes | To be implemented for all the endpoints exposed by policy-distribution. Sample s3p numbers for policy-distribution stress tests. |
Failed API request counter | NoYesNo | Yes | To be implemented for all the endpoints exposed by policy-distribution. Number of API calls with non 200 family of status codes per minute |
Latency | NoYesNo | Yes | To be implemented for all the endpoints exposed by policy-distribution. |
Policy distribution statistics distributions | Yes | NoYes |
...
Metric | Metric available? | Exposed via Prometheus endpoint? | Comment | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Availability of policy-apex-pdp | Yes | NoYes | Exposed by policy-apex-pdp healthcheck and policy-pap consolidated healthcheck. | |||||||
TOSCA Policy Deployment counter (per apex-pdp instance) policyDeployCount | Yes | NoYes | Exposed by policy-pap statistics
| |||||||
TOSCA Policy Execution counter (per apex-pdp instance) # of policies executed *Note: the stats currently displays APEX policy counters | No | No | Yes | Yes | ||||||
Engine stats (by engineID per apex-pdp instance) | Latency | No | No eventCount: number of APEX events processed | Yes | No | |||||
Count of events processed (per engine thread, per apex-pdp instance) # of incoming trigger events processed by policy-apex-pdp *Note: the stats currently displays APEX event counters processed by the engine | No | No | ||||||||
, uptime is derived from this metric | Yes | Yes | ||||||||
Latency | Yes | Yes | Time taken for processing an incoming network trigger event by a TOSCA policyAPEX event *Note: the stats currently displays execution time for processing APEX policy., and is a measure of system saturation and is sufficient | |||||||
Kafka consumer lag | No | No | Can be implemented outside of the Policy FWK. Monitor kafka consumer lag increase for kafka/dmaap-message-router topics related to apex-pdp |
...