Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

System Metrics that aply to all Policy components

...

These metrics are available and exposed via a Prometheus endpoint since Istanbul release. 

Note: Standard metrics are already exposed for Policy DB (MariaDB) via common charts.

Yes
MetricPrometheus Query
Memory usagejvm_memory_bytes_usedYesYesAvailable in Istanbul
CPU Usageprocess_cpu_seconds_totalYes
JVM threads

jvm_threads_current
jvm_threads_daemon

YesYes
Process uptimeprocess_start_time_secondsYesYes
Garbage Collectors

GCs per second: rate(jvm_gc_collection_seconds_sum[1m])

Avg GC time: rate(jvm_gc_collection_seconds_sum[1m]) / rate(jvm_gc_collection_seconds_count[1m])

YesYes

Note: Standard metrics are already exposed for Policy DB (MariaDB) via common charts.

Key metrics for Policy API

MetricMetric available?

Exposed via Prometheus endpoint?

Comment
Availability of policy-api serviceYesNo

Exposed by policy-api healthcheck

Code Block
titleGET /policy/api/v1/healthcheck
collapsetrue
Sample response:
{
    "name": "Policy API",
    "url": "dev-policy-api-87997d75c-2dsxf",
    "healthy": true,
    "code": 200,
    "message": "alive"
}

and policy-pap consolidated healthcheck.

Latency


NoNo

To be implemented for all CRUD endpoints exposed by policy-api.

Sample s3p numbers for policy-api stress tests.

Request rate (API requests per minute)NoNo

Number of API calls per minute

Failure rate (API errors per minute)NoNo

Number of API calls with non 20* family of status codes per minute

SSL certificate expiry timeNoNo

...

MetricMetric available?Exposed via Prometheus endpoint?Comment
Availability of the policy-pap serviceYesNo

policy-pap healthcheck API

Status of PDPs as registered with policy-pap

YesNo

policy-pap consolidated healthcheck API

Request rate (API requests per minute)

NoNo

To be implemented for all the endpoints exposed by policy-pap.

Sample s3p numbers for policy-pap stress tests. 

Failure rate (API errors per minute)

NoNo

To be implemented for all the endpoints exposed by policy-pap.

Number of API calls with non 200 family of status codes per minute

Latency

NoNo

To be implemented for all the endpoints exposed by policy-pap.

Policy deployment statistics

policyDeployFailureCount
policyDeploySuccessCount
totalPolicyDeployCount

YesNo

Sample:

Code Block
languagebash
titleGET /policy/pap/v1/statistics
collapsetrue
{
    "code": 200,
    "policyDeployFailureCount": 0,
    "policyDeploySuccessCount": 0,
    "policyDownloadFailureCount": 0,
    "policyDownloadSuccessCount": 0,
    "totalPdpCount": 0,
    "totalPdpGroupCount": 0,
    "totalPolicyDeployCount": 0,
    "totalPolicyDownloadCount": 0
}

Latency

NoNoTo be implemented for all the endpoints exposed by policy-pap.


SSL certificate expiry time

No

Nohttps is disabled for entire Policy framework

...