ONAP Application Logging Specification v1.2 (Casablanca)

ONAP Application Logging Specification v1.2 (Casablanca)

About This Document

Official R1 documentation snapshot in  https://onap.readthedocs.io/en/latest/submodules/logging-analytics.git/docs/

This document specifies logging conventions to be followed by ONAP component applications.  

ONAP logging is intended to support operability, debugging and reporting on ONAP. These guidelines address:

  • Events that are written by ONAP components.

  • Propagation of transaction and invocation information between components.

  • MDCs, Markers and other information that should be attached to log messages.

    • MDC = Mapped Diagnostic Context

  • Human and machine-readable output format(s).

  • Files, locations and other conventions. 

Original AT&T ONAP Logging guidelines (pre amsterdam release) - for historical reference only: https://wiki.onap.org/download/attachments/1015849/ONAP%20application%20logging%20guidelines.pdf?api=v2

The Acumos logging specification follows this document at https://wiki.acumos.org/display/OAM/Log+Standards 

Logback reference: Logging Developer Guide#Logback.xml based on https://gerrit.onap.org/r/#/c/62405

Introduction

The purpose of ONAP logging is to capture information needed to operate, troubleshoot and report on the performance of the ONAP platform and its constituent components. Log records may be viewed and consumed directly by users and systems, indexed and loaded into a datastore, and used to compute metrics and generate reports. 

The processing of a single client request will often involve multiple ONAP components and/or subcomponents (interchangeably referred to as ‘application’ in this document). The ability to track flows across components is critical to understanding ONAP’s behavior and performance. ONAP logging uses a universally unique RequestID value in log records to track the processing of every client request through all the ONAP components involved in its processing.

A reference configuration of Elastic Stack is being deployed using ONAP Operations Manager since the amsterdam release - see usage in Logging Analytics Dashboards (Kibana)

This document proposes conventions you can follow to generate conformant, indexable logging output from your component.

Supported Languages

Java (including Scala and Clojure - because they compile to the same bytecode) and Python are supported, but conventions may be implemented by other technologies like GO. 

Java

The library and aop wrapper are written in Java and will work for Clojure (thanks to a discussion with @Shwetank that reminded me of languages that compile to bytecode) and Scala as well. see: https://git.onap.org/logging-analytics/tree/reference

Clojure

See the java libraries

Scala

see the java libraries

Python

https://lists.onap.org/g/onap-discuss/topic/logging_python_logging/25307286?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,25307286

How to Log

ONAP prescribes conventions. The use of certain APIs and providers is recommended, but they are not mandatory. Most components log via EELF or SLF4J to a provider like Logback or Log4j.

Logging Specification Compliance

For the casablanca release the logging specification has been updated and finalized as of June 2018.  Implementation of this specification is required but the method of implementation is optional based on each team's level of possible engagement.

High Level the changes are introduction of MDC's (key/value pairs like requestID=...) and Markers (labels like ENTRY/EXIT)

The ELK stack (indexes and dashboards) is focused on the new specification in Casablanca - if time permits we will have a migration path and/or support for both the older Amsterdam/Beijing release and Casablanca - https://lf-onap.atlassian.net/browse/LOG-584

Level 0: Log specification unchanged

Just keep your code as-is and commit to migrating to the the MDC and Marker focused specification for Dublin - logs will be the older format for now

Level 1: Log specification compliance via local project changes, library or wrapper

Continue to use your own library/wrapper if you have one like in SDC and AAI, change individual source files.

Level 2: Log specification compliance via SLF4J supplied logging-analytics library

Use Luke's SLF4J library directly as wrapper for MDCs and Markers by calling the library from within each function.

Level 3: Log specification compliance via AOP library over SLF4J supplied logging-analytics library

Use a spring based Aspect library that emits Markers automatically around function calls and retrofit your code to log via Luke's SLF4J for internal log messages.

LoggingWithAOP

Logging Library Location and Use

see https://git.onap.org/logging-analytics/tree/reference/logging-slf4j

and usage ONAP Development#DeveloperUseoftheLoggingLibrary  ONAP Development#KubernetesDevOps and Logging User Guide#LoggingDevOps

see Spring AOP example (minimal changes to existing code base) WIP in https://lf-onap.atlassian.net/browse/LOG-135 documented at ONAP Development#LoggingWithAOP

EELF

EELF is the Event and Error Logging Framework, described at https://github.com/att/EELF.

EELF abstracts your choice of logging provider, and decorates the familiar Logger contracts with features like:

  • Localization. 

  • Error codes. 

  • Generated wiki documentation.

  • Separate audit, metrics, security error and debug logs. 

EELF is a facade, so logging output is configured in two ways:

  1. By selection of a logging provider such as Logback or Log4j, typically via the classpath. 

  2. By way of a provider configuration document, typically logback.xml or log4j.xml. See Providers.

SLF4J

SLF4J is a logging facade, and a humble masterpiece. It combines what's common to all major, modern Java logging providers into a single interface. This decouples the caller from the provider, and encourages the use of what's universal, familiar and proven. 

EELF also logs via SLF4J's abstractions as the default provider.

Providers

Logging providers are normally enabled by their presence in the classpath. This means the decision may have been made for you, in some cases implicitly by dependencies. If you have a strong preference then you can change providers, but since the implementation is typically abstracted behind EELF or SLF4J, it may not be worth the effort.

Logback

Logback is the most commonly used provider. It is generally configured by an XML document named logback.xml. See Configuration.

See HELM template https://git.onap.org/logging-analytics/tree/reference/provider/helm/logback

Log4j 2.X

Log4j 2.X is somewhat less common than Logback, but equivalent. It is generally configured by an XML document named log4j.xml. See Configuration.

Log4j 1.X

Strongly discouraged from Beijing onwards, since 1.X is EOL, and since it does not support escaping, so its output may not be machine-readable. See https://logging.apache.org/log4j/1.2/.

This affects OpenDaylight-based components like SDNC and APPC, since ODL releases prior to Carbon bundled Log4j 1.X, and make it difficult to replace. The Common Controller SDK Project project targets ODL Carbon, so remaining instances of Log4j 1.X should disappear by the time of the Casablanca release.

What to Log

The purpose of logging is to capture diagnostic information.

An important aspect of this is analytics, which requires tracing of requests between components. In a large, distributed and scalable system such as ONAP this is critical to understanding behavior and performance. 

General 

It isn't the aim of this document to reiterate Best Practices, so advice here is general: 

  • Use a logging facade such as SLF4J or EELF. 

  • Write log messages in English.

  • Write meaningful messages. Consider what will be useful to consumers of logger output. 

  • Log at the appropriate level. Be aware of the volume of logs that will be produced.

  • Safeguard the information in exceptions, and ensure it is never lost.

  • Use error codes to characterize exceptions. 

  • Log in a machine-readable format. See Conventions.

  • Log for analytics as well as troubleshooting.

Others have written extensively on this: 

Standard Attributes

These are attributes common to all log messages. They are either:

  • Explicitly required by logging APIs:

    • Logger

    • Level

    • Message

    • Exception (note that exception is the only standard attribute that may routinely be empty).

  • Implicitly derived by the logging provider:

    • Timestamp

    • Thread.

Which means you normally can't help but report them. 

See https://www.slf4j.org/api/org/slf4j/Logger.html and https://logback.qos.ch/manual/layouts.html#ClassicPatternLayout for their origins and use.

Logger Name

This indicates the name of the logger that logged the message.

In Java it is convention to name the logger after the class or package using that logger.

  • In Java, report the class or package name.

  • In Python, the class or source filename.

Most other languages will fit one of those patterns. 

Level

Severity, typically drawn from the enumeration {TRACE, DEBUG, INFO, WARN, ERROR}.

Think carefully about the information you report at each log level. The default log level is INFO.

Some loggers define non-standard levels, like FINE, FINER, WARNING, SEVERE, FATAL or CRITICAL. Use these judiciously, or avoid them.

Message

The free text payload of a log event. 

This is the most important item of information in most log messages. See General guidelines.

Internationalization

Diagnostic log messages generally do not need to be internationalized.

Parameterization

Parameterized messages allow serialization to be deferred until AFTER level threshold checks. This means the cost is never incurred for messages that won't be written. 

  • Favor parameterized messages, especially for INFO and DEBUG logging.

  • Perform expensive serialization in the #toString method of wrapper classes.

For example:

logger.debug("Relax - this won't hurt: {}", new ToStringWrapper(costlyToSerialize));

Exception

The error stacktrace, where applicable.

Log unabridged stacktraces upon error.

When rethrowing, ensure that frame information is not lost:

  • By logging the original exception at the point where it was caught. 

  • By setting the original exception as the cause when rethrowing.

Timestamp

Logged as an ISO8601 UTC datetime. Millisecond (or greater) precision is preferable.

For example:

2018-07-05T20:21:34.794Z

Offset timestamps are OK provided the offset is included. (In the above example, the "Z" is a shorthand indicating an offset of zero – UTC).

Thread

The name of the thread from which the log message was emitted.

Thread names don't necessarily convey useful information, and their reliability depends on the thread model implemented by different runtimes, but they are sometimes used in heuristic analysis.

Efficiency

There is tension between utility and efficiency. IO bandwidth is finite, and the cost of serialization can be significant, especially at higher diagnostic levels.

Methods and Line Numbers

Many loggers can use reflection to emit the originating (Java) method, and even individual line numbers.

This information is certainly useful, but very expensive. Most logging implementations recommend that this not be enabled in production.

Level Thresholds

Level indicates severity.

Logger output is typically filtered by logger and level. The default logging level is INFO, so particular consideration should be given to the efficiency of INFO-level logging.

When DEBUG-level logging is configured, it's probably for good reason, and a greater overhead is expected. Be aware that it's not unusual for DEBUG logging to be left enabled inadvertently, however.

WARN and ERROR-level messages are of higher value, and comparatively rare, so their cost is less of a concern.

Conditionals

A common pattern is to place conditionals around (expensive) serialization.

For example:

if (logger.isDebugEnabled()) { logger.debug("But this WILL hurt: " + costlyToSerialize); }

Parameterized logging is preferable.

Context

MDCs

A Mapped Diagnostic Context (MDC) allows an arbitrary string-valued attribute to be attached to a Java thread via a ThreadLocal variable. The MDC's value is then emitted with each message logged by that thread. The set of MDCs associated with a log message is serialized as unordered name-value pairs (see Text Output).

A good discussion of MDCs can be found at https://logback.qos.ch/manual/mdc.html

Example

From @Luke Parker's call graph work in https://git.onap.org/logging-analytics/tree/reference/logging-slf4j-demo

LogEntry(markers=ENTRY, logger=ComponentAlpha, requestID=eb3e0dc2-6c3c-4bb7-8ed6-e5cc4ec7aad2, invocationID=06c815ef-5969-45cc-b319-d0dbcde89329, timestamp=Tue May 08 04:23:27 AEST 2018)


Mapped Diagnostic Context Table

Pipe Order

Name

Type

Group

Description

Applicable

(per log file)

Marker Associations

Moved

MDC

to



standard

attribute



Removed

(was in

older

spec)

Required?

Y/N/C

(C= context dependent)

N = not required

L=Library provided



Derived

Historical

Acumos
ref

Use Cases

Code References

Pipe Order

Name

Type

Group

Description

Applicable

(per log file)

Marker Associations

Moved

MDC

to



standard

attribute



Removed

(was in

older

spec)

Required?

Y/N/C

(C= context dependent)

N = not required

L=Library provided



Derived

Historical

Acumos
ref

Use Cases

Code References

1

LogTimestamp

log system



use %d field - see %d{"yyyy-MM-dd'T'HH:mm:ss.SSSXXX",UTC}









L











2

EntryTimestamp

MDC



if part of an ENTRY marker log









C











3

InvokeTimestamp

MDC



if part of an INVOKE marker log









C











4

RequestID

MDC



UUID to track the processing of each client request across all the ONAP components involved in its processing









Y









In general

https://git.onap.org/logging-analytics/tree/reference/logging-slf4j-demo/src/test/java/org/onap/logging/ref/slf4j/demo/component/AbstractComponentTest.java

5

InvocationID

MDC



UUID correlates log entries relating to a single invocation of a single component
In the case of an asynchronous request, the InvocationID should come from the original request 









Y











6

InstanceID

MDC



UUID to differentiate between multiple instances of the same (named) log writing service/application









Y



was InstanceUUID







7

ServiceInstanceID

MDC













C











8

thread

log system



use %thread field









L











9

ServiceName





The service inside the partner doing the call - includes API name









Y











10

PartnerName





unauthenticated = The part of the URI specifying the agent that the caller used to make the call to the component that is logging the message.

authenticated = userid

  1. If an authenticated API, then log the userid

  2. Otherwise, if the HTTP header "X-ONAP-PartnerName" was provided, then log that (note: this was a direction that we seemed to be going but never completed)

  3. Otherwise, if the HTTP header "User-Agent" was provided, then log that