Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents
outlinetrue

Overview

...

MUSIC was released in the ONAP Beijing release and provides a service with recipes that individual ONAP components and micro-service can use for state replication, consistency management and state ownership across geo-distributed sites. This is a crucial component enabling ONAP components to achieve S3P in terms of resiliency both within and across sites (platform-maturity resiliency level 3). 

In this release we plan to provide to address the following items:

  • MUSIC as a service: while MUSIC was consumed internally by components in the Beijing release, in Dublin we intend to provide MUSIC as an independent multi-site clustered service
  • Enable automated failure detection and consistent failover across sites for ONAP components using MUSIC through the PROM recipe. It will require no change to the code of the ONAP components and just a few scripting/configuration steps to achieve single-step automated failover while ensuring that the new leader/owner has access to the latest state information.
  • Provide the design to make MUSIC a Provide MUSIC as a fully sharded, scale out systemcommon service, where as many ONAP sites/component replicas can be added as required for performance. The significant technical challenge is to eliminate the need for Zookeeper and build  MUSIC completely based on Cassandra while preserving all its guarantees. We expect this change to improve both deployabiity (just one tool – Cassandra) and performance (initial benchmarks indicate a factor of at least 4-5 times in terms of throughput). This is a crucial precursor for its use in edge computing and as the state management service for a federated ONAP.
  • Provide the design seed code to allow MUSIC to support database (RDBMS) clustering across sites using the mdbc recipe wherein ONAP components that require it can continue using a SQL database within a site while using MUSIC is as the underlying transport layer across sites, with much better performance than standard solutions like Gallera clustering. 
  • Continued adherence to ONAP S3P requirements in Dublin

...

  • Targeted goal for Dublin
    • OOF-Homing Optimizer (HAS) uses MUSIC for its state persistence (as a queue) and as a highly available distributed messaging service. (2) 
    • ONAP Portal will use MUSIC to store its http session state across sites in a persistent manner.  
    Stretch goal for Dublin: SDN-C will use the MUSIC PROM recipe for automated and consistent failover across sites. 

Minimum Viable Product

MUSIC service that can serve the geo-redundancy needs of ONAP HAS and ONAP Portal while satisfying the platform maturity requirements for the Dublin release. 

...

In the long term we hope that MUSIC will be common, shared state-management system for all ONAP components and micro-services to manage geo-redundancy. For example, we envisage the use of MUSIC for multi-site state management in SO (to store Camunda state across sites), <SDN-C, AppC> (to store ODL related state across sites) , A&AI (to store its graph data) and most other ONAP components that need to manage state across sites. Further, we envision that these services will use the MUSIC recipes  (mdbc, prom, musicCAS, musicQ) to achieve the goal of a multi-site active-active federated ONAP solution. 

...

AreaActual LevelTargeted Level for current ReleaseHow, EvidencesComments
Performance11

This file shows basic performance benchmarks performed for MUSIC on a 10 node cluster.

  • 0 -- none
  • 1 – baseline performance criteria identified and measured
  • 2 & 3 – performance improvement plans created & implemented
Stability11As shown in this file, our experimental runs were all over 1 hour.
  • 0 – none
  • 1 – 72 hours component level soak w/random transactions
  • 2 – 72 hours platform level soak w/random transactions
  • 3 – 6 months track record of reduced defect rate
Resiliency

2

2

Within each container we have scripts that will detect failure of MUSIC and restart it. However, if the entire container fails, we will need OOM to bring it up.

  • 0 – none
  • 1 – manual failure and recovery (< 30 minutes)
  • 2 – automated detection and recovery (single site)
  • 3 – automated detection and recovery (geo redundancy)
Security

2

2

  • SSL Communication’s between Cassandra Cluster Nodes.
  • REST over HTTPS with AAF for Authentication.
  • 0 – none
  • 1 – CII Passing badge + 50% Test Coverage
  • 2 – CII Silver badge; internal communication encrypted; role-based access control and authorization for all calls
  • 3 – CII Gold
Scalability

1

1

Among the MUSIC components [tomcat, zookeeper, cassandra], new MUSIC nodes with the tomcat and cassandra can be added seamlessly to scale the cluster (MUSIC itself is state-less). Zookeeper nodes ideally should not be scaled since there are major performance implications. However, this can be done with reconfiguration.


  • 0 – no ability to scale
  • 1 – single site horizontal scaling
  • 2 – geographic scaling
  • 3 – scaling across multiple ONAP instances
Manageability

1

1

Using EELF with logback as the logging provider.

  • 1 – single logging system across components; instantiation in < 1 hour
  • 2 – ability to upgrade a single component; tracing across components; externalized configuration management
Usability

1

1

Use SWAGGER for the REST API and Installation Docs. Will need to enhance and update the documentation.

  • 1 – user guide; deployment documentation; API documentation
  • 2 – UI consistency; usability testing; tutorial documentation

...

None identified so far. 

Resources

Udated the Resources Committed to the Release centralized page.

  • Release Milestone

The milestones are defined at the Release Level and all the supporting project agreed to comply with these dates.

...