2019-08-29 AAI Developers Meeting

Attachments (including meeting recording)

  File Modified

PNG File Screen-747.png

Aug 27, 2019 by Keong Lim



Status

OPEN

IN PROGRESS

ON HOLD

DONE

CANCELLED

Agenda Items

START RECORDING


TitleRaised ByStatusLast discussedNotes
1Schema validation tool

IN PROGRESS

2019-08-29
2Helm Chart Common Templates

IN PROGRESS

2019-08-29
OOM-1936 - Getting issue details... STATUS

https://gerrit.onap.org/r/c/oom/+/91553

https://gerrit.onap.org/r/c/aai/oom/+/91550

Use of common templates in the helm charts and have it driven from the values.yaml since all the helm charts are identical except the following things such as volumes, volume mounts, ConfigMap and secrets.

The goal is to extract all that information in the values.yaml file and use common templates to enable more consistency.

In each of the individual helm charts, we have a filebeat sidecar container, when creating a new helm chart, they just have to say if their chart should enable it or not in the values.yaml.

3Kubernetes job problem

DONE

2019-08-29

Message to discuss list: https://lists.onap.org/g/onap-discuss/topic/32859796

We have a Dublin environment that had a Cassandra problem which caused the graphadmin-create-db-schema job to fail repeatedly.

Question 1. Is there an ONAP-built-in monitoring solution for the shared Cassandra database?

The Cassandra problem was a combination of:
- node-0 had Out of Memory error in JVM
- node-1 had error in commit log processing
- node-2 had Out of Disk Space error on filesystem

However, Kubernetes showed node-0 and node-2 still "Running" and apparently healthy (zero restarts), with node-1 in CrashLoopBackOff (many restarts).
The failed Cassandra nodes caused the graphadmin-create-db-schema job to fail (many restarts), which causes all other AAI pods to wait in Init state.

The Cassandra problems have been manually fixed now, but should ONAP have a monitoring solution built-in to detect it?


Question 2. How can we re-run that kubernetes job?

Deleting the failed job pods cleaned up the list but did not prompt any new job pods to be created.
Deleting the graphadmin pod caused it to restart and wait for the job completion, but did not trigger a new job to start.

I think the job has now hit its backoff limit, so it no longer runs, even though we have since fixed the Cassandra problem.
Can we reset some parameter so that it will finally run to completion?
Is there an appropriate helm command or kubectl command to re-run the job?

The rest of AAI pods are obviously waiting for that job to complete before they can progress out of Init state.

Update:

  • re-running the graphadmin-create-db-schema job (done!)
  • fixing the readiness-check that is waiting for cassandra
  • Put a message onto the rocketchat also, hoping that FREEMAN, BRIAN D  or another expert on ready.py script in readiness-check pod can assist.
  • Mahendra Raghuwanshi do you know if there is a better way to monitor Cassandra?
  • Discussed at OOM Meeting Notes - 2019-08-14 Mike Elliott is trying to get some resources to look at the JIRA case OOM-2057 - Getting issue details... STATUS

Harish will work on 

AAI-2082 - Getting issue details... STATUS

4BYOQ DSL wiki

IN PROGRESS

2019-08-29

Referring to https://wiki.onap.org/display/DW/cloud-region-fromVnf

and https://gerrit.onap.org/r/gitweb?p=aai/schema-service.git;a=blob;f=aai-queries/src/main/resources/schema/onap/query/stored-queries.json;hb=HEAD#l3

Wiki page says that output is:

vserver
vnfc
tenant
cloud-region

but stored-queries.json definition includes:

.createEdgeTraversal(EdgeType.COUSIN, 'vserver', 'pserver').store('x')

So the output should also include "pserver"?

Also, the wiki page shows


The formatting of the traversal suggests that only cloud-region would be returned, which is inconsistent with the sections above on the same page.

Which one is correct for this case?


This is my attempt at a translation of the Gremlin query into DSL query:

{
 "dsl":"generic-vnf*('vnf-id',‘id number') > [
  vnfc* > vserver* > [ pserver*, tenant* > cloud-region* ] ,
  vserver* > [ pserver*, tenant* > cloud-region* ] ]"
}

Please confirm whether this is accurate and correct.


5Jenkins jobs

DONE

2019-08-29

El Alto branch is failing but master branch is passing.

Ubuntu node works but Centos node fails.

https://jenkins.onap.org/view/aai-aai-common/job/aai-aai-common-maven-stage-elalto/jobConfigHistory/showDiffFiles?timestamp1=2019-08-07_17-28-00&timestamp2=2019-08-07_20-13-19

Problem with releasing artifacts, since self-release is blocked, but LF cannot help with release.

aai-common 1.5.2 appears in nexus, but it looks different from 1.5.1, missing the ".asc" files which appear to be GPG signatures.

Is fixed

6AAI schema UML diagram

DONE

2019-08-29

2019-08-12 AAI Information Model Reverse Engineering

7Review of AAI El Alto backlog

DONE

2019-07-29

Getting issues...

8AAI-EVENT configuration

IN PROGRESS

2019-08-29

While investigating and responding to https://lists.onap.org/g/onap-discuss/topic/32651336 discovered that topic name "AAI-EVENT" appears in some configuration property files but also hard-coded into some Java files.

Is this topic name intended to be fully configurable or not?

e.g.

Configured: https://gerrit.onap.org/r/gitweb?p=aai/graphadmin.git;a=blob;f=src/main/resources/etc/appprops/aaiconfig.properties;hb=HEAD#l53

Configured: https://gerrit.onap.org/r/gitweb?p=aai/resources.git;a=blob;f=aai-resources/src/main/resources/etc/appprops/aaiconfig.properties;hb=HEAD#l58

Hard-coded: https://gerrit.onap.org/r/gitweb?p=aai/aai-common.git;a=blob;f=aai-core/src/main/java/org/onap/aai/dmaap/AAIDmaapEventJMSConsumer.java;hb=HEAD#l121

Hard-coded: https://gerrit.onap.org/r/gitweb?p=aai/aai-common.git;a=blob;f=aai-core/src/main/java/org/onap/aai/util/StoreNotificationEvent.java;hb=HEAD#l332

  1. We need to concentrate configuration (like event name) on one place and not have it spread throughout the properties/java classes
  2. A description of the payload, so that possible new consumers know what to expect. Maybe a light description would be good.


9AAI Alpine conversionDmitry Puzikov (Deactivated)

DONE

2019-08-29

After some delay we were able to deploy ONAP with minimized AAI images and perform

  • ONAP robot healthdist tests and test model-loader manually.
  • All AAI related robot tests and model-loader tests were passed successfully.
  • Please, find detailed report in JIRA: https://jira.onap.org/browse/INT-1023
10New UI Features / Historical TrackingWilliam Reehil

DONE


The AT&T team has done an exciting POC at a sprint-a-thon event that they would like to share with the community.
11Alternative meeting system

DONE


Just in case Zoom problems continue, try this instead:

https://meet.jit.si/onapaai


12Review El Alto proposals

IN PROGRESS


AAI Epics/Stories in JIRA: Getting issues...

132 Types of logging in A&AI WS

ON HOLD


1st Nov 2018

There are 2 types of logging in the services

  • one read from EELFManager
  • the other Logger log = Logger.getLogger( ...

Is that correct? Shouldn't there be just 1 type?

1st Nov:

After Casablanca release investigate logging guidelines and figure out what library to use in order to unify logging within A&AI

26th Nov: See also ONAP Application Logging Specification - Post Dublin

29th Nov: how does this fit with LOG-877 - Getting issue details... STATUS ?

28th May: Stela Stoykova is fixing AAI-2462 - Getting issue details... STATUS . Are there more that should be done for El Alto?

https://gerrit.onap.org/r/c/aai/cacher/+/85319

  • Former user (Deleted) reach out to the logging enhancement team and ask if there is a ONAP-wide logging system which we should use as the A&AI microservicese use at least 2 different approaches  
14AAI Standalone UI Access/setup

DONE

2019-05-23

Francis Paquette

James Forsyth

AAI-2418 - Getting issue details... STATUS

  • In casablanca environment without Sparky-fe, how we are able to access the AAI-UI?
  • In standalone AAI setup, how can i access the AAI UI? i am running the resources, traversal, elastic-search, search-data-service and sparky-be. but not able to access the UI, need to run any other services to access the UI?

tried with below URL,

http://<IP>:9517/services/aai/webapp/index.html

response:

Whitelabel Error Page

This application has no explicit mapping for /error, so you are seeing this as a fallback.
Tue May 21 05:07:26 UTC 2019
There was an unexpected error (type=Not Found, status=404).
No message available

15Dublin Branching

DONE

2019-05-09

El Alto runs Jun - Sep - technical debt and S3P/deployability release

Frankfurt starts in Sep 

As of now, master will accept feature development for Frankfurt

We tag dublin at release time (early June)

The Dublin branch becomes El Alto (similar to a Dublin Maintenance Release)

El Alto changes will be cherry picked back to master

16GraphGraph demoFormer user (Deleted)

DONE

2nd May 2019

A 5-10 minute demo of GraphGraph AAI-531 - Getting issue details... STATUS .

Feedback needed!

17AAI Modeling Multi-part key for schema elements

DONE

25th Apr 2019

From https://lists.onap.org/g/onap-discuss/topic/31317665

Discussion:

  • The cloud-region schema element is unusual in that it has a two-part key i.e. "cloud-owner" and "cloud-region-id". There are not many other usages of it ("ctag-pool" , "service-capability" and "route-target" are three others, out of over 100 other schema elements)
  • Is it possible to enhance the error message to indicate that part of the key value is missing from the relationship-data?  AAI-2391 - Getting issue details... STATUS
  • Is it time to deprecate the relationship-data and switch over to using the related-link only?
  • Is there any modeling guidance that would steer new designs away from using multi-part key for schema elements?
  • Are there other caveats to using the multi-part key design for schema elements?
  • Can we get feedback from Chandra Cinthalaon the key design for multi-part keys and whether this will be more common going forward?

    From: CINTHALA, CHANDRA [mailto:cc1196@att.com]
    Sent: Tuesday, April 30, 2019 12:16 AM
    To: Keong Lim <Keong.Lim@huawei.com>; FORSYTH, JAMES <jf2512@att.com>
    Cc: AGGARWAL, MANISHA <amanisha@att.com>
    Subject: Re: [confluence] Keong Lim has assigned tasks to you in "2019-05-02 AAI Developers Meeting"


    Keong,


    I think we have no plans to deprecate the relation-data option the in the A&AI relationship payload.

    It's another option for the client to specify the relationship.


    Thanks

    Chandra

  • See also email from Marco Platania in https://lists.onap.org/g/onap-discuss/topic/31385256
18Return codes and messages for WS

ON HOLD

25th Apr 2019

Is there a guide for the description of the error message and the error codes? How are new error states (message + code) added?

  • William LaMont will send James Forsyth the output of a script that formats the error.properties file to make a wiki page and readthedocs
  • James Forsyth should commit that script and create a wiki for the error properties
19Purpose of fields in AAI

DONE

18th April 2019

Dénes Németh wrote in AAI-1104 - Getting issue details... STATUS :

In think it would be good to answer what is the meaning of the field (collection of PEMs of the CA xor URL)

Questions:

1. Is AAI intended to strictly prescribe how the fields are used and what contents are in the values?
2. Or does AAI simply reflect the wishes of all the client projects that use it to store and retrieve data?

Even if (1) is true, AAI is not really in any position to enforce how clients use the data, so really (2) is always true and we need to consult the original producers of the data and the ultimate consumers of the data to document their intended meanings.

How do we push to have documentation on the purpose and meaning of the fields in AAI?

Where does all this documentation go?

Should the documentation be backed up by validation code?

See also discussion about AAI in 2018-11-28 ExtAPI Meeting notes

29th Nov: Started on new wiki page AAI Schema Producer-Consumer Pairings

18th Apr: Can we have this documentation go into ONAP in a generic way?

24th Apr: See also questions about "sw-version" in https://lf-onap.atlassian.net/wiki/display/DW/5G+-+PNF+Plug+and+Play?focusedCommentId=16367935


20range query

IN PROGRESS

2019-08-29


21Schema-service roadmap

DONE

2019-08-29

31st Jan 2019:

The schema-service is ready. Currently it provides file-sharing capabilities in terms of schema/edgerule files.

In order for GraphGraph to take advantage of the schema parsing/processing in schema-service additional abstractions have to be implemented on top of the crude file2string functionality currently in schema-service.

  • Venkata Harish Kajurwill ask Manisha Aggarwalif the current functionality of the schema-service is the final version for Dublin and if there will be further enhancements in next releases. 

GraphGraph needs the following functionality:

Venkata Harish Kajur  and Manisha Aggarwal What is missing in schema service that is needed in graphgraph is the following:

  • rest call to get available schemas
  • list of all schema nodes/items (like vserver, tenant, p-interfaces..) for example on a REST path /schemas/{schema}/nodes
  • all relevant attributes of a given node/item for example on REST path /schemas/{schema}/nodes/{node}
  • edges/relationships with their attributes between schema nodes/items (for example on REST path /schemas/{schema}/edges where you specify a "from" "to" schema items as query params)
  • subgraph of the schema, where you specify 1. initial (root) items/node (like tenant or vserver) 2. schema version and 3. number of parent/cousin/child hops from the initial item/node
  • all paths in a given schema graph between 2 items/nodes (like vserver and tenant) for a given schema version
  • edges in the schema graph should be composed of edges in the schema file + edges created from the edgerules file
  • edges should contain basic attributes when delivered via the subgraph call (like parent/child relationship and important properties from edgerules) and have additional (or all) attributes when queries via /schemas/{schema}/edges REST endpoint.

20. Mar 2019:

Open questions for schema-service:

  1. what is the current implemented functionality?
  2. what are the business use-cases in ONAP for schema-service? Description of functionality in relation to other services/projects is needed. In other words who needs it and why?
  3. if no business use-cases can be formulated we should consider removing schema-service from A&AI and replacing it with standard file-sharing mechanisms.

21st Mar 2019:

Based on William Reehil comments

https://lf-onap.atlassian.net/wiki/display/DW/AAI+Schema+Service?focusedCommentId=16325457 what is "our future proposed functionality"?

22New AAF Certificates at startupJimmy Forsyth 

ON HOLD

24th Jan 2019

AAI-2476 - Getting issue details... STATUS

AAF will generate certificates to the be used by the containers at startup; AAI services should use the run-time generated certs instead of the ones that are in the repos or oom charts.

In dublin the services will mount a volume with certificates. This is on the roadmap for Dublin as a feature.

  • is this for all service and/or HAProxy?
  • Where are the certificates coming from (OOM/gerrit/generated by AAF)

12th June 2019: Jonathan Gathman demonstrated aaf-hello functionality at DDF event

23AAI HAProxy and 2-way-TLS

ON HOLD

2019-08-29

Technical solution to either decommission the proxy or make design changes to AAF to enable client side certificates.

After VF2F we will know if this is a requirement in Dublin. We discuss after this date.

question raised: MSB - would client authentication be supported?

15th Dec: https://lf-onap.atlassian.net/wiki/display/DW/Pluggable+Security#PluggableSecurity-7.10Identifiedandsupportedpatternsandfeatures

James Forsyth  - please update if this agenda item is still relevant

24AAI Backup and Restore

DONE

2019-08-29

FREEMAN, BRIAN D asked on Re: Backup and Restore Solution: ONAP-OOM :

what would be the approach to backup an entire ONAP instance particualarly SDC, AAI, SDNC data ? would it be a script with all the references to the helm deploy releases or something that does a helm list and then for each entry does  the ark backup ?

What is the AAI strategy for backup and restore?

What is the overall ONAP strategy for backup and restore?

Should it be unified with the data migration strategy as per "Hbase to Cassandra migration" on 2018-11-14 AAI Meeting Notes?

  • James Forsythwill raise the topic of having backups and restore functionality in ONAP - if it is feasible, on the roadmap and what others PTL think

Jimmy didn't directly raise the topic but there was movement - Keong Lim asked "if istio service mesh is a no-go, is there a replacement for secure onap communications?
is backup/restore/upgradability included in s3p?"

Michael O'Brien replied that a reference tool set for backup and restore was introduced in Casablanca:  Backup and Restore Solution: ONAP-OOM

Mike Elliott said he would look at Brian's question, AAI will provide support as needed.