LOG Meeting Minutes 2019-02-19
Meeting at 1100 EST Tue - https://zoom.us/j/519971638
https://lists.onap.org/g/onap-discuss/topics
http://onap-integration.eastus.cloudapp.azure.com:3000/group/onap-integration
https://jira.onap.org/secure/RapidBoard.jspa?rapidView=143&view=planning.nodetail&epics=visible
Agenda
- Michael started parallel full time ONAP related DevOps position - should discuss impact on logging/pomba project
Notice to onap: https://lists.onap.org/g/onap-discuss/topic/michael_reduced_availability/29918628?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,29918628
Team,
I have taken on a full time DevOps kubernetes based role last week directly related to ONAP that may cause less focus on public in the short term and includes 20% travel.
We can discuss this in the affected meetings.
I have more than a couple pending mails to answer – sorry for the de-focus the last 2 weeks.
I am still working out the details of working privately and publicly – as I was previously 100% public – the team I work with on DevOps is very open to the idea of continuing the LOG and CD work – as there is also opportunity for up-sourcing as both sides are ONAP focused and the role is in support of production ONAP deployment.
So just a heads up
- Any committer can run for the PTL role - it is a short 2 phase process - 3 days for who runs - on onap-discuss - then 3 days for committers to vote
- Prep for M2 this thursday - Logging Dublin M2 Deliverables for Functionality Freeze Milestone Checklist
- TSC-25 ramping down - CD work
- Continue coding changes for spec - casablanca spec is implemented in Dublin
- Answer pending questions/mails
- Review opentracing/zipkin - LOG-104Getting issue details... STATUS
- Lorraine A. Welch System.out standardout - review in terms of not using a hardcoded file appender - in terms of syslogs
- Discussion: DCAE logs
- log formatting questions - the 4 types of logs per microservice -
- Discuss: Acumos single log discussion - MARKERs to replace log file name key - optional for ONAP for now - TODO verify our spec
Attendees
Items
- Plan for
- current implementation
- Future spec for el-alto for VNFs below
- this week...
- logging work
- Dev environment back up - merging existing library - using vid-app-common as a template for usage of org.onap.portal.sdk
<epsdk.version>2.4.0</epsdk.version>
- prepping for splitting repos - 1 per component - will need 8+
- working with dmaap on charts and filebeat
- pylog issues for vfc are transient LF issues - posting response (with multicloud)
- release notes
- scorecard for S3P
- dcaegen2 work in - DCAEGEN2-1166Getting issue details... STATUS under https://gerrit.onap.org/r/#/c/77910/
- questions on logging format onap-discuss including hv-ves I need to address https://lists.onap.org/g/onap-discuss/message/14997?p=,,,20,0,0,0::Created,,log,20,2,20,29162034
- Discussion on VNF logs (CLAMP) @Sanjay - with Alok Gupta
- vnf behaviors on top of vnf events
- dmaap TCA like events - look into capturing these
- ?add our own log tracing when VNFs react to events - another tracing EPIC we should look at
- both for VES and non-VES format
- There is a gap in tracing VNF behaviour - via 5G RAN -
- cloud infrastructure logs - vm, k8s and cloud service logs (beyond the vm level) - CNI cloud-native plugins example
- need to think more about combining the logs - at the same time as we currently just capture them.
- Provide log requirements to VNF onboarding team.
- Spondon Dey - feeding in to CL, policy - scaling behaviour - onap to drive more
- Dev environment back up - merging existing library - using vid-app-common as a template for usage of org.onap.portal.sdk
- infrastructure work
- Helm ownership -
- CI/CD - going with Orange MQ robot for oom merges
- Perf and mostly crashloop avoidance
- Deploy changes for RHEL7.6
- Deploy order work
- ARM A1 testing of new containers from dockerhub on AWS
- 80g/vm images - reducing footprint, standard alpine java image, ARM/i64 compat
- Nodeports for dmaap
- Datalake (now part of DCAE) does not yet affect us but it will -
- ONS April conference prep work
- The rest of our backlog is still in progress - M2 is coming up on the 14th
- logging work
- last week....
- get committed resources for the next 2 months M2 to RC0 - so we can state what is in and out of the Dublin release
9 weeks to april 4th- M2 - functionality freeze - 21 Feb
M4 - April 4th
I have taken the liberty of adding some names - feel free to add your availability or edit this section - we will paraphrase it in the M1 report - Logging Dublin M1 Release Planning - Michael O'Brien - 50% direct Logging work - really 40% dev/devops + 10% PTL/TSC/Project - the rest = related ONAP, CD, Doc, OOM, conference/customer,
- Todo lowered?
- Prudence Au - doing half of the PTL work, template, meets, reviews - especially POMBA with James MacNider on reviews - representing on most Thu POMBA meets
- Avdhut Kholkar - thank you for all the commit reviews
- Luke Parker - co-PTL and reference code
- Sanjay - TODO: % of work on the project
- M2 - functionality freeze - 21 Feb
- Meeting at 1200 EST today on ARM docker images (affecting LOG images as we need to get the ARM layer into the image - wrap the dockerhub versions)
- LOG-331Getting issue details... STATUS - Stop using "latest" for any image - lock down the version tag for testing stability - see our use of busybox
- LOG-949Getting issue details... STATUS - Good news: We passed M1 last Thu
- Dublin scope finalized for M1
Release Planning#DublinReleaseCalendar
Logging Dublin Scope - New work for dublin
- Assist in 5G edge work via OOM/AWS work - meet is at 1100 EST Wed with Ramki Krishnan's team
- plus metric capture via Prometheus -
-
LOG-911Getting issue details...
STATUS
- LOG-707Getting issue details... STATUS
- Review/consolidate JIRAs
- opentrace - will try to get in by april - an LF project
- priority list
- infrastructure - filebeat sidecars (before DaemonSet refactor) - see Log Streaming Compliance and API
- format - via library - portal/sdk - minimal retrofit for markers/mdcs - - LOG-600Getting issue details... STATUS
- all s3p - security, perf (aai-log-3**) - scaling - run with 1 logstash
- Logstash used to be a Daemonset - however the filebeat needs to a daemonset - instead of each sidecar - 1 container per vm - get story
- Additional tools - get POC for each - determine which goes to production level
PRI | Title | Responsible | Status OPEN DONE | In Dublin | Last Worked on | Start | Notes |
---|---|---|---|---|---|---|---|
Security Vulnerability template | Ongoing | IN | 20190122 | ||||
M1 template | DONE | IN | 20190124 | 20190122 | |||
ONS NA 2019 April Talk proposal | DONE | IN | 20190122 | ||||
Use manifest generation over raw oom values.yaml docker image tag names | DONE | IN | 20190124 | 20190117 | pending documentation in RTD Team, In the TSC it was decided to treat the diff between oom and the manifest by always running the manifest generated yaml in your deployments – you will not need to do this for master work – just for Casablanca and RC0-2 work
Working out the details in https://jira.onap.org/browse/LOG-929 /michael | ||
S3P Logging compliance TSC/PTL | IN PROGRESS | IN | 20190115 | 20190114 | |||
El-Alto 1.4 logging spec change - plan only todo merge with Dave's below | IN PROGRESS | IN | 20190122 | ||||
Dublin Scope Planning | DONE | IN | 20190124 | ||||
RTD documentation | OPEN | IN | 20180129 | Attending Thu 1130-1230 meets | |||
restart log4j format and files example | IN PROGRESS | IN | 201901115 | 20190108 | https://gerrit.onap.org/r/#/c/62405/ for - LOG-630Getting issue details... STATUS and | ||
Work with portal/sdk library | Michael O'Brien | IN PROGRESS | IN | 20190129 | 20190115 | Update: 20190129 - Existing eclipse environ for the RI being retrofitted At the pom stage bringing in the jar via portal/sdk in use by aai, dmaap, sdk, vid (vid link into so maybe?) <groupId>org.onap.portal.sdk</groupId> epic - LOG-600Getting issue details... STATUS Jira - PORTAL-348Getting issue details... STATUS review investigation in Log Streaming Compliance and API#ExistingLibraryResearch\ Luke Parker discussion need to use the portal library in an initiating project for tx processing working likely with the SO team - via the work we are doing for them in https://gerrit.onap.org/r/#/c/69947/ (check the original spec - ODL specific - check appc/sdnc use of ccsdk) | |
New Committers | OPEN | 20190115 | We have room for 2-5 committers and will be reviewing the list Logging Enhancements Project Proposal#KeyProjectFacts add your details to Logging Committer Promotion Requests 20190129 status - waiting on contributor documentation from each contributor | ||||
OPNFV/ONAP Paris | DONE | 20190108 | https://ddfplugfest19.sched.com/ Tue-Thu Clover Gambia on prior https://zoom.us/j/115579117 - 7 hours ahead https://ddfplugfest19.sched.com/event/K1Gy/opnfv-clover-utilizing-cloud-native-technologies-for-nfv | ||||
Security badging | IN PROGRESS | IN | 20190129 | Need to restart this | |||
Security Vulnerabilities | IN PROGRESS | IN | 20190129 | lower - but for M4 | |||
s3p Secure https endpoints LOG + POMBA | for djhunt | OPEN | IN | Discussion on whether we need to lock down the nodeport exposed ports Can key off POMBA work already done todo: get s3p page | |||
Format compliance - working with AAI team + perf | IN PROGRESS | IN | 20190115 | 20181101 | 20190115 - casablanca cherry pick in queue logstack 5 to 3 and 1 https://gerrit.onap.org/r/#/c/75702/ (+) 20190109 #6 on 2018-12-20 AAI Developers Meeting around - LOG-376Getting issue details... STATUS Discussion with @Sanjay Agraharam and [~pau2882] on checking how cassandra is running on the vm and if debug levels are on should be verified use labelling to split aai-cs and ls - no DaemonSet Michael O'Brien to reduce core count for ls to 1 from 3 - LOG-915Getting issue details... STATUS edited 2019-01-10 AAI Developers Meeting for the 10th | ||
AAI team - 2 types of logging AOP/non-AOP | OPEN | IN | 20181101 | #22 on 2018-12-20 AAI Developers Meeting | |||
Logging requests from Vendors | OPEN | - LOG-877Getting issue details... STATUS - LOG-876Getting issue details... STATUS #15,19 and 37 on SP priorities for Dublin | |||||
LOG Streaming compliance | IN PROGRESS | IN | Log Streaming Compliance and API - LOG-487Getting issue details... STATUS - LOG-487Getting issue details... STATUS - LOG-852Getting issue details... STATUS and | ||||
opentracing via | IN (planning/POC for sure) | 20190123 | @Sanjay discuss integration - out of band processing - - LOG-104Getting issue details... STATUS see zipkin arch https://zipkin.io/pages/architecture.html possibly tie both as a client of es ? Tie in to ONS NA 2019 April demo booth for LF | ||||
discussion - remove | 20190108 | discuss tick/tock logging spec behaviour - cassablanca implemented in dublin, dublin implemented in elalto
| |||||
Log Checker | OUT | 20190109 | MIke to review with Horace | ||||
Search Guard | OPEN | Maybe | 20180109 | ||||
spec changes for Dublin | IN (planning) | 20190109 | 20190109 | Dublin spec changes for Elalto environment name release name check mail for reply Michael O'Brien Prudence Au proposal of renaming the log file name itself for the release ie: 3.0.0-ONAP - will discuss later for next week | |||
Cluster logging behaviour S3P | IN | server name in clustered environments - I will add the details and the Jira right after this meet | |||||
LOG ELK stack indexing/dashboards with Prometheus below | OPEN | IN | 20190123 | ||||
Casablanca 3.0.1 work until 10th Jan Including POMBA | DONE | 20190122 | 20190113 | - LOG-913Getting issue details... STATUS revert Jira for data-router off TSC-92 - pending merge of https://gerrit.onap.org/r/#/c/75999/ | |||
LOG openlab tenant devops cluster creation/testing | Done pending vFW | IN | We have 2 clusters a 1+4 and 1+13 used for testing deployments and running the vFW | ||||
Wiki edits, RTD review | OPEN | IN | Requiring Updates, Merges or Marked Deprecated | ||||
Metric Streaming and Prometheus | IN PROGRESS | IN | 20181207 | - LOG-911Getting issue details... STATUS - experimental chart on http://secure.solar:30000/graph - LOG-773Getting issue details... STATUS - LOG-861Getting issue details... STATUS work with Vaibhav Chopra - OOM-1504Getting issue details... STATUS @Sanjay - note the prom chart assumes a k8s environment - what about bare metal | |||
Finish SO filebeat additions | IN PROGRESS | IN | 20181207 | ||||
Finish LOG common charts | OPEN | OUT to El-alto | 20190123 | 20181207 | James MacNider - bring in Prianka's common eLK charts and use them in Clamp, LOG, SDC, POMBA https://gerrit.onap.org/r/#/c/64767/ - OOM-1276Getting issue details... STATUS rever to El-alto under - LOG-936Getting issue details... STATUS | ||
Team Members Thank you and review | OPEN | IN |
| ||||
del | Review last 4 weeks since | IN PROGRESS | |||||
TSC/PTL meet actions | IN PROGRESS | ||||||
OOM transfer chart ownership to teams LOG is part of poc | IN PROGRESS | IN | 20190107 | Starting - will have a training session - will send out any meetings to onap-discuss We may have the same symlink repo folder like we do for doc Last discussed TSC 20180109 | |||
OOM Deployment priority base platform includes LOG | IN PROGRESS | OUT todo review with Mike Elliott | 20190123 | 20181207 | Q2) priority of system level containers like the ELK stack - OOM has a common services JIRA - DMaaP, AAF - TODO get JIRA - make sure log is in this! There is a cd.sh retrofit that sequences the pods in order for deployment stability - this will be phased out when tiered deployment comes in - LOG-898Getting issue details... STATUS - LOG-326Getting issue details... STATUS https://gerrit.onap.org/r/#/c/75422/ via ONAP Development#WorkingwithJSONPath - DCAEGEN2-1067Getting issue details... STATUS | ||
k8s manifest or oom values.yaml for docker tags - truth | IN PROGRESS | 20190123 | |||||
Nexus3 routing slowdown | DONE | 20181222 | |||||
LOG compliance diagram/exercise | @Sanjay Agraharam | IN PROGRESS | 20181205 | Log Streaming Compliance and API part of prometheus work now Sanjay - diagram FB must be split between using AOP and AOP+spec compliant - only this one should be green | |||
Ongoing | |||||||
CI/CD pipeline TSC poc | IN PROGRESS | 20180107 |
- Team Members
- Review last 4 weeks since LOG Meeting Minutes 2018-12-05