TSC 2018-12-20
Duration 90 minutes
Agenda Items | Presented By | Time | Notes/Links | JIRA Task |
---|---|---|---|---|
Casablanca Maintenance Release | 5 mins | |||
Any Infrastructure Improvement/Plan | Linux Foundation | 10 mins | Any LF showstopper? #65866 - Nexus3 proxy verified 80-100x faster downloads since 20181217 #65794 - Nexus3 timing out - still getting 0.4MB/sec - not the usual 10+MB/sec #65809 - Nexus3 slowdown 10X - docker pulls very slow in openlab example normal speed on the nexux3.onap.info:5000 proxy example pull - 5 sec sudo docker pull nexus3.onap.info:5000/onap/appc-cdt-image:1.4.3 this will take minutes sudo docker pull nexus3.onap.org:10001/onap/appc-cdt-image:1.4.3 Nexus3.onap.org:10001 experiencing a serious routing? issue not the older 3 hour slowdown from 4-6 months ago that was fixed. Normally prepull takes 30 min - it now takes 120+ hours for a full prepull since 20181217 - and 80x slowdown on image downloads. Q) why jenkins has no issue with nexus3 - A) they are on the same domain - and don't go through a bell.ca exchange ;; ANSWER SECTION: jenkins.onap.org. 60 IN CNAME cloud.onap.org. cloud.onap.org. 60 IN A 199.204.45.137 ;; ANSWER SECTION: nexus3.onap.org. 60 IN CNAME cloud.onap.org. cloud.onap.org. 47 IN A 199.204.45.137 run a traceroute and notice # this is from an AWS EC2 instance in us-east-2 ubuntu@ip-172-31-10-98:~$ traceroute nexus3.onap.org traceroute to nexus3.onap.org (199.204.45.137), 30 hops max, 60 byte packets 16 tcore3-ashburnbk_hundredgige0-1-0-0.net.bell.ca (64.230.125.182) 25.657 ms tcore4-ashburnbk_hundredgige0-1-0-0.net.bell.ca (64.230.125.184) 30.723 ms 30.673 ms discussions/helpdesk tickets Effect: anyone bringing up a clean ONAP system casablanca, master, 3.0.0-ONAP - all will take 35+ hours to come up depending on what is deployed - for example LOG uses dockerhub images it will come up fast - but AAF or any other pod that has images over 1G will each take a couple hours. Once casablanca is pulled you are good - they don't change - but for master - everytime you redeploy on a different day - all or some part of the images will need to be pulled. Temp Workaround nexus3.onap.cloud:5000 alternate proxy (have the LF's back when on vacation) up on 20181218 - taking 2 days to saturate with casablanca images 128G FS ETA late Friday - as of 20h of pulliing - see 21 images of https://git.onap.org/integration/tree/version-manifest/src/main/resources/docker-manifest.csv?h=casablanca 20181219:1700EDT nexus3.onap.cloud status : 16h of pulls - 26 images ubuntu@a-nexus3:~$ sudo docker images | wc -l 26 ubuntu@a-nexus3:~$ sudo docker images REPOSITORY TAG IMAGE ID CREATED SIZE nexus3.onap.cloud:5000/onap/aai-resources 1.3.4 723d184670e2 3 weeks ago 515 MB nexus3.onap.cloud:5000/onap/aai-graphadmin 1.0.1 ed643f4a192c 4 weeks ago 526 MB nexus3.onap.cloud:5000/onap/aai-traversal 1.3.3 4e5784b9e283 4 weeks ago 526 MB nexus3.onap.cloud:5000/onap/appc-cdt-image 1.4.3 10b6b253e1a9 4 weeks ago 160 MB nexus3.onap.cloud:5000/onap/appc-image 1.4.3 888018330bf5 4 weeks ago 2.88 GB nexus3.onap.cloud:5000/onap/babel 1.3.2 b4012e79495e 4 weeks ago 625 MB nexus3.onap.cloud:5000/onap/aaf/aaf_service 2.1.8 6eb295fed110 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_oauth 2.1.8 74dcdce76094 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_locate 2.1.8 2a4eaa6275ff 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_hello 2.1.8 495a01176053 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_gui 2.1.8 8caa6dc681f0 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_fs 2.1.8 3d663698534d 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_cm 2.1.8 0ba25c4ec3fb 5 weeks ago 1.16 GB nexus3.onap.cloud:5000/onap/aaf/aaf_agent 2.1.8 090b326a7f11 5 weeks ago 1.14 GB nexus3.onap.cloud:5000/onap/aaf/aaf_config 2.1.8 6506ac785cb5 5 weeks ago 1.14 GB nexus3.onap.cloud:5000/onap/aaf/aaf_cass 2.1.8 4b91e9b0b43f 5 weeks ago 323 MB nexus3.onap.cloud:5000/onap/aaf/smsquorumclient 3.0.1 f8cf701eadc3 7 weeks ago 18.2 MB nexus3.onap.cloud:5000/onap/aaf/sms 3.0.1 02363fccc6c7 7 weeks ago 35.4 MB nexus3.onap.cloud:5000/onap/aai/esr-server 1.2.1 00c9c28e8936 7 weeks ago 521 MB nexus3.onap.cloud:5000/onap/aai/esr-gui 1.2.1 4bd7ab7ae54a 7 weeks ago 512 MB nexus3.onap.cloud:5000/onap/aaf/testcaservice 3.0.0 fc717d0b071c 2 months ago 1.17 GB nexus3.onap.cloud:5000/onap/aaf/abrmd 3.0.0 00a91d2dc09d 2 months ago 1.15 GB nexus3.onap.cloud:5000/onap/aaf/distcenter 3.0.0 d8d9137ef2d3 2 months ago 1.09 GB nexus3.onap.cloud:5000/onap/aai-cacher 1.0.0 8ec3df246a35 2 months ago 466 MB slightly-not-useful-random graphs for reference my private AWS proxy after 46h of pulls has 60 images of 180 - predicting 140h or 5.8 days ubuntu@ip-172-31-10-98:~$ sudo docker images | wc -l 52 # at nexus3.onap.info:5000/onap/externalapi/nbi 3.0.1 1ebc02237c1c 2 months ago 122 MB similar incoming traffic profile from nexus3.onap.org nexus4.onap.cloud:5000 alternate proxy for master - larger FS 200G - access instructions, cert, installation on Cloud Native Deployment#NexusProxy windriver lab also has a network issue (for example if i pull from nexus3.onap.cloud:5000 (azure) into an aws EC2 instance - 45 sec for 1.1G - If I pull the same in an openlab VM - on the order of 10+ min) - therefore you need a local nexus3 proxy if you are inside the openstack lab - I have registered nexus3windriver.onap.cloud:5000 to a nexus3 proxy in my logging tenant - cert above I take this tooooo seriously | - TSC-79Getting issue details... STATUS |
Task Force Update | 60 mins | E2E Process Automation TODO: review the vFW automation in https://github.com/garyiwu/onap-lab-ci - thanks Yang Xu | ||
Task Force Update | 15 mins | Pair Wise Activities Update | ||
TSC Activities and Deadlines | TSC Vice-Chair Election - Congratulations Lingli ! | |||
Incoming ONAP Events | 5 mins | Jan 8-11 - Dublin Release F2F Developer Design Forum (France): https://wiki.lfnetworking.org/pages/viewpage.action?pageId=8257579 Feel free to request your VISA: http://events.linuxfoundation.org/visa-request Submit your proposal: https://wiki.lfnetworking.org/display/LN/OPNFV-ONAP+January+2019+Session+Proposals Reminder - No TSC Call on December 27th, 2018 |
Zoom Chat Log
06:02:28 From Milind Jalwadi : #info Milind Jalwadi, TechMahindra Ltd.
06:02:34 From Jason Hunt : #info Jason Hunt, IBM
06:04:19 From Srini Addepalli (Intel) : #info Srinivasa Addepalli, Intel
06:04:40 From Marc Fiedler (DT) : #info Marc Fiedler, DT proxy of Andreas
06:05:00 From Jason Hunt : Sorry. :)
06:09:36 From Murat Turpcu ( Turk Telekom) : #info, Murat Turpcu Turk Telekom
06:11:38 From Dan Timoney : I think it’s a fair question why a routing change was made while LF was out for holidays.
06:12:24 From Catherine Lefevre : #65866 - Nexus3 proxy verified 80-100x faster downloads since 20181217 #65794 - Nexus3 timing out - still getting 0.4MB/sec - not the usual 10+MB/sec #65809 - Nexus3 slowdown 10X - docker pulls very slow in openlab
06:12:38 From Michael O'Brien(Amdocs,LOG,OSX) : https://jira.onap.org/browse/TSC-79
06:16:03 From Eric Debeau : #info presentation E2E Automation: https://lf-onap.atlassian.net/wiki/download/attachments/16231787/ONAP-E2E-Automation-v3.pptx?api=v2
06:27:18 From Michael O'Brien(Amdocs,LOG,WIN) : The vFW is extremely difficult to demo to a client - with a lot of workarounds - I created that diagram back in Dec to help understand what was going on - it needs to be updated from Beijing, even with all I know about onap - I am still having issues running the simplest end to end use case
06:30:04 From Yang Xu : Integration team use Robot to automate vFW e2e in CI process for Casablanca
06:31:19 From Yang Xu : See http://onapci.org/jenkins/, it had been used by release manager to report progress daily during Casablanca
06:31:20 From Michael O'Brien(Amdocs,LOG,WIN) : I'll go through the deployment scripts again in the integration repo - I did notice the new vFW checks in onapci.org
06:32:56 From Yang Xu : The script is here https://github.com/garyiwu/onap-lab-ci
06:33:03 From Michael O'Brien(Amdocs,LOG,WIN) : what the goal of the vetted page was to in "exact" detail - describe every single step/fix/workaround/automation required to get from a set of 1+13 ubuntu vms - to 3 vFW deployed VMs - with some mitigation for any error in any interim step
06:36:29 From Michael O'Brien(Amdocs,LOG,WIN) : thanks yang never heard of that repo - will review - I would expect everything required to get ONAP and the vFW demo working would be in git.onap.org
06:39:48 From Yang Xu : there was reason (no auto VPN access to LF network is allowed) that Gary the code in GitHub. Everyone can access the github repo, and we will evaluate to see if we can move it within onap repo
06:40:24 From Michael O'Brien(Amdocs,LOG,WIN) : good reason - vpn issues on my end as well - thanks
06:46:01 From NingSo : #info Ning So, Reliance Jio
06:51:39 From Michael O'Brien(Amdocs,LOG,WIN) : these are the components involved in the vFW, vCPE actors are a bit different
06:51:39 From Michael O'Brien(Amdocs,LOG,WIN) : https://wiki.onap.org/download/attachments/3245081/20170727_closed_loop_flow_Screenshot%202017-07-27%2011.57.40.png?version=1&modificationDate=1501171169000&api=v2
06:53:32 From Michael O'Brien(Amdocs,LOG,WIN) : I definitely would like a completely automated one-click vFW robot script if it works - this was our goal since amsterdam
06:54:22 From Michael O'Brien(Amdocs,LOG,WIN) : we can always split the script into sections so manual workflow can be added
07:00:11 From Arash Hekmat (Amdocs) : To make ONAP use cases "generic" is to ask these two questions: On the Northbound, how can the use case be effectively managed (created, configured, monitored, deleted) by an external BSS system via the External API without having to know anything about the internal components of ONAP? On the Southbound, how can the use case work with Multiple Vendors network functions (and cloud infrastructures) without any modification to ONAP components?
07:01:15 From Michael O'Brien(Amdocs,LOG,OSX) : the wiki is a bit outdated - but to answer manual vFW (with a couple robot actions in the middle) - we have these
07:01:15 From Michael O'Brien(Amdocs,LOG,OSX) : https://lf-onap.atlassian.net/wiki/display/DW/Running+the+ONAP+Demos
07:02:46 From Catherine Lefevre : Next steps: Deep dive with the PTLs
07:07:17 From Michael O'Brien(Amdocs,LOG,WIN) : Thanks Catherine - at the end we need to run ONAP in front of our team or the customer - we are the last word and must answer for all of onap's issues in the 4+ hour intense hands-on in front of the team - sometime traumetizing
07:12:14 From Steven Wright : use cases should be at the level where they only exercise external interfaces of the ONAP platform
07:12:48 From Catherine Lefevre : #Pair-Wise Proposal
07:14:02 From Michael O'Brien(Amdocs,LOG,OSX) : I like the flow of the french language - it sound nice
07:14:44 From Srini Addepalli (Intel) : Yes Steve. +1 on that. Robot scripts are hiding the actual complexity. We need to run vFW or vDNS with robot scripts and understand the number of steps one needs to do. I feel this excercise is important to understand issues.
07:14:57 From Srini Addepalli (Intel) : s/with/without
07:17:37 From Catherine Lefevre : +1 on Srini's comments and my understanding is that what Eric and the task force have tried to identify
07:21:17 From Catherine Lefevre : wiki page shared by Eric - https://lf-onap.atlassian.net/wiki/display/DW/Dublin+Pair-Wise+Testing
07:22:31 From Michael O'Brien(Amdocs,LOG,OSX) : CD done on OOM via helm charts for 1 or more components before merge by gerrit magic word - in progress with 3 LF personnel - we are at the paying for a target VM to host a minimal k8s/oom cluster right now - usually friday at 10 meets
07:22:32 From Michael O'Brien(Amdocs,LOG,OSX) : https://jira.onap.org/browse/TSC-25
07:23:00 From Michael O'Brien(Amdocs,LOG,OSX) : above can run automated CSIT
07:24:10 From Michael O'Brien(Amdocs,LOG,WIN) : we have onapci.org and TSC-25 - two CD pocs
07:28:00 From Michael O'Brien(Amdocs,LOG,WIN) : we have helm-verify now - we will have helm-deploy magic word in TSC-25
07:30:13 From Michael O'Brien(Amdocs,LOG,OSX) : Logging and OOM team will work very closely with integration team (Yang and Gary) to work towards full CD
07:31:13 From Michael O'Brien(Amdocs,LOG,OSX) : one issue is that in side openlab there is a network bottleneck not seen outside the lab - the fix is to preload all your docker images on all hosts - just re-iterating for anyone bringing up a test cluster there
07:35:28 From Eric Debeau : Thanks you and Happy Christmas
07:35:47 From Marc Fiedler : Merry Christmas