#action all cleanup the repo (integration/testsuite/Demo) and remove in master what is not used for istanbul/honolulu
versioning
xtesting images can be created now - 1.7.0 created for guilin..bad version 7.0.0 shall be created as well as 8.0.0 for honolulu (shall be possible now..)
in progress....
onapsdk => new version for guilin (7.6.0) and for honolulu (8.0.0)
agreed => 2 versions needed #action michal create the 2 tags (7.6.0=8.0.0)
Thanks Lasse Kaihlavirta for the great work done since more than 2 years for integration project.
Action point follow-up
Lukasz Rajewski prepare a demo on test automation managed by Lukasz..why it could not be integrated in CI for the moment (reflexion for Istanbul)
Morgan Richomme sent a mail to PTL on good practice. Reminder that no patch in gate oldes than x months is a criteria for milestones..it is clearly not respected.
CCSDK previous showcase, still requires distributtion of docker builds, not possible to rebuild locally
DMaaP (dmaap/buscontroller moved), OPTF (optf/odf and optf/has already moved), SDNC (almost finished but still some failures) transition in progress
AAI mumtitenancy done outside ONAP planned to be incoprorated into aai/resources and aai/traversal => use of scala+gatling (examples based on robot so far)
Policy: CSIT moved to reir repositories but tests do not follow reco (test after merge, not project specific, jjb fully specific
Still some active projects with old design (DCAEGEN2, Modeling, Multicloud, OOM-platform-cert-service, SO)
Usecase #action Morgan Richomme dig a little bit to understand if it corresponds to active use cases
VFC and VID do not seem really maintained
VNFSDK: only images for frankfurt => not maintained
CSIT tests disabled (aaf, appc, clamp, music)
clamp moved to SDC => what does it mean from a CSIT perspective
use case partially automated but would require a KUD (test cannot use a ns of the existing k8s hosting ONAP)
As the use case is maintained and includes regularly feature changes, it woudl be interesting to include it in CI..to show that it is red/green as part of the use cases (more than smoke tests)
Open question and where the status of the xNF shall be available for ONAP admin..not clear status of the service/vnf/module was available in VID, some test status was available in SDC but not sure to know where the info on the status of the xNF would be the more accurate #action Morgan Richomme send a mail to the community to get feedback on this xNF status page
AoB
Action point follow-up
AP1: Morgan Richomme review Integration simu and release versions if possible + update doc accordingly
not done yet wait for release of the xtesting dockers
AP2: all move or closed the JIRA in honolulu
done
AP3: report your feedback on honolulu lesson learned
congratulations Bartosz Gardziejewski elected as integration committer
Morgan Richomme send a reminder for Alexander currently 4/4 but more than for 4 votes needed
all the INFO.yaml actions will be performed once
PTL no candidate => escalation done to TSC
xtesting docker release
new way, need to twist the scripts but could be OK. Need to fix guilin CI chain first (issue due to new pbr value)
baseline images: wait for seccom reco for istanbul
Morgan Richommecontact Seccom to prepare baseline images. Former user (Deleted) indicates that we need to discuss offline especially for the python image.
Honolulu
regression on multicloud-k8s / pnf-macro
issue due to helm format, no error with the helm charts sued for Lukasz's use case. Fix done, docker images must be regenerated
action Lukasz Rajewski prepare a demo on test automation managed by Lukasz..why it could not be integrated in CI for the moment (reflexion for Istanbul)
Former user (Deleted) indicates that anothe resource from Saùmsung is working on the topic to improve the images. No feedback from the PTL.Morgan Richomme indicates that he reports the issue to PTL during PTL meeting. Not normal to have no feedback (e.g. CLI improvement patch in gate for a long time), David McBride sent a mail to Former user (Deleted) #action morgan sent a mail to PTL on good practice. Reminder that no patch in gate oldes than x months is a criteria for milestones..it is clearly not respected.
stability tests
update on MariaDB tuning => reduce the replica of non core components/removal of appc/... see stability_follow-up 2-2.pdf
redimensioning is enough to avoid DB crashbut recommendation to TSC to reduce replica for non core components 15 instances of Maria DB => 15x30 Go of additional RAM needed to avoid crash in load conditions...9 nodes cassandra...
all integration committers are using teh default DB values on their lab, no specific tuning, only the nb of compoentns may be reduced (because not used)
CSIT status after Honolulu (we may have to reserve some time for this so not necessarily today): Integration
introduction to the status of CSIT at the end of Honolulu
Morgan Richomme create resiliency chapter in doc and initiate Jira for confirmed issue (appears at least 2 times), restart of a controller still to be done to complete the test.
done see next section
Krzysztof Kuzmicki create a wiki page in Istanbul page for robot refactoring follow-up
Krzysztof Kuzmicki create a wiki page in Istanbul page for DCAE migration
Admin
INFO.yaml updated needed (NS repo have no yaml + status of the project)
Morgan Richomme retry and wait for the 5 minutes (Eviction timeout) then investigate on the networking issue
done, Jira completed
Morgan Richomme check ansible image availability for chained-ci
doneAndreas Geißler gave a new try, still issue but not related to teh image. open question: shall we also reference the image in the CI page of the release?
Admin
INFO.yaml updated needed (NS repo have no yaml + status of the project)
pnf-macro test to be done on master before integration in CI (working fine in guilin)
basic_vm_macro good results on guilin but not so good on master/honolulu => JIRA created
tern: test run in CI but results not pushed yet to LF Backend => WIP to manage properly the push to lf backend. On last honolulu and master-weekly, artifacts were produced and pushed manually #action Former user (Deleted) finalize push to LF back end
stability/resiliency tests
Resiliency test
worker node restart (thanks Bartek Grzybowski for the full analysis) => how do we follow, shall we create Jiras?
test done, eviction looks good. some issues detected on some pods but not always trivial to reproduce may depend on the associated evicted pods. sometimes Init error, sometimes pod Running but exception..sometimes first eviction look good, but second seems to be fail... different cases referenced in Honolulu Resiliency and Backup and Restore test
for SDC issue, the restart may be due to teh fact that the SDC is trying to recreate tables that already exist in cassandra
#action Morgan Richomme create resiliency chapter in doc and initiate Jira for confirmed issue (appears at least 2 times), restart of a controller still to be done to complete the test.
stability tests
SDC tests: 72h 5 parallel on boarding INT-1912 - SDC Stability test: success rate drops after ~ 500 onboardingOpen
ChrisC reports that it was useful to detect new issues, especially the continuous time increasing that my be lead to teh DB and/or a middle ware layer (explicit exception seen)
plan to integrate such test in weekly as part of benchmark test
instantiation tests
48h 5 parallel basic_vm (initial issue on SDNC (fixed by restart) then regular timeout value then internal openstack issues INT-1918 - Stability tests: // instantiationOpen
more timeout than in guilin, need to be confirmed with a 1 test (no parallelization), mariadbgalera issue observed on some replica
72h 1 test basic_vm => problem with mariadb galera, results will not be as good as in guilin run
issue with mariadbgalera after 24h (but tests run after the 48h tests..) once finished (thurday morning => reinstallation of honolulu weekly to reproduce the tests with the last dockers
mariadb galera workaround reported by Bell to OOM, seems to improve the overall quality of gate but trigger an new issue on CDS. tradefoff between redundancy and efficiency not clear
Lasse Kaihlavirta also disabled useless SO CSIT tests (with very old images) - the Jira was closed without action by Seshu Kumar Mudiganti so we remove some jobs as they are meaningless (test with casablanca images)
Lasse Kaihlavirta add a browser cleanup on robot healthcheck (vid) to save resource, it will be integrated in 1.8.0
Action point follow-up
Illia Halych review the official simu page (pythonsdk wrapper)
done, no example for Honolulu, to be completed in Istanbul
Morgan Richomme refine the information about the conditions of the resiliency tests of worker restart -
Krzysztof KuzmickiINT-1907 - browser_setup.robot does not provide proper teardownOpen no teardown for browser-based checks - high consumption of resources at the end
used by VID test, workaround to be confirmed
Maybe we need more information about DMaaP simulator use and to write it better in the simulator doc → Reported by Lasse Kaihlavirta
Globally we are not well structured on the simulator, on the release note of version X we should list all the versions of the simulators we used for the validation. It shall be done by the use case teams that developed a simu and:or the integration team. The difficulty ois to get the full view on the simulators, some are hidden in repositories, some are unmaintained,...# action Morgan Richomme review Integration simu and release versions if possible + update doc accordingly
Need of consolidation of versioning of images of simulators used in CSIT → Reported by Lasse Kaihlavirta
sure it is a bit messy...and impossible to maintain, integration can cleanup time to time..but the best way for functional tests => bring functional tests including simu in their own repo. And if the simu can be used more widely create a dedicate repo under integration/simulators..
Admin
INFO.yaml updated needed (NS repo have no yaml + status of the project)
strange INFO.yaml is needed to create the repo => to be xchecked. Morgan Richomme indicates to Thomas Kulik that it will be done with the next INFO update (needed as at least 2 committers announced they will stop after Honolulu)
thanks to Pawel and Marcin for their contributions and all the best for the next challenges
pnf_macro => OK on daily guilin, what do we do regarding CI integration
kudos to Michał Jagiełło - first pythonsdk tests with simulator - let's wait for the merge of Dan fix for SDNC-1515, if OK #action Morgan Richomme add pnf-macro to CI
stil SO bugs preventing Lukasz Rajewski to complete his test .. deja-vu?
#action Lukasz Rajewski properly tag SO bugs to reference them as blocking in Integration blicking table
I started the first resiliency tests: TEST-308 - [Resiliency] Evaluate ONAP behavior on k8s worker node restartIn Progress , replay done, mail sent to the community (network issue?)
behavor different on Nokia (pod stuck for ever in Terminating state when stopping a working) and Samsung RKE2 cluster, continue discussion on the ticket Bartek Grzybowski indicates that we shall be careful especially with statefulset, it coudl explain some issues
#action morgan retry and wait for the 5 minutes (Eviction timeout) then investigate on the networking issue
stability tests
SDC tests: 72h 5 parallel on boarding started on the 20th of April..wait and see...
First tests were all OK but duration ontinuously increases. No error rate is high. Wait for the end of the tests to get the graphs.
Andreas Geißler mentioned that his CI chains were affected by the non availability of an ansible image used in the runner => could be a problem, need to release also such images somewhere to avoid such issues #action Morgan Richomme check ansible image availability for chained-ci (2.7.13)
issue also on Orange chains last week (build chain of ansible image was broken, chain fixed)
ChrisC raises a question on python2.7 in jenkins CI and in robot. Partly addressed by the initiative of robot refactoring. note xtesting dockers also still using python2.7 due to dependency to python-utils that has not evolved for a long time.
Morgan Richomme review Integration simu and release versions if possible + update doc accordingly
Lukasz Rajewski properly tag SO bugs to reference them as blocking in Integration blicking table
if SDNC-1515 merged and pnf-simu OK in honoluluMorgan Richomme add pnf-macro to CI
Morgan Richomme retry and wait for the 5 minutes (Eviction timeout) then investigate on the networking issue
Morgan Richomme check ansible image availability for chained-ci
Action point follow-up
Michał Jagiełło pnf_macro add processing for pnf_macro to wait for a clean startup of the simulator (currently time based, and it seems that it takes more time than expected)
still sometimes some issues at SDNC startup (, Dan Timoney already added check mechanism but it seems that there are still issue in mariadbgalera, workaround (restart of SDNC seems to be OK)
basic_vm_macro does not work due to SDNC ()
basic_vm and basic_cnf fail randomly - mostly due to SDC problem or timeout in SO - this looks less stable than in guilin
observed problem with config-assign/config-deploy in vFW-CNF integration test - we need to cover config-assign/config-deploy in our smoke tests
to sum up: we have two major problems with service-macro-create
Remaining work on tests
basic_clamp: OK
basic_vm_macro: OK show a regression in SDNC ()
pnf_macro => Michał Jagiełłois working on the latest comments and needs further tests
refactoring of 5gbulkpm => OK (additional patch required submitted by Krzysztof Kuzmicki and cherry-picked)
tern: test declared in CI => we went further on the weekly-honolulu (https://gitlab.com/Orange-OpenSource/lfn/onap/xtesting-onap/-/jobs/1165577708), ansible-playbook was missing on the runner...fix merged (we need to use image with already ansible installed) ...to be continued but at least it is properly triggered
stability tests
I started the first resiliency tests: , replay today - we need to add remark this is only for stateless services - what about DB?
stability I planned to regive a try to Natacha's suite as soon as the other tests are OK
consolidation of versions
better exception catching
integration of the GUI part to generate an html reporting (for the moment I usually replay manually the tests, get the results and apply the GUI processing manually)
removal of the submodule: test in progress (seems there was an import in the simu step), once done we could review our docker build (to remove the developer mode) but we shall not forget to adapt xtesting-onap according ly (/Src/onaptests/src/onaptests => /usr/lib/python3.8/sites-available/onaptests => action morgan for the week end..to avoid breaking everything before my PTO..
Krzysztof Kuzmicki no teardown for browser-based checks - high consumption of resources at the end
Maybe we need more information about DMaaP simulator use and to write it better in the simulator doc → Reported by Lasse Kaihlavirta
Need of consolidation of versioning of images of simulators used in CSIT → Reported by Lasse Kaihlavirta
Action point follow-up
Michał Jagiełło pnf_macro add processing for pnf_macro to wait for a clean startup of the simulator (currently time based, and it seems that it takes more time than expected)
Not done
Morgan Richomme cleanup basic_clamp and re-run it on a daily master
basic_vm_macro (replace vfw_macro): in progress still some issues on the daily honolulu
pnf_macro => add mechanism to have a better control of the pnf simu launch (timer is not enough) => # action Michal
refactoring of 5gbulkpm => #action Krzysztof => test done this afternoon => merged
tern: test declared in CI, but issue => # action morgan & Alexander
stability tests
I started the first resiliency tests: TEST-308 - [Resiliency] Evaluate ONAP behavior on k8s worker node restart
Other labs are experiencing different issues
Nokia => pod remain in stuck
#action Daniel, Andreas create a JIRA ticket for each resiliency test
#action morgan create a wiki page to consolidate resiliencey tests
stability I planned to regive a try to Natacha's suite as soon as the other tests are OK
to be tried asap: if enough resources => weekly Honolulu to be launched on Wednesday
consolidation of versions: I was waiting for the come back of Pawel but as far as I understood, you will move to othe r challenges soon.. :) so I need to review that
better exception catching
integration of the GUI part to generate an html reporting (for the moment I usually replay manually the tests, get the results and apply the GUI processing manually)
removal of the submodule: test in progress (seems there was an import in the simu step), once done we could review our docker build (to remove the developer mode) but we shall not forget to adapt xtesting-onap according ly (/Src/onaptests/src/onaptests => /usr/lib/python3.8/sites-available/onaptests
#action morgan review the patch dealing with sumbmodule removal + cherry + adapt xtesting-onap
CDS / CSIT => will not have the resource to do it for Honolulu => move to Istanbul
GUI tests: wait for completion of the Wiki page and see if enough bandwidth => move to instanbul
Discussion on the support for AAI gatling tests. By default the multitenancy support is disabled and they require a keycloak third party
first issue is the test env => we coudl imagine to have a daily scenario daily_master_multitenancy but we need to take care of the resources. Until we cannot rely on the windriver lab, it will be hard to create new scenarios...
installation of keycloak shall be done in a tooling namespace through a helm chart => OOM
Mohammad Hosnidokht indicated that gatling part is almost automated (docker verifying the roles)
William Reehil indicated that it will probably break the other tests
it is probably more relevant to start by creating a CSIT test (fonctional test) with AAI + keycloak rather than a daily..
#action Morgan Richomme complete wiki instanbul page to track this scenario
Michał Jagiełło pnf_macro add processing for pnf_macro to wait for a clean startup of the simulator (currently time based, and it seems that it takes more time than expected)
xtesting dockers are rebuilt daily based on the branches (guilin/master end of branches of all the components embedded in the docker), CI or end users cannot guess the format imposed by the LF rules (used for components e.g. 6.0-STAGING-20210315T185806Z)
Honolulu
status: INT-1883 - Reccurent various errors in smoke testsIn Progress
better results on daily master
last patch for the release was the SDNC patch, currently in gate - shall be merged before the TSC
windriver: feedback from Intel => recontact them after Honolulu for Hardware upgrade
Automated tests
pnf_macro
issue with the start of the simulator, wait 30s but it takes more time to start in master, a mechanism to check that everything shall be implemented #action Michał Jagiełło add mechanism in simu step to verify that everything is well started before sending a VES event.
basic_clamp
some regression had been introduced in master. Cleanup in progress, tests to be done after the meeting # action morgan re-run a clean basic_clamp on daily
basic_vm_macro
test almost ready, a config-deploy file was requested to complete the deployment, Michał Jagiełło wil include an empty file (nothing to do..) and re-test #action Michał Jagiełło complete basic_vm_macro tests
Michał Jagiełło pnf_macro add processing for pnf_macro to wait for a clean startup of the simulator (currently time based, and it seems that it takes more time than expected)
Morgan Richomme cleanup basic_clamp and re-run it on a daily master
Krzysztof Kuzmicki complete the official simulator page with NS illustration
Bartek Grzybowski review the official simu page (list of usable simu)
Illia Halych review the official simu page (pythonsdk wrapper)
Action point follow-up
AP1: Morgan Richomme amend pythonsdk_tests to include the workaround
new patch submitted yesterday (first one not efficient as no exception was raised by the SDK at this stage, just a 404 but no exception ResourceNotFound)
first attempt done last friday but faield, new attempts to be done, see next section
AP5: Morgan Richomme contact use case owner to collect the possible needs and remind the need to update the doc
done see next section
AP6: Morgan RichommeKrzysztof Kuzmicki create Epic and tasks to detail expectations on robot pod refactoring (alpine, split web/robot, python3, execution from outside the cluster, use python baseline image,....)
E2E Network Slicing use case requirements for Honolulu release: "We do the testing in CMCC lab, Winlab (Rutger’s University), and at Windriver (OOF tenant) within our use case team. So there is no other requirement from Integration team for resources (human as well as lab infra). Automation – we expect to make some progress in Istanbul release."
CCVPN - Transport Slicing use case requirements for Honolulu release Henry/Lin (some tests automated, dedicated repo created) "We are using our local lab in Ottawa to test the CCVPN use-case. And in parallel, the integration test of Network Slicing and Transport Slicing is done together with the Network Slicing use case team (i.e., using CMCC lab and Win lab)."
doc updated
Time to think to renew committers (some are no more very active) => suggest to promote Illia Halych
#action morgan initiate promotion procedure
Honolulu
Troubleshooting campaign status: INT-1883 - Reccurent various errors in smoke testsIn Progress
good progress on Master, still lots of patches in gate
on daily Master still regular timeouts on basic_vm|cnf|network
these 2E2 basic tests are leveraging A lacarte bpmn, it seems that most of the Service Providers priviledge the macro bpmn
pnf_macro under integration in CI
it would make sense to keep 1 test with a la carte workflow (basic_network) and move basic_vm and basic_cnf in macro Michał Jagiełło and Morgan Richomme started working on this task
Daily Guilin very stable - no so many timeouts for the basic_tests...so for sure the change of DB had an impact..we are not very good in evaluating the DB performance...important to address before B&R tests
From a dashboard perspective => test in healthcheck = 1 project / Smoke = test involved several projects BUT test still executed as part of smoke test in CI (avoid race condition with DCAE healthcheck)
tern integration in CI failed in last weekly (gitlab-ci under review)
use of a bad env var for the weekly rules change merged to be retested
Istambul
Move Robot pod out of ONAP cluster (including robot)
certificate and robot pod to be executed in onap-testing cluster
deals only with xtesting (not the robot pod)
the idea is to keep the onap namespace cleaned
to be tested, only open question = cm shall be recreated in teh new onap-testing namespace? url seems OK in cm to allow x-namespace exchanges
PythonSDK
which tests? what is poorly covered?
vFWCL? scaleout?
NBI
basic loop (basic_clamp extended)
ETSI bpmn?
use cases?
Robot
Bell/apex (if not done in Honolulu)
Add a retry capability in CI? other rules (if onap-helm or onap-k8s Fail "too much" => do not execute E2E tests to save CI time?)
Michał JagiełłoKrzysztof Kuzmicki Service Sanity check. Healthcheck are not enough, they may be PASS when the system is not working at all..Problem reported already (lots of Healthcheck tests do not provide more information that the onap-k8s checking that the pods are up&running). We have several examples when the pods are up&running but it does not wrok. Exception shall not caught to hide the issues. We would need to formalize a little bit to discuss with the PTL.
Andreas Geißler we are missing a simple test checking the UI endpoint
#action andreas morgan send a mail to PTL to collect admin and user UI and create a simple test checking that the UI are available
Former user (Deleted) add clair scan, even if it shall be done at LF, usually showing it in weekly is the fastest way for adoption
CI dashboard: improved dashboard to identify regression over time (see xls graphs) - multi platforms?
congratulation to the team for the S3P..now it will be cool to fix the issues in Master/Gating
windriver re-installation: to be planned after honolulu => answer sent to Intel team
Azure Staging lab reinstalled
#acrion Morgan Richomme send a mail to the community to indicate that staging azure is back (guilin maintenance release as master is broken)
Honolulu
Troubleshooting campaign status: INT-1883 - Reccurent various errors in smoke testsIn Progress
several issues found
SDC races => including offset to move to pseudo sequential seems to reduces the number of SDC issues. Still one where ceritification seems to take to much time => possibility to add a workaround in the test to retry and/or delay the creation/certification
#action Morgan Richomme amend pythonsdk_tests to include the workaround
VID 500 due to the presence of a "U" char in random password (which explains why we did not have always the error...à - deobuscated by VID even if not obfuscated using jetty library interprating U as unicode char breaking everything. Problem found and fixed by Krzysztof Opasiak => last gating seems reasonably better
SO error (so-bpmn stuck) was alsready observed time to time in guilin, so it is not a new problem...it explains why sometimes an instantiation takes 3 minutes sometimes 20, a jIRA had been created on this aspect
rebase/remerge in gate in progress...gates will burn over the next days...but it means that master chains should be back and usable
test done on Bell labs, adaptation needed to be intgrated in CI (config in a dedicated file not leveraging CM information)..
#Action Santosh bayas identify inputs to see if parameters are missing + publish robot code in testsuite for review
CDS regression test (session with JC planned on thursday to give a try on Orange lab and see how we can integrate in CI - helm chart finally with docker file in mock repositories)
#action Morgan Richomme contact use case owner to collect the possible needs and remind the need to update teh doc
Istambul => for information to be discussed next week
Move Robot pod out of ONAP cluster (including robot)
certificate and robot pod to be executed in onap-testing cluster
CI dashboard: improved dashboard to identify regression over time (see xls graphs) - muti platforms?
PythonSDK
which tests? what is poorly covered?
vFWCL? scaleout?
NBI
basic loop (basic_clamp extended)
Add a retry capability in CI? other rules (if onap-helm or onap-k8s Fail "too much" => do not execute E2E tests to save CI time?)
Robot refactoring: target (slot planned after the meeting if we are short in time)
#action Morgan RichommeKrzysztof Kuzmicki create Epic and tasks to detail expectations on robot pod refactoring (alpine, split web/robot, python3, execution from outside the cluster, use python baseline image,....)
AoB
AP1: Morgan Richomme amend pythonsdk_tests to include the workaround
AP2: Santosh bayas identify inputs to see if parameters are missing + publish robot code in testsuite for review
AP3: Morgan RichommeKrzysztof Kuzmicki check why we have a path difference between execution in the robot pod and in the xtesting docker
AP5: Morgan Richomme contact use case owner to collect the possible needs and remind the need to update the doc
AP6: Morgan RichommeKrzysztof Kuzmicki create Epic and tasks to detail expectations on robot pod refactoring (alpine, split web/robot, python3, execution from outside the cluster, use python baseline image,....)
Action point follow-up
AP1: Andreas Geißler check if doc of the official use cases are excplictely referencing the robot pod
not done yet
AP2: Morgan Richomme include Nokia pod in onap-integration web site
however it is exactly teh same code xtesting-smoke-usecases:master and the same launch xtesting-onap:master
the only difference is due to master changes
review of the OOM Master patches from the 17/2 (all gates look pretty good). Daily master of th 18 and 19 were also good (not the daily from the 1st of March was not that bad as well..)
Merge Date
Date of the last patchset (or rebase) used for gating
ID
Patch
Status At least 2 of the 3 basic_ E2E tests OK)
links
Comments
17/02/21
00:12:00
12/02/21
08:48:00
117758
[AAF] Give `identities.dat` to working deployments
something seems broken since the 19/2...only 3 patches have been merged around this date: 117743, 117936, 117753
the subsequent OK correspond to patch whose tests have been executed earlier but merge after the 20/2, revert in progress to see if one of the 3 patches coudl be a root cause of the regression on the gating chain
sequential versus parallel
tests planned to move to E2E sequential to avoid side effect due to parallelization. Parallel tests could be done on weekly...need to fist master and gating chains first.
almost there: last issue due to DCAE TCA, seems tehre is a regression on dcae due to python2.7 / Python3 => . Wait for fix before retrying.
wrapper
OK, question on where to host the helm charts for the simulator (pnf and future one)
#action contact OOM team => similar question than the for CDS mock objects
it is possible to remove the submodule - just be carefull, once removed => possible to deploy onapsdk as a "standard module" no more as developer module => impact on xtesting-onap as the paces will be different
consensus to work on the resiliency of the core component
#action Andreas Geißler warning shall be added in the documentation, as aarna network is not really usable for a real backup/restore (more some examples to use velero)
Companies that tried velero has issues due to teh fact that ONAP is not really cloud native
AoB
cmpv2 about to be moved from healthcheck to smoke
Krzysztof Kuzmicki indicates that 5Gbulkpm will be updated ..may have some turbulences during the update
Actions
AP1 Illia Halych : ask OOM for the rright place to host helm chart
AP2 Andreas Geißler add warning in doc for backup and restore
AP3 Morgan Richomme contact TSC regarding B&R task force and possible scenario for Honolulu
AP4 Morgan Richomme test basic-pnf (macro) on daily master before integrating thsi use case in CI
Action point follow-up
no action point from last week (several docs)
action point from the 17/02
AP1: Morgan Richomme : contact vcpe tosca team to get their position in Honolulu ()
not really an action point but the advise is wise...be carefull after removing the submodules + change the xtesting installation (using standard python module not developer way...xtesting-onap shall follow)
AP6: Krzysztof Kuzmicki prepare a 10 minutes demo on pnf-simu to nf-simu (explaning the graph shared by mail)
done
AP7: Marcin Przybysz plan a demo of VID new UI, how to launch it, how it is used in PNF macromode + see with VID how we can manage tests that are using legacy VID..
done it is a free chan, no idea how LF decides to pay for a non free chan..if community thinks that we shall be able to keep the history and then take a non free chan, it is possible to report it to the TSC
M3 is over..JIRA review needed to know what is possible (support/test cases/) what shall be postponed...
release cadence is new, some projects were surprised. No new code will be accepted in OOM gate. Lots of patches submitted...gating system under pressure..wait for merge to reach the RC0
list of jenkins images known in CSIT repo but not sure it is documented. Some issues due to python 2.7/3.5/3.6/3.7/3.8/3.9...lib dependency issues NB the seccom recommended version is the 3.9
Morgan Richomme reminds that robot tests are run through pythonsdk, the question is linked to the robot pod not the ability to run robotframework test cases, which is always possible. Who is using the e2e scripts that trigger processing on the robot pod. Lasse Kaihlavirta indicates that some usecases (under upstreaming (A1)) are in this case. For Honolulu we may ask for a waiver and remove the pod in Istambul (is nobody is using it). Morgan Richomme sent a mail to the community but no feedback so far. Scaleout and vFWCL still probably requires the pod, other usercases requires also the init offered by the bash scripts #action Andreas Geißler look in the doc which use cases explicitely make reference to the scripts in the official documentation
error due to internal cert executed in onap ns (shall be moved to an external ns)
1 error on enhanced policy => policy anticipated the OOM merge and already adapted the test, it shall be back OK once the new policy dockers woudl be merged
1 error in E2E due to SDC (exception not caught)
lots of patch in OOM gate..if lots of errors it is not normal as the master is reasonably stable...
#action all help reviewing the OOM patches to give Integration feedback
Guilin is not good
same CI errors in DT and Orange
runnign the tests directly is usually good => issue in the way to launch the test in guilin
"old" ansible version in the xtesting-onap branch
tests in progress to use xtesting-onap master (assuming that the test version is provided by a env variable already) => no need of specific branch
delta guilin/current master shared with the community
as usual some projects anticipate and provided features continuously (Policy, DCAE, NBI, OOF, OOM), some did not...which explains the gating bottlneck due to the M3
core/internal nodeports tests can be long ..any idea how to parallelize
wait for Paweł Wieczorek come back as same issue than for versions
Backup & restore
postponed to next week
simu question on the availability of the E// pnf simu availability in nexus AJAY SINGH
answer sent by mai, changes done broke the built chain => after 40 days the snampshot images are removed, as the image was not released and not rebuilt, expiration date was reached => need to fix the build or to release an old version (pretty sure that jenkins history is also cleaned..so probably not possible)
AoB
Action points for next meeting
AP1: Andreas Geißler check if doc of the official use cases are excplictely referencing the robot pod
AP2: Morgan Richomme include Nokia pod in onap-integration web site
AP3: all help reviewing the OOM patches to give Integration feedback
Action point follow-up
Admin
Honolulu
demo simulator Core NSSMF simulator (Zhang Min ) 20'
Krzysztof Kuzmicki indicates that dcae-mod shall be tested also through a new healthcheck test
FWCL not possible to integrate in CI straightforward - preprovisionned resources needed, dependencies on installation (documented),
Not sure if FW (macromode) will bring more than basic_VM + PNF and note sure Andreas Geißler will have resource to deal with it
VCPE TOSCA (CLI) was not possible to run it on daily labs (running onwly on tester lab), improvements were planned in Guilin (seems to be done). #action 1
#agreed No objection from the new categories and new tests for Honolulu branch
for tern/scancode: watch out the versions of the OS, Debian9 is too old. #action 2
very few projects seem to use the baseline images (that are up to date..)
Pythonsdk
removal of the submodule?
#agree remove the pnf submodule as soon as possible, be careful with the side effects on xtesting dockers / xtesting-onap ansible roles as we coudl move back from src/onaptests to /usr/lib/python3.8 path
status on the wrapper (helm2/helm3)
Illia Halych indicates that there is a client supporting helm2 and one supporting helm3 but not both
#agree consider only helm3
Add the statement in the README (as for the python min version 3.7 (same as pythonsdk)
#action 3
status on the tests basic_pnf, basic_clamp
problem for basic_pnf seems to be fixed, patch merged on SDC
see if the basic_pnf could be introduced in CI (test it first manually on daily weekly then add teh job xtesting-onap project that defines the CI chain) #action 4
Improve reporting
reporting may be misleading as it indictaes only the results of resource creation. We recently got an issue when deleting the resources => all the reporting is green even if there was an error during vfModule deletion. Michał Jagiełło submitted a patch to consider deletion in reporting #action 5
Solution to limit issues on master gate (gating on pythonsdk?, versioning,..)
pnf-simu > nf-simu: formal review
#agreed
demo to be planned next week #action6
AoB
Marcin Przybysz indicates that a discussion is needed on VID. Vid new UI is needed for PNF macromode tests but old UI is needed for vFWCL..how do we manage that
Michał Jagiełło tried to setup the new VID uI (VID 7.0.2) but has issue, Marcin Przybysz indicates that some flags have to be tuned #action 7
actions for next meeting
AP1: Morgan Richomme : contact vcpe tosca team to get their position in Honolulu ()
AP6: Krzysztof Kuzmicki prepare a 10 minutes demo on pnf-simu to nf-simu (explaning the graph shared by mail)
AP7: Marcin Przybysz plan a demo of VID new UI, how to launch it, how it is used in PNF macromode + see with VID how we can manage tests that are using legacy VID..