The intent of the 72 hour stability test is not to exhaustively test all functions but to run a steady load against the system and look for issues like memory leaks that aren't found in the short duration install and functional testing during the development cycle.
This page will collect notes on the 72 hour stability test run for El Alto.
Setup
The integration-longevity tenant in Intel/Windriver environment was used for the 72 hour tests.
The onap-ci job for "Project windriver-longevity-release-manual" was used for the deployment with the OOM and Integration branches set to elalto.
The deployment was fairly clean but there was an environment issue that required a few pods to be recycled by the normal k8 delete pod due to a what looked like a network blimp during the install.
We also hit the environment dhcp bug where the VMs would get an external dhcp address from a different network than openstack's dhcp. The symptom is not being able to log into the external IP of the VM.
This is solved by a force reboot of the VM from the horizon portal but unfortunately this prevents the installation of the demo VNF config files so the VM install script has to be re-run from inside the VM.
Changes were made to the testsuite robot scripts for instantiateDemoVFWCL robot flows to fix changes in the customer name/stack name generation to match the jenkins job setup for closed loop.
These were a side affect of the El Alto refactoring for python 2.7/3 migration that hadnt been detected in the previous test cases due to the need for unique Naming requirements in the jenkins jobs.
Shakedown consistent of creating some temporary tags for stability72hrvLB, stability72hrvVG,stability72hrVFWCL to make sure each sub test ran successfully (including cleanup) in the environment before the jenkins job started with the higher level testsuite tag stability72hr that covers all three test types.
During shake down of the environment we exceeded the quota on key pairs again (a recurring problem due to testing in the environment where the keypair delete is not run after deleting the VMs).
We used the horizon portal to delete keypairs for a large set of the previous robot test runs using the common admin tenant to free up quota space which should be sufficent for the duration of the tests but we will delete key pairs during the run just in case if needed.
VNF Orchestration Tests
This test uses the onap-ci job "Project windriver-longevity-stability72hr" to automatically onboard, distribute and instantiate the ONAP opensource test VNFs vLB, vVG and vFWCL.
The scripts run validation tests after the install.
The scripts then delete the VNFs and cleans up the environment for the next run.
The script tests AAF, DMaaP, SDC, VID, AAI, SO, SDNC, APPC with the open source VNFs.
These tests started at jenkins job #243 at October 12 at 1:00 PM EST
Each test run generates over 500 MB of data on the test through robot framework.
Test # | Comment | Message |
---|---|---|
Test start #243 1 PM Oct 12 | ||
245 | Validate vServer in testsuite HeatBridge needed to wait for the AAI index update. Wrapped this step in a Wait For Keyword Success | post response: {"requestError":{"serviceException":{"messageId":"SVC3001","text":"Resource not found for %1 using id %2 (msg=%3) (ec=%4)","variables":["POST Search","getNamedQueryResponse","Node Not Found:No Node of type vserver found for properties","ERR.5.4.6114"]}}} |
260 | Environment Issue/Test tool issue. robot test script created conflicting data | Received failure response from so {"request":{"requestId":"79264729-04ab-4738-a27d-29013c59218c","startTime":"Sun, 13 Oct 2019 09:38:20 GMT","finishTime":"Sun, 13 Oct 2019 09:39:14 GMT","requestScope":"vfModule","requestType":"createInstance","requestDetails":{"modelInfo":{"modelCustomizationName":"VfwclVfwsnk0f6a8e47E64e..base_vfw..module-0","modelInvariantId":"e994097b-6285-49e1-a87c-76ba6e0371ab","modelType":"vfModule","modelName":"VfwclVfwsnk0f6a8e47E64e..base_vfw..module-0","modelVersion":"1","modelCustomizationUuid":"6ce786ef-31e8-4f00-bdb4-1c66f54eaffd","modelVersionId":"72f56293-fbf2-49fa-bb13-1df8f5f88548","modelCustomizationId":"6ce786ef-31e8-4f00-bdb4-1c66f54eaffd","modelUuid":"72f56293-fbf2-49fa-bb13-1df8f5f88548","modelInvariantUuid":"e994097b-6285-49e1-a87c-76ba6e0371ab","modelInstanceName":"VfwclVfwsnk0f6a8e47E64e..base_vfw..module-0"},"requestInfo":{"source":"VID","instanceName":"Vfmodule_Ete_vFWCLvFWSNK_031aaae1_0","suppressRollback":false,"requestorId":"demo"},"relatedInstanceList":[{"relatedInstance":{"instanceId":"fc4a3aac-e15e-4cf2-b85c-93eee3cdf3cc","modelInfo":{"modelInvariantId":"ed6ca1d8-cf38-455b-bb0a-75ae84d51715","modelType":"service","modelName":"vFWCL 2019-10-13 09:29:","modelVersion":"1.0","modelVersionId":"1c3dece0-945e-4f38-b5d2-f1d3fe7579e1","modelUuid":"1c3dece0-945e-4f38-b5d2-f1d3fe7579e1","modelInvariantUuid":"ed6ca1d8-cf38-455b-bb0a-75ae84d51715"}}},{"relatedInstance":{"instanceId":"d4cc80c3-367c-4de2-8dd2-52904466b60a","modelInfo":{"modelCustomizationName":"vFWCL_vFWSNK 0f6a8e47-e64e 0","modelInvariantId":"dcbe3ca3-b9c3-4042-a06f-5ad83f1be089","modelType":"vnf","modelName":"vFWCL_vFWSNK 0f6a8e47-e64e","modelVersion":"1.0","modelCustomizationUuid":"9eaff9be-ac20-4872-9804-7bd45515a351","modelVersionId":"2de4b9dd-b6d6-4822-92c0-670c9329557f","modelCustomizationId":"9eaff9be-ac20-4872-9804-7bd45515a351","modelUuid":"2de4b9dd-b6d6-4822-92c0-670c9329557f","modelInvariantUuid":"dcbe3ca3-b9c3-4042-a06f-5ad83f1be089","modelInstanceName":"vFWCL_vFWSNK 0f6a8e47-e64e 0"}}}],"cloudConfiguration":{"tenantId":"28481f6939614cfd83e6767a0e039bcc","cloudOwner":"CloudOwner","lcpCloudRegionId":"RegionOne"},"requestParameters":{"usePreload":true,"testApi":"VNF_API"}},"instanceReferences":{"serviceInstanceId":"fc4a3aac-e15e-4cf2-b85c-93eee3cdf3cc","vnfInstanceId":"d4cc80c3-367c-4de2-8dd2-52904466b60a","vfModuleInstanceName":"Vfmodule_Ete_vFWCLvFWSNK_031aaae1_0","requestorId":"demo"},"requestStatus":{"requestState":"FAILED","statusMessage":"STATUS: Received vfModuleException from VnfAdapter: category='INTERNAL' message='Exception during create VF org.onap.so.openstack.utils.StackCreationException: Stack Creation Failed Openstack Status: CREATE_FAILED Status Reason: Resource CREATE failed: Conflict: resources.vsn_0_onap_private_port_0: IP address 10.0.235.102 already allocated in subnet 4ed99c09-aed6-4eca-8f94-48357ab4e5d1\nNeutron server returns request_ids: ['req-f60a93ff-ecbf-4c5e-b149-8ebdf64e38f2'] , Rollback of Stack Creation completed with status: DELETE_COMPLETE Status Reason: Stack DELETE completed successfully' rolledBack='true'","percentProgress":100,"timestamp":"Sun, 13 Oct 2019 09:39:14 GMT"}}} |
Test Status : #261 7 AM Oct 13 | No left over VMs or Stacks from delete Docker-data-nfs at 21% of available capacity robot container: 10.0.0.4:/dockerdata-nfs/dev-robot/robot/logs 162420736 33509376 128894976 21% /share/logs. 17 keypairs under demo account Environment Spot Check when tests are not running look okay. NAME CPU(cores) CPU% MEMORY(bytes) MEMORY% | |
Test Status: #267 12:00 PM Oct 13 | /dev/vda1 162420480 36636868 125767228 23% / No left over VMs or Stacks from previous runs RegionOne_ONAP-NF_20191013T150300143Z_olc-key_PlYL style keypairs added in the morning. Up to 27 keypairs | |
Closed Loop Tests
This test uses the onap-ci job "Project windriver-longevity-vfwclosedloop".
The test uses the robot test script "demo-k8s.sh vfwclosedloop ". The script sets the number of streams on the vPacket Generator to 10 , waits for the change from 10 set sreams to 5 streams by the control loop then sets the stream to 1 and again waits for the 5 streams.
Success tests the loop from VNF through DCAE, DMaaP, Policy, AAI , AAF and APPC.
The tests start with #1595 on October 12 at 4:00 PM EST
Test # | Comment | Message |
---|---|---|
Test Start #1595 4 PM Oct 12 | ||
Test Status: #1610 7 AM Oct 13 | No issues. No failed tests | |
Test Status: $1615 12 PM Oct 13 | No issues. No failed tests |
Summary
To be completed after the test run