History

Before Frankfurt

Until Frankfurt there were 2 tests

stability test: vFw (then vFWCL) run continuously during 72h (https://docs.onap.org/en/elalto/submodules/integration.git/docs/integration-s3p.html?highlight=stability)
resilency test: test when we destroy some pods and retest that the use case vFw is still OK (only up to El Alto)

In frankfurt we also consider the stability of the installation through the Daily chains (https://docs.onap.org/projects/onap-integration/en/frankfurt/integration-s3p.html#integration-s3p)

Guilin

The stability tests considered for the release were:

1 week stability test based on basic_vm
1 day HC verification
Daily CI Guilin installation chain

See https://docs.onap.org/projects/onap-integration/en/guilin/integration-s3p.html#integration-s3p

Evolution for Honolulu

In Honolulu we would like to revisit the stability/resiliency testing part by introducing automated tests on CI weekly chain.

It means we want to execute tests over a week to verify the resiliency and the stability of the solution during the development life cycle.

Definition of the KPIs

what do we want to test, which figures? Nb of onboardings / instantiations? test duration//

In a first step, we estimate our needs to

10 parallel service onboarding - 10 simultaneous module upload in ONAP
50 parallel instantiation - 50 simultaneous service creations of service declared in ONAP

We could imagine additional KPIs

number of simultaneous loop creation/instantiation
number of dmaap messages
number of event messages

The question has been raised at the community level especially among the service provider operating ONAP in production.

Testcase 1: Parallel onboarding tests

Description

The goal of this test is to create in parallel several services in the SDC.

We estimate that this number is not very high in the reality of operations because it corresponds to the upload of a new service model, which does not occur frequently.

Environment

Tests executed from 07/01/2021 to 13/01/2021 on a Guilin lab. Reusing the basic_vm with different service names (it means that we recreate all the SDC objects VSP, VF Services).

2 series run several times:

5 simultaneous onboarding
10 simultaneous onboarding

The main component used for this test is the SDC (+AAI).

the reporting page can be described as follows:

The name of the service is basic_onboard_<Random string>, the random string is needed to ensure we reuse the onboarding mechanism (with the same name pythonsdk will retrieved the service already onboarded)

During the test we monitor the ONAP cluster resources through a prometheus/grafana:

Common Cassandra resource consumption:

Results

Data format is MM:SS

5 parallel onboarding (10 series)

criteria \ Serie	1	2	3	4	5	6	7	8	9	10	Global

criteria \ Serie	1	2	3	4	5	6	7	8	9	10	Global
Success rate (%)	100	100	100	100	100	100	100	100	100	100	100
Min duration	27:39	10:24	10:15	10:26	11:18	07:42	07:54	08:05	08:35	08:20	07:42
Max duration	27:43	10:36	10:17	10:27	11:22	07:53	08:00	08:19	08:42	08:44	27:43
Average duration	27:41	27:40	10:16	10:27	11:20	07:48	07:58	08:12	08:39	08:42	11:09
Median duration	27:41	10:26	10:16	10:27	11:19	07:49	07:59	08:13	08:40	08:38	09:30
Comments/Errrors	/	/	/	/	/	/	/	/	/	/	/

Evolution of the average duration in seconds over time for series of 5.

10 parallel onboarding (5 series)

criteria \ Serie	1	2	3	4	5	Global

criteria \ Serie	1	2	3	4	5	Global
Success rate (%)	100	100	100	100	100	100
Min duration	16:04	15:24	16:32	19:40	19:07	15:24
Max duration	16:22	17:10	17:36	20:01	19:50	20:01
Average duration	16:15	16:51	17:23	19:52	19:46	18:00
Median duration	16:20	17:08	17:33	19:53	19:38	17:33
Comments/Errrors	/	/	/	/	/	/

Evolution of the average duration in seconds over time for series of 10.

Evolution of test durations over the campaign for series of 5 (red/first circle) and 10 (green/second circle).

Conclusions

ONAP Guilin is able to support 10 parallel onboarding, which is what we do expect.

We may also observe that:

The number of previous onboarded services has no impact on the onboarding duration. The creation of resources is linear. It means that on serie 10, 9 services have been already created. We could have expected a linear increase of the onboarding duration because the client used for test list several times the services.So the more services in SDC, the bigger the list is. So globally the SDC resources increases continuously because we cannot delete them but it has no direct impact on the onboarding duration. The duration evolution is not linear and the duration may depend mainly on the cluster status.
The more // processing we have, the slower the onboarding this. duration = f(nb parallel onboarding) seems almost linear.

ONAP Wiki

Honolulu stability test evolution

Analytics