CPS-1126 : CmHandle creation performance degradation

References:

CPS-1126: CmHandle creation performance degradationClosed

Problem Statement

  • With previous versions we were able to register 10k cmhandles (in 4 threads) in 12 minutes.

  • With the current CPS master we sent the 10k cmHandles  around 10 minutes to NCMP, so it is not faster than before. What I see is that NCMP starts getting modules in a background thread (seems to be one thread) and handling one cmhandle takes more seconds.

  • Currently in 12 minutes only 150 cmHandles got available with ready state in NCMP, because getting modules and changing the state is quite slow this way. 



Expectation/ goal: cm-handle registration + module sync → 25 cm handles / sec (see CPS Interface Performance Requirements Overview)

Improvements Overview

Test Environment

Secs Taken for

ModuleSyncWatchdog (Cm Handle moves ADVISED->READY)

Secs taken for
Registration sub-process

Commit id

Parrealel Request to DMI

Main improvement

Comments

Test Environment

Secs Taken for

ModuleSyncWatchdog (Cm Handle moves ADVISED->READY)

Secs taken for
Registration sub-process

Commit id

Parrealel Request to DMI

Main improvement

Comments

Ericsson Testing cluster (with Model adapter that is implementation of dmi-plugin)

Total 1458 cm Handles

6,998 → 0.20/sec

2,278 → 0.64/sec

1410509e33142c0c79ff0e111c63abc47f5936d3

https://gerrit.onap.org/r/gitweb?p=cps.git;a=commit;h=1410509e33142c0c79ff0e111c63abc47f5936d3

N/A (1)

N/A

Baseline as per the bug

600 → 2.43/sec

70 → 20.82/sec

c3596fe3a85381668598c75a1ffee4bf973a8518

https://gerrit.onap.org/r/c/cps/+/129829/5

N/A (1)

  1. Created a dedicated threadpool for scheduler. - Tuned async threadpool of notification executor from setting RejectedExecutionHandler and application.yml.

Note : Root cause : We have 3 watchdogs (2 @Scheduled into module sync and 1 into data sync) that was blocking each other as all @Scheduled methods share a single thread by default. )



540 → 2.70 /sec

88(?!) → 16.57/sec

6e97bc6ee9f1cb2722aafc42325918e98da5cff2

https://gerrit.onap.org/r/c/cps/+/130058

N/A (1)

  1. Mapped additional properties of YangModelCmHandle.

  2. @Async : scheduler are executing tasks asynchronously 



10,000 → 4.17 / sec

10K in 40 mins

N/A

0e7970a1ec12eac0f45fa016f98148da7a2266a8

https://gerrit.onap.org/r/c/cps/+/130946 (Master)

10 

1. Introduced timeout for connect and reading for RestTemplate while connecting dmi-plugin.

2. Modified data structure of moduleSyncSemaphore to hold YangModelCmHandle referance (cm handle id) as key of map

3. Working of Modulewatchdog is modified with parallelism and broken into async tasks.

4. ModuleSyncSemaphoreMap is now populated with advised cm handles and processed by Modulewatchdog by batch size (Dynamic batch size at cm registration).

5. Introduled parallelism level for async tasks that are suppiled.

6. Includes cm handle registration preformance code change.

7. Performance Improvement: Postgres Pool Configuration

8. Performance Improvement: Async Notification Pool Configuration.

9. Performance Improvement: Save/update batch of CM-Handle States in one operation.

10. Performance Improvement: Updating (many) child datanodes/datanode tree into one SQL operation

11. Performance Improvement: Native query to more efficiently get all yangResourceIds.

12. Performance Improvement: Detaching fragment entities early.

13. Performance Improvement: Decouple Temporal DB Notification from main thread.

Note : It includes bottleneck of in my recursive method (@Joe K is working on it to improve.

org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#toDataNode)



Local Development Machine with CSIT docker deployment (with dmi-plugin that is  implemented by CPS team) 

97+296 = 393
→ 3.81 / sec

30 → 48.6 / sec

e367b520ac01432715b701af6e2bd75d4f81bccf

https://gerrit.onap.org/r/c/cps/+/130148  (Latest)

5

Removed  YangModelCmHandle object as key of map and stores only the reference that is cm handle id.





Note :

  1. It includes bottleneck of in recursive method 

 (org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#toDataNode)@Joe K is working on it to improve.

   2. dmi-plugin is able to handle only 5-8 cm handle per seconds             concurrently.

125+378  = 503
→ 2.98  / sec

33 → 45.45 / sec

10

246 +356 = 602
→ 2.49 / sec

31 → 48.39 / sec

30

Local Development Machine wit h dmi-plugin-perf-stub

10,000
→ 4.76 / sec

10K in 35 mins

Not measured

0e7970a1ec12eac0f45fa016f98148da7a2266a8

https://gerrit.onap.org/r/c/cps/+/130946 (Master)

10

  • Using wiremock stub implemented by CPS team

  • 100  requests of 100 cm handles each

  • 2 sec between each requests

Test Configuration 

Use Case

#

Configuration

Description

#

Configuration

Description

1

Number of CPS Instance

1

2

Total number of threads to send each batch (Register cm handle)

4

3

Batch size

100

4

To register 1458 cm handles

Used to send 15 batchs (100 cm handles in each batch)

CPS Parameters

#

Parameter

Value

#

Parameter

Value

1

timers:     advised-modules-sync:         sleep-time-ms:

30000

2

timers:     locked-modules-sync:         sleep-time-ms:

300000

3

timers:   cm-handle-data-sync:        sleep-time-ms:

30000

4

data-sync:     cache:         enabled:

false

Improvement Details

Performance matrix of improved methods that are called by ModuleSyncWatchdog or Cm handle registration API 

Test Environment

Method

 Sequential processing time
(Task Parallelism Level = 1 )

Parallel processing time(Task Parallelism Level = 30)

Number of DB calls reduced by parallel / batch processing 

Best Case

Worst Case

Average

Best Case

Worst Case

Average

Local Development Machine with CSIT docker deployment (with dmi-plugin that is  implemented by CPS team) 

org.onap.cps.ncmp.api.impl.event.lcm.LcmEventsCmHandleStateHandlerImpl#registerNewCmHandle

















org.onap.cps.ncmp.api.inventory.InventoryPersistence#saveCmHandleStates

















org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#addChildDataNodes

















org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#replaceDataNodeTree

















org.onap.cps.spi.repository.FragmentRepositoryCpsPathQueryImpl#findByAnchorAndCpsPath

















org.onap.cps.spi.repository.SchemaSetYangResourceRepositoryImpl#insertSchemaSetIdYangResourceId

















org.onap.cps.spi.repository.YangResourceNativeRepository#getResourceIdsByModuleNameAndRevision

















#

Operations

Average Time elapsed (in ~ ms)

#

Operations

Average Time elapsed (in ~ ms)

1

To save cm handle state with id 

Batch #.Iteration #

Time elapsed (in ~ ms)

1.1

293 

1.2

84

2.1

14

2.2

8

Average Time elapsed per cm handle (in ~ ms)

53

2

To synchronize, create schema set and anchor with id 

Batch #.Iteration #

Time elapsed (in ~ ms)

1.1

38393 

1.2

599

2.1

237 

2.2

122

Average Time elapsed per cm handle (in ~ ms)

0.483 

3

Delete schema set If exists with id

Batch #.Iteration #

Time elapsed (in ~ ms)

1.1

17

1.2

2

2.1

2

2.2

2

Average Time elapsed per cm handle (in ~ ms)

11.5

4

Get cm handle state with id

Batch #.Iteration #

Time elapsed (in ~ ms)

1.1

83

1.2

7

2.1

10

2.2

12

Average Time elapsed per cm handle (in ~ ms)

56 

5

 To get yang models

Batch #

Batch Size

Time elapsed (in ~ ms)

Time elapsed per cm handle (in ~ ms)

1

601

63107

105.00

2

857

37337

43.56

Average Time elapsed per iteration (in ~ ms)

74.28

6

To shuffle cm handles

Batch #

Batch Size

Time elapsed (in ~ ms)

1

601

0

2

857

0

Average Time elapsed per iteration (in ~ ms)

0

7

Get / fetch cm handles by state ADVISED

Batch #

Batch Size

Time elapsed (in ~ ms)

Time elapsed per cm handle (in ~ ms)

1

601

19404

32.28

2

857

3485

4.06

Average Time elapsed per iteration (in ~ ms)

18.17