Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

References:

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId4733707d-2057-3a0f-ae5e-4fd8aff50176
keyCPS-1126

...

Problem Statement

  • With previous versions we were able to register 10k cmhandles (in 4 threads) in 12 minutes.
  • With the current CPS master we sent the 10k cmHandles  around 10 minutes to NCMP, so it is not faster than before. What I see is that NCMP starts getting modules in a background thread (seems to be one thread) and handling one cmhandle takes more seconds.

...

  • Currently in 12 minutes only 150 cmHandles got available with ready state in NCMP, because getting modules and changing the state is quite slow this way. 

...

Analyzed the logs the summary is as below :

Patch used for testing :


Expectation/ goal: cm-handle registration + module sync → 25 cm handles / sec (see CPS Interface Performance Requirements Overview)

Improvements Overview

Test Environment

Secs Taken for

ModuleSyncWatchdog (Cm Handle moves ADVISED->READY)

Secs taken for
Registration sub-process
Commit idParrealel Request to DMIMain improvementComments

Ericsson Testing cluster (with Model adapter that is implementation of dmi-plugin)

Total 1458 cm Handles

6,998 → 0.20/sec2,278 → 0.64/sec

1410509e33142c0c79ff0e111c63abc47f5936d3

https://gerrit.onap.org/r/gitweb?p=cps.git;a=commit;h=1410509e33142c0c79ff0e111c63abc47f5936d3

N/A (1)N/ABaseline as per the bug
600 → 2.43/sec70 → 20.82/sec

c3596fe3a85381668598c75a1ffee4bf973a8518

https://gerrit.onap.org/r/c/cps/+/129829/5

N/A (1)
  1. Created a dedicated threadpool for scheduler. - Tuned async threadpool of notification executor from setting RejectedExecutionHandler and application.yml.

Note : Root cause : We have 3 watchdogs (2 @Scheduled into module sync and 1 into data sync) that was blocking each other as all @Scheduled methods share a single thread by default. )


540 → 2.70 /sec88(?!) → 16.57/sec

6e97bc6ee9f1cb2722aafc42325918e98da5cff2

https://gerrit.onap.org/r/c/cps/+/130058

N/A (1)
  1. Mapped additional properties of YangModelCmHandle.
  2. @Async : scheduler are executing tasks asynchronously 

10,000 → 4.17 / sec

10K in 40 mins

N/A

0e7970a1ec12eac0f45fa016f98148da7a2266a8

https://gerrit.onap.org/r/c/cps/+/130946 (Master)

10 

1. Introduced timeout for connect and reading for RestTemplate while connecting dmi-plugin.

2. Modified data structure of moduleSyncSemaphore to hold YangModelCmHandle referance (cm handle id) as key of map

3. Working of Modulewatchdog is modified with parallelism and broken into async tasks.

4. ModuleSyncSemaphoreMap is now populated with advised cm handles and processed by Modulewatchdog by batch size (Dynamic batch size at cm registration).

5. Introduled parallelism level for async tasks that are suppiled.

6. Includes cm handle registration preformance code change.

7. Performance Improvement: Postgres Pool Configuration

8. Performance Improvement: Async Notification Pool Configuration.

9. Performance Improvement: Save/update batch of CM-Handle States in one operation.

10. Performance Improvement: Updating (many) child datanodes/datanode tree into one SQL operation

11. Performance Improvement: Native query to more efficiently get all yangResourceIds.

12. Performance Improvement: Detaching fragment entities early.

13. Performance Improvement: Decouple Temporal DB Notification from main thread.

Note : It includes bottleneck of in my recursive method (Joe K is working on it to improve.

org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#toDataNode)


Local Development Machine with CSIT docker deployment (with dmi-plugin that is  implemented by CPS team) 

97+296 = 393
→ 3.81 / sec

30 → 48.6 / sec

e367b520ac01432715b701af6e2bd75d4f81bccf

https://gerrit.onap.org/r/c/cps/+/

...

130148  (Latest)

5

Removed  YangModelCmHandle object as key of map and stores only the reference that is cm handle id.



Note :

  1. It includes bottleneck of in recursive method 

 (org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#toDataNode)Joe K is working on it to improve.

   2. dmi-plugin is able to handle only 5-8 cm handle per seconds             concurrently.

125+378  = 503
→ 2.98  / sec
33 → 45.45 / sec10

246 +356 = 602
→ 2.49 / sec

31 → 48.39 / sec30
Local Development Machine wit h dmi-plugin-perf-stub

10,000
→ 4.76 / sec

10K in 35 mins

Not measured

0e7970a1ec12eac0f45fa016f98148da7a2266a8

https://gerrit.onap.org/r/c/cps/+/130946 (Master)

10
  • Using wiremock stub implemented by CPS team
  • 100  requests of 100 cm handles each
  • 2 sec between each requests

Test Configuration 

Use Case

#ConfigurationDescription
1Number of CPS Instance

1

2Total number of threads to send each batch (Register cm handle)4
3Batch size

100

4To register 1458 cm handlesUsed to send 15 batchs (100 cm handles in each batch)

CPS Parameters

#
WatchdogFixed Delay String (in ms)1
Parameter
Value
1
timers:
    advised-modules-sync:
        sleep-time-ms: 

30000

2
timers:
    locked-modules-sync:
        sleep-time-ms: 
300000
3
timers:
   cm-handle-data-sync:
       sleep-time-ms: 
30000
#Data Sync
Enabled ?
4
data-sync:
    cache:
        enabled: 

false

Improvement Details

Performance matrix of improved methods that are called by ModuleSyncWatchdog or Cm handle registration API 

Test EnvironmentMethod Sequential processing time
(Task Parallelism Level = 1 )

Parallel processing time(Task Parallelism Level = 30)

Number of DB calls reduced by parallel / batch processing 
Best CaseWorst CaseAverageBest CaseWorst CaseAverage
Local Development Machine with CSIT docker deployment (with dmi-plugin that is  implemented by CPS team) org.onap.cps.ncmp.api.impl.event.lcm.LcmEventsCmHandleStateHandlerImpl#registerNewCmHandle







org.onap.cps.ncmp.api.inventory.InventoryPersistence#saveCmHandleStates







org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#addChildDataNodes







org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#replaceDataNodeTree







org.onap.cps.spi.repository.FragmentRepositoryCpsPathQueryImpl#findByAnchorAndCpsPath







org.onap.cps.spi.repository.SchemaSetYangResourceRepositoryImpl#insertSchemaSetIdYangResourceId







org.onap.cps.spi.repository.YangResourceNativeRepository#getResourceIdsByModuleNameAndRevision







#Operations
Average Time elapsed (in ~ ms)
1To save cm handle state with id Batch #.Iteration #Time elapsed (in ~ ms)
1.1293 
1.284
2.114
2.28
Average Time elapsed per cm handle (in ~ ms)53
2To synchronize, create schema set and anchor with id Batch #.Iteration #Time elapsed (in ~ ms)
1.138393 
1.2599
2.1237 
2.2122
Average Time elapsed per cm handle (in ~ ms)0.483 
3Delete schema set If exists with idBatch #.Iteration #Time elapsed (in ~ ms)
1.117
1.22
2.12
2.22
Average Time elapsed per cm handle (in ~ ms)11.5
4Get cm handle state with idBatch #.Iteration #Time elapsed (in ~ ms)
1.183
1.27
2.110
2.212
Average Time elapsed per cm handle (in ~ ms)56 
5 To get yang modelsBatch #Batch SizeTime elapsed (in ~ ms)Time elapsed per cm handle (in ~ ms)
160163107105.00
28573733743.56
Average Time elapsed per iteration (in ~ ms)74.28
6To shuffle cm handlesBatch #Batch SizeTime elapsed (in ~ ms)
16010
28570
Average Time elapsed per iteration (in ~ ms)0
7
Get / fetch cm handles by state ADVISED
Batch #Batch SizeTime elapsed (in ~ ms)Time elapsed per cm handle (in ~ ms)
1601

19404

32.28
285734854.06
Average Time elapsed per iteration (in ~ ms)18.17