Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

References:

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId4733707d-2057-3a0f-ae5e-4fd8aff50176
keyCPS-1126

Overview :

With previous versions we were able to register 10k cmhandles (in 4 threads) in 12 minutes.
With the current CPS master we sent the 10k cmHandles  around 10 minutes to NCMP, so it is not faster than before. What I see is that NCMP starts getting modules in a background thread (seems to be one thread) and handling one cmhandle takes more seconds.
Currenlty in 12 minutes only 150 cmHandles got available with ready state in NCMP, because getting modules and changing the state is quite slow this way. 

...

Analyzed the logs the summary is as below :

Patch used for testing :

https://gerrit.onap.org/r/c/cps/+/129829/5//COMMIT_MSG

Configurations: 

...

1

...

100

...

Fixed Delay String (in ms)

...

timers:
    advised-modules-sync:
        sleep-time-ms: 

...

30000

...

timers:
    locked-modules-sync:
        sleep-time-ms: 

...

timers:
   cm-handle-data-sync:
       sleep-time-ms: 

...

Enabled ?

...

data-sync:
    cache:
        enabled: 

...

false

Scenario 1: If there is no cm handle into Advised state

Average time elapsed to get/fetch cm handles by state ADVISED is ~ 3-4 ms

CPS -log :

2022-07-15 04:40:17.918  INFO [cps-application,57eae02b31a98818,57eae02b31a98818] 1 --- [  watchdog-th-9] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to get cm handles by state ADVISED is : 3 ms | Fetched data node size is 0
2022-07-15 04:40:17.918  INFO [cps-application,57eae02b31a98818,57eae02b31a98818] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : No Cm-Handles currently found in an ADVISED state
2022-07-15 04:40:48.003  INFO [cps-application,e1fb65c00a949ce0,e1fb65c00a949ce0] 1 --- [  watchdog-th-9] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to get cm handles by state ADVISED is : 4 ms | Fetched data node size is 0
2022-07-15 04:40:48.003  INFO [cps-application,e1fb65c00a949ce0,e1fb65c00a949ce0] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : No Cm-Handles currently found in an ADVISED state
 

Scenario 2: If we started registering 1458 with configuration mentioned above :

...

Average Time elapsed (in ~ ms)

...

Get / fetch cm handles by state ADVISED

...

19404

...

Time

...

Registration of 1458 cm handles started

...

 04:41:00

...

Average time to make all 1458 cm handles from ADVICED to READY state is: ~ 10 minutes

First Batch:

First iteration:

2022-07-15 04:41:37.407  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to get cm handles by state ADVISED is : 19404 ms | Fetched data node size is 601
2022-07-15 04:41:37.408  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to shuffle  advisedCmHandles list of size : 601 is 0 ms
2022-07-15 04:42:40.516  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to get yang model cm handles is : 63107 ms
2022-07-15 04:42:40.600  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to get cm handle state with id 6D3F803EFCCB92F84FEE3CAEEEC0C48D is : 83 ms
2022-07-15 04:42:40.618  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to deleteSchemaSetIfExists with id 6D3F803EFCCB92F84FEE3CAEEEC0C48D is : 17 ms
2022-07-15 04:43:19.011  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to syncAndCreateSchemaSetAndAnchor with id 6D3F803EFCCB92F84FEE3CAEEEC0C48D is : 38393 ms
2022-07-15 04:43:19.307  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to executeAdvisedCmHandlePoll saveCmHandleState with id 6D3F803EFCCB92F84FEE3CAEEEC0C48D is : 293 ms
2022-07-15 04:43:19.307  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : 6D3F803EFCCB92F84FEE3CAEEEC0C48D is now in READY state

Second iteration :

2022-07-15 04:43:19.314  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to get cm handle state with id CA29889D3AE24486DD95344AEC60E0F7 is : 7 ms
2022-07-15 04:43:19.317  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to deleteSchemaSetIfExists with id CA29889D3AE24486DD95344AEC60E0F7 is : 2 ms
2022-07-15 04:43:19.916  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to syncAndCreateSchemaSetAndAnchor with id CA29889D3AE24486DD95344AEC60E0F7 is : 599 ms
2022-07-15 04:43:20.001  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to executeAdvisedCmHandlePoll saveCmHandleState with id CA29889D3AE24486DD95344AEC60E0F7 is : 84 ms
2022-07-15 04:43:20.001  INFO [cps-application,8ffa50adfe3fb407,8ffa50adfe3fb407] 1 --- [  watchdog-th-9] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : CA29889D3AE24486DD95344AEC60E0F7 is now in READY state

Second Batch:

First iteration: 

2022-07-15 04:45:51.513  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to get cm handles by state ADVISED is : 3485 ms | Fetched data node size is 857
2022-07-15 04:45:51.513  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to shuffle  advisedCmHandles list of size : 857 is 0 ms
2022-07-15 04:46:28.851  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.ncmp.api.inventory.sync.SyncUtils  : Time elapsed to get yang model cm handles is : 37337 ms
2022-07-15 04:46:28.861  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to get cm handle state with id F11218269B0D31012FBAFF88F9B4A345 is : 10 ms
2022-07-15 04:46:28.864  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to deleteSchemaSetIfExists with id F11218269B0D31012FBAFF88F9B4A345 is : 2 ms
2022-07-15 04:46:29.101  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to syncAndCreateSchemaSetAndAnchor with id F11218269B0D31012FBAFF88F9B4A345 is : 237 ms
2022-07-15 04:46:29.116  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : Time elapsed to executeAdvisedCmHandlePoll saveCmHandleState with id F11218269B0D31012FBAFF88F9B4A345 is : 14 ms
2022-07-15 04:46:29.116  INFO [cps-application,acf4595802b71109,acf4595802b71109] 1 --- [ watchdog-th-10] o.o.c.n.a.i.sync.ModuleSyncWatchdog      : F11218269B0D31012FBAFF88F9B4A345 is now in READY state

Second iteration:

...

References:

Jira Legacy
serverSystem Jira
columnIdsissuekey,summary,issuetype,created,updated,duedate,assignee,reporter,priority,status,resolution
columnskey,summary,type,created,updated,due,assignee,reporter,priority,status,resolution
serverId4733707d-2057-3a0f-ae5e-4fd8aff50176
keyCPS-1126

Problem Statement

  • With previous versions we were able to register 10k cmhandles (in 4 threads) in 12 minutes.
  • With the current CPS master we sent the 10k cmHandles  around 10 minutes to NCMP, so it is not faster than before. What I see is that NCMP starts getting modules in a background thread (seems to be one thread) and handling one cmhandle takes more seconds.
  • Currently in 12 minutes only 150 cmHandles got available with ready state in NCMP, because getting modules and changing the state is quite slow this way. 


Expectation/ goal: cm-handle registration + module sync → 25 cm handles / sec (see CPS Interface Performance Requirements Overview)

Improvements Overview

Test Environment

Secs Taken for

ModuleSyncWatchdog (Cm Handle moves ADVISED->READY)

Secs taken for
Registration sub-process
Commit idParrealel Request to DMIMain improvementComments

Ericsson Testing cluster (with Model adapter that is implementation of dmi-plugin)

Total 1458 cm Handles

6,998 → 0.20/sec2,278 → 0.64/sec

1410509e33142c0c79ff0e111c63abc47f5936d3

https://gerrit.onap.org/r/gitweb?p=cps.git;a=commit;h=1410509e33142c0c79ff0e111c63abc47f5936d3

N/A (1)N/ABaseline as per the bug
600 → 2.43/sec70 → 20.82/sec

c3596fe3a85381668598c75a1ffee4bf973a8518

https://gerrit.onap.org/r/c/cps/+/129829/5

N/A (1)
  1. Created a dedicated threadpool for scheduler. - Tuned async threadpool of notification executor from setting RejectedExecutionHandler and application.yml.

Note : Root cause : We have 3 watchdogs (2 @Scheduled into module sync and 1 into data sync) that was blocking each other as all @Scheduled methods share a single thread by default. )


540 → 2.70 /sec88(?!) → 16.57/sec

6e97bc6ee9f1cb2722aafc42325918e98da5cff2

https://gerrit.onap.org/r/c/cps/+/130058

N/A (1)
  1. Mapped additional properties of YangModelCmHandle.
  2. @Async : scheduler are executing tasks asynchronously 

10,000 → 4.17 / sec

10K in 40 mins

N/A

0e7970a1ec12eac0f45fa016f98148da7a2266a8

https://gerrit.onap.org/r/c/cps/+/130946 (Master)

10 

1. Introduced timeout for connect and reading for RestTemplate while connecting dmi-plugin.

2. Modified data structure of moduleSyncSemaphore to hold YangModelCmHandle referance (cm handle id) as key of map

3. Working of Modulewatchdog is modified with parallelism and broken into async tasks.

4. ModuleSyncSemaphoreMap is now populated with advised cm handles and processed by Modulewatchdog by batch size (Dynamic batch size at cm registration).

5. Introduled parallelism level for async tasks that are suppiled.

6. Includes cm handle registration preformance code change.

7. Performance Improvement: Postgres Pool Configuration

8. Performance Improvement: Async Notification Pool Configuration.

9. Performance Improvement: Save/update batch of CM-Handle States in one operation.

10. Performance Improvement: Updating (many) child datanodes/datanode tree into one SQL operation

11. Performance Improvement: Native query to more efficiently get all yangResourceIds.

12. Performance Improvement: Detaching fragment entities early.

13. Performance Improvement: Decouple Temporal DB Notification from main thread.

Note : It includes bottleneck of in my recursive method (Joe K is working on it to improve.

org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#toDataNode)


Local Development Machine with CSIT docker deployment (with dmi-plugin that is  implemented by CPS team) 

97+296 = 393
→ 3.81 / sec

30 → 48.6 / sec

e367b520ac01432715b701af6e2bd75d4f81bccf

https://gerrit.onap.org/r/c/cps/+/130148  (Latest)

5

Removed  YangModelCmHandle object as key of map and stores only the reference that is cm handle id.



Note :

  1. It includes bottleneck of in recursive method 

 (org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#toDataNode)Joe K is working on it to improve.

   2. dmi-plugin is able to handle only 5-8 cm handle per seconds             concurrently.

125+378  = 503
→ 2.98  / sec
33 → 45.45 / sec10

246 +356 = 602
→ 2.49 / sec

31 → 48.39 / sec30
Local Development Machine wit h dmi-plugin-perf-stub

10,000
→ 4.76 / sec

10K in 35 mins

Not measured

0e7970a1ec12eac0f45fa016f98148da7a2266a8

https://gerrit.onap.org/r/c/cps/+/130946 (Master)

10
  • Using wiremock stub implemented by CPS team
  • 100  requests of 100 cm handles each
  • 2 sec between each requests

Test Configuration 

Use Case

#ConfigurationDescription
1Number of CPS Instance

1

2Total number of threads to send each batch (Register cm handle)4
3Batch size

100

4To register 1458 cm handlesUsed to send 15 batchs (100 cm handles in each batch)

CPS Parameters

#Parameter
Value
1
timers:
    advised-modules-sync:
        sleep-time-ms: 

30000

2
timers:
    locked-modules-sync:
        sleep-time-ms: 
300000
3
timers:
   cm-handle-data-sync:
       sleep-time-ms: 
30000
4
data-sync:
    cache:
        enabled: 

false

Improvement Details

Performance matrix of improved methods that are called by ModuleSyncWatchdog or Cm handle registration API 

Test EnvironmentMethod Sequential processing time
(Task Parallelism Level = 1 )

Parallel processing time(Task Parallelism Level = 30)

Number of DB calls reduced by parallel / batch processing 
Best CaseWorst CaseAverageBest CaseWorst CaseAverage
Local Development Machine with CSIT docker deployment (with dmi-plugin that is  implemented by CPS team) org.onap.cps.ncmp.api.impl.event.lcm.LcmEventsCmHandleStateHandlerImpl#registerNewCmHandle







org.onap.cps.ncmp.api.inventory.InventoryPersistence#saveCmHandleStates







org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#addChildDataNodes







org.onap.cps.spi.impl.CpsDataPersistenceServiceImpl#replaceDataNodeTree







org.onap.cps.spi.repository.FragmentRepositoryCpsPathQueryImpl#findByAnchorAndCpsPath







org.onap.cps.spi.repository.SchemaSetYangResourceRepositoryImpl#insertSchemaSetIdYangResourceId







org.onap.cps.spi.repository.YangResourceNativeRepository#getResourceIdsByModuleNameAndRevision







#Operations
Average Time elapsed (in ~ ms)
1To save cm handle state with id Batch #.Iteration #Time elapsed (in ~ ms)
1.1293 
1.284
2.114
2.28
Average Time elapsed per cm handle (in ~ ms)53
2To synchronize, create schema set and anchor with id Batch #.Iteration #Time elapsed (in ~ ms)
1.138393 
1.2599
2.1237 
2.2122
Average Time elapsed per cm handle (in ~ ms)0.483 
3Delete schema set If exists with idBatch #.Iteration #Time elapsed (in ~ ms)
1.117
1.22
2.12
2.22
Average Time elapsed per cm handle (in ~ ms)11.5
4Get cm handle state with idBatch #.Iteration #Time elapsed (in ~ ms)
1.183
1.27
2.110
2.212
Average Time elapsed per cm handle (in ~ ms)56 
5 To get yang modelsBatch #Batch SizeTime elapsed (in ~ ms)Time elapsed per cm handle (in ~ ms)
160163107105.00
28573733743.56
Average Time elapsed per iteration (in ~ ms)74.28
6To shuffle cm handlesBatch #Batch SizeTime elapsed (in ~ ms)
16010
28570
Average Time elapsed per iteration (in ~ ms)0
7
Get / fetch cm handles by state ADVISED
Batch #Batch SizeTime elapsed (in ~ ms)Time elapsed per cm handle (in ~ ms)
1601

19404

32.28
285734854.06
Average Time elapsed per iteration (in ~ ms)18.17