Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Summary

This is an assessment of current performance of CPS operations, as of August 2023.

...

Created device nodes110020030040050060070080090010001,000
Fragments created868,60017,20025,80034,40043,00051,60060,20068,80077,40086,000

Time (seconds)

0.302.364.367.159.7611.5014.7718.4319.7922.1626.54
Fragments per second86287

Image Removed

Observations

3,6443,9453,6083,5253,7393,4943,2663,47734933,240

Graph of time taken to store device nodes

Image Added

Observations

  • Storing data nodes has linear time complexity (as expected).
  • Raw performance is roughly 3000 3,500 fragments per second for the given test setup.
  • Performance can be improved by enabling write batching (CPS-1795)
  • There are edge cases with exponential complexity (adding books to the bookstore model).

...

In this scenario, 1,000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodeAndDescendants.

12.86
Updated device nodes11002003004005006007008009001000

Time (seconds)

0.221,000
Fragments updated868,60017,20025,80034,40043,00051,60060,20068,80077,40086,000

Time (seconds)

0.2212.7928.3844.2351.55

69.46

85.6795.02109.16117.00131.15
Fragments391672606

...

583

Observations

...

667619602634630662656

Graph of time taken to update device nodes and descendants

Image Added

Observations

  • Updating data nodes has linear time complexity (as expected).
  • Raw performance is roughly 600 fragments per second for the given model and test setup.
  • Updating data nodes is 5 times slower than storing data nodes.

...

This is the weak spot of all write operations. A custom algorithm comparing existing data against updated data (delta algorithm) may be required to improve performance here. This could also indicate that Hibernate is not being used effectively.

TODO mention json alphabetical sorting

Updating data leaves

In this scenario, 1,000 Open ROADM device nodes are already defined. The data leaves of a number of these There are a number of issues possibly affecting performance here. For example, after DataNodes are converted to FragmentEntity objects, they are being persisted using the Hibernate persist method, which will always write the changes, regardless of whether the data changes. There is an alternative method called merge which compares existing data to updated data. Even if this were implemented, there are additional issues, such as JSON encoding giving inconsistent ordering. For example, data leaves could be encoded as "{'x':'1', 'y':'2'}" or "{'y':'2','x':'1'}", depending on exact object type for storing leaves (HashMap, LinkedHashMap, ImmutableMap, etc.). There is an option for JsonObjectMapper to order the keys alphabetically during encoding.

Updating data leaves

In this scenario, 1,000 Open ROADM device nodes are already defined. The data leaves of a number of these existing data nodes will be updated using CpsDataService::updateNodeLeaves.

...

Code Block
languagetext
{
    'openroadm-device': [
        {'device-id':'C201-7-1A-1', 'status':'fail', 'ne-state':'jeopardy'}
    ]
}

Test Results

...

}

Test Results

10000.56
Updated device nodes11002003004005006007008009001,000
Fragments updated11002003004005006007008009001,000

Time (seconds)

0.200.270.280.280.320.380.390.470.490.520.380.390.470.490.52

0.56

Fragments per second53707141,0711,2501,3161,5391,4891,6331,7311,786

Graph of updating data leaves of device nodes

Observations

  • Updating data leaves has linear time complexity.
  • Raw performance is about 3000 around 1,500 fragments per second.
  • This is very fast compared to updating whole data nodes, and should be preferred where possible.

...

In this scenario, 300 Open ROADM device nodes are already defined. A number of these data nodes will be deleted using CpsDataService::deleteDataNodes. The types of nodes will be varied, for example, deleting container nodes, list elements, or whole lists.

Test results

N =
50100150200250300Example xpath
Delete top-level container node-----0.63/openroadm-devices
Batch delete N/300 container nodes0.150.260.380.450.550.69/openroadm-devices/openroadm-device[@device-id='C201-7-1A-10']/org-openroadm-device
Batch delete N/300 lists elements0.130.250.340.450.550.67/openroadm-devices/openroadm-device[@device-id='C201-7-1A-49']
Batch delete N/300 whole lists0.511.051.401.852.132.56/openroadm-devices/openroadm-device[@device-id='C201-7-1A-293']/org-openroadm-device/degree
Try batch delete N/300 non-existing0.250.540.670.951.151.32/path/to/non-existing/node[@id='27']

...

  • Delete performance is linear on the amount of data being deleted (as expected).
  • Raw performance of deleting container nodes is around 4035,000 fragments per second. (So we can delete data nodes more than 10x faster than creating them.)
  • Deleting lists is much slower than deleting the parent container of the list (this can be improved).
  • Of note, attempting to delete non-existing data nodes takes longer than actually deleting the equivalent amount of nodes with descendants - it is a slow operation.

...

Reading the top-level container node with no descendants:

...

node with no descendants:

Total device nodes5001,0001,5002,0002,5003,000
Fragments read111111
Time (milliseconds)475248564847

The above data clearly indicates constant time.

Reading the top-level container node with all descendants:

30002.67
Total device nodes5001,0001,50010001500200025002,0002,5003,000
Fragments read43,00086,000129,000172,000215,000258,000
Time (seconds)0.421.191.542.162.532.532.67
Fragments per second102,38172,26983,76679,63085,65796,629

Graph of time taken to read top-level container node with all descendants

Observations

  • Reading a single top-level container node with no descendants has constant time (as expected).
  • Reading a single top-level container node with all descendants has linear time (as expected).
  • Raw performance of reading with all descendants is roughly 100,000 fragments per second.

...

Test results

3000
Total device nodes5001,0001,5002,0002,50010001500200025003,000
Fragments read43,00086,000129,000172,000215,000258,000
Time (seconds)0.611.151.522.142.963.97

Special case: attempting to read multiple non-existing data nodes

In this case, we attempt to read many non-existing data nodes:

3000
Total devices nodes5001,0001,5002,0002,50010001500200025003,000
Fragments read000000
Time (milliseconds)10109978

The above case appears to be constant time, but theoretically must be linear - we'll call it negligible time.

...

Recently, functionality was added to enable reading whole lists (CPS-1696). Here we compare performance of reading a container node containing a list, versus reading the list (with all descendants).

3000
Total device nodes50010001500200025001,0001,5002,0002,5003,000xpath
Reading container0.3860.7121.5292.6671.7593.112/openroadm-devices
Reading list0.5851.3352.0362.8602.7693.949/openroadm-devices/openroadm-device

...

CM-handle registration

40000
ReleaseDateCmHandlesCM-Handles removed
10050010002000500010000200001,0002,0005,00010,00020,00040,000Comments
KohnOct 2022Time taken
8 sec16 sec19 sec33 sec1 min3 minERRORERRORError due to DB running out shared memory
LondonMay 2023Time taken
6 sec7 sec12 sec22 sec1 min2 3 min10 min32 min
currentAug 2023Time taken
6 sec7 sec10 sec16 sec31 sec57 sec71 sec108 sec

...

CM-handle de-registration

ReleaseDateCmHandlesCM-Handles removed
1005001000200050001000020000400001,0002,0005,00010,00020,00040,000Comments
KohnOct 2022Time taken
7 sec2 min7 min25 min2.5 hourest: 10 hourest: 2 daysest: 7 daysSome values estimated due to time constraints
LondonMay 2023Time taken
< 1 sec2 sec3 sec5 sec17 sec37 sec85 secERRORError due to 32K limit
current

Aug 2023

Time taken
< 1 sec1 sec3 sec4 sec14 sec23 sec65 sec2 min

Current release has exactly linear performance for CM-handle de-registration (on a single thread):

CM-handles RemovedTime (sec)CM-handles/sec
5001.5327
10002.6377
500013.2377
1000025.9385
2000056.1356

Commentary

CM-handle de-registration is a synchronous operation. Since the above tests send a single large batch, the operation is performed in a single thread. The linear performance reflects this. If the operation was broken into many smaller requests, the expectation is that it would complete many times faster - but this has not been tested.