TO DO Laptop specs
The performance tests are written in Groovy (a JVM language). As all CPS Core operations are synchronous, the results here are to be considered as single-threaded operation only.
Test data
Test data used complies with Open ROADM YANG model. Specifically, openroadm-device nodes consisting of 86 fragments are created. For example, a test that creates 1000 device nodes will result in 86,000 fragments in the database. Some tests use up to 3000 device nodes (258,000 fragments - equivalent to around 20,000 CM-handles in NCMP), with four anchors replicating the data, meaning that the system has been tested up to 1 million fragments.
A varying number of Open ROADM device nodes will be stored using CpsDataService::saveData.
Created device nodes | 1 | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
---|---|---|---|---|---|---|---|---|---|---|---|
Time (seconds) | 0.30 | 2.36 | 4.36 | 7.15 | 9.76 | 11.50 | 14.77 | 18.43 | 19.79 | 22.16 | 26.54 |
Observations
Commentary
The current database schema does not allow for Hibernate to use JDBC write batching. There is work in progress to address this.
JPA/Hibernate is not well-suited to tree-structured data. As an example, writing data nodes / fragments happens in two low-level steps:
There is work in progress to address this behavior using a custom algorithm using JDBC. The use of a graph database would implicitly fix such issues.
In this scenario, 1000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodeAndDescendants.
Updated device nodes | 1 | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
---|---|---|---|---|---|---|---|---|---|---|---|
Time (seconds) | 0.22 | 12.79 | 28.38 | 44.23 | 51.55 | 69.46 | 85.67 | 95.02 | 109.16 | 117.00 | 131.15 |
Observations
Commentary
This is the weak spot of all write operations. A custom algorithm comparing existing data against updated data (delta algorithm) may be required to improve performance here.
This could also indicate that Hibernate is not being used effectively.
In this scenario, 1000 Open ROADM device nodes are already defined. The data leaves of a number of these existing data nodes will be updated using CpsDataService::updateNodeLeaves.
Example JSON payload for updating data leaves for one device:
{ 'openroadm-device': [ {'device-id':'C201-7-1A-1', 'status':'fail', 'ne-state':'jeopardy'} ] } |
Test Results
Updated device nodes | 1 | 100 | 200 | 300 | 400 | 500 | 600 | 700 | 800 | 900 | 1000 |
---|---|---|---|---|---|---|---|---|---|---|---|
Time (seconds) | 0.20 | 0.27 | 0.28 | 0.28 | 0.32 | 0.38 | 0.39 | 0.47 | 0.49 | 0.52 | 0.56 |
Observations
Commentary
The performance of this function is excellent, and yet it may be improved by write batching.
In this scenario, 300 Open ROADM device nodes are already defined. A number of these data nodes will be deleted using CpsDataService::deleteDataNodes. The types of nodes will be varied, for example, deleting container nodes, list elements, or whole lists.
Test results
N = | 50 | 100 | 150 | 200 | 250 | 300 | Example xpath |
---|---|---|---|---|---|---|---|
Delete top-level container node | - | - | - | - | - | 0.63 | /openroadm-devices |
Batch delete N/300 container nodes | 0.15 | 0.26 | 0.38 | 0.45 | 0.55 | 0.69 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-10']/org-openroadm-device |
Batch delete N/300 lists elements | 0.13 | 0.25 | 0.34 | 0.45 | 0.55 | 0.67 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-49'] |
Batch delete N/300 whole lists | 0.51 | 1.05 | 1.40 | 1.85 | 2.13 | 2.56 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-293']/org-openroadm-device/degree |
Try batch delete N/300 non-existing | 0.25 | 0.54 | 0.67 | 0.95 | 1.15 | 1.32 | /path/to/non-existing/node[@id='27'] |
Observations
Suggested improvement: For whole list deletion, add a condition to the WHERE clause in the SQL for deleting lists, to narrow the search space to children of the parent. For example:
SELECT * FROM fragment WHERE (existing conditions)AND parent_id = (SELECT id FROM fragment WHERE xpath = '/parent-xpath')
This should narrow the performance gap in this case.
In these tests, a varying number of Open ROADM devices are created and retrieved.
In this test, CpsDataService::getDataNodes is used to retrieve the top-level container node.
Test results
Reading the top-level container node with no descendants:
Total device nodes | 500 | 1000 | 1500 | 2000 | 2500 | 3000 |
---|---|---|---|---|---|---|
Time (milliseconds) | 47 | 52 | 48 | 56 | 48 | 47 |
The above data clearly indicates constant time.
Reading the top-level container node with all descendants:
Total device nodes | 500 | 1000 | 1500 | 2000 | 2500 | 3000 |
---|---|---|---|---|---|---|
Time (seconds) | 0.42 | 1.19 | 1.54 | 2.16 | 2.53 | 2.67 |
Observations
This test uses CpsDataService::getDataNodesForMultipleXpaths with all descendants to retrieve a varying number of Open ROADM device nodes.
Test results
Total device nodes | 500 | 1000 | 1500 | 2000 | 2500 | 3000 |
---|---|---|---|---|---|---|
Time (seconds) | 0.61 | 1.15 | 1.52 | 2.14 | 2.96 | 3.97 |
Special case: attempting to read multiple non-existing data nodes
In this case, we attempt to read many non-existing data nodes:
Total devices nodes | 500 | 1000 | 1500 | 2000 | 2500 | 3000 |
Time (milliseconds) | 10 | 10 | 9 | 9 | 7 | 8 |
The above case appears to be constant time, but theoretically must be linear - we'll call it negligible time.
Observations
Recently, functionality was added to enable reading whole lists (CPS-1696). Here we compare performance of reading a container node containing a list, versus reading the list (with all descendants).
Total device nodes | 500 | 1000 | 1500 | 2000 | 2500 | 3000 | xpath |
Reading container | 0.386 | 0.712 | 1.529 | 2.667 | 1.759 | 3.112 | /openroadm-devices |
Reading list | 0.585 | 1.335 | 2.036 | 2.860 | 2.769 | 3.949 | /openroadm-devices/openroadm-device |
As can be seen, it is slower reading the list than reading the parent node containing the list.
Suggested improvement: Add a condition to the WHERE clause in the SQL for reading lists, to narrow the search space to children of the parent. For example:
SELECT * FROM fragment WHERE (existing conditions)AND parent_id = (SELECT id FROM fragment WHERE xpath = '/parent-xpath')
This should narrow the performance gap in this case.
TO DO
There have been massive orders-of-magnitude improvements to query performance. Formerly, queries had quadratic complexity, which is now linear. In the last month alone, performance has improved by 100x.
Release | Date | CmHandles | 100 | 500 | 1000 | 2000 | 5000 | 10000 | 20000 | 40000 | Comments |
---|---|---|---|---|---|---|---|---|---|---|---|
Kohn | Oct 2022 | Time | 8 sec | 16 sec | 17 sec | 33 sec | 1 min | 3 min | ERROR | ERROR | Error due to DB running out shared memory |
London | May 2023 | Time | 6 sec | 7 sec | 12 sec | 22 sec | 1 min | 2 min | 10 min | 32 min | |
current | Aug 2023 | Time | 6 sec | 7 sec | 10 sec | 16 sec | 31 sec | 57 sec | 71 sec | 108 sec |
CM-handle registration is multi-threaded, so performance may appear to scale better than linear, until the CPU cores are maxed out.
As can be seen below, CPU usage never reached 100% during the tests of the current version.
Release | Date | CmHandles | 100 | 500 | 1000 | 2000 | 5000 | 10000 | 20000 | 40000 | Comments |
---|---|---|---|---|---|---|---|---|---|---|---|
Kohn | Oct 2022 | Time | 7 sec | 2 min | 7 min | 25 min | 2.5 hour | est: 10 hour | est: 2 days | est: 7 days | Some values estimated due to time constraints |
London | May 2023 | Time | < 1 sec | 2 sec | 3 sec | 5 sec | 17 sec | 37 sec | 85 sec | ERROR | Error due to 32K limit |
current | Aug 2023 | Time | < 1 sec | 1 sec | 3 sec | 4 sec | 14 sec | 23 sec | 65 sec | 2 min |
Current release has exactly linear performance for CM-handle de-registration (on a single thread):
Removed | Time (sec) | CM-handles/sec |
---|---|---|
500 | 1.5 | 327 |
1000 | 2.6 | 377 |
5000 | 13.2 | 377 |
10000 | 25.9 | 385 |
20000 | 56.1 | 356 |