...
The current database schema does not allow for Hibernate to use JDBC write batching. There is work worke in progress to address this.
...
There is work in progress to address this behavior using a custom algorithm using JDBC. The use of a graph database would implicitly fix such issues.
Updating data nodes
Batch updating a collection using updateDataNodesAndDescendants (plural)
In this scenario, 1,000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodesAndDescendants (plural).
...
Time (seconds)
...
69.46
...
Graph of time taken to update device nodes and descendants
Observations
- Updating data nodes has linear time complexity (as expected).
- Raw performance is roughly 600 fragments per second for the given model and test setup.
- Updating data nodes is 5 times slower than storing data nodes.
Commentary
This is the weak spot of all write operations. A custom algorithm comparing existing data against updated data (delta algorithm) may be required to improve performance here. This could also indicate that Hibernate is not being used effectively.
There are a number of issues possibly affecting performance here. For example, after DataNodes are converted to FragmentEntity objects, they are being persisted using the Hibernate persist method, which will always write the changes, regardless of whether the data changes. There is an alternative method called merge which compares existing data to updated data. Even if this were implemented, there are additional issues, such as JSON encoding giving inconsistent ordering. For example, data leaves could be encoded as "{'x':'1', 'y':'2'}" or "{'y':'2','x':'1'}", depending on exact object type for storing leaves (HashMap, LinkedHashMap, ImmutableMap, etc.). There is an option for JsonObjectMapper to order the keys alphabetically during encoding.
WIP Comparison of updating using different approaches
There are multiple was of updating data nodes, each with different performance.
In these scenario, 1,000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodeAndDescendants (singular).
...
To Do: this section is be completed pending further analysisSee here: https://vladmihalcea.com/the-best-way-to-map-a-onetomany-association-with-jpa-and-hibernate/ for explanation on why this occurs. Basically, unidirectional mapping OneToMany suffers from poor performance due to the order in which Hibernate writes out the entities. Switching to either unidirectional ManyToOne, or bidirectional OneToMany and ManyToOne could solve the issue using pure Hibernate, but they come with drawbacks. (To Do: More details on this.)
Updating data nodes
NOTE: I have removed the previous test results here, as the test analysis was flawed. Analysis is ongoing as per CPS-1674.
Performance tests to support new analysis are provisionally available at: https://gerrit.nordix.org/c/onap/cps/+/19086
Updating data leaves
In this scenario, 1,000 Open ROADM OpenROADM device nodes are already defined. The data leaves of a number of these existing data nodes will be updated using CpsDataService::updateNodeLeaves.
...
Deleting data nodes
In this scenario, 300 Open ROADM OpenROADM device nodes are already defined. A number of these data nodes will be deleted using CpsDataService::deleteDataNodes. The types of nodes will be varied, for example, deleting container nodes, list elements, or whole lists.
Test results
N = | 50 | 100 | 150 | 200 | 250 | 300 | Example xpath | ||||
---|---|---|---|---|---|---|---|---|---|---|---|
Delete top-level container node | - | - | - | - | - | 0.63 | /openroadm-devices | ||||
Batch delete N/300 container nodes | 0.15 | 0.26 | 0.38 | 0.45 | 0.55 | 0.69 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-10']/org-openroadm-device | ||||
Batch delete N/300 lists elements | 0.13 | 0.25 | 0.34 | 0.45 | 0.55 | 0.67 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-49']200 | 250 | 300 | Example xpath | |
Delete top-level container node | - | - | - | - | - | 0.63 | /openroadm-devices | ||||
Batch delete N/300 whole listscontainer nodes | 0.5115 | 10.05261 | 0.4038 | 10.8545 | 20.13552 | 0.5669 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-29310']/org-openroadm-device/degree | ||||
Try batch Batch delete N/300 non-existinglists elements | 0.2513 | 0.5425 | 0.6734 | 0.9545 | 10.15551 | 0.3267 | /path/to/non-existing/node[@id='27'] |
Observations
- Delete performance is linear on the amount of data being deleted (as expected).
- Raw performance of deleting container nodes is around 35,000 fragments per second. (So we can delete data nodes more than 10x faster than creating them.)
- Deleting lists is much slower than deleting the parent container of the list (this can be improved).
- Of note, attempting to delete non-existing data nodes takes longer than actually deleting the equivalent amount of nodes with descendants - it is a slow operation.
Suggested improvement: For whole list deletion, add a condition to the WHERE clause in the SQL for deleting lists, to narrow the search space to children of the parent. For example:
DELETE FROM fragment WHERE (existing conditions)
AND parent_id = (SELECT id FROM fragment WHERE xpath = '/parent-xpath')
This should narrow the performance gap in this case.
Reading data nodes
In these tests, a varying number of Open ROADM devices are created and retrieved.
Reading top-level container node
In this test, CpsDataService::getDataNodes is used to retrieve the top-level container node.
Test results
Reading the top-level container node with no descendants:
...
openroadm-devices/openroadm-device[@device-id='C201-7-1A-49'] | |||||||
Batch delete N/300 whole lists | 0.51 | 1.05 | 1.40 | 1.85 | 2.13 | 2.56 | /openroadm-devices/openroadm-device[@device-id='C201-7-1A-293']/org-openroadm-device/degree |
Try batch delete N/300 non-existing | 0.25 | 0.54 | 0.67 | 0.95 | 1.15 | 1.32 | /path/to/non-existing/node[@id='27'] |
Observations
- Delete performance is linear on the amount of data being deleted (as expected).
- Raw performance of deleting container nodes is around 35,000 fragments per second. (So we can delete data nodes around 10x faster than creating them.)
- Deleting lists is much slower than deleting the parent container of the list (this can be improved).
- Of note, attempting to delete non-existing data nodes takes longer than actually deleting the equivalent amount of nodes with descendants - it is a slow operation.
Suggested improvement: For whole list deletion, add a condition to the WHERE clause in the SQL for deleting lists, to narrow the search space to children of the parent. For example:
DELETE FROM fragment WHERE (existing conditions)
AND parent_id = (SELECT id FROM fragment WHERE xpath = '/parent-xpath')
This should narrow the performance gap in this case.
Reading data nodes
In these tests, a varying number of Open ROADM devices are created and retrieved.
Reading top-level container node
In this test, CpsDataService::getDataNodes is used to retrieve the top-level container node.
Test results
Reading the top-level container node with all no descendants:
Total device nodes | 500 | 1,000 | 1,500 | 2,000 | 2,500 | 3,000 |
---|---|---|---|---|---|---|
Fragments read | 43,000 | 86,000 | 129,000 | 172,000 | 215,000 | 258,000 |
Time (seconds) | 0.42 | 1.19 | 1.54 | 2.16 | 2.53 | 2.67 |
Fragments per second | 102,381 | 72,269 | 83,766 | 79,630 | 85,657 | 96,629 |
Graph of time taken to read top-level container node with all descendants
Observations
- Reading a single top-level container node with no descendants has constant time (as expected).
- Reading a single top-level container node with all descendants has linear time (as expected).
- Raw performance of reading with all descendants is roughly 100,000 fragments per second.
Commentary
This is the fastest operation in CPS, in terms of fragments per second. This is not surprising, as a simple read should be the fastest operation.
Reading data nodes for multiple xpaths
This test uses CpsDataService::getDataNodesForMultipleXpaths with all descendants to retrieve a varying number of Open ROADM device nodes.
Test results
...
Special case: attempting to read multiple non-existing data nodes
In this case, we attempt to read many non-existing data nodes:
Total devices nodes | 500 | 1,000 | 1,500 | 2,000 | 2,500 | 3,000 |
---|---|---|---|---|---|---|
Fragments read | 1 | 1 | 1 | 1 | 1 | 1 |
Time (milliseconds) | 47 | 52 | 48 | 56 | 48 | 47 |
The above data clearly indicates constant time.
Reading the top-level container node with all descendants:
Total device nodes | 500 | 1,000 | 1,500 | 2,000 | 2,500 | 3,000 |
---|---|---|---|---|---|---|
Fragments read | 43,000 | 86,000 | 129,000 | 172,000 | 215,000 | 258,000 |
Time (seconds) | 0.42 | 1.19 | 1.54 | 2.16 | 2.53 | 2.67 |
Fragments per second | 102,381 | 72,269 | 83,766 | 79,630 | 85,657 | 96,629 |
e
Graph of time taken to read top-level container node with all descendants
Observations
- Reading a single top-level container node with no descendants has constant time (as expected).
- Reading a single top-level container node with all descendants has linear time (as expected).
- Raw performance of reading with all descendants is roughly 100,000 fragments per second.
Commentary
This is the fastest operation in CPS, in terms of fragments per second. This is not surprising, as a simple read should be the fastest operation.
Reading data nodes for multiple xpaths
This test uses CpsDataService::getDataNodesForMultipleXpaths with all descendants to retrieve a varying number of Open ROADM device nodes.
Test results
Total device nodes | 500 | 1,000 | 1,500 | 2,000 | 2,500 | 3,000 | |||||
---|---|---|---|---|---|---|---|---|---|---|---|
Fragments read | 0 | 0 | 0 | 0 | 0 | 043,000 | 86,000 | 129,000 | 172,000 | 215,000 | 258,000 |
Time (millisecondsseconds) | 10 | 10 | 9 | 9 | 7 | 8 |
...
0.61 | 1.15 | 1.52 | 2.14 | 2.96 | 3.97 |
Observations
- Reading many data nodes with all descendants has linear time (as expected).Attempting to read many non-existing data nodes takes negligible time.
- Raw performance of reading many with all descendants is roughly 80,000 fragments per second.
...