...
There is work in progress to address this behavior using a custom algorithm using JDBC. The use of a graph database would implicitly fix such issues.
Updating data nodes
Batch updating a collection using updateDataNodesAndDescendants (plural)
In this scenario, 1,000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodesAndDescendants (plural).
...
Time (seconds)
...
69.46
...
Graph of time taken to update device nodes and descendants
Observations
- Updating data nodes has linear time complexity (as expected).
- Raw performance is roughly 600 fragments per second for the given model and test setup.
- Updating data nodes is 5 times slower than storing data nodes.
Commentary
This is the weak spot of all write operations. A custom algorithm comparing existing data against updated data (delta algorithm) may be required to improve performance here. This could also indicate that Hibernate is not being used effectively.
There are a number of issues possibly affecting performance here. For example, after DataNodes are converted to FragmentEntity objects, they are being persisted using the Hibernate persist method, which will always write the changes, regardless of whether the data changes. There is an alternative method called merge which compares existing data to updated data. Even if this were implemented, there are additional issues, such as JSON encoding giving inconsistent ordering. For example, data leaves could be encoded as "{'x':'1', 'y':'2'}" or "{'y':'2','x':'1'}", depending on exact object type for storing leaves (HashMap, LinkedHashMap, ImmutableMap, etc.). There is an option for JsonObjectMapper to order the keys alphabetically during encoding.
...
(NOTE: I have removed the previous test results here, as the testing was flawed).
Analysis is still ongoing, but here is an overview of some of the findings:
- For making CM-handle state updates, NCMP uses CpsDataService::updateDataNodesAndDescendants (plural - taking a Map<xpath, json>). The performance of this function is poor. Because it takes a map of JSON, it needs to instantiate new YangParser for piece of JSON. Better performance would be achieved by using CpsDataService::updateDataNodeAndDescendants (singular - taking a single piece of JSON). Note this API could be removed after, since it is only used by NCMP, and was never exposed as part of the public REST API.
- Related: CpsDataService::saveListElementsBatch likewise has very poor performance compared with CpsDataService::saveListsElements, which can achieve the same effect by correctly preparing the input JSON. The poor performance stems from needing a new YangParser instance for each chunk of JSON instead of a single YangParser for one JSON string. This API is not publicly exposed via the REST API and is only used in one place in NCMP, so it could be removed.
- The main public API for updating data nodes is CpsDataService::updateDataNodeAndDescendants (singular). This API has very high variations in performance, depending on how the JSON is prepared, for example, given an update that effects exactly the same data nodes:
- updating a top-level node with parent path of '/' is much slower than updating the descendants. See table below.
- As such, it is now advised to update data trees at the deepest possible level.
Comparison of updating using different approaches
There are multiple was of updating data nodes, each with different performance.
In these scenario, 1,000 Open ROADM device nodes are already defined. A number of these existing data nodes will be updated using CpsDataService::updateDataNodeAndDescendants (singular).
Scenario | Time (seconds) | Parent Xpath | Sample Json | Remarks |
---|---|---|---|---|
Update 1 device node under top-level container | 0.6 | /openroadm-devices | { "openroadm-device": [ ... ] } | |
Update a list of 100 device nodes under top-level container | 3 | /openroadm-devices | { "openroadm-device": [ ... ] } | |
Update top-level container having a list of 100 device nodes | 22 | / | { "openroadm-devices": { "openroadm-device": [ ... ] } } |
...
Updating data leaves
In this scenario, 1,000 Open ROADM device nodes are already defined. The data leaves of a number of these existing data nodes will be updated using CpsDataService::updateNodeLeaves.
...