Grouping of generated delta reports in Delta Feature

Date

Item

Owner

Notes

Relevant Links

Date

Item

Owner

Notes

Relevant Links

Aug 9, 2024

 

@someone

Action

 

Requirement Analysis

Currently CPS Delta featured is designed such that it reports the delta of every data node individually, regardless of a data nodes relation with another data node.

For example:

If we have 2 data nodes in a parent-child relationship, then based on the 3 possible cases of delta, i.e. create, remove and replace, we will have following variations of delta report:

[ { "action": "create", "xpath": "/xpath/to/parent-node", "target-data": { "data": "leaf data of parent-node" } }, { "action": "create", "xpath": "/xpath/to/child-node", "target-data": { "data": "leaf data of child node" } } ]
[ { "action": "remove", "xpath": "/xpath/to/parent-node", "source-data": { "data": "leaf data of parent-node" } }, { "action": "remove", "xpath": "/xpath/to/child-node", "source-data": { "data": "leaf data of child node" } } ]
[ { "action": "replace", "xpath": "/xpath/to/parent-node", "source-data": { "data": "leaf data of parent-node before update" }, "target-data": { "data": "leaf data of parent-node after update" } }, { "action": "replace", "xpath": "/xpath/to/child-node", "source-data": { "data": "leaf data of child node before update" }, "target-data": { "data": "leaf data of child node after update" } } ]

In all the above scenarios we can see that the delta feature reports each data node individually as a separate delta operation.

From the perspective of reporting delta between data nodes this approach is suitable as each data node is an individual entity having unique feature sets like xpath and leaf data. And operations such as create, remove and update can be performed on these nodes individually.

But going by this approach also makes the delta report more complicated in terms of the number of deltas present in a delta report, because more the number of data nodes in a JSON/XML data, more will be the number of deltas reported in the delta report.

This makes using the delta report trickier and slower because now we have to iterate over each delta within the delta report in order to make use of it in any scenario.

Proposed Solution

The delta report can be further simplified by grouping the deltas of data nodes sharing a parent-child relationship.

The proposed solution would introduce this change as an optional feature where in the users can enable the option to group the delta’s generated using a query parameter.

This would make the solution modular and add it as an extra feature to the existing delta feature without impacting the current functionality.

This would allow the user to either generate the delta report in the existing format or group the delta’s based on the parent-child relationship based on their requirement.

Proposed Approaches

To execute either of the approaches the common entry point will be a binary flag(group-data-nodes), which would enable and disable the grouping functionality.

Approach 1

In this approach, the delta would be generated as per the current algorithm and after it has been generated, if the grouping flag is enabled, a grouping algorithm would be executed which would perform the said operation.

Approach 2

In this approach, the data nodes will be identified and grouped based on a parent-child relationship before the delta operation is performed. This would help identify and group the nodes before the delta operation is performed and hence would also reduce the number of iterations needed to generate the delta.

Pros and Cons of both approaches

Approach 1

Pros:

Delta is generated using existing algorithm, ensuring consistent results between the two options to be provided

Cons:

Slower, as the delta is generated for all the data nodes individually and then grouping is performed. Resulting in higher number of iterations to take place.

Approach 2

Pros:

Faster compared to approach 1 because here the grouping is done before generating the delta, hence the number of iterations are requced.

Cons:

A separate algorithm would mean inconsistent results might be generated, hence careful planning and testing is to be planned alongside the development to ensure consistent results between the two options.

Proposed changes to delta endpoints

  • GET-/v2/dataspaces/{dataspace-name}/anchors/{source-anchor-name}/delta?target-anchor-name={target-anchor-name}?xpath={xpath}&descendants={descendants}&group-by-nodes={true}

Parameter name

In

Required

Description

Parameter name

In

Required

Description

dataspace-name

Path

Yes

Dataspace name

source-anchor-name

Path

Yes

First Anchor Name/Reference Anchor

target-anchor-name

Query

Yes

Second Anchor Name/Comparand Anchor

xpath

Query

Yes

xpath of the node

descendants

Query

No

Level of descendants for delta comparison. 

group-by-nodes

Query

No

flag to enable/disable grouping of data nodes in delta report

  • POST- /v2/dataspaces/{dataspace-name}/anchors/{anchor-name}/delta?xpath={xpath}&group-by-nodes={true}

Parameter name

In

Required

Description

Parameter name

In

Required

Description

dataspace-name

Path

Yes

Dataspace name

anchor

Path

Yes

Anchor Name/Reference Anchor

xpath

Query

Yes

xpath of the node

group-by-nodes

Query

No

flag to enable/disable grouping of data nodes in delta report.

descendants

N/A

No

Level of descendants for delta comparison. Set to INCLUDE_ALL_DESCENDANTS by default.

Updated Delta Report format

After the proposed changes are implemented, the updated delta report will be concise and have the data nodes grouped as follows: