we would like to bring support for contains operator in cps-path.
contains() is a method in XPath expression. It is used when the value of any attribute changes dynamically, below are the examples,
Reference
Issues & decisions
Native query for Contains Operator using Like Keyword :
# | Issue | Notes | Decisions |
---|---|---|---|
1 | Which keyword to use ? Do we want case sensitivity or not? Do we follow the Xpath contains or do we become specific? |
| As per discussion , with Toine Siebelink Prefers Contains Xpath is case sensitive , So ilike keyword would be suitable to implement the contains query which support case sensitive attribute values. Need to discuss with stakeholders. |
# | Json Data | CPS-PATH Syntax | Output |
---|---|---|---|
1 | Below is the sample data , Here are ways to use contains keyword : | <cps-path>(contains'[@leafname,'<string-value>']')
| |
Native Query for contains keyword
1.Using LIKE Keyword :
Like operator is used to match specified matching pattern. It has two signs :
% : Matches any sequence of character, the character size may be 0 or more.
_ : Matches any single character.
# | Query | Output |
---|---|---|
1 | cpsdb=# SELECT * FROM FRAGMENT WHERE anchor_id = 4 and attributes->>'lang' like '%en%'; | |
2 | cpsdb=# SELECT * FROM FRAGMENT WHERE anchor_id = 4 and attributes->>'lang' ilike '%En%'; | |
3 | cpsdb=# SELECT * FROM FRAGMENT WHERE anchor_id = 4 and attributes->>'lang' like 'en'; |
2.Using SIMILAR TO Regular Expression Keyword :
The only difference between like
and similar to
is to pattern matches the given string. It is similar to LIKE
, except that it interprets the pattern using the SQL standard's definition of a regular expression
SIMILAR TO
supports these pattern-matching metacharacters borrowed from POSIX regular expressions:
|
denotes alternation (either of two alternatives).*
denotes repetition of the previous item zero or more times.+
denotes repetition of the previous item one or more times.?
denotes repetition of the previous item zero or one time.{
m
}
denotes repetition of the previous item exactlym
times.{
m
,}
denotes repetition of the previous itemm
or more times.{
m
,
n
}
denotes repetition of the previous item at leastm
and not more thann
times.Parentheses
()
can be used to group items into a single logical item.A bracket expression
[...]
specifies a character class, just as in POSIX regular expressions.
# | Query | Output |
---|---|---|
1 | cpsdb=# SELECT * FROM FRAGMENT WHERE anchor_id = 3 and attributes->>'pub_year'similar to '%(94|95)%'; |
Performance wise : As we are not making much changes for query , the performance is similar to existing query will not effect much
Implementation of Contains Operator
1.Update antlr parser to recognize this pattern
2.Implement required (native) query
3.Add db-container tests for
a.filter on string leaf-value
b.filter on Integer leaf-value
c.filter on leaf-list value
4.Update documentation
5.demo to team